Recent advances in 3D depth cameras such as Microsoft Kinect sensors have created many opportunities towards a more natural way of interacting with computers and with people across distances. A key enabling technology is human body-language understanding by computer. Only after the computer understands what a user is doing, it can respond/act in a natural way back to the user, or capture the essential information and relay it to remote users. This has always been an active research field in computer vision but proven to be formidably difficult with video cameras. 3D depth cameras such as Microsoft Kinect sensors allow the computer to directly sense the 3rd dimension (depth) of the users and the environment, alleviating the burden of human body-language understanding by computer. In this talk, I will describe our recent work in using such commodity depth cameras for hand gesture recognition, human action recognition, facial expression tracking, engagement detection, human body modeling, and immersive teleconferencing.
Zhengyou Zhang received the B.S. degree in electronic engineering from Zhejiang University, Hangzhou, China, in 1985, the M.S. degree in computer science from the University of Nancy, Nancy, France, in 1987, the Ph.D. degree in computer science from the University of Paris XI, Paris, France, in 1990, and the Doctorate of Science (Habilitation à diriger des recherches) from the University of Paris XI, Paris, France, in 1994. He is a Principal Researcher with Microsoft Research, Redmond, WA, USA, and the Research Manager of the “Multimedia, Interaction, and Communication” group. Before joining Microsoft Research in March 1998, he was with INRIA (French National Institute for Research in Computer Science and Control), France, for 11 years and was a Senior Research Scientist from 1991. In 1996-1997, he spent a one-year sabbatical as an Invited Researcher with the Advanced Telecommunications Research Institute International (ATR), Kyoto, Japan. He is also an Affiliate Professor with the University of Washington, Seattle, WA, USA, and an Adjunct Chair Professor with Zhejiang University, Hangzhou, China. He has published over 200 papers in refereed international journals and conferences, and has coauthored the following books: 3-D Dynamic Scene Analysis: A Stereo Based Approach (Springer-Verlag, 1992); Epipolar Geometry in Stereo, Motion and Object Recognition (Kluwer, 1996); Computer Vision (Chinese Academy of Sciences, 1998, 2003, in Chinese); Face Detection and Adaptation (Morgan and Claypool, 2010); and Face Geometry and Appearance Modeling (Cambridge University Press, 2011). He has given a number of keynotes in international conferences and invited talks in universities.
Dr. Zhang is a Fellow of the Institute of Electrical and Electronic Engineers (IEEE), the Founding Editor-in-Chief of the IEEE Transactions on Autonomous Mental Development, an Associate Editor of the International Journal of Computer Vision, an Associate Editor of Machine Vision and Applications, and an Area Editor of the Journal of Computer Science and Technology. He served as Associate Editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence from 2000 to 2004, an Associate Editor of the IEEE Transactions on Multimedia from 2004 to 2009, an Associate Editor of the International Journal of Pattern Recognition and Artificial Intelligence from 1997 to 2009, among others. He served as Area Chair, Program Chair, or General Chair of a number of international conferences, including recently a Program Co-Chair of the International Conference on Multimedia and Expo (ICME), July 2010, a Program Co-Chair of the ACM International Conference on Multimedia (ACM MM), October 2010, a Program Co-Chair of the ACM International Conference on Multimodal Interfaces (ICMI), November 2010, and a General Co-Chair of the IEEE International Workshop on Multimedia Signal Processing (MMSP), October 2011. He recently served as a founding Chair of a new track “Technical Briefs” of the ACM SIGGRAPH Asia Conference, Nov. 28 – Dec. 1st, 2012. More information is available at http://research.microsoft.com/~zhang/