[San Rafael, California (1537 Fourth Street, San Rafael, CA 94901 USA)] :
Morgan & Claypool Publishers,
[2014]
1 online resource (xviii, 116 pages) :
illustrations
Synthesis lectures on computer vision,
#5
2153-1064 ;
Includes bibliographical references (pages 85-113).
1. Introduction -- 1.1 Problem definition and terminology -- 1.2 VBI motivation -- 1.3 A brief history of VBI -- 1.4 Opportunities and challenges for VBI -- 1.5 Organization.
2. Awareness: detection and recognition -- 2.1 What to detect and recognize? -- 2.2 Review of state-of-the-art and seminal works -- 2.2.1 Face -- 2.2.2 Eyes -- 2.2.3 Hands -- 2.2.4 Full body -- 2.3 Contextual human sensing.
3. Control: visual lexicon design for interaction -- 3.1 Static visual information -- 3.1.1 Lexicon design from body/hand posture -- 3.1.2 Lexicon design from face/head/facial expression -- 3.1.3 Lexicon design from eye gaze -- 3.2 Dynamic visual information -- 3.2.1 Model-based approaches -- 3.2.2 Exemplar-based approaches -- 3.3 Combining static and dynamic visual information -- 3.3.1 The SWP systems -- 3.3.2 The VM system -- 3.4 Discussions and remarks.
5. Applications of vision-based interaction -- -- 5.1 Application scenarios for VBI -- 5.2 Commercial systems.
6. Summary and future directions -- Bibliography -- Authors' biographies.
0
8
8
8
8
8
In its early years, the field of computer vision was largely motivated by researchers seeking computational models of biological vision and solutions to practical problems in manufacturing, defense, and medicine. For the past two decades or so, there has been an increasing interest in computer vision as an input modality in the context of human-computer interaction. Such vision-based interaction can endow interactive systems with visual capabilities similar to those important to human-human interaction, in order to perceive non-verbal cues and incorporate this information in applications such as interactive gaming, visualization, art installations, intelligent agent interaction, and various kinds of command and control tasks. Enabling this kind of rich, visual and multimodal interaction requires interactive-time solutions to problems such as detecting and recognizing faces and facial expressions, determining a person's direction of gaze and focus of attention, tracking movement of the body, and recognizing various kinds of gestures.