Vision Based Interaction


I am interested in the development of general methods for vision-based interaction that allow dynamic, unencumbered interaction in environments augmented with new display technology and both active and passive vision-systems.
I am involved in the VICs project wherein we are investigating the development of a general framework for vision-based human-computer interaction. The project is based on two general ideas:

  1. Most interaction occurs in a constrained setting or region in the environment; i.e. a button must only observe the image region in its immediate neighborhood.
  2. Visual-Image processing is an expensive task and must be employed sparingly. Thus, if one observes the stream of visual cues in the image stream prior to a button push, one notices a "coarse-to-fine" progression of such cues: there is motion, then color, then shape, then gesture-matching. In coherence with the visual cues, we employ visual processing components in a similar fashion.

Most recently, we have developed a new platform for unencumbered interaction via a projected display passively monitored by uncalibrated cameras: the 4D Touchpad (4DT) The system operates under the VICs paradigm -- that is, coarse to fine processing in constrained setting to minimize unnecessary computation. We built an eight-key piano as a first demonstration of the system [mpeg]. We are currently modifying TWM to operate under the 4DT in order to demonstrate the versatility of this platform.
A Provisional Patent has been filed for the 4D Touchpad.




Supplementary Information for ICVS 2003 Paper


VICs: A Modular Vision-Based HCI Framework
The paper can be found on the publications page.
Slides of the talk are here in pdf form [ color | b/w with notes ]


Surface Tracking



Many of the proposed Vision-based interfaces will be tied to a surface for the interface. We have developed a set of algorithms to directly track planar surfaces and parametric surfaces under a calibrated stereo-rig. Papers detailing this work are A movie demonstrating the planar surface tracking is here. A binary pixel mask is maintained which determines those pixel belonging to the plane (and with good texture); it is shown in red in the lower left of the video. The green vector being rendered is the plane's normal vector. Below is an image of the system that was built with our plane tracking routines to localize mobile robots. In the image, we show the real scene, the two walls that are being tracked (one in blue and one in red), and an overhead (orthogonal) projection of the reconstructed walls.



XVision2 notes



home
last updated: 2003.april.08; © jcorso