Vision Based Interaction
I am interested in the development of general methods for vision-based
interaction that allow dynamic, unencumbered interaction in environments
augmented with new display technology and both active and passive
vision-systems.
I am involved in the
VICs project wherein we are investigating the development of a
general framework for vision-based human-computer interaction. The
project is based on two general ideas:
- Most interaction occurs in a constrained setting or region in the
environment; i.e. a button must only observe the image region in its
immediate neighborhood.
- Visual-Image processing is an expensive task and must be employed
sparingly. Thus, if one observes the stream of visual cues in the
image stream prior to a button push, one notices a "coarse-to-fine"
progression of such cues: there is motion, then color, then shape, then
gesture-matching. In coherence with the visual cues, we employ visual
processing components in a similar fashion.
Most recently, we have developed a new platform for unencumbered
interaction via a projected display passively monitored by uncalibrated
cameras: the 4D Touchpad (4DT) The system operates under the VICs
paradigm -- that is, coarse to fine processing in constrained setting to
minimize unnecessary computation. We built an eight-key piano as a
first demonstration of the system
[mpeg]. We are currently modifying TWM to operate under the 4DT
in order to demonstrate the versatility of this platform.
A Provisional Patent has been filed for the
4D Touchpad.
Supplementary Information for ICVS 2003 Paper
VICs: A Modular Vision-Based HCI Framework
The paper can be found on the publications
page.
Slides of the talk are here in pdf form
[ color |
b/w with notes ]
Surface Tracking

Many of the proposed Vision-based interfaces will be tied to a
surface for the interface. We have developed a set of algorithms to
directly track planar surfaces and parametric surfaces under a
calibrated stereo-rig. Papers detailing this work are
A movie demonstrating the planar surface tracking is here. A binary pixel mask is maintained which
determines those pixel belonging to the plane (and with good
texture); it is shown in red in the lower left of the video. The green
vector being rendered is the plane's normal vector. Below is an image
of the system that was built with our plane tracking routines to
localize mobile robots. In the image, we show the real scene, the two
walls that are being tracked (one in blue and one in red), and an
overhead (orthogonal) projection of the reconstructed walls.
XVision2 notes