Fall 2009

September 1, 2009

Traditional Approximate Pattern Matching (e.g. Hamming Distance errors, edit distance errors) assumes that various types of errors may occur in the data, but an implicit assumption is that the order of the data remains unchanged. Over the years some applications identified types of “errors” where the data remains correct but its order is compromised. The earliest example is the “swap” error motivated by a common typing error. Other widely known examples such as transpositions, reversals, and interchanges, are motivated by Biology. We propose a model that formally separates the concepts of “errors in the data” and “errors in the address” since they present different algorithmic challenges solved by different techniques. The “error in address” model, which we call asynchronous pattern matching, since the data is not assumed to arrive in a synchronous sequential manner, is rich in problems not addressed hitherto. We will consider some reasonable metrics for asynchronous pattern matching and show some efficient algorithms for these problems. As expected, the techniques needed to solve these problems are not taken from the standard pattern matching “toolkit”.

Speaker Biography: Amihood Amir received his Ph.D. in 1983 at Bar-Ilan University. He did his post-doc at MIT, and was on the faculty at Tufts, University of Maryland at College Park, and Georgia Tech, before joining Bar-Ilan University in 1994. Today he is a professor of Computer Science at Bar-Ilan University and a Research Professor at Johns Hopkins University. Amir had been head of the Bar-Ilan Responsa Project, and Chairman of the Department of Computer Science. Today he is dean of the College of Exact Sciences at Bar-Ilan University. Amihood Amir’s Ph.D. thesis was in logic of programs, particularly temporal logics. He later did some work in Complexity theory -sub-recursive classes of functions and the concept of “cheatability” in hard sets. Since the late 1980’s Amir’s research has been in Algorithms design and analysis, in particular Pattern Matching Algorithms. In the later area he has been instrumental in developing the multidimensional pattern matching area, compressed matching, and, recently, asynchronous matching. In this context he has also done work in algorithms for Computational Biology. Amir has also worked in the past in Scheduling algorithms, VLSI algorithms, and Data Mining algorithms.

Slides >>

September 22, 2009

Long-term, environmental sampling by networks of embedded sensors has long been a vision of the computer-science community. This vision has subsequently driven much of the research over the past decade around low-power communications and network protocols. Despite these advances however, almost all existing applications of sensor networks still hold to the traditional “data logger” paradigm, where a user preselects a fixed operating point and then deploys. This talk will present an overview of recent work around changing the way, long-term sensor networks can be used. We propose a revised architecture for environmental sensor networks, where nodes can adapt their behavior based on the amount of energy available to harvest from the environment and within the constraints of a user-policy. An overview of current work around wireless camera networks will also be presented, where smart energy management protocols will become critical for long-term use.

Speaker Biography: Tim Wark is a Senior Research Scientist in the Autonomous Systems Laboratory at CSIRO ICT Centre, Australia. He has worked in a variety of projects around sensor networks ranging from wearable body sensors for indoor monitoring, to long-term, environmental sensing. His current areas of focus are around adaptive energy management of sensor networks and in-network processing of multimedia information. He currently leads a team of scientists and engineers developing the next generation of these long-term, wireless sensing technologies. Tim is a recipient of the CSIRO Julius Career Award, and is currently based at UC Berkeley as a Visiting Scholar with the Wireless and Embedded Systems Laboratory.

Student Seminar

September 25, 2009

English and a small set of other languages have a wealth of available linguistic knowledge resources and annotated language data, but the great majority of the world’s languages have little or none. This seminar will describe work on leveraging the detailed and accurate morphosyntactic analyses available for English to improve analytical capabilities for a diverse set of other languages. This includes the targeted enrichment of English morphosyntactic analysis, translingual projection of that analysis to bootstrap analyses of other languages, and exploitation of that richer feature space for improved machine translation and bitext word alignment. Emphasis is on the combination of multiple sources of information, including both explicitly expressed human linguistic knowledge and patterns observed in monolingual and bilingual corpora, and on language pairs where advanced analysis capabilities are available for one language and unavailable for the other.

Selected contributions to science that will be described include:

  • Proposal and execution of the concept of tagging English with a quasi-universal part-of-speech tag set of fine-grained morphosyntactic features designed for effective translingual annotation transfer from English to a diverse set of world languages.

  • Demonstration of the feasibility of automatically tagging English with a quasi-universal part-of-speech tagset with high accuracy, including the large percentage of quasi-universal features which are not realized via surface English morphology.

  • Demonstration of the high-performance extraction of fine-grained morphosyntactic tags from several state-of-the-art parsers, the combination of which outperforms the syntactic analysis extracted from any individual parser.

  • Demonstration of successful fine-grained tagset mapping between languages to enable translingual projection between non-isomorphic fine-grained tagsets.

  • Demonstration of successful bootstrapping from this projection, using automatically trained system combination to integrate multiple information sources.

  • Demonstration that enrichment of conditioning for machine translation by inclusion of fine-grained morphosyntactic tagging can provide significant gains in the accuracy of lexical choice in machine translation.

  • Demonstration that morphological expansion of a translation lexicon can provide significant improvements in word-alignment performance.

  • Demonstration that such expansion followed by weighting or filtering by empirically estimated correspondences between source- and target-language inflectional forms can improve translation performance.

  • Demonstration that syntactically transforming the target language into an English’ reordering of parsed English to closely parallel the source language word order can provide substantial improvements in word-alignment performance.

Slides >>

October 6, 2009

As robots move to human environments, the ability to learn and imitate from observing human behavior will become important. The talk will focus on our recent work on designing humanoid robots capable of continuous, on-line learning through observation of human movement. Learning behavior and motion primitives from observation is a key skill for humanoid robots, enabling humanoids to take advantage of their similar body structure to humans. First, approaches for designing the appropriate motion representation and abstraction will be discussed. Next, an approach for on-line, incremental learning of whole body motion primitives and primitive sequencing from observation of human motion will be described. The talk will conclude with an overview of preliminary experimental results and a discussion of future research directions.

Speaker Biography: Dana Kulić received the combined B. A. Sc. and M. Eng. degree in electro-mechanical engineering, and the Ph. D. degree in mechanical engineering from the University of British Columbia, Canada, in 1998 and 2005, respectively. From 2002 to 2006, Dr. Kulić worked with Dr. Elizabeth Croft as a Ph. D. student and a post-doctoral researcher at the CARIS Lab at the University of British Columbia, developing human-robot interaction strategies to quantify and maximize safety during the interaction. From 2006 to 2009, Dr. Kulić was a JSPS Post-doctoral Fellow and a Project Assistant Professor at the Nakamura-Yamane Laboratory at the University of Tokyo, Japan, working on algorithms for incremental learning of human motion patterns for humanoid robots. Dr. Kulić is currently an Assistant Professor at the Electrical and Computer Engineering Department at the University of Waterloo. Her research interests include robot learning, humanoid robots, human-robot interaction and mechatronics.

Student Seminar

October 14, 2009

For clinical and scientific studies, it is important to understand the internal muscle motion of tissues such as the tongue during speech and the heart in its beating cycle. Magnetic resonance (MR) tagging places non-invasive and temporary markers (tags) inside the soft tissues in a pre-specified pattern, yielding images that carry information about motion in the tagging direction. These images can be processing using harmonic phase (HARP) method to compute the in-plane motion. The dissertation studies the three-dimensional (3D) muscle motion using MR tagging with a focus on tongue imaging, and addresses the technical challenges in both 2D and 3D motion estimation.

In the dissertation, we developed HARP tracking refinement methods to reliably and automatically track the whole tissue from tagged MRI even when traditional HARP tracking fails. We measured 3D tongue motion during speech by re-implementing and optimizing the zHARP imaging sequence, and using a specialized MR triggering and vocal repetition method. We developed a thin plate spline based 3D tongue motion tracking method using tagged MR images by extending the 3D-HARP method for cardiac motion tracking. We developed a method to reconstruct a 3D, dense, incompressible deformation field from tagged MR images based on the divergence-free vector spline with incomplete data samples, and applied it to both tongue and cardiac motion reconstruction. Finally, we performed preliminary studies of the internal tongue motion pattern and muscle mechanisms of glossectomy patient and compared them with normal speakers.

Slides >>

October 29, 2009

The objective of this talk is to demonstrate the utility of machine learning in developing a cost-effective test solution for analog/RF circuits. I will first introduce the problem of testing analog/RF chips for manufacturing defects and the current industrial practice, which involves explicit specification measurements obtained through expensive instrumentation. I will then describe an ontogenic neural classifier that learns to separate nominal from faulty chip distributions in a low-dimensional space of inexpensive measurements. The key novelty of this classifier is that its topology is not fixed; rather, it adapts dynamically, in order to match the inherent complexity of the separation problem. Thus, it establishes separation hypersurfaces that reciprocate very well even in the presence of complex chip distributions. I will also discuss the construction of guard-bands, which provide a level-of-confidence indication and support a two-tier test method that allows exploration of the trade-off between test quality and test cost. In this method, the majority of chips are accurately classified through inexpensive measurements, while the small fraction of chips for which the decision of the classifier has a low level of confidence is re-tested through traditional specification testing. The ability of the proposed method to drastically reduce the cost of analog/RF testing without compromising its quality will be demonstrated using two example circuits, a simple switched-capacitor filter and an industrial UHF receiver front-end. In addition, the application of the neural classifier to the problem of analog specification test compaction and its potential for developing a stand-alone Built-in Self-Test (BIST) method for analog/RF circuits will be discussed.

Speaker Biography: Yiorgos Makris received the Diploma of Computer Engineering and Informatics from the University of Patras, Greece, in 1995, and the M.S. and Ph.D. degrees in Computer Engineering from the University of California, San Diego, in 1997 and 2001, respectively. He then joined Yale University where he has been instrumental in revitalizing the Computer Engineering program. He is currently an Associate Professor of Electrical Engineering and Computer Science and leads the Testable and Reliable Architectures (TRELA) Laboratory. His main research interests are in the application of machine learning and statistical analysis methods towards increasing testability, reliability, and security of analog/RF devices. He is also interested in test and reliability methods for asynchronous circuits, as well as error detection and correction methods for modern microprocessors. His research has been supported by NSF, DARPA, SRC, IBM, LSI, Intel, and TI.

Student Seminar

November 11, 2009

Most wireless network installations today involve a set of access points with overlapping coverage zones, each access point being connected to a wired network tap. Mesh networks remove this strong connectivity requirement by having only a few of the access points connected to a wired network, and allowing the others to forward packets over multiple wireless hops. This talk is in the area of wireless mesh networking.

Effort has already been made to make wireless mesh networks a reality. However, the systems that we usually see in academic world and industry are either experimental testbeds (tailored to evaluate special kind of protocols), they use expensive hardware for mesh nodes, or have limited (or none) support for mobility. In this talk we present the architecture of the first high-throughput 802.11 wireless mesh network that provides a fast handoff using off-the-shelf low cost routers. This talk also introduces the architecture and protocols of the first robust Push-To-Talk service for wireless mesh networks. This is a joint work with Yair Amir, Claudiu Danilov, Mike Hilsdale, Michael Kaplan and Nilo Rivera.

November 12, 2009

Many engineering and computer graphics applications require computing surface deformations minimizing an energy or solving an equation of motion. This type of deformations are used to model free-form surfaces in computer-aided design systems, to create animated characters, to simulate cloth or analyze stresses in a car body.

Complex surfaces are commonly represented by meshes, that is, piecewise-linear functions which cannot be differentiated directly. At the same time, the equations that we need to solve often involved derivatives of order four or higher. Approximating high-order derivatives on meshes with sufficient accuracy is difficult, and often requires costly computations. These computations may be prohibitively expensive in the context of interactive modeling and simulation. In many cases, cheap, but inaccurate approximations are available, resulting in faster algorithms, but less reliable results.

In this talk, I will discuss how mesh deformations can be computed efficiently while maintaining accuracy, and demonstrate several applications in geometric modeling and animation.

I will review several complimentary approaches that we have explored, in particular, taking advantage of geometric relations to simplify the equations we need to solve, decomposing higher-order problems into several low-order problems, and representing the solution of a general problem as a combination of solutions of special-case simpler problems.

Speaker Biography: Denis Zorin is an associate professor in the Computer Science Department of the Courant Institute of Mathematical Sciences at New York University. He received a PhD in Computer Science from the California Institute of Technology. Before joining the faculty at NYU he was a postdoctoral researcher at the Computer Science Department of Stanford University. He was a Sloan Foundation Fellow in 2000-2002; he received the NSF CAREER award in 2001, and IBM Faculty Partnership Award in 2001-2004 and 2006. His primary research interests are geometric modeling and scientific computing.

Student Seminar

November 13, 2009

Much of our critical infrastructure is controlled by large software systems whose participants are distributed across the Internet. As our dependence on these critical systems continues to grow, it becomes increasingly important that they meet strict availability and performance requirements, even in the face of malicious attacks, including those that are successful in compromising parts of the system. This talk presents the first replication protocols capable of guaranteeing correctness, availability, and good performance even when some of the servers are compromised, enabling the construction of highly available and highly resilient systems for our critical infrastructure.

Prior to this work, intrusion-tolerant replication protocols were designed to perform well in fault-free executions, and this is how they were evaluated. We point out that many state of the art protocols are vulnerable to significant performance degradation by a small number of malicious processors. We present Prime, a new intrusion-tolerant replication protocol that bounds the amount of performance degradation that can be caused by compromised machines, assuming the network is sufficiently stable. Using Prime as a building block, we show how to design and implement an attack-resilient, large-scale intrusion-tolerant replication system for wide-area networks. Our results provide evidence that it is possible to construct highly resilient, large-scale survivable systems that perform well even when some of the servers (and some entire sites) are compromised.

Slides >>

November 19, 2009

While supervised learning methods for classification and structured prediction are very effective in many domains, they require detailed and precise labeling of large amounts of data. Weakly or ambiguously labeled data present major challenges as well as opportunities. For example, to build a machine translation system, we typically have large amounts of translated sentences to learn from, but without word or phrase level correspondence. Copious images and videos on the web or your harddrive are typically labeled with captions of who and what is in the picture, but not where and when. The challenges are both theoretical and algorithmic: under what assumptions can we guarantee effective and efficient learning of precise correspondence from pure co-occurrence? I will describe our ongoing work on weakly supervised learning approaches for machine translation and parsing of images, videos and text.

Speaker Biography: Ben Taskar received his bachelor’s and doctoral degree in Computer Science from Stanford University. After a postdoc at the University of California at Berkeley, he joined the faculty at the University of Pennsylvania Computer and Information Science Department in 2007, where he currently co-directs PRiML: Penn Research in Machine Learning. His research interests include machine learning, natural language processing and computer vision. His work on structured prediction has received best paper awards at NIPS and EMNLP conferences.

December 3, 2009

The tele-immersion system at CITRIS lab at UC Berkeley consists of 48 cameras in 12 stereo clusters. Images are processed by 12 computers running simultaneously and sending data via Internet II connection to near and remote rendering computers. This system enables to generate in real time (22 frames/second) 4D data of moving people. In this presentation I will show the following applications:

  • Two dancers each in separate locations dancing together in the Virtual space;

  • Two Geo-scientists in different locations discussing, manipulating and analyzing seismographic data interactively;

  • Archeologist analyzing 3D data from China virtually.

The scientific agenda stemming from this infrastructure is the dynamic analysis of Human body action and its interpretation. We will show some preliminary results from segmentation and classification of human action. This work is collaboration between UC Berkeley, UIUC, UC Merced and UC Davis.

For more information, go to http://tele-immersion.citris-uc.org/.

Speaker Biography: Dr. Ruzena Bajcsy was appointed Director of CITRIS and professor of EECS department at the University of California, Berkeley on November 1, 2001. In 2004 she became a CITRIS director emeritus and now she is a full time professor of EECS. Dr. Bajcsy is a pioneering researcher in machine perception, robotics and artificial intelligence. Dr. Bajcsy received her master’s and Ph.D. degrees in electrical engineering from Slovak Technical University in 1957 and 1967, respectively. She received a Ph.D. in computer science in 1972 from Stanford University.

Student Seminar

December 4, 2009