Spring 2002

February 13, 2002

When designing a cryptographic system, failure to take into account the behavior of the system’s users can have severe consequences. Nowhere is this more evident than in the design of protocols for user authentication and authenticated key-exchange. The standard protocols for these tasks are secure when parties use long, hard-to-guess secrets; however, these same protocols can be completely broken by an off-line dictionary attack when users choose weak passwords which are easy for an attacker to guess.

A long-standing goal of computer security has been to design password-only authentication and key-exchange protocols which are resistant to such attacks even when users choose poor passwords (as is typically the case). Many previously-proposed solutions offer “heuristic” security guarantees without a proof of security in the standard cryptographic model.

We propose the first efficient protocols for password-only authentication and key-exchange which are provably secure against off-line dictionary attacks. Proofs of security are in the standard cryptographic model using a well-known cryptographic assumption, and the protocols require only slightly more computation than the original key-exchange protocol of Diffie and Hellman (which provides no authentication at all).

We will also discuss more recent work focusing on password-based authentication and key-exchange in the public-key model.

February 25, 2002

Network design problems consist of selecting components to purchase (servers, cables, hubs, and so forth) in order to produce the cheapest network which satisfies given constraints. These constraints require (for example) high bandwidth or low latency service for a set of customers. Many such problems exist; some of the simplest can be solved using flow techniques. However, the addition of integrality constraints and economies of scale makes many network design problems NP-Hard.

I will describe combinatorial approximations for several problems in network design. In particular, I will presentthe problems of single sink buy-at-bulk network design and cost-distance network design. These problems deal with connecting a set of customers to a single sink node (consider this to be a point on the internet backbone). The customers have given bandwidth requirements and various types of cables are available whose costs satisfy economies of scale. The combinatorial approximation algorithms to be presented have a number of advantages over earlier techniques, improving running time, worst case approximation ratio, and average case performance.

Earlier work on network design did not account for changes in the set of customers with the passage of time. I will give the first online algorithm for facility location, making it possible to build a solution incrementally as the customers arrive. Similar techniques can be applied to the problem of access network design.

Finally, I will outline some avenues for future work, extending upon the results mentioned and relating them to current research in the theory and networking communities.

Parts of the work described are joint with Sudipto Guha, Kamesh Munagala, and Serge Plotkin.

March 4, 2002

Recent industrial efforts (i.e. Infiniband) into next-generation System Area Networks (SAN) architectures introduce new communication abstractions and protocols coupled with significant hardware support to meet performance goals of low latency, low host processor overhead and high throughput. They largely abandon existing network protocol schemes under that the perception that they cannot deliver sufficient performance. However, the success and ubiquity of TCP/IP based networks cannot be ignored. The suite of Internet protocols is well understood, offers a rich feature set and has been tested on a variety of platforms and links. This talk examines the potential architectural interactions of regimes of new I/O networks, like Infiniband, and internet protocols in the SAN space. The first part of the discussion focuses on the Queue Pair memory-based communication adopted by modern SAN proposals. The analysis includes network performance and host overhead analysis. Also examined is the functional integration of the QP into distributed I/O and IPC application environments. The second part of the talk looks at the merger of inter-network protocols (i.e. TCP/UDP/IP) and the queue-pair abstraction in an architecture called Queue Pair IP (QPIP). Presented is a prototype implementation of QPIP that provides offloaded TCP/IP functionality underneath the QP abstraction and includes a performance analysis in the context of networked storage.

March 5, 2002

Typically, NACK-based congestion control is dismissed as being not viable due to the common notion that “open-loop” congestion control is simply “difficult.” Emerging real-time streaming applications, however, often rely on rate-based flow control and would benefit greatly from scalable NACK-based congestion control. This talk sheds new light on the performance of NACK-based congestion control and measures the amount of “difficulty” inherently present in such protocols. We specifically focus on increase-decrease (I-D) congestion control methods for real-time, rate-based streaming. First, we introduce and study several new performance measures that can be used to analyze the class of general I-D congestion control methods. These measures include monotonicity of convergence to fairness and packet-loss scalability. Second, under the assumptions that the only feedback from the network is packet loss, we show that AIMD is the only TCP-friendly method with monotonic convergence to fairness. Furthermore, we find that AIMD possesses the best packet-loss scalability among all TCP-friendly binomial schemes and show how poorly all of the existing methods scale as the number of flows is increased. Third, we show that if the flows can obtain the knowledge of an additional network parameter (i.e., the bottleneck bandwidth), the scalability of AIMD can be substantially improved. We conclude the talk by studying the performance of a new scheme, called Ideally-Scalable Congestion Control (ISCC), both in simulation and a NACK-based MPEG-4 streaming application over a Cisco testbed.

March 7, 2002

In this talk we consider the $$k$$ edge-disjoint paths problem ($$k$$-EDP), which is a generalization of the well-known edge-disjoint paths problem: given a graph $$G=(V,E)$$ and a set of terminal pairs (or requests) $$T$$, the problem is to find a maximum subset of the pairs in $$T$$ for which it is possible to select disjoint paths so that each pair is connected by $$k$$ edge disjoint paths. To the best of our knowledge, nothing nontrivial is known for this problem so far for $$k>1$$. In order to measure the performance of our algorithms we will use the routing number of the graph, R. This parameter is known to fulfill $$R=O(\alpha^{-1} \log n)$$, where $$\Delta$$ is the maximum degree and $$\alpha$$ is the edge expansion of $$G$$. We show with the help of a simple, greedy online algorithm that it is possible to achieve a competitive ratio of $$O(k^3.\Delta.R)$$. In order to achieve this competitive ratio, we introduce a new method to convert a system of $$k$$ disjoint paths into a system of $$k$$ short disjoint paths. We also show that our class of online algorithms have a competitive ratio of $$\Omega(k.R)$$.

In addition, we study the $$k$$ disjoint flows problem ($$k$$-DFP), which is a generalization of the well-known unsplittable flow problem (UFP): the $$k$$-DFP is similar to the $$k$$-EDP with the difference that now the graph has edge capacities and the requests can have arbitrary demands $$d_i$$. The aim is to find a subset of requests of maximum total demand for which it is possible to select flow paths so that all the capacity constraints are kept and each selected request with demand $$d_i$$ is connected by $$k$$ disjoint paths of flow value $$d_i/k$$.

The $$k$$-EDP and $$k$$-DFP problems have important applications in fault-tolerant (virtual) circuit switching, which plays an important role in optical networks.

This talk is based on work done jointly with Christian Scheideler and Amitabh Chaudhary of Johns Hopkins University and Petr Kolman of Charles University.

March 11, 2002

Networked sensors – those that coordinate amongst themselves to achieve a sensing task – promise to revolutionize the way we live, work and interact with the physical environment. Fundamental to such coordination is localization, or the ability to establish spatial relationships among such devices. However, existing localization systems such as GPS do not always meet the operational (for example, low power), environmental (for example, indoors) or cost constraints of such systems.

In this talk, I will describe the challenges for localization in very large, ad hoc deployed sensor networks. One approach to localizing small devices is to rely on a system of beacons, nodes that are position-aware by virtue of being pre-positioned, GPS-enabled or endowed with more sophisticated hardware. I will make the case that in unattended sensor networks, instead of relying on extensive pre-configuration, these beacon systems must self-configure, i.e., autonomously adapt to the dynamics of their environmental setting and the availability of other beacons.

I will first quantitatively analyze the impact of beacon density on the quality of localization, and show that localization saturates at a certain beacon density. I will then describe the design of two self-configuring algorithms that build on this observation, to improve localization quality by adding new beacons at low densities or increase system lifetime by rotating functionality amongst redundant beacons at high beacon densities. The talk will include performance results from simulations as well as experimental results from implementation of an RF-proximity based localization system on a variety of wireless testbeds, including tiny devices known as Rene motes, developed at U.C. Berkeley.

March 14, 2002

Given a string, a pattern is a repeating substring possibly with some “don’t care” characters. The problem of pattern discovery is that of finding all such substrings that appear at least $$k$$ times. Discovering such patterns in nucleic and amino acid sequences is of great interest in computational biology. If we do not have a model for evaluating the significance of the patterns a priori, it is important that all the patterns be detected. However, the number of such patterns could be exponential in the size of the input. In this talk, I will describe a notion of redundancy which gives rise to a provably small number of irredundant patterns, while preserving the information content of the set of all patterns. I will further discuss how this helps in designing a very efficient pattern discovery algorithm.

There are several generalizations of the pattern discovery problem which are important in computational biology and we will discuss one such generalization. We will conclude the talk by presenting several concrete examples, of the application of pattern discovery in the area of biology.

March 15, 2002

Until about five years ago, hardware accelerated, interactive computer graphics and procedural shading were considered opposite ends of the speed/ realism spectrum. Rendering hardware uses parallel computational resources to render tens of millions of triangles per second and write billions of pixels per second. Its uses include computer-aided design, flight simulators, games, medical and scientific visualization, and virtual reality. Procedural shading allows a programmer or artist to describe the color and shading of a surface through a function in a special-purpose high-level “shading language”. Its uses in software renderers have included commercials and film. Since procedural shaders are literally short programs, they are known for their detail and complexity and their ability to easily produce an infinite array of similar, yet subtly differing surfaces.

Recent advances in graphics hardware have brought us to a point where simple procedural shaders can be rendered interactively (though we still have yet to reach the often-stated goal of “interactive Toy Story”). This talk describes two interactive shading language systems. The first is PixelFlow, a large-scale graphics hardware project that was completed in 1997 at the University of North Carolina at Chapel Hill. PixelFlow was the first graphics hardware system with a high-level shading language, and compiled shading code into operations for a custom SIMD processor array. The second is OpenGL Shader, a software product first released in 2000 by SGI. OpenGL Shader compiles shading language code into multiple rendering passes on existing graphics hardware, using either standard blending operations or recent low-level hardware shading extensions to perform the basic operations of the shading code. The talk will include examples of the shading code and results of both systems as well as details of the graphics hardware architectures and compilers behind them.

March 25, 2002

Communication networks have witnessed phenomenal growth in recent years. The issues raised by the introduction of new technology and protocols have led to problems which are both theoretically fundamental and relevant to current applications. In this talk, I shall focus on two key themes, namely, virtual private networks (VPN) and the Border Gateway Protocol (BGP).

VPNs provide customers with predictable and secure network connections over a shared network. Provisioning a VPN over a set of terminals gives rise to the following general network design problem: we have bounds on the cumulative amount of traffic each terminal can send and receive; we must reserve enough bandwidth on the network so that any pair-wise traffic matrix consistent with these bounds can be routed.

BGP is the standard protocol for exchanging information between border routers of autonomous systems(AS) in the Internet. Within an AS, border routers exchange externally learned route information. Naive solutions for this peering between the border routers simply cannot scale to the sizes of modern AS networks. Carefully designed route reflector configurations can drastically reduce the communication costs needed by such peering sessions. We give the first algorithmic approach towards designing such configurations.

We show connections of these problems with facility location problems and give efficient algorithms with provable performance guarantees.

March 26, 2002

A large number of fundamental algorithmic problems involve optimization in finite metric spaces, and hence recent years have seen much work in understanding the structure of finite metric spaces, and particularly in finding meaningful ways to represent metrics so as to make them easy to work with, while retaining much of their inherent structure.

In this talk, I will talk about different ways to embed finite metrics into trees, and how these embeddings can be used to solve a variety of different problems. In particular, I will illustrate how tree embeddings can be used for emulating multicasts by unicasts, in building modular network routing algorithms, and in algorithms for graph partitioning and other optimization problems.

March 28, 2002

The widespread deployment of inexpensive communication technologies, computational resources in the networking infrastructure and network-capable end devices offers a rich design space for novel distributed applications and services. Exploration of this space has given rise, for instance, to the notions of grid and peer-to-peer computing. Both technologies promise to change the way we think about and use computing by harvesting geographically distributed resources to create a universal source of pervasive computing power that will support new classes of applications.

Despite the growing interest in these environments and the increasing availability of the necessary hardware and network infrastructure, few actual applications are readily available or widely deployed. This scarcity results from a number of technical challenges that must be addressed before the full potential of these technologies can be realized. Most of these applications, as well as the services they utilize, are expected to handle dynamically varying demand on resources and to run in large, heterogeneous, and dynamic environments, where the availability of resources cannot be guaranteed a priori — all of this while providing acceptable levels of performance.

In this talk, I will present “Active Streams”, a novel middleware approach for building adaptive distributed systems aimed at such environments. I will describe the design and implementation of its supporting framework and present some experimental results that illustrate its use and demonstrate its performance and flexibility.

March 29, 2002

April 5, 2002

This talk will survey a variety of uses of modeling and simulation in understanding biological systems. The first part of the talk will cover applications of the local rules dynamics model, a simulation model which combined principles from the earlier local rules model of Berger et al. with techniques from molecular dynamics. Both models were originally developed to study virus protein shell self-assembly, although they were later applied to several other self-assembly systems including protein aggregates and amyloids. The second part of the talk will examine the use of a discrete lattice simulation model for studying sequence predictors of protein aggregate formation. It will also look at a statistical model of amino acid word distributions applied to protein sequence databases in order to confirm a prediction derived from the lattice model. The third part of the talk will examine a graph theoretic model and some associated combinatorial optimization problems formulated to aid in understanding the haplotype structure of sequenced genomes. For each type of system covered, we will examine the chosen methods of model building, the motivations for selecting them, and their specific applications.

April 11, 2002

Motion planning is concerned with computing obstacle-avoiding motion for objects in an environment populated with obstacles (e.g., robot manipulators on an assembly line or drug molecules inside the binding cavity of a protein). It has applications in areas as disparate as robotics, computational biology, animation, and medical surgery planning. The key difficulty in designing general motion planning algorithms lies in the number of degrees of freedom (dofs) that an object has.

In this talk, I will present efficient randomized algorithms for computing obstacle-avoiding motion for objects with many dofs in complex geometric environments. I will show extensions of the basic algorithms to deal with kinematic and dynamic motion constraints, as well as moving obstacles in the environment. The generality and efficiency of these algorithms make them useful in applications such as robotics and computer animation. Using insights gained from randomized motion planning, we developed Stochastic Roadmap Simulation (SRS), a new framework for simulating and analyzing molecular motion. SRS simultaneously examines many molecular motion pathways encoded compactly in a graph and thus achieves tremendous improvement in computational efficiency, compared with classic techniques such as the Monte Carlo method.

April 19, 2002

Chip multi-processors (CMP), where a processor core is replicated multiple times on a single chip, are widely recognized as a means to exploit the exponentially growing transistor budgets predicted by Moore’s Law, while respecting design complexity and physical constraints. While some workloads (e.g., server, scientific) already exhibit sufficient thread-level parallelism, most software does not. For these other programs, the additional complexity of manual parallelization is unattractive, and traditional automatic parallelization techniques have yet to be widely adopted.

In this talk, I will present two hardware/software techniques to exploit thread-parallel resources to improve sequential program performance. The first technique uses “helper threads” to avoid stalls (due to cache misses and branch mispredictions) that would be incurred by a single-threaded execution of the program. The second technique uses a “master processor” to compute future checkpoints of program state to achieve a (speculative) parallel execution of the program. Both techniques exploit a common theme: the use of approximate versions of code to generate accurate value predictions.

April 22, 2002

In the 25 years that have passed since the first node of the ARPAnet was installed in UCLA, the Internet has grown to a global infrastructure. This infrastructure provides however only low level primitives such as packet forwarding and end-to-end reliable delivery. What is still missing today from the Internet architecture is high level network utilities. The difference between network utilities and network applications is that utilities are more general than applications can be used as building blocks to enable the creation of new applications.

In this talk I will describe the work I have done in the past two years building such a network utility as one of the co-founders and system architects at Silvan Networks. As part of my talk I will present the architecture of the application level overlay network proposed. This network exposes high level primitives upon which applications could be easily built. The primitives offered by this overlay network include automatic creation of a mesh from a number of network nodes, the capability for a node to announce the existence of certain content, the capability to route towards that content, the capability to replicate content accord to demand patterns and the capability to control where content is replicated according to policy rules provided by content owners. I will also present PDQ (Protocol for Directing Queries), a novel content routing protocol that lies at the center of the application level overlay network. In addition to content routing, PDQ is responsible for delivering content invalidation signals when different resources are updated and collecting accounting information about the use of resources throughout the network.

The first application built on top of the Silvan application network was a Content Delivery Network (CDN) for the distribution of static web pages. Our measurements have shown that the performance of the developed CDN is on par and in some cases better than the best commercial service while the size of our deployed system is two to three orders of magnitude smaller than that of the comparable commercial service.

A comparison of the described architecture with some of the proposed peer-to-peer schemes will be provided. Finally I will outline some of the future work I plan to do in this area including internet-wide storage, collaboration and computation utilities.

This has been a joint work with Lixia Zhang and Adam Rosenstein.

May 2, 2002

This talk introduces and explores a new approach to solve the consensus problem in asynchronous systems that we call the “Condition-Based” approach. It consists in identifying sets of input vectors for which it is possible to design a protocol solving consensus despite the occurrence of process crashes.

The first main result is a generic consensus protocol for the shared memory model. It always guarantees agreement, and also termination (at least) when the inputs satisfy the condition with which the protocol has been instantiated, or when there is no crash. Then, two particular realistic conditions are investigated. A second main result is the statement of a general characterization that captures the set of all the conditions that allow to solve consensus, proving that the protocol works in all these cases. An efficient version of the generic protocol is then designed for the message passing model. It is also shown how the protocol safety can be traded for liveness. The theory that underlies the condition-based approach is also investigated.

Speaker Biography: Michel Raynal received his “doctorat d’Etat” in computer science from the University of Rennes, France, in 1981. In 1981 he moved to Brest (France) where he founded the computer science department in a telecommunications engineer school (ENST). In 1984 he moved to the University of Rennes where he has been a professor of computer science. At IRISA (CNRS-INRIA-University joint computing research laboratory located in Rennes), he is the founder and the leader of the research group “Distributed Algorithms and Protocols”. His research interests include distributed algorithms, operating systems, computing systems, protocols and dependability.

Professor Michel Raynal has published more than 70 papers in journals and more than 140 papers in conferences. He has has written seven books devoted to parallelism, distributed algorithms and systems. Michel Raynal has been invited to serve in program committees for more than 50 international conferences. In 1989 and 1995 he served as the program chair of the WDAG/DISC symposium on Distributed Algorithms. In 1999 he served as the general chair of the IEEE Conference on Object-Oriented Real-time Distributed Computing (ISORC’99). He is the program co-chair of IEEE International Conference on Distributed Comuting Systems (ICDCS’02) that will be held in Vienna (Austria), and the conference co-chair of ISORC’02 that will be held in Washington DC (April 02).