We typically have seminars on Wednesday at noon in Malone 228. All seminar announcements will be sent to the theory mailing list.

Speaker: Rico Zenklusen

Affiliation: ETH Zurich and Johns Hopkins University

Title: Multi-Budgeted Matchings via the Ham Sandwich Theorem

Abstract:

In many applications, one has to deal with multiple, partially conflicting constraints. In this talk, we consider a multi-objective variant of the maximum weight matching problem, which is a classical combinatorial optimization problem with numerous applications. A natural way to deal with several objectives is to turn all of the objectives but one into budget constraints. This leads to the multi-budgeted matching problem, which asks to find a maximum weight matching subject to k linear constraints with nonnegative coefficients. Whereas this problem can easily be shown to be NP hard even for k=1, I will present in this talk a polynomial-time approximation scheme that works for any constant k. Our algorithm is based on rounding an optimal solution x* of an LP relaxation. Starting with a convex decomposition of x* into few matchings, we reduce the problem of rounding x* to an arguably simpler problem of successively merging two matchings in the convex decomposition of x*.

To prove that our algorithm is correct, we leverage two beautiful non-constructive mathematical theorems. More precisely, the Jordan Curve Theorem gives a concise and intuitive proof why our algorithm works for k=1, and a result of Stromquist and Woodall that follows from the Ham Sandwich Theorem allows for showing correctness for any constant k.

Part of this work is joint with Fabrizio Grandoni, R. Ravi and Mohit Singh.

Speaker: Rong Ge

Affiliation: Microsoft Research New England

Title: New Algorithms for Learning Incoherent and Overcomplete Dictionary

Abstract:

In sparse recovery we are given a matrix A (“the dictionary”) and a vector of the form AX where X is sparse. and the goal is to recover X. This is a central notion in signal processing, statistics and machine learning. But in applications such as

sparse coding, the dictionary A is unknown and has to be learned from random examples of the form Y = AX where X is drawn from an appropriate distribution — this is the dictionary learning problem. In most settings, A is overcomplete: it has more columns

than rows. This talk presents a polynomial-time algorithm for learning overcomplete dictionaries; Our algorithm applies to incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo.

Based on joint work with Sanjeev Arora, Tengyu Ma and Ankur Moitra.

Bio:

Rong Ge is currently a post-doc at Microsoft Research, New England. He received his Ph.D. in Princeton University, advised by Prof. Sanjeev Arora. His main research interest is in applying algorithm design techniques from theoretical computer science to

machine learning problems, with the hope of provable algorithms and better understanding of the machine learning models.

Title: Correctness Protection via Differential Privacy

Speaker: Aaron Roth

Affiliation: UPenn

Abstract:

False discovery is a growing problem in scientific research. Despite

sophisticated statistical techniques for controlling the false discovery rate

and related statistics designed to protect against spurious discoveries, there

is significant evidence that many

published scientific papers contain incorrect conclusions.

In this talk we consider the role that adaptivity has in this problem. A

fundamental disconnect between the theorems that control false discovery rate

and the practice of science is that the theorems assume a fixed collection of

hypotheses to be tested, selected non-adaptively before the data is gathered,

whereas science is by definition an

adaptive process, in which data is shared and re-used, while hypotheses are

generated after seeing the results of previous tests.

We note that false discovery cannot be prevented when a substantial number of

adaptive queries are made to the data, and data is used naively — i.e. when

queries are answered exactly with their empirical estimates on a given finite

data set. However we show that remarkably, there is a different way to evaluate

statistical queries on a data set that allows even an adaptive analyst to make

exponentially many queries to the data set, while guaranteeing that with high

probability, all of the conclusions he draws generalize to the underlying

distribution. This technique counter-intuitively involves actively perturbing

the answers given to the data analyst, using techniques developed for privacy

preservation — but in our application, the perturbations are added entirely to

increase the utility of the data.

Joint work with Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann

Pitassi, and Omer Reingold.

Speaker: Guy Kortsarz

Affiliation: Rutgers University-Camden

Title: What have we learned about cut expansion and density problems

Abstract: I will survey several problems related to the above subjects. Directed and undirected multicut. For Directed multicut I will show the approximation algorithm algorithm of Gupta. Conductance (and sparsest cut), overlapping and non overlapping clustering, the small set expansion conjecture and it equivalent to breaking the ratio of 2 for MINIMUM partial vertex cover problem, the densest subgraph problem and the dense k-subgraph problem.

Speaker: Robert Krauthgamer

Affiliation: Weizmann Institute of Science

Title: The Sketching Complexity of Graph Cuts

Abstract:

We study the problem of sketching an input graph $G$ on $n$ vertices, so that given the sketch, one can estimate the weight (capacity) of any cut in the graph within a small approximation factor $1+\epsilon$. The classic cut-sparsifier construction of Benczur and Karger (STOC 1996) implies a sketch of size $\tilde O(n/\epsilon^2)$ [this notation hides logarithmic factors].

We show that the dependence on $\epsilon$ can be brought down to only linear, at the price of achieving a slightly relaxed guarantee. Specifically, we design a randomized scheme that produces from $G$ a sketch of size $\tilde O(n/\epsilon)$ bits, from which the weight of any cut $(S,\bar S)$ can be reported, with high probability, within factor $1+\epsilon$. We also demonstrate some applications where this “for each” guarantee is indeed useful.

We further prove that that our relaxation is necessary. Specifically, a sketch that can $(1+\epsilon)$-approximate the weight of all cuts in the graph simultaneously (i.e., a “for all” guarantee), must be of size $\Omega(n/\epsilon^2)$ bits.

Joint work with Alexandr Andoni and David Woodruff.

Calvin Newport

Georgetown University

Title: Radio Network Lower Bounds Made Easy

Speaker: Amitabh Basu

Affiliation: JHU

Title: Cutting Planes and Geometry of Numbers

Abstract: We survey some recent results in cutting plane theory for integer programming. Cutting Planes give a way to reduce the search space for the optimal solution in an integer optimization problem. The results we will present are very recent connections between cutting planes and covering/tiling properties of subsets of euclidean sets. Important structural information about cutting planes can be translated to geometric questions like: Does a particular compact subset B of R^n cover all of R^n when we consider all of its translates by integer vectors. This connects to very classical problems in the geometry of numbers and deep theorems like the Venkov-Alexandrov-McMullen theorem on tilings, and the geometry of zonotopes can be leveraged. Research in this area of integer optimization is very much work-in-progress; we will close the presentation with an invitation to join our quest with some open problems.

Speaker: Grigory Yaroslavtsev

Affiliation: University of Pennsylvania

Title: Parallel Algorithms for Geometric Graph Problems

Abstract:

I will describe algorithms for geometric graph problems in the modern parallel models inspired by MapReduce. The talk will be self-contained, including a formal introduction of the modern theoretical computational models used to capture computations in massively parallel “MapReduce”-like systems. It will also include a sample of major open problems in the area.

For example, for the Minimum Spanning Tree (MST) problem over a set of points in the two-dimensional space, our algorithm computes an approximate MST. Our algorithms work in a constant number of rounds of communication, while using total space and communication proportional to the size of the data (linear space and near linear time algorithms).

I will also show how the main ideas from the MST algorithm can be captured within a general “Solve-and-Sketch” algorithmic framework that we develop. Besides MST it also applies to the approximate Earth-Mover Distance (EMD) and the transportation cost problem. Algorithms designed in the “Solve-and-Sketch” framework have implications which go beyond parallel models. In particular, our work implies new near-linear time algorithms for EMD cost and transportation cost in the plane. Other implications include algorithms in the streaming with sorting model.

Joint work with Alexandr Andoni, Krzysztof Onak and Aleksandar Nikolov.

Speaker: Michael Dinitz

Affiliation: Johns Hopkins University

Title: Approximating Graph Spanners

Abstract:

Graph spanners (subgraphs which approximately preserve distances) have been studied extensively since the 1980’s. Many of the known results are about the optimal tradeoffs between various parameters, particularly the stretch and size of the spanner. But there has been some recent progress on a different and less developed line of research: fixing the allowable stretch, and optimizing the size. This turns spanners into more of a computational problem, and allows us to use many of the standard techniques from approximation algorithms (convex relaxations in particular). In this talk we will give an overview of some of the progress in this area, its limitations, and some possible future directions.