Speaker: Jalaj Upadhyay

\nAffiliation: Johns Hopkins Un
iversit

Title: Towards Robust and Scalable Private Data Analysis\n

Abstract:

\nIn the current age of big data\, we are constantly
creating new data which is analyzed by various platforms to improve servi
ce and user’s experience. Given the sensitive and confidential nature of
these data\, there are obvious security and privacy concerns while storing
and analyzing such data. In this talk\, I will discuss the fundamental ch
allenges in providing robust security and privacy guarantee while storing
and analyzing large data. I will also give a brief overview of my contribu
tions and future plans towards addressing these challenges.

To g ive a glimpse of these challenges in providing a robust privacy guarantee known as differential privacy\, I will use spectral sparsification of grap hs as an example. Given the ubiquitous nature of graphs\, differentially p rivate analysis on graphs has gained a lot of interest. However\, existing algorithms for these analyses are tailored made for the task at hand maki ng them infeasible in practice. In this talk\, I will present a novel diff erentially private algorithm that outputs a spectral sparsification of the input graph. At the core of this algorithm is a method to privately estim ate the importance of an edge in the graph. Prior to this work\, there was no known privacy preserving method that provides such an estimate or spec tral sparsification of graphs.

\nSince many graph properties are de fined by the spectrum of the graph\, this work has many analytical as well as learning theoretic applications. To demonstrate some applications\, I will show more efficient and accurate analysis of various combinatorial pr oblems on graphs and the first technique to perform privacy preserving man ifold learning on graphs.

DTSTART;TZID=America/New_York:20190206T120000 DTEND;TZID=America/New_York:20190206T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Jalaj Upadhyay URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-jalaj-upadh yay-3/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-243@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Martin Farach-Colton

\nAffiliation: Rutgers Un
iversity

Title: TBA

\nAbstract: TBA

DTSTART;TZID=America/New_York:20190213T120000 DTEND;TZID=America/New_York:20190213T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Martin Farach-Colton URL:https://www.cs.jhu.edu/~mdinitz/theory/event/martin-farach-colton/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-272@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Xue Chen

\nAffiliation: Northwestern Universit
y

Title: Active Regression via Linear-Sample Sparsification

\n< p>Abstract:\nWe present an approach that improves the sample comp
lexity for a variety of curve fitting problems\, including active learning
for linear regression\, polynomial regression\, and continuous sparse Fou
rier transforms. In the active linear regression problem\, one would like
to estimate the least squares solution \\beta^* minimizing ||X \\beta – y|
|_2 given the entire unlabeled dataset X \\in \\R^{n \\times d} but only o
bserving a small number of labels y_i. We show that O(d/\\eps) labels suff
ice to find an \\eps-approximation \\wt{\\beta} to \\beta^*:

\n<
/div>\n

\nChristopher Musco is an Assistant Professor in the Computer Science and Engineering department at NYU’s Tandon School of Engineering. His research focuses on the algorithmic foundations of data science and ma chine learning. Christopher received his Ph.D. in Computer Science from th e Massachusetts Institute of Technology and B.S. degrees in Applied Mathem atics and Computer Science from Yale University. DTSTART;TZID=America/New_York:20191011T123000 DTEND;TZID=America/New_York:20191011T133000 SEQUENCE:0 SUMMARY:[Theory Seminar] Christopher Musco URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-christopher -musco/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-296@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:

DTSTART;TZID=America/New_York:20191106T120000
DTEND;TZID=America/New_York:20191106T130000
SEQUENCE:0
SUMMARY:[Theory Seminar] Jiapeng Zhang
URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-jiapeng-zha
ng/
X-COST-TYPE:free
END:VEVENT
BEGIN:VEVENT
UID:ai1ec-310@www.cs.jhu.edu/~mdinitz/theory
DTSTAMP:20220124T035552Z
CATEGORIES:
CONTACT:
DESCRIPTION:E[||X \\wt{\\beta} – X\\beta^*||_2^2] \\leq \\eps ||X \\beta^* – y||_2^2.

\nThis improves on the best previous result of O(d \\log d
+ d/\\eps) from leverage score sampling. We also present results for the
*inductive* setting\, showing when \\wt{\\beta} will generalize to fr
esh samples\; these apply to continuous settings such as polynomial regres
sion. Finally\, we show how the techniques yield improved results for the
non-linear sparse Fourier transform setting.

\n

Bio: Xue Chen is broadly interested in randomized algorithms and the use of randomness in computati on. Specific areas include Fourier transform\, learning theory and optimiz ation\, and pseudorandomness. He obtained his Ph.D. at the University of T exas at Austin\, under the supervision of David Zuckerman. Currently\, he is a postdoctoral fellow in Northwestern University.

DTSTART;TZID=America/New_York:20190227T120000 DTEND;TZID=America/New_York:20190227T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Xue Chen URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-xue-chen/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-282@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Rediet Abebe

\nAffiliation: Cornell University

Title: Using Search Queries to Understand Health Information Need s in Africa

\nAbstract:

\nAccess to healthcare and health infor
mation is of major global

\nconcern. The stark inequality in the avai
lability of health data by

\ncountry\, demographic groups\, and socio
economic status impedes the

\nidentification of major public health c
oncerns and implementation of

\neffective interventions. This data ga
p ranges from basic disease

\nstatistics\, such as disease prevalence
rates\, to more nuanced

\ninformation\, such as public attitudes. A
key challenge is

\nunderstanding health information needs of under-se
rved and

\nmarginalized communities. Without understanding people’s e
veryday

\nneeds\, concerns\, and misconceptions\, health organization
s lack the

\nability to effectively target education and programming
efforts.

In this presentation\, we focus on the lack of comprehens
ive\,

\nhigh-quality data about information needs of individuals in d
eveloping

\nnations. We propose an approach that uses search data to
uncover

\nhealth information needs of individuals in all 54 nations i
n Africa.

\nWe analyze Bing searches related to HIV/AIDS\, malaria\,
and

\ntuberculosis\; these searches reveal diverse health information
needs

\nthat vary by demographic groups and geographic regions. We a
lso shed

\nlight on discrepancies in the quality of content returned
by search

\nengines.

We conclude with a discussion on computa
tionally-informed

\ninterventions both on- and off-line in health and
related domains and

\nthe Mechanism Design for Social Good research
initiative.

Bio:

\nRediet Abebe is a computer scientist with
a strong interest in the

\npromotion of equality and justice. Her res
earch focuses on algorithms\,

\nAI\, and their applications to social
good. As part of this research

\nagenda\, she co-founded and co-orga
nizes Mechanism Design for Social

\nGood (MD4SG)\, an interdisciplina
ry\, multi-institutional research

\ninitiative with over 300 individu
als. She is also a co-founder and

\nco-organizer of Black in AI\, an
international network of over 1000

\nindividuals focused on increasin
g the presence and inclusion of Black

\nand African researchers in AI
. Her research is deeply influenced by

\nher upbringing in her hometo
wn of Addis Ababa\, Ethiopia\, where she

\nlived until moving to the
U.S. in 2009. Her work has been generously

\nsupported by fellowships
and scholarships through Facebook\, Google\,

\nthe Cornell Graduate
School\, and the Harvard-Cambridge Fellowship.

Speaker: Grigory Yaroslavtsev

\nAffiliation: Indiana Un
iversity\, Bloomington

Title: Advances in Hierarchical Clustering
for Vector Data

\nAbstract:

\nCompared to the highly successful
flat clustering (e.g. k-means)\, despite its important role and applicatio
ns in data analysis\, hierarchical clustering has been lacking in rigorous
algorithmic studies until late due to absence of rigorous objectives. Sin
ce 2016\, a sequence of works has emerged and gave novel algorithms for th
is problem in the general metric setting. This was enabled by a breakthrou
gh by Dasgupta\, who introduced a formal objective into the study of hiera
rchical clustering.

In this talk I will give an overview of our re cent progress on models and scalable algorithms for hierarchical clusterin g applicable specifically to high-dimensional vector data. I will first di scuss various linkage-based algorithms (single-linkage\, average-linkage) and their formal properties with respect to various objectives. I will the n introduce a new projection-based approximation algorithm for vector data . The talk will be self-contained and doesn’t assume prior knowledge of cl ustering methods.

\nBased on joint works with Vadapalli (ICML’18) an d Charikar\, Chatziafratis and Niazadeh (AISTATS’19)

DTSTART;TZID=America/New_York:20190306T120000 DTEND;TZID=America/New_York:20190306T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Grigory Yaroslavtsev URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-grigory-yar oslavtsev/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-286@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Arka Rai Choudhury

\nAffiliation: Johns Hopkin
s University

Title: Finding a Nash Equilibrium is No Easier than B reaking Fiat-Shamir

\nAbstract:

\nThe Fiat-Shamir heuristic tra
nsforms a public-coin interactive proof into a non-interactive argument\,
by replacing the verifier with a cryptographic hash function that is appli
ed to the protocol’s transcript. Constructing hash functions for which thi
s transformation is sound is a central and long-standing open question in
cryptography.

We show that solving the END-OF-METERED-LINE problem is no easier than breaking the soundness of the Fiat-Shamir transformatio n when applied to the sumcheck protocol. In particular\, if the transforme d protocol is sound\, then any hard problem in #P gives rise to a hard dis tribution in the class CLS\, which is contained in PPAD. Our result opens up the possibility of sampling moderately-sized games for which it is hard to find a Nash equilibrium\, by reducing the inversion of appropriately c hosen one-way functions to #SAT.

\nOur main technical contribution i s a stateful incrementally verifiable procedure that\, given a SAT instanc e over n variables\, counts the number of satisfying assignments. This is accomplished via an exponential sequence of small steps\, each computable in time poly(n). Incremental verifiability means that each intermediate st ate includes a sumcheck-based proof of its correctness\, and the proof can be updated and verified in time poly(n).

\nJoint work with Pavel Hu bacek\, Chethan Kamath\, Krzysztof Pietrzak\, Alon Rosen\, and Guy Rothblu m.

DTSTART;TZID=America/New_York:20190327T120000 DTEND;TZID=America/New_York:20190327T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Arka Rai Choudhury URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-arka-rai-ch oudhury/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-289@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Amitabh Basu

\nAffiliation: JHU

Title: Admissibility of solution estimators for stochastic optimization

\n
Abstract:

\nWe look at stochastic optimization problems through the l
ens of statistical decision theory. In particular\, we address admissibili
ty\, in the statistical decision theory sense\, of the natural sample aver
age estimator for a stochastic optimization problem (which is also known a
s the empirical risk minimization (ERM) rule in learning literature). It i
s well known that for general stochastic optimization problems\, the sampl
e average estimator may not be admissible. This is known as Stein’s parado
x in the statistics literature. We show in this paper that for optimizing
stochastic linear functions over compact sets\, the sample average estimat
or is admissible.

Speaker: Daniel Reichman

\nAffiliation: Princeton Unive
rsity

Title: Contagious sets in bootstrap percolation

\nAbs
tract:

\nConsider the following activation process in undirected grap
hs: a vertex is active either if it belongs to a set of initially activate
d vertices or if at some point it has at least r active neighbors. This pr
ocess (commonly referred to as bootstrap percolation) has been studied in
several fields including combinatorics\, computer science\, statistical ph
ysics and viral marketing. A contagious set is a set whose activation resu
lts with the entire graph being active. Given a graph G\, let m(G\,r) be t
he minimal size of a contagious set.

I will survey upper and lower bounds for m(G\,r) in general graphs and for graphs with special properti es (random and pseudo-random graphs\, graphs without short cycles) and dis cuss many unresolved questions.

\nBased on joint work with Amin Coj a-Oghlan\, Uriel Feige and Michael Krivelevich.

DTSTART;TZID=America/New_York:20190417T120000 DTEND;TZID=America/New_York:20190417T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Daniel Reichman URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-daniel-reic hman/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-292@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Xuan Wu

\nAffiliation: Johns Hopkins

T itle: Coreset for Ordered Weighted Clustering

\nAbstract:

\nOrdered k-Median is a generalization of classical clustering problems such as k-Median and k-Center\, that offers a more flexible data analysis\, li ke easily combining multiple objectives (e.g.\, to increase fairness or fo r Pareto optimization). Its objective function is defined via the Ordered Weighted Averaging (OWA) paradigm of Yager (1988)\, where data points are weighted according to a predefined weight vector\, but in order of their c ontribution to the objective (distance from the centers).

\nCorese t is a powerful data-reduction technique which summarizes the data set int o a small (weighted) point set while approximately preserving the objectiv e value of the data set to all centers. When there are multiple objectives (weights)\, the above standard coreset might have limited usefulness\, wh ereas in a \\emph{simultaneous} coreset\, which was introduced recently by Bachem and Lucic and Lattanzi (2018)\, the above approximation holds for all weights (in addition to all centers). Our main result is the first con struction of simultaneous coreset for the Ordered k-Median problem of sma ll size.

\nIn this talk\, I will introduce the basics of coreset c onstruction for the clustering problem and the main ideas of our new resul ts. Finally\, we discuss some remaining open problems.

\nThis talk is based on joint work with Vladimir Braverman\, Shaofeng Jiang\, and Robe rt Krauthgamer.

DTSTART;TZID=America/New_York:20190501T120000 DTEND;TZID=America/New_York:20190501T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Xuan Wu URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-xuan-wu/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-303@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Christopher Musco

\nAffiliation: NYU

T itle: Structured Covariance Estimation

\nAbstract:

\nGiven acce
ss to samples from a distribution D over d-dimensional vectors\, how many
samples are necessary to learn the distribution’s covariance matrix\, T? M
oreover\, how can we leverage a priori knowledge about T’s structure to re
duce this sample complexity?

I will discuss this fundamental stati stical problem in the setting where T is known to have Toeplitz structure. Toeplitz covariance matrices arise in countless signal processing applica tions\, from wireless communications\, to medical imaging\, to time series analysis. In many of these applications\, we are interested in learning a lgorithms that only view a subset of entries in each d-dimensional vector sample from D. We care about minimizing two notions of sample complexity 1) the total number of vector samples taken and 2) the number of entries a ccessed in each vector sample. The later goal typically equates to minimiz ing equipment or hardware requirements.

\nI will present several new non-asymptotic bounds on these sample complexity measures. We will start by taking a fresh look at classical and widely used algorithms\, including methods based on selecting entries from each sample according to a “spars e ruler”. Then\, I will introduce a novel sampling and estimation strategy that improves on existing methods in many settings. Our new approach for learning Toeplitz structured covariance utilizes tools from random matrix sketching\, leverage score sampling for continuous signals\, and sparse Fo urier transform algorithms. It fits into a broader line of work which seek s to address fundamental problems in signal processing using tools from th eoretical computer science and randomized numerical linear algebra.

\n< p>Bio:\nChristopher Musco is an Assistant Professor in the Computer Science and Engineering department at NYU’s Tandon School of Engineering. His research focuses on the algorithmic foundations of data science and ma chine learning. Christopher received his Ph.D. in Computer Science from th e Massachusetts Institute of Technology and B.S. degrees in Applied Mathem atics and Computer Science from Yale University. DTSTART;TZID=America/New_York:20191011T123000 DTEND;TZID=America/New_York:20191011T133000 SEQUENCE:0 SUMMARY:[Theory Seminar] Christopher Musco URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-christopher -musco/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-296@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:

Speaker: Guy Kortsarz

\nAffiliation: Rutgers Universty
– Camden

Title: A survey on the Directed Steiner tree problem

\nAbstract:

\nThe directed Steiner problem is one of the most import
ant problems in optimization\, and in particular is more general than Grou
p Steiner and other problems.

I will discuss the (by now classic) 1/\\epsilon^3 n^epsilon approximation for the problem by Charikar et al (t he algorithm was invented by Kortsarz and Peleg and is called recursive gr eedy. A technique who people in approximation should know). The running t ime is more than n^{1/epsilon}. One of the most important open questions in Approximation Algorithms is if there is a polynomial time polylog rati o for this problem. This is open from 1997.

\nI will discuss the Gro up Steiner problem ( a special case of the Directed Steiner problem) and t he Directed Steiner Forest (a generalization of the Directed Steiner probl em) and many more related problems.

DTSTART;TZID=America/New_York:20191016T120000 DTEND;TZID=America/New_York:20191016T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Guy Kortsarz URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-guy-kortsar z/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-294@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Jiapeng Zhang

\nAffiliation: Harvard Universit
y

Title:An improved sunflower bound< /p>\n

Abstract:

\n\nJoint
work with Ryan Alweiss\, Shachar Lovett and Kewen Wu.

\n
\n

\n\nA sunflower with $r$ petals is a coll
ection of $r$ sets so that the intersection of each pair is equal to the i
ntersection of all. Erdos and Rado in 1960 proved the sunflower lemma: for any fixed $r$\, any family of sets of size
$w$\, with at least about $w^w$ sets\, must contain a sunflower. The famous sunflower co
njecture is that the bound on the number of sets can be improved to $c^w$
for some constant $c$. Despite much research\, the best bounds until recen
tly were all of the order of $w^{cw}$ for some constant c. In this work\,
we improve the bounds to about $(log w)^w$.

\nSpeaker: Robert Krauthgamer

\nAffiliation: Weizmann Ins
titute of Science

Title: On Solving Linear Systems in Sublinear Ti me

\nAbstract:

\nI will discuss sublinear algorithms that solve
linear systems locally. In

\nthe classical version of this problem\,
the input is a matrix S and a vector

\nb in the range of S\, and the
goal is to output a vector x satisfying Sx=b.

We focus on computi
ng (approximating) one coordinate of x\, which potentially

\nallows f
or sublinear algorithms. Our results show that there is a

\nqualitati
ve gap between symmetric diagonally dominant (SDD) and the more

\ngen
eral class of positive semidefinite (PSD) matrices. For SDD matrices\, we<
br />\ndevelop an algorithm that runs in polylogarithmic time\, provided t
hat S is

\nsparse and has a small condition number (e.g.\, Laplacian
of an expander

\ngraph). In contrast\, for certain PSD matrices with
analgous assumptions\, the

\nrunning time must be at least polynomial
.

Joint work with Alexandr Andoni and Yosef Pogrow.

DTSTART;TZID=America/New_York:20191108T120000 DTEND;TZID=America/New_York:20191108T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Robert Krauthgamer URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-robert-krau thgamer/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-314@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Yasamin Nazari

\nAffiliation: Johns Hopkins Un
iversity

Title: Sparse Hopsets in Congested Clique

\nAbstract:

\nWe give the first Cong
ested Clique algorithm that computes a sparse hopset with polylogarithmic
hopbound in polylogarithmic time. Given a graph G=(V
\,E)\, a (β\,ϵ)-hopset H with “hopbound” β\, is a set of edges added to <
span id='MathJax-Element-5-Frame' class='MathJax' tabindex='0'>G such that f
or any pair of nodes u and v in ~~G~~~~ th
ere is a path with at most β hops in G∪H with length within (1+ϵ) of the shortest path between u~~ and v<
/span> in G
.

\n\nOur hopsets are significantly sparser than t
he recent construction of Censor-Hillel et al. [6]\, that constructs a hop
set of size O
̃ (n^{3
/2})\, but with a smal
ler polylogarithmic hopbound. On the other hand\, the previously known con
structions of sparse hopsets with polylogarithmic hopbound in the Congeste
d Clique model\, proposed by Elkin and Neiman [10]\,[11]\,[12]\, all requi
re polynomial rounds.

\n\nOne tool that we use is an efficient a
lgorithm that constructs an ℓ-limited neighborhood cover\, that may be of independent
interest.

\n\nFinally\, as a side result\, we also give a hopse
t construction in a variant of the low-memory Massively Parallel Computati
on model\, with improved running time over existing algorithms.

DTSTART;TZID=America/New_York:20191204T120000
DTEND;TZID=America/New_York:20191204T130000
SEQUENCE:0
SUMMARY:[Theory Seminar] Yasamin Nazari
URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-yasamin-naz
ari-2/
X-COST-TYPE:free
END:VEVENT
BEGIN:VEVENT
UID:ai1ec-315@www.cs.jhu.edu/~mdinitz/theory
DTSTAMP:20220124T035552Z
CATEGORIES:
CONTACT:
DESCRIPTION:Speaker: Richard Shea

\nAffiliation: Applied and Compu tational Math program\, Johns Hopkins University

\nTitle: Progress t owards building a Dynamic Hawkes Graph

\nAbstract:

\nThis talk will introduce the Dynamic Hawke
s Graph. It builds from developments in multivariate Hawkes Processes\, l
ocally stable Hawkes distributions\, and representations of the Hawkes pro
cess as an Integer Valued autoregressive (INAR) fit. The background to en
able understanding of the Dynamic Hawkes Graph will be the bulk of the tal
k. Richard is presenting this as part of his Master’s thesis.

DTSTART;TZID=America/New_York:20191211T120000
DTEND;TZID=America/New_York:20191211T130000
SEQUENCE:0
SUMMARY:[Theory Seminar] Richard Shea
URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-richard-she
a/
X-COST-TYPE:free
END:VEVENT
BEGIN:VEVENT
UID:ai1ec-317@www.cs.jhu.edu/~mdinitz/theory
DTSTAMP:20220124T035552Z
CATEGORIES:
CONTACT:
DESCRIPTION:Speaker: Aditya Krishnan

\nAffiliation: Johns Hopkins U
niversity

Title: Schatten Norms in Matrix Streams: The Role of Spa rsity

\nAbstract:

\nSpectral functions of large matrices contai
n important structural information about the underlying data\, and are thu
s becoming increasingly important to efficiently compute. Many times\, lar
ge matrices representing real-world data are *sparse* or *doubly
sparse* (i.e.\, sparse in both rows and columns)\, and are accessed a
s a *stream* of updates\, typically organized in *row-order*
. In this setting\, where space (memory) is the limiting resource\, all kn
own algorithms require space that is polynomial in the dimension of the ma
trix\, even for sparse matrices. We address this challenge by providing th
e first algorithms whose space requirement is *independent of the matri
x dimension*\, assuming the matrix is doubly-sparse and presented in r
ow-order.

In addition\, we prove that multiple passes are unavoida ble in this setting and show extensions of our primary technique\, includi ng a trade-off between space requirements and number of passes. Our algori thms approximate the Schatten p-norms\, which we use in turn to approximat e other spectral functions\, such as logarithm of the determinant\, trace of matrix inverse\, and Estrada index.

\nJoint work with Vladimir Br averman\, Robert Krauthgamer and Roi Sinoff.

DTSTART;TZID=America/New_York:20200311T120000 DTEND;TZID=America/New_York:20200311T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Aditya Krishnan URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-aditya-kris hnan/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-305@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Arnold Filtser

\nAffiliation: Columbia Univers
ity

Title: TBD

\nAbstract: TBD

Speaker: Jasper Lee

\nAffiliation: Brown University

Title: Real-Valued Sub-Gaussian Mean Estimation\, Optimal to a (1+o(1 )) Multiplicative Factor

\nAbstract:

\nWe revisit one of the mo
st fundamental problems in statistics: given access to independent samples
from a 1D random variable (with finite but unknown mean and variance)\, w
hat is the best way to estimate the mean\, in terms of error convergence w
ith respect to sample size? The conventional wisdom is to use the sample m
ean as our estimate. However it is known that the sample mean has optimal
convergence if and only if the underlying distribution is (sub-)Gaussian.
Moreover\, it can even be exponentially slower than optimal for certain he
avy-tailed distributions. On the other hand\, the median-of-means estimato
r (invented and reinvented in various literature) does have sub-Gaussian c
onvergence for all finite-variance distributions\, albeit in the big-O sen
se with a sub-optimal multiplicative constant. The natural remaining quest
ion then\, is whether it is possible to bridge the gap\, to have an estima
tor that has optimal sub-Gaussian concentration with an optimal constant\,
for all finite-variance distributions.

In this talk\, we answer t he question affirmatively by giving an estimator that converges with the o ptimal constant inside the big-O\, up to a (1+o(1)) multiplicative factor. Our estimator is furthermore computable in time linear in the sample size . The convergence analysis involves deriving tail bounds using tools from linear and convex programming\, which may be of independent interest.

\nJoint work with Paul Valiant.

DTSTART;TZID=America/New_York:20200916T120000 DTEND;TZID=America/New_York:20200916T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Jasper Lee URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-jasper-lee/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-324@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Edinah Gnang

\nAffiliation: Johns Hopkins Univ
ersity

\nTitle: On the Kotzig-Ringel-Rosa conjecture

Abstract
:

\nIn this talk we describe and motivate the K.R.R. conjecture graph
partitioning and describe a functional approach enabling us to tackle the
K.R.R. conjecture via a new composition lemma. If time permits I will al
so discuss algorithmic aspects.

Speaker: Aditya Krishnan

\nAffiliation: Johns Hopkins U
niversity

Title: Schatten Norms in Matrix Streams: The Role of Spa rsity.

\nAbstract: Spectral functions of large matrices contain i mportant structural information about the underlying data\, and are thus b ecoming increasingly important to efficiently compute. Many times\, large matrices representing real-world data are \\emph{sparse} or \\emph{doubly sparse} (i.e.\, sparse in both rows and columns)\, and are accessed as a \ \emph{stream} of updates\, typically organized in \\emph{row-order}. In th is setting\, where space (memory) is the limiting resource\, all known alg orithms require space that is polynomial in the dimension of the matrix\, even for sparse matrices. We address this challenge by providing the first algorithms whose space requirement is \\emph{independent of the matrix di mension}\, assuming the matrix is doubly-sparse and presented in row-order .

\nIn addition\, we prove that multiple passes are unavoidable in this setting and show extensions of our primary technique\, including a tr ade-off between space requirements and number of passes. Our algorithms ap proximate the Schatten p-norms\, which we use in turn to approximate other spectral functions\, such as logarithm of the determinant\, trace of matr ix inverse\, and Estrada index.

\nJoint work with Vladimir Braverma n\, Robert Krauthgamer and Roi Sinoff.

DTSTART;TZID=America/New_York:20201007T120000 DTEND;TZID=America/New_York:20201007T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Aditya Krishnan URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-aditya-kris hnan-2/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-331@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Brian Brubach

\nAffiliation: Wellesley College

Title: Online matching under three layers of uncertainty

\n
Abstract:

\nOnline matching problems have become ubiquitous with the
rise of the internet and e-commerce. From the humble beginnings of a singl
e problem 30 years ago\, the theoretical study of online matching now enco
mpasses dozens of variations inspired by diverse applications. We’ll take
a tour through the landscape of online matching problems. As we go\, we’ll
venture deeper and deeper into the jungle of stochasticity and uncertaint
y. Finally\, we’ll cover some very recent work introducing new algorithms
and models. Along the way\, I’ll highlight techniques and tricks I’ve foun
d useful in studying these problems.

Speaker: Joshua Grochow

\nAffiliation: University of Co
lorado

Title: Isomorphism of tensors\, algebras\, and polynomials\ , oh my!

\nAbstract: We consider the problems of testing isomorphism
of tensors\, p-groups\, cubic polynomials\, quantum states\, algebras\, a
nd more\, which arise from a variety of areas\, including machine learning
\, group theory\, and cryptography. Despite Graph Isomorphism now being in
quasi-polynomial time\, and having long had efficient practical software\
, for the problems we consider no algorithm is known that is asymptoticall
y better than brute force\, and state-of-the-art software cannot get beyon
d small instances. We approach this situation in two ways:

\n – Compl
exity-theoretic: we show that all these problems are polynomial-time equiv
alent\, giving rise to a class of problems we call Tensor Isomorphism-comp
lete (TI-complete). Perhaps surprising here is that we show that isomorphi
sm of d-tensors for any fixed d (at least 3) is equivalent to testing isom
orphism of 3-tensors. These equivalences let us focus on just one problem
rather than dozens\, or a whole infinite hierarchy\, separately.

\n –
Algorithmic: Adapting the Weisfeiler-Leman method from Graph Isomorphism
to tensors\, trying to understand tensor isomorphism by taking advantage o
f isomorphisms of small sub-tensors. This leads to tensor analogues of the
Graph Reconstruction conjecture and related questions.

Based on j oint works with Vyacheslav V. Futorny and Vladimir V. Sergeichuk (Lin. Alg . Appl.\, 2019\; arXiv:1810.09219)\, with Peter A. Brooksbank\, Yinan Li\, Youming Qiao\, and James B. Wilson (arXiv:1905.02518)\, and with Youming Qiao (arXiv:1907.00309).

DTSTART;TZID=America/New_York:20201028T120000 DTEND;TZID=America/New_York:20201028T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Joshua Grochow URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-joshua-groc how/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-338@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Jingfeng Wu

\nAffiliation: Johns Hopkins Unive
rsity

Title: Direction Matters: On the Implicit Regularization Eff ect of Stochastic Gradient Descent with Moderate Learning Rate

\nAbs
tract:

\nUnderstanding the algorithmic regularization effect of stoch
astic gradient descent (SGD) is one of the key challenges in modern machin
e learning and deep learning theory. Most of the existing works\, however\
, focus on very small or even infinitesimal learning rate regime\, and fai
l to cover practical scenarios where the learning rate is moderate and ann
ealing. In this paper\, we make an initial attempt to characterize the par
ticular regularization effect of SGD in the moderate learning rate regime
by studying its behavior for optimizing an overparameterized linear regres
sion problem. In this case\, SGD and GD are known to converge to the uniqu
e minimum-norm solution\; however\, with the moderate and annealing learni
ng rate\, we show that they exhibit different directional bias: SGD conver
ges along the large eigenvalue directions of the data matrix\, while GD go
es after the small eigenvalue directions. Furthermore\, we show that such
directional bias does matter when early stopping is adopted\, where the SG
D output is nearly optimal but the GD output is suboptimal. Finally\, our
theory explains several folk arts in practice used for SGD hyperparameter
tuning\, such as (1) linearly scaling the initial learning rate with batch
size\; and (2) overrunning SGD with high learning rate even when the loss
stops decreasing.

Speaker: Arnold Filtser

\nAffiliation: Columbia Univers
ity

Title: Scattering and Sparse Partitions\, and their Applicatio
ns

\nAbstract:

\nA partition $\\mathcal{P}$ of a weighted graph
$G$ is $(\\sigma\,\\tau\,\\Delta)$-sparse if every cluster has diameter at
most $\\Delta$\, and every ball of radius $\\Delta/\\sigma$ intersects at
most $\\tau$ clusters. Similarly\, $\\mathcal{P}$ is $(\\sigma\,\\tau\,\
\Delta)$-scattering if instead for balls we require that every shortest pa
th of length at most $\\Delta/\\sigma$ intersects at most $\\tau$ clusters
. Given a graph $G$ that admits a $(\\sigma\,\\tau\,\\Delta)$-sparse part
ition for all $\\Delta>0$\, Jia et al. [STOC05] constructed a solution for
the Universal Steiner Tree problem (and also Universal TSP) with stretch
$O(\\tau\\sigma^2\\log_\\tau n)$. Given a graph $G$ that admits a $(\\sig
ma\,\\tau\,\\Delta)$-scattering partition for all $\\Delta>0$\, we constru
ct a solution for the Steiner Point Removal problem with stretch $O(\\tau^
3\\sigma^3)$. We then construct sparse and scattering partitions for vari
ous different graph families\, receiving new results for the Universal Ste
iner Tree and Steiner Point Removal problems.

Speaker: Yu Zheng

\nAffiliation: Johns Hopkins Universi
ty

Title: Space Efficient Deterministic Approximation of String Me asures

\nAbstract:

\nWe study approximation algorithms for the
following three string measures that are widely used in practice: edit dis
tance (ED)\, longest common subsequence (LCS)\, and longest increasing seq
uence (LIS). All three problems can be solved exactly by standard algorith
ms that run in polynomial time with roughly $\\Theta(n)$ space\, where $n$
is the input length\, and our goal is to design deterministic approximati
on algorithms that run in polynomial time with significantly smaller space
.

Towards this\, we design several algorithms that achieve $1+\\ep s$ or $1-\\eps$ approximation for all three problems\, where $\\eps>0$ can be any constant and even slightly sub constant. Our algorithms are flexib le and can be adjusted to achieve the following two regimes of parameters: 1) space $n^{\\delta}$ for any constant $\\delta>0$ with running time ess entially the same as or slightly more than the standard algorithms\; and 2) space $\\mathsf{polylog}(n)$ with (a larger) polynomial running time\, which puts the approximation versions of the three problems in Steve’s cla ss (SC). Our algorithms significantly improve previous results in terms of space complexity\, where all known results need to use space at least $\\ Omega(\\sqrt{n})$. Some of our algorithms can also be adapted to work in t he asymmetric streaming model [SS13]\, and output the corresponding sequen ce. Furthermore\, our results can be used to improve a recent result by Fa rhadi et. al. [FHRS20] about approximating ED in the asymmetric streaming model\, reducing the running time from being exponential in [FHRS20] to a polynomial.

\nOur algorithms are based on the idea of using recursi on as in Savitch’s theorem [Sav70]\, and a careful adaption of previous te chniques to make the recursion work. Along the way we also give a new logs pace reduction from longest common subsequence to longest increasing seque nce\, which may be of independent interest.

DTSTART;TZID=America/New_York:20201202T120000 DTEND;TZID=America/New_York:20201202T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Yu Zheng URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-yu-zheng/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-341@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Xuan Wu

\nAffiliation: Johns Hopkins Universit
y

Title: Coreset for Clustering in Graph Metrics.

\nAbstract
:

\nClustering is a fundamental task in machine learning. As the incr
easing demand for running machine learning algorithms in the huge data set
s\, classic clustering algorithms were found not to scale well. To this en
d\, coreset is introduced as a powerful data reduction technique that turn
s a huge dataset into a tiny proxy. Coresets have been also successfully a
pplied to streaming and distributed computing. Coresets for clustering in
Euclidean spaces have been very well studied. However\, very few results
were known about the non-Euclidean space. Beyond Euclidean\, graph metrics
is a very important family of metric space. In this talk\, I will cover m
y recent work on coreset for k-clustering in graph metrics\, including bou
nded treewidth graph and excluded-minor graph.

Speaker: Enayat Ullah

\nAffiliation: Johns Hopkins Univ
ersity

Title: Machine unlearning via algorithmic stability

\nAbstract: We study the problem of machine unlearning\, and identify a not ion of algorithmic stability\, Total Variation (TV) stability\, which we a rgue\, is suitable for the goal of exact efficient unlearning. For convex risk minimization problems\, we design TV-stable algorithms based on noisy Stochastic Gradient Descent (SGD). Our key contribution is the design of corresponding efficient unlearning algorithms\, which are based on constru cting a (maximal) coupling of Markov chains for the noisy SGD procedure. T o understand the trade-offs between accuracy and unlearning efficiency\, w e give upper and lower bounds on excess empirical and population risk of T V stable algorithms for convex risk minimization. Our techniques generaliz e to arbitrary non-convex functions\, and our algorithms are differentiall y private as well.

DTSTART;TZID=America/New_York:20210217T120000 DTEND;TZID=America/New_York:20210217T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Enayat Ullah URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-enayat-ulla h/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-348@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Thomas Lavastida

\nAffiliation: Carnegie Mello
n University

Title: Combinatorial Optimization Augmented with Mach ine Learning

\nAbstract:

\nCombinatorial optimization often fo cuses on optimizing for the worst-case. However\, the best algorithm to us e depends on the “relevant inputs”\, which is application specific and oft en does not have a formal definition.

\nThe talk gives a new theor etical model for designing algorithms that are tailored to inputs for the application at hand. In the model\, learning is performed on past problem instances to make predictions on future instances. These predictions are i ncorporated into the design and analysis of the algorithm. The prediction s can be used to achieve “instance-optimal” algorithm design when the pred ictions are accurate and the algorithm’s performance gracefully degrades w hen there is error in the prediction.

\nThe talk will apply this fra mework to online algorithm design and give algorithms with theoretical per formance that goes beyond worst-case analysis. The majority of the talk wi ll focus on load balancing with restricted assignments.

DTSTART;TZID=America/New_York:20210224T120000 DTEND;TZID=America/New_York:20210224T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Thomas Lavastida URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-thomas-lava stida/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-349@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Hung Le

\nAffiliation: University of Massachus
etts\, Amherst

Title: Reliable Spanners: Locality-Sensitive Orderi ngs Strike Back

\nAbstract:

\nA highly desirable property of n
etworks is robustness to failures.

\nConsider a metric space $(X\,d_X
)$\, a graph $H$ over $X$ is a $\\vartheta$-reliable $t$-spanner if\, for
every set of failed vertices $B\\subset X$\, there is a superset $B^+\\sup
seteq B$ such that the induced subgraph $H[X\\setminus B]$ preserves all t
he distances between points in $X\\setminus B^+$ up to a stretch factor $t
$\, while the expected size of $B^+$ is as most $(1+\\vartheta)|B|$. Such
a spanner could withstand a catastrophe: failure of even $90\\%$ of the ne
twork.

Buchin\, Har-Peled\, and Olah showed how to construct a spa rse reliable spanner for Euclidean space from Euclidean locality-sensitive orderings\, an object introduced by Chan\, Har-Peled\, and Jones. In this talk\, we extend their approach to non-Euclidean metric spaces by general izing the ordering of Chan\, Har-Peled\, and Jones to doubling metrics and introducing new types of locality-sensitive orderings for other metric sp aces. We also show how to construct reliable spanners from the newly intro duced locality-sensitive orderings via reliable 2-hop spanners for paths. The highlight of our results is that the number of edges in our spanner ha s no dependency on the spread.

DTSTART;TZID=America/New_York:20210303T120000 DTEND;TZID=America/New_York:20210303T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Hung Le URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-hung-le/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-356@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Teodor Marinov

\nAffiliation: Johns Hopkins Un
iversity

Title: Beyond Value-Function Gaps: Improved Instance-Depe ndent Regret Bounds for Episodic Reinforcement Learning

\nAbstract:< br />\nReinforcement Learning (RL) is a general scenario where agents inte ract with the environment to achieve some goal. The environment and an age nt’s interactions are typically modeled as a Markov decision process (MDP) \, which can represent a rich variety of tasks. But\, for which MDPs can a n agent or an RL algorithm succeed? This requires a theoretical analysis o f the complexity of an MDP. In this talk I will present information-theore tic lower bounds for a large class of MDPs. The lower bounds are based on studying a certain semi-infinite LP. Further\, I will show that existing a lgorithms enjoy tighter gap-dependent regret bounds (similar to the stocha stic multi-armed bandit problem)\, however\, they are unable to achieve th e information-theoretic lower bounds even in deterministic transition MDPs \, unless there is a unique optimal policy.

DTSTART;TZID=America/New_York:20210310T120000 DTEND;TZID=America/New_York:20210310T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Teodor Marinov URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-teodor-mari nov/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-355@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Dominik Kempa

\nAffiliation: Johns Hopkins Uni
versity

Title: How to store massive sequence collections in a sear chable form

\nAbstract:

\nCompressed indexing is concerned with
the design and construction of data structures to store massive sequences
in space close to the size of compressed data\, while simultaneously prov
iding searching functionality (such as pattern matching) on the original u
ncompressed data. These indexes have a huge impact on the analysis of larg
e-scale DNA databases (containing hundreds of thousands of genomes) but un
til recently the size of many indexes lacked theoretical guarantees and th
eir construction remains a computational bottleneck.

In this talk I will describe my work proving theoretical guarantees on index size as a function of compressed data size\, resolving a key open question in this f ield. Related to that\, I will also describe new algorithms for converting between two conceptually distinct compressed representations\, Lempel-Ziv and the Burrows-Wheeler Transform. I will show how these new findings ena ble advanced computation directly on compressed data. I will conclude by d escribing avenues for future work\, including the new algorithms for the c onstruction of compressed indexes that have the potential to cut indexing time by further orders of magnitude\, unlocking the ability to search trul y enormous text or DNA datasets.

DTSTART;TZID=America/New_York:20210317T120000 DTEND;TZID=America/New_York:20210317T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Dominik Kempa URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-dominik-kem pa/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-353@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Audra McMillan

\nAffiliation: Apple

Ti tle: Hiding among the clones: a simple and nearly optimal analysis of priv acy amplification by shuffling

\nAbstract:

\nDifferential priva
cy (DP) is a model of privacy-preserving machine learning that has garnere
d significant interest in recent years due to its rigorous privacy guarant
ees. An algorithm differentially private if the output is stable under sma
ll changes in the input database. While DP has been adopted in a variety o
f applications\, most applications of DP in industry actually satisfy a st
ronger notion called local differential privacy. In local differential pri
vacy data subjects perturb their data before it reaches the data analyst.
While this requires less trust\, it comes a substantial cost to accuracy.
Recent work of Erlingsson\, Feldman\, Mironov\, Raghunathan\, Talwar\, and
Thakurta [EFMRTT19] demonstrated that random shuffling amplifies differen
tial privacy guarantees of locally randomized data. Such amplification imp
lies substantially stronger privacy guarantees for systems in which data i
s contributed anonymously [BEMMRLRKTS17] and has led to significant intere
st in the shuffle model of privacy [CSUZZ19\, EFMRTT19]. In this talk\, we
will discuss a new result on privacy amplification by shuffling\, which a
chieves the asymptotically optimal dependence in the local privacy paramet
er. Our result is based on a new proof strategy which is simpler than prev
ious approaches\, and extends to a lightly weaker notion known as approxim
ate differential privacy with nearly the same guarantees. Based on joint w
ork with Vitaly Feldman and Kunal Talwar (https://arxiv.org/abs/2012.12803
)

Speaker: Maryam Negahbani

\nAffiliation: Dartmouth Univ
ersity

Title: “Revisiting Priority k-Center: Fairness and Outliers

\nAbstract:

\nClustering is a fundamental unsupervised learnin
g and facility location problem extensively studied in the literature. I w
ill talk about a clustering problem called “priority k-center” introduced
by Plesnik (in Disc. Appl. Math. 1987). Given a metric space on n points
X\, with distance function d\, an integer k\, and radius r_v for each poin
t v in X\, the goal is to choose k points S as “centers” to minimize the m
aximum distance of a point v to S divided by r_v. For uniform r_v’s this i
s precisely the “k-center” problem where the objective is to minimize the
maximum distance of any point to S. In the priority version\, points with
smaller r_v are prioritized to be closer to S. Recently\, a special case o
f this problem was studied in the context of “individually fair clustering
” by Jung et al.\, FORC 2020. This notion of fairness forces S to open a
center in every “densely populated area” by setting r_v to be v’s distance
to its closest (n/k)-th neighbor.

In this talk\, I show how to ap proximate priority k-center with outliers: When for a given integer z\, yo u are allowed to throw away z points as outliers and minimize the objectiv e over the rest of the points. We show there is 9-approximation\, which is morally a 5\, if you have constant many types of radii or if your radii a re powers of 2. This is via an LP-aware reduction to min-cost max-flow and is general enough that could handle Matroid constraints on facilities (wh ere instead of asking to pick at most k facilities\, you are asked to pick facilities that are independent in a given matroid). Things become quite interesting for priority knapsack-center with outliers: where opening each center costs something and you have a limited budget to spend on your sol ution S. In this case\, we do not know how to solve the corresponding flow problem\, so we alter our reduction to reduce to a simpler problem we do know how to solve taking a hit of +5 in the approximation ratio. There are still many open problems in this work\, in addition to solving the flow p roblem in the knapsack case\, the best LP integrality gap we know for prio rity k-center with outliers is 3.

DTSTART;TZID=America/New_York:20210331T120000 DTEND;TZID=America/New_York:20210331T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Maryam Negahbani URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-maryam-nega hbani/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-346@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Leonidas Tsepenekas

\nAffiliation: University
of Maryland

Title: Approximating Two-Stage Stochastic Supplier Pro blems

\nAbstract:

\nThe main focus of this talk will be radius-
based (supplier) clustering in the two-stage stochastic setting with recou
rse\, where the inherent stochasticity of the model comes in the form of a
budget constraint. Our eventual goal is to provide results in the most ge
neral distributional setting\, where there is only black-box access to the
underlying distribution. To that end\, we follow a two-step approach. Fir
st\, we develop algorithms for a restricted version of the problem\, in wh
ich all possible scenarios are explicitly provided\; second\, we employ a
novel scenario-discarding variant of the standard Sample Average Approxima
tion (SAA) method\, in which we also crucially exploit structural properti
es of the algorithms developed for the first step of the framework. In thi
s way\, we manage to generalize the results of the latter to the black-box
model. Finally\, we note that the scenario-discarding modification to the
SAA method is necessary in order to optimize over the radius.

Pap er: https://arxiv.org/abs/2008.03325

DTSTART;TZID=America/New_York:20210407T120000 DTEND;TZID=America/New_York:20210407T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Leonidas Tsepenekas URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-leonidas-ts epenekas/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-358@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20220124T035552Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Samson Zhou

\nAffiliation: Carnegie Mellon Uni
versity

Title: Tight Bounds for Adversarially Robust Streams and S liding Windows via Difference Estimators

\nAbstract:

\nWe intro
duce *difference estimators* for data stream computation\, which pr
ovide approximations to F(v)-F(u) for frequency vectors v\,u and a given f
unction F. We show how to use such estimators to carefully trade error for
memory in an iterative manner. The function F is generally non-linear\, a
nd we give the first difference estimators for the frequency moments F_p f
or p between 0 and 2\, as well as for integers p>2. Using these\, we resol
ve a number of central open questions in adversarial robust streaming and
sliding window models.

For both models\, we obtain algorithms for norm estimation whose dependence on epsilon is 1/epsilon^2\, which shows\, up to logarithmic factors\, that there is no overhead over the standard i nsertion-only data stream model for these problems.

DTSTART;TZID=America/New_York:20210414T120000 DTEND;TZID=America/New_York:20210414T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Samson Zhou URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-samson-zhou -3/ X-COST-TYPE:free END:VEVENT END:VCALENDAR