Speaker: Richard Shea

\nAffiliation: Applied and Compu tational Math program\, Johns Hopkins University

\nTitle: Progress t owards building a Dynamic Hawkes Graph

\nAbstract:

\nThis talk will introduce the Dynamic Hawke
s Graph. It builds from developments in multivariate Hawkes Processes\, l
ocally stable Hawkes distributions\, and representations of the Hawkes pro
cess as an Integer Valued autoregressive (INAR) fit. The background to en
able understanding of the Dynamic Hawkes Graph will be the bulk of the tal
k. Richard is presenting this as part of his Master’s thesis.

DTSTART;TZID=America/New_York:20191211T120000
DTEND;TZID=America/New_York:20191211T130000
SEQUENCE:0
SUMMARY:[Theory Seminar] Richard Shea
URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-richard-she
a/
X-COST-TYPE:free
END:VEVENT
BEGIN:VEVENT
UID:ai1ec-317@www.cs.jhu.edu/~mdinitz/theory
DTSTAMP:20221208T031335Z
CATEGORIES:
CONTACT:
DESCRIPTION:Speaker: Aditya Krishnan

\nAffiliation: Johns Hopkins U
niversity

Title: Schatten Norms in Matrix Streams: The Role of Spa rsity

\nAbstract:

\nSpectral functions of large matrices contai
n important structural information about the underlying data\, and are thu
s becoming increasingly important to efficiently compute. Many times\, lar
ge matrices representing real-world data are *sparse* or *doubly
sparse* (i.e.\, sparse in both rows and columns)\, and are accessed a
s a *stream* of updates\, typically organized in *row-order*
. In this setting\, where space (memory) is the limiting resource\, all kn
own algorithms require space that is polynomial in the dimension of the ma
trix\, even for sparse matrices. We address this challenge by providing th
e first algorithms whose space requirement is *independent of the matri
x dimension*\, assuming the matrix is doubly-sparse and presented in r
ow-order.

In addition\, we prove that multiple passes are unavoida ble in this setting and show extensions of our primary technique\, includi ng a trade-off between space requirements and number of passes. Our algori thms approximate the Schatten p-norms\, which we use in turn to approximat e other spectral functions\, such as logarithm of the determinant\, trace of matrix inverse\, and Estrada index.

\nJoint work with Vladimir Br averman\, Robert Krauthgamer and Roi Sinoff.

DTSTART;TZID=America/New_York:20200311T120000 DTEND;TZID=America/New_York:20200311T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Aditya Krishnan URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-aditya-kris hnan/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-305@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Arnold Filtser

\nAffiliation: Columbia Univers
ity

Title: TBD

\nAbstract: TBD

Speaker: Jasper Lee

\nAffiliation: Brown University

Title: Real-Valued Sub-Gaussian Mean Estimation\, Optimal to a (1+o(1 )) Multiplicative Factor

\nAbstract:

\nWe revisit one of the mo
st fundamental problems in statistics: given access to independent samples
from a 1D random variable (with finite but unknown mean and variance)\, w
hat is the best way to estimate the mean\, in terms of error convergence w
ith respect to sample size? The conventional wisdom is to use the sample m
ean as our estimate. However it is known that the sample mean has optimal
convergence if and only if the underlying distribution is (sub-)Gaussian.
Moreover\, it can even be exponentially slower than optimal for certain he
avy-tailed distributions. On the other hand\, the median-of-means estimato
r (invented and reinvented in various literature) does have sub-Gaussian c
onvergence for all finite-variance distributions\, albeit in the big-O sen
se with a sub-optimal multiplicative constant. The natural remaining quest
ion then\, is whether it is possible to bridge the gap\, to have an estima
tor that has optimal sub-Gaussian concentration with an optimal constant\,
for all finite-variance distributions.

In this talk\, we answer t he question affirmatively by giving an estimator that converges with the o ptimal constant inside the big-O\, up to a (1+o(1)) multiplicative factor. Our estimator is furthermore computable in time linear in the sample size . The convergence analysis involves deriving tail bounds using tools from linear and convex programming\, which may be of independent interest.

\nJoint work with Paul Valiant.

DTSTART;TZID=America/New_York:20200916T120000 DTEND;TZID=America/New_York:20200916T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Jasper Lee URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-jasper-lee/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-324@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Edinah Gnang

\nAffiliation: Johns Hopkins Univ
ersity

\nTitle: On the Kotzig-Ringel-Rosa conjecture

Abstract
:

\nIn this talk we describe and motivate the K.R.R. conjecture graph
partitioning and describe a functional approach enabling us to tackle the
K.R.R. conjecture via a new composition lemma. If time permits I will al
so discuss algorithmic aspects.

Speaker: Aditya Krishnan

\nAffiliation: Johns Hopkins U
niversity

Title: Schatten Norms in Matrix Streams: The Role of Spa rsity.

\nAbstract: Spectral functions of large matrices contain i mportant structural information about the underlying data\, and are thus b ecoming increasingly important to efficiently compute. Many times\, large matrices representing real-world data are \\emph{sparse} or \\emph{doubly sparse} (i.e.\, sparse in both rows and columns)\, and are accessed as a \ \emph{stream} of updates\, typically organized in \\emph{row-order}. In th is setting\, where space (memory) is the limiting resource\, all known alg orithms require space that is polynomial in the dimension of the matrix\, even for sparse matrices. We address this challenge by providing the first algorithms whose space requirement is \\emph{independent of the matrix di mension}\, assuming the matrix is doubly-sparse and presented in row-order .

\nIn addition\, we prove that multiple passes are unavoidable in this setting and show extensions of our primary technique\, including a tr ade-off between space requirements and number of passes. Our algorithms ap proximate the Schatten p-norms\, which we use in turn to approximate other spectral functions\, such as logarithm of the determinant\, trace of matr ix inverse\, and Estrada index.

\nJoint work with Vladimir Braverma n\, Robert Krauthgamer and Roi Sinoff.

DTSTART;TZID=America/New_York:20201007T120000 DTEND;TZID=America/New_York:20201007T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Aditya Krishnan URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-aditya-kris hnan-2/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-331@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Brian Brubach

\nAffiliation: Wellesley College

Title: Online matching under three layers of uncertainty

\n
Abstract:

\nOnline matching problems have become ubiquitous with the
rise of the internet and e-commerce. From the humble beginnings of a singl
e problem 30 years ago\, the theoretical study of online matching now enco
mpasses dozens of variations inspired by diverse applications. We’ll take
a tour through the landscape of online matching problems. As we go\, we’ll
venture deeper and deeper into the jungle of stochasticity and uncertaint
y. Finally\, we’ll cover some very recent work introducing new algorithms
and models. Along the way\, I’ll highlight techniques and tricks I’ve foun
d useful in studying these problems.

Speaker: Joshua Grochow

\nAffiliation: University of Co
lorado

Title: Isomorphism of tensors\, algebras\, and polynomials\ , oh my!

\nAbstract: We consider the problems of testing isomorphism
of tensors\, p-groups\, cubic polynomials\, quantum states\, algebras\, a
nd more\, which arise from a variety of areas\, including machine learning
\, group theory\, and cryptography. Despite Graph Isomorphism now being in
quasi-polynomial time\, and having long had efficient practical software\
, for the problems we consider no algorithm is known that is asymptoticall
y better than brute force\, and state-of-the-art software cannot get beyon
d small instances. We approach this situation in two ways:

\n – Compl
exity-theoretic: we show that all these problems are polynomial-time equiv
alent\, giving rise to a class of problems we call Tensor Isomorphism-comp
lete (TI-complete). Perhaps surprising here is that we show that isomorphi
sm of d-tensors for any fixed d (at least 3) is equivalent to testing isom
orphism of 3-tensors. These equivalences let us focus on just one problem
rather than dozens\, or a whole infinite hierarchy\, separately.

\n –
Algorithmic: Adapting the Weisfeiler-Leman method from Graph Isomorphism
to tensors\, trying to understand tensor isomorphism by taking advantage o
f isomorphisms of small sub-tensors. This leads to tensor analogues of the
Graph Reconstruction conjecture and related questions.

Based on j oint works with Vyacheslav V. Futorny and Vladimir V. Sergeichuk (Lin. Alg . Appl.\, 2019\; arXiv:1810.09219)\, with Peter A. Brooksbank\, Yinan Li\, Youming Qiao\, and James B. Wilson (arXiv:1905.02518)\, and with Youming Qiao (arXiv:1907.00309).

DTSTART;TZID=America/New_York:20201028T120000 DTEND;TZID=America/New_York:20201028T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Joshua Grochow URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-joshua-groc how/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-338@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Jingfeng Wu

\nAffiliation: Johns Hopkins Unive
rsity

Title: Direction Matters: On the Implicit Regularization Eff ect of Stochastic Gradient Descent with Moderate Learning Rate

\nAbs
tract:

\nUnderstanding the algorithmic regularization effect of stoch
astic gradient descent (SGD) is one of the key challenges in modern machin
e learning and deep learning theory. Most of the existing works\, however\
, focus on very small or even infinitesimal learning rate regime\, and fai
l to cover practical scenarios where the learning rate is moderate and ann
ealing. In this paper\, we make an initial attempt to characterize the par
ticular regularization effect of SGD in the moderate learning rate regime
by studying its behavior for optimizing an overparameterized linear regres
sion problem. In this case\, SGD and GD are known to converge to the uniqu
e minimum-norm solution\; however\, with the moderate and annealing learni
ng rate\, we show that they exhibit different directional bias: SGD conver
ges along the large eigenvalue directions of the data matrix\, while GD go
es after the small eigenvalue directions. Furthermore\, we show that such
directional bias does matter when early stopping is adopted\, where the SG
D output is nearly optimal but the GD output is suboptimal. Finally\, our
theory explains several folk arts in practice used for SGD hyperparameter
tuning\, such as (1) linearly scaling the initial learning rate with batch
size\; and (2) overrunning SGD with high learning rate even when the loss
stops decreasing.

Speaker: Arnold Filtser

\nAffiliation: Columbia Univers
ity

Title: Scattering and Sparse Partitions\, and their Applicatio
ns

\nAbstract:

\nA partition $\\mathcal{P}$ of a weighted graph
$G$ is $(\\sigma\,\\tau\,\\Delta)$-sparse if every cluster has diameter at
most $\\Delta$\, and every ball of radius $\\Delta/\\sigma$ intersects at
most $\\tau$ clusters. Similarly\, $\\mathcal{P}$ is $(\\sigma\,\\tau\,\
\Delta)$-scattering if instead for balls we require that every shortest pa
th of length at most $\\Delta/\\sigma$ intersects at most $\\tau$ clusters
. Given a graph $G$ that admits a $(\\sigma\,\\tau\,\\Delta)$-sparse part
ition for all $\\Delta>0$\, Jia et al. [STOC05] constructed a solution for
the Universal Steiner Tree problem (and also Universal TSP) with stretch
$O(\\tau\\sigma^2\\log_\\tau n)$. Given a graph $G$ that admits a $(\\sig
ma\,\\tau\,\\Delta)$-scattering partition for all $\\Delta>0$\, we constru
ct a solution for the Steiner Point Removal problem with stretch $O(\\tau^
3\\sigma^3)$. We then construct sparse and scattering partitions for vari
ous different graph families\, receiving new results for the Universal Ste
iner Tree and Steiner Point Removal problems.

Speaker: Yu Zheng

\nAffiliation: Johns Hopkins Universi
ty

Title: Space Efficient Deterministic Approximation of String Me asures

\nAbstract:

\nWe study approximation algorithms for the
following three string measures that are widely used in practice: edit dis
tance (ED)\, longest common subsequence (LCS)\, and longest increasing seq
uence (LIS). All three problems can be solved exactly by standard algorith
ms that run in polynomial time with roughly $\\Theta(n)$ space\, where $n$
is the input length\, and our goal is to design deterministic approximati
on algorithms that run in polynomial time with significantly smaller space
.

Towards this\, we design several algorithms that achieve $1+\\ep s$ or $1-\\eps$ approximation for all three problems\, where $\\eps>0$ can be any constant and even slightly sub constant. Our algorithms are flexib le and can be adjusted to achieve the following two regimes of parameters: 1) space $n^{\\delta}$ for any constant $\\delta>0$ with running time ess entially the same as or slightly more than the standard algorithms\; and 2) space $\\mathsf{polylog}(n)$ with (a larger) polynomial running time\, which puts the approximation versions of the three problems in Steve’s cla ss (SC). Our algorithms significantly improve previous results in terms of space complexity\, where all known results need to use space at least $\\ Omega(\\sqrt{n})$. Some of our algorithms can also be adapted to work in t he asymmetric streaming model [SS13]\, and output the corresponding sequen ce. Furthermore\, our results can be used to improve a recent result by Fa rhadi et. al. [FHRS20] about approximating ED in the asymmetric streaming model\, reducing the running time from being exponential in [FHRS20] to a polynomial.

\nOur algorithms are based on the idea of using recursi on as in Savitch’s theorem [Sav70]\, and a careful adaption of previous te chniques to make the recursion work. Along the way we also give a new logs pace reduction from longest common subsequence to longest increasing seque nce\, which may be of independent interest.

DTSTART;TZID=America/New_York:20201202T120000 DTEND;TZID=America/New_York:20201202T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Yu Zheng URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-yu-zheng/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-341@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Xuan Wu

\nAffiliation: Johns Hopkins Universit
y

Title: Coreset for Clustering in Graph Metrics.

\nAbstract
:

\nClustering is a fundamental task in machine learning. As the incr
easing demand for running machine learning algorithms in the huge data set
s\, classic clustering algorithms were found not to scale well. To this en
d\, coreset is introduced as a powerful data reduction technique that turn
s a huge dataset into a tiny proxy. Coresets have been also successfully a
pplied to streaming and distributed computing. Coresets for clustering in
Euclidean spaces have been very well studied. However\, very few results
were known about the non-Euclidean space. Beyond Euclidean\, graph metrics
is a very important family of metric space. In this talk\, I will cover m
y recent work on coreset for k-clustering in graph metrics\, including bou
nded treewidth graph and excluded-minor graph.

Speaker: Enayat Ullah

\nAffiliation: Johns Hopkins Univ
ersity

Title: Machine unlearning via algorithmic stability

\nAbstract: We study the problem of machine unlearning\, and identify a not ion of algorithmic stability\, Total Variation (TV) stability\, which we a rgue\, is suitable for the goal of exact efficient unlearning. For convex risk minimization problems\, we design TV-stable algorithms based on noisy Stochastic Gradient Descent (SGD). Our key contribution is the design of corresponding efficient unlearning algorithms\, which are based on constru cting a (maximal) coupling of Markov chains for the noisy SGD procedure. T o understand the trade-offs between accuracy and unlearning efficiency\, w e give upper and lower bounds on excess empirical and population risk of T V stable algorithms for convex risk minimization. Our techniques generaliz e to arbitrary non-convex functions\, and our algorithms are differentiall y private as well.

DTSTART;TZID=America/New_York:20210217T120000 DTEND;TZID=America/New_York:20210217T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Enayat Ullah URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-enayat-ulla h/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-348@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Thomas Lavastida

\nAffiliation: Carnegie Mello
n University

Title: Combinatorial Optimization Augmented with Mach ine Learning

\nAbstract:

\nCombinatorial optimization often fo cuses on optimizing for the worst-case. However\, the best algorithm to us e depends on the “relevant inputs”\, which is application specific and oft en does not have a formal definition.

\nThe talk gives a new theor etical model for designing algorithms that are tailored to inputs for the application at hand. In the model\, learning is performed on past problem instances to make predictions on future instances. These predictions are i ncorporated into the design and analysis of the algorithm. The prediction s can be used to achieve “instance-optimal” algorithm design when the pred ictions are accurate and the algorithm’s performance gracefully degrades w hen there is error in the prediction.

\nThe talk will apply this fra mework to online algorithm design and give algorithms with theoretical per formance that goes beyond worst-case analysis. The majority of the talk wi ll focus on load balancing with restricted assignments.

DTSTART;TZID=America/New_York:20210224T120000 DTEND;TZID=America/New_York:20210224T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Thomas Lavastida URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-thomas-lava stida/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-349@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Hung Le

\nAffiliation: University of Massachus
etts\, Amherst

Title: Reliable Spanners: Locality-Sensitive Orderi ngs Strike Back

\nAbstract:

\nA highly desirable property of n
etworks is robustness to failures.

\nConsider a metric space $(X\,d_X
)$\, a graph $H$ over $X$ is a $\\vartheta$-reliable $t$-spanner if\, for
every set of failed vertices $B\\subset X$\, there is a superset $B^+\\sup
seteq B$ such that the induced subgraph $H[X\\setminus B]$ preserves all t
he distances between points in $X\\setminus B^+$ up to a stretch factor $t
$\, while the expected size of $B^+$ is as most $(1+\\vartheta)|B|$. Such
a spanner could withstand a catastrophe: failure of even $90\\%$ of the ne
twork.

Buchin\, Har-Peled\, and Olah showed how to construct a spa rse reliable spanner for Euclidean space from Euclidean locality-sensitive orderings\, an object introduced by Chan\, Har-Peled\, and Jones. In this talk\, we extend their approach to non-Euclidean metric spaces by general izing the ordering of Chan\, Har-Peled\, and Jones to doubling metrics and introducing new types of locality-sensitive orderings for other metric sp aces. We also show how to construct reliable spanners from the newly intro duced locality-sensitive orderings via reliable 2-hop spanners for paths. The highlight of our results is that the number of edges in our spanner ha s no dependency on the spread.

DTSTART;TZID=America/New_York:20210303T120000 DTEND;TZID=America/New_York:20210303T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Hung Le URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-hung-le/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-356@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Teodor Marinov

\nAffiliation: Johns Hopkins Un
iversity

Title: Beyond Value-Function Gaps: Improved Instance-Depe ndent Regret Bounds for Episodic Reinforcement Learning

\nAbstract:< br />\nReinforcement Learning (RL) is a general scenario where agents inte ract with the environment to achieve some goal. The environment and an age nt’s interactions are typically modeled as a Markov decision process (MDP) \, which can represent a rich variety of tasks. But\, for which MDPs can a n agent or an RL algorithm succeed? This requires a theoretical analysis o f the complexity of an MDP. In this talk I will present information-theore tic lower bounds for a large class of MDPs. The lower bounds are based on studying a certain semi-infinite LP. Further\, I will show that existing a lgorithms enjoy tighter gap-dependent regret bounds (similar to the stocha stic multi-armed bandit problem)\, however\, they are unable to achieve th e information-theoretic lower bounds even in deterministic transition MDPs \, unless there is a unique optimal policy.

DTSTART;TZID=America/New_York:20210310T120000 DTEND;TZID=America/New_York:20210310T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Teodor Marinov URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-teodor-mari nov/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-355@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Dominik Kempa

\nAffiliation: Johns Hopkins Uni
versity

Title: How to store massive sequence collections in a sear chable form

\nAbstract:

\nCompressed indexing is concerned with
the design and construction of data structures to store massive sequences
in space close to the size of compressed data\, while simultaneously prov
iding searching functionality (such as pattern matching) on the original u
ncompressed data. These indexes have a huge impact on the analysis of larg
e-scale DNA databases (containing hundreds of thousands of genomes) but un
til recently the size of many indexes lacked theoretical guarantees and th
eir construction remains a computational bottleneck.

In this talk I will describe my work proving theoretical guarantees on index size as a function of compressed data size\, resolving a key open question in this f ield. Related to that\, I will also describe new algorithms for converting between two conceptually distinct compressed representations\, Lempel-Ziv and the Burrows-Wheeler Transform. I will show how these new findings ena ble advanced computation directly on compressed data. I will conclude by d escribing avenues for future work\, including the new algorithms for the c onstruction of compressed indexes that have the potential to cut indexing time by further orders of magnitude\, unlocking the ability to search trul y enormous text or DNA datasets.

DTSTART;TZID=America/New_York:20210317T120000 DTEND;TZID=America/New_York:20210317T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Dominik Kempa URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-dominik-kem pa/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-353@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Audra McMillan

\nAffiliation: Apple

Ti tle: Hiding among the clones: a simple and nearly optimal analysis of priv acy amplification by shuffling

\nAbstract:

\nDifferential priva
cy (DP) is a model of privacy-preserving machine learning that has garnere
d significant interest in recent years due to its rigorous privacy guarant
ees. An algorithm differentially private if the output is stable under sma
ll changes in the input database. While DP has been adopted in a variety o
f applications\, most applications of DP in industry actually satisfy a st
ronger notion called local differential privacy. In local differential pri
vacy data subjects perturb their data before it reaches the data analyst.
While this requires less trust\, it comes a substantial cost to accuracy.
Recent work of Erlingsson\, Feldman\, Mironov\, Raghunathan\, Talwar\, and
Thakurta [EFMRTT19] demonstrated that random shuffling amplifies differen
tial privacy guarantees of locally randomized data. Such amplification imp
lies substantially stronger privacy guarantees for systems in which data i
s contributed anonymously [BEMMRLRKTS17] and has led to significant intere
st in the shuffle model of privacy [CSUZZ19\, EFMRTT19]. In this talk\, we
will discuss a new result on privacy amplification by shuffling\, which a
chieves the asymptotically optimal dependence in the local privacy paramet
er. Our result is based on a new proof strategy which is simpler than prev
ious approaches\, and extends to a lightly weaker notion known as approxim
ate differential privacy with nearly the same guarantees. Based on joint w
ork with Vitaly Feldman and Kunal Talwar (https://arxiv.org/abs/2012.12803
)

Speaker: Maryam Negahbani

\nAffiliation: Dartmouth Univ
ersity

Title: “Revisiting Priority k-Center: Fairness and Outliers

\nAbstract:

\nClustering is a fundamental unsupervised learnin
g and facility location problem extensively studied in the literature. I w
ill talk about a clustering problem called “priority k-center” introduced
by Plesnik (in Disc. Appl. Math. 1987). Given a metric space on n points
X\, with distance function d\, an integer k\, and radius r_v for each poin
t v in X\, the goal is to choose k points S as “centers” to minimize the m
aximum distance of a point v to S divided by r_v. For uniform r_v’s this i
s precisely the “k-center” problem where the objective is to minimize the
maximum distance of any point to S. In the priority version\, points with
smaller r_v are prioritized to be closer to S. Recently\, a special case o
f this problem was studied in the context of “individually fair clustering
” by Jung et al.\, FORC 2020. This notion of fairness forces S to open a
center in every “densely populated area” by setting r_v to be v’s distance
to its closest (n/k)-th neighbor.

In this talk\, I show how to ap proximate priority k-center with outliers: When for a given integer z\, yo u are allowed to throw away z points as outliers and minimize the objectiv e over the rest of the points. We show there is 9-approximation\, which is morally a 5\, if you have constant many types of radii or if your radii a re powers of 2. This is via an LP-aware reduction to min-cost max-flow and is general enough that could handle Matroid constraints on facilities (wh ere instead of asking to pick at most k facilities\, you are asked to pick facilities that are independent in a given matroid). Things become quite interesting for priority knapsack-center with outliers: where opening each center costs something and you have a limited budget to spend on your sol ution S. In this case\, we do not know how to solve the corresponding flow problem\, so we alter our reduction to reduce to a simpler problem we do know how to solve taking a hit of +5 in the approximation ratio. There are still many open problems in this work\, in addition to solving the flow p roblem in the knapsack case\, the best LP integrality gap we know for prio rity k-center with outliers is 3.

DTSTART;TZID=America/New_York:20210331T120000 DTEND;TZID=America/New_York:20210331T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Maryam Negahbani URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-maryam-nega hbani/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-346@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Leonidas Tsepenekas

\nAffiliation: University
of Maryland

Title: Approximating Two-Stage Stochastic Supplier Pro blems

\nAbstract:

\nThe main focus of this talk will be radius-
based (supplier) clustering in the two-stage stochastic setting with recou
rse\, where the inherent stochasticity of the model comes in the form of a
budget constraint. Our eventual goal is to provide results in the most ge
neral distributional setting\, where there is only black-box access to the
underlying distribution. To that end\, we follow a two-step approach. Fir
st\, we develop algorithms for a restricted version of the problem\, in wh
ich all possible scenarios are explicitly provided\; second\, we employ a
novel scenario-discarding variant of the standard Sample Average Approxima
tion (SAA) method\, in which we also crucially exploit structural properti
es of the algorithms developed for the first step of the framework. In thi
s way\, we manage to generalize the results of the latter to the black-box
model. Finally\, we note that the scenario-discarding modification to the
SAA method is necessary in order to optimize over the radius.

Pap er: https://arxiv.org/abs/2008.03325

DTSTART;TZID=America/New_York:20210407T120000 DTEND;TZID=America/New_York:20210407T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Leonidas Tsepenekas URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-leonidas-ts epenekas/ X-COST-TYPE:free END:VEVENT BEGIN:VEVENT UID:ai1ec-358@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20221208T031335Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Samson Zhou

\nAffiliation: Carnegie Mellon Uni
versity

Title: Tight Bounds for Adversarially Robust Streams and S liding Windows via Difference Estimators

\nAbstract:

\nWe intro
duce *difference estimators* for data stream computation\, which pr
ovide approximations to F(v)-F(u) for frequency vectors v\,u and a given f
unction F. We show how to use such estimators to carefully trade error for
memory in an iterative manner. The function F is generally non-linear\, a
nd we give the first difference estimators for the frequency moments F_p f
or p between 0 and 2\, as well as for integers p>2. Using these\, we resol
ve a number of central open questions in adversarial robust streaming and
sliding window models.

For both models\, we obtain algorithms for norm estimation whose dependence on epsilon is 1/epsilon^2\, which shows\, up to logarithmic factors\, that there is no overhead over the standard i nsertion-only data stream model for these problems.

DTSTART;TZID=America/New_York:20210414T120000 DTEND;TZID=America/New_York:20210414T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Samson Zhou URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-samson-zhou -3/ X-COST-TYPE:free END:VEVENT END:VCALENDAR