Speaker: Jasp
er Lee

\nAffiliation: Brown University

Title: Real-Valued Sub -Gaussian Mean Estimation\, Optimal to a (1+o(1)) Multiplicative Factor

\nAbstract:

\nWe revisit one of the most fundamental problems in
statistics: given access to independent samples from a 1D random variable
(with finite but unknown mean and variance)\, what is the best way to esti
mate the mean\, in terms of error convergence with respect to sample size?
The conventional wisdom is to use the sample mean as our estimate. Howeve
r it is known that the sample mean has optimal convergence if and only if
the underlying distribution is (sub-)Gaussian. Moreover\, it can even be e
xponentially slower than optimal for certain heavy-tailed distributions. O
n the other hand\, the median-of-means estimator (invented and reinvented
in various literature) does have sub-Gaussian convergence for all finite-v
ariance distributions\, albeit in the big-O sense with a sub-optimal multi
plicative constant. The natural remaining question then\, is whether it is
possible to bridge the gap\, to have an estimator that has optimal sub-Ga
ussian concentration with an optimal constant\, for all finite-variance di
stributions.

In this talk\, we answer the question affirmatively b y giving an estimator that converges with the optimal constant inside the big-O\, up to a (1+o(1)) multiplicative factor. Our estimator is furthermo re computable in time linear in the sample size. The convergence analysis involves deriving tail bounds using tools from linear and convex programmi ng\, which may be of independent interest.

\nJoint work with Paul Va liant.

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-324@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Edinah Gnang\nAffiliation: Johns Hopkins University\nT itle: On the Kotzig-Ringel-Rosa conjecture\nAbstract:\nIn this talk we des cribe and motivate the K.R.R. conjecture graph partitioning and describe a functional approach enabling us to tackle the K.R.R. conjecture via a new composition lemma. If time permits I will also discuss algorithmic aspec ts. DTSTART;TZID=America/New_York:20200923T120000 DTEND;TZID=America/New_York:20200923T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Edinah Gnang URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-edinah-gnan g/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Edin
ah Gnang

\nAffiliation: Johns Hopkins University

\nTitle: On the
Kotzig-Ringel-Rosa conjecture

Abstract:

\nIn this talk we de
scribe and motivate the K.R.R. conjecture graph partitioning and describe
a functional approach enabling us to tackle the K.R.R. conjecture via a ne
w composition lemma. If time permits I will also discuss algorithmic aspe
cts.

Speaker: Adit
ya Krishnan

\nAffiliation: Johns Hopkins University

Title: Sc hatten Norms in Matrix Streams: The Role of Sparsity.

\nAbstract: Spectral functions of large matrices contain important structural informa tion about the underlying data\, and are thus becoming increasingly import ant to efficiently compute. Many times\, large matrices representing real- world data are \\emph{sparse} or \\emph{doubly sparse} (i.e.\, sparse in b oth rows and columns)\, and are accessed as a \\emph{stream} of updates\, typically organized in \\emph{row-order}. In this setting\, where space (m emory) is the limiting resource\, all known algorithms require space that is polynomial in the dimension of the matrix\, even for sparse matrices. W e address this challenge by providing the first algorithms whose space req uirement is \\emph{independent of the matrix dimension}\, assuming the mat rix is doubly-sparse and presented in row-order.

\nIn addition\, we prove that multiple passes are unavoidable in this setting and show exten sions of our primary technique\, including a trade-off between space requi rements and number of passes. Our algorithms approximate the Schatten p-no rms\, which we use in turn to approximate other spectral functions\, such as logarithm of the determinant\, trace of matrix inverse\, and Estrada in dex.

\nJoint work with Vladimir Braverman\, Robert Krauthgamer and Roi Sinoff.

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-331@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Brian Brubach\nAffiliation: Wellesley College\nTitle: Online matching under three layers of uncertainty\nAbstract:\nOnline match ing problems have become ubiquitous with the rise of the internet and e-co mmerce. From the humble beginnings of a single problem 30 years ago\, the theoretical study of online matching now encompasses dozens of variations inspired by diverse applications. We’ll take a tour through the landscape of online matching problems. As we go\, we’ll venture deeper and deeper in to the jungle of stochasticity and uncertainty. Finally\, we’ll cover some very recent work introducing new algorithms and models. Along the way\, I ’ll highlight techniques and tricks I’ve found useful in studying these pr oblems. DTSTART;TZID=America/New_York:20201021T120000 DTEND;TZID=America/New_York:20201021T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Brian Brubach URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-brian-bruba ch/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Bria
n Brubach

\nAffiliation: Wellesley College

Title: Online matc hing under three layers of uncertainty

\nAbstract:

\nOnline mat
ching problems have become ubiquitous with the rise of the internet and e-
commerce. From the humble beginnings of a single problem 30 years ago\, th
e theoretical study of online matching now encompasses dozens of variation
s inspired by diverse applications. We’ll take a tour through the landscap
e of online matching problems. As we go\, we’ll venture deeper and deeper
into the jungle of stochasticity and uncertainty. Finally\, we’ll cover so
me very recent work introducing new algorithms and models. Along the way\,
I’ll highlight techniques and tricks I’ve found useful in studying these
problems.

Speaker: Josh
ua Grochow

\nAffiliation: University of Colorado

Title: Isomo rphism of tensors\, algebras\, and polynomials\, oh my!

\nAbstract:
We consider the problems of testing isomorphism of tensors\, p-groups\, cu
bic polynomials\, quantum states\, algebras\, and more\, which arise from
a variety of areas\, including machine learning\, group theory\, and crypt
ography. Despite Graph Isomorphism now being in quasi-polynomial time\, an
d having long had efficient practical software\, for the problems we consi
der no algorithm is known that is asymptotically better than brute force\,
and state-of-the-art software cannot get beyond small instances. We appro
ach this situation in two ways:

\n – Complexity-theoretic: we show th
at all these problems are polynomial-time equivalent\, giving rise to a cl
ass of problems we call Tensor Isomorphism-complete (TI-complete). Perhaps
surprising here is that we show that isomorphism of d-tensors for any fix
ed d (at least 3) is equivalent to testing isomorphism of 3-tensors. These
equivalences let us focus on just one problem rather than dozens\, or a w
hole infinite hierarchy\, separately.

\n – Algorithmic: Adapting the
Weisfeiler-Leman method from Graph Isomorphism to tensors\, trying to unde
rstand tensor isomorphism by taking advantage of isomorphisms of small sub
-tensors. This leads to tensor analogues of the Graph Reconstruction conje
cture and related questions.

Based on joint works with Vyacheslav V. Futorny and Vladimir V. Sergeichuk (Lin. Alg. Appl.\, 2019\; arXiv:1810 .09219)\, with Peter A. Brooksbank\, Yinan Li\, Youming Qiao\, and James B . Wilson (arXiv:1905.02518)\, and with Youming Qiao (arXiv:1907.00309).

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-338@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Jingfeng Wu\nAffiliation: Johns Hopkins University\nTi tle: Direction Matters: On the Implicit Regularization Effect of Stochasti c Gradient Descent with Moderate Learning Rate\nAbstract:\nUnderstanding t he algorithmic regularization effect of stochastic gradient descent (SGD) is one of the key challenges in modern machine learning and deep learning theory. Most of the existing works\, however\, focus on very small or even infinitesimal learning rate regime\, and fail to cover practical scenario s where the learning rate is moderate and annealing. In this paper\, we ma ke an initial attempt to characterize the particular regularization effect of SGD in the moderate learning rate regime by studying its behavior for optimizing an overparameterized linear regression problem. In this case\, SGD and GD are known to converge to the unique minimum-norm solution\; how ever\, with the moderate and annealing learning rate\, we show that they e xhibit different directional bias: SGD converges along the large eigenvalu e directions of the data matrix\, while GD goes after the small eigenvalue directions. Furthermore\, we show that such directional bias does matter when early stopping is adopted\, where the SGD output is nearly optimal bu t the GD output is suboptimal. Finally\, our theory explains several folk arts in practice used for SGD hyperparameter tuning\, such as (1) linearly scaling the initial learning rate with batch size\; and (2) overrunning S GD with high learning rate even when the loss stops decreasing. DTSTART;TZID=America/New_York:20201104T120000 DTEND;TZID=America/New_York:20201104T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Jingfeng Wu URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-jingfeng-wu / X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Jing
feng Wu

\nAffiliation: Johns Hopkins University

Title: Direct ion Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate

\nAbstract:

\nUnderstanding
the algorithmic regularization effect of stochastic gradient descent (SGD
) is one of the key challenges in modern machine learning and deep learnin
g theory. Most of the existing works\, however\, focus on very small or ev
en infinitesimal learning rate regime\, and fail to cover practical scenar
ios where the learning rate is moderate and annealing. In this paper\, we
make an initial attempt to characterize the particular regularization effe
ct of SGD in the moderate learning rate regime by studying its behavior fo
r optimizing an overparameterized linear regression problem. In this case\
, SGD and GD are known to converge to the unique minimum-norm solution\; h
owever\, with the moderate and annealing learning rate\, we show that they
exhibit different directional bias: SGD converges along the large eigenva
lue directions of the data matrix\, while GD goes after the small eigenval
ue directions. Furthermore\, we show that such directional bias does matte
r when early stopping is adopted\, where the SGD output is nearly optimal
but the GD output is suboptimal. Finally\, our theory explains several fol
k arts in practice used for SGD hyperparameter tuning\, such as (1) linear
ly scaling the initial learning rate with batch size\; and (2) overrunning
SGD with high learning rate even when the loss stops decreasing.

Speaker: Arno
ld Filtser

\nAffiliation: Columbia University

Title: Scatteri
ng and Sparse Partitions\, and their Applications

\nAbstract:

\n
A partition $\\mathcal{P}$ of a weighted graph $G$ is $(\\sigma\,\\tau\,\\
Delta)$-sparse if every cluster has diameter at most $\\Delta$\, and every
ball of radius $\\Delta/\\sigma$ intersects at most $\\tau$ clusters. Si
milarly\, $\\mathcal{P}$ is $(\\sigma\,\\tau\,\\Delta)$-scattering if inst
ead for balls we require that every shortest path of length at most $\\Del
ta/\\sigma$ intersects at most $\\tau$ clusters. Given a graph $G$ that a
dmits a $(\\sigma\,\\tau\,\\Delta)$-sparse partition for all $\\Delta>0$\,
Jia et al. [STOC05] constructed a solution for the Universal Steiner Tree
problem (and also Universal TSP) with stretch $O(\\tau\\sigma^2\\log_\\ta
u n)$. Given a graph $G$ that admits a $(\\sigma\,\\tau\,\\Delta)$-scatte
ring partition for all $\\Delta>0$\, we construct a solution for the Stein
er Point Removal problem with stretch $O(\\tau^3\\sigma^3)$. We then cons
truct sparse and scattering partitions for various different graph familie
s\, receiving new results for the Universal Steiner Tree and Steiner Point
Removal problems.

Speaker: Yu Z
heng

\nAffiliation: Johns Hopkins University

Title: Space Eff icient Deterministic Approximation of String Measures

\nAbstract:

\nWe study approximation algorithms for the following three string meas
ures that are widely used in practice: edit distance (ED)\, longest common
subsequence (LCS)\, and longest increasing sequence (LIS). All three prob
lems can be solved exactly by standard algorithms that run in polynomial t
ime with roughly $\\Theta(n)$ space\, where $n$ is the input length\, and
our goal is to design deterministic approximation algorithms that run in p
olynomial time with significantly smaller space.

Towards this\, we design several algorithms that achieve $1+\\eps$ or $1-\\eps$ approximati on for all three problems\, where $\\eps>0$ can be any constant and even s lightly sub constant. Our algorithms are flexible and can be adjusted to a chieve the following two regimes of parameters: 1) space $n^{\\delta}$ for any constant $\\delta>0$ with running time essentially the same as or sli ghtly more than the standard algorithms\; and 2) space $\\mathsf{polylog} (n)$ with (a larger) polynomial running time\, which puts the approximatio n versions of the three problems in Steve’s class (SC). Our algorithms sig nificantly improve previous results in terms of space complexity\, where a ll known results need to use space at least $\\Omega(\\sqrt{n})$. Some of our algorithms can also be adapted to work in the asymmetric streaming mod el [SS13]\, and output the corresponding sequence. Furthermore\, our resul ts can be used to improve a recent result by Farhadi et. al. [FHRS20] abou t approximating ED in the asymmetric streaming model\, reducing the runnin g time from being exponential in [FHRS20] to a polynomial.

\nOur al gorithms are based on the idea of using recursion as in Savitch’s theorem [Sav70]\, and a careful adaption of previous techniques to make the recurs ion work. Along the way we also give a new logspace reduction from longest common subsequence to longest increasing sequence\, which may be of indep endent interest.

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-341@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Xuan Wu\nAffiliation: Johns Hopkins University\nTitle: Coreset for Clustering in Graph Metrics.\nAbstract:\nClustering is a fund amental task in machine learning. As the increasing demand for running mac hine learning algorithms in the huge data sets\, classic clustering algori thms were found not to scale well. To this end\, coreset is introduced as a powerful data reduction technique that turns a huge dataset into a tiny proxy. Coresets have been also successfully applied to streaming and distr ibuted computing. Coresets for clustering in Euclidean spaces have been v ery well studied. However\, very few results were known about the non-Eucl idean space. Beyond Euclidean\, graph metrics is a very important family o f metric space. In this talk\, I will cover my recent work on coreset for k-clustering in graph metrics\, including bounded treewidth graph and excl uded-minor graph. DTSTART;TZID=America/New_York:20201209T120000 DTEND;TZID=America/New_York:20201209T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Xuan Wu URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-xuan-wu-2/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Xuan
Wu

\nAffiliation: Johns Hopkins University

Title: Coreset fo r Clustering in Graph Metrics.

\nAbstract:

\nClustering is a fu
ndamental task in machine learning. As the increasing demand for running m
achine learning algorithms in the huge data sets\, classic clustering algo
rithms were found not to scale well. To this end\, coreset is introduced a
s a powerful data reduction technique that turns a huge dataset into a tin
y proxy. Coresets have been also successfully applied to streaming and dis
tributed computing. Coresets for clustering in Euclidean spaces have been
very well studied. However\, very few results were known about the non-Eu
clidean space. Beyond Euclidean\, graph metrics is a very important family
of metric space. In this talk\, I will cover my recent work on coreset fo
r k-clustering in graph metrics\, including bounded treewidth graph and ex
cluded-minor graph.

Speaker: Enay
at Ullah

\nAffiliation: Johns Hopkins University

Title: Machi ne unlearning via algorithmic stability

\nAbstract: We study the pro blem of machine unlearning\, and identify a notion of algorithmic stabilit y\, Total Variation (TV) stability\, which we argue\, is suitable for the goal of exact efficient unlearning. For convex risk minimization problems\ , we design TV-stable algorithms based on noisy Stochastic Gradient Descen t (SGD). Our key contribution is the design of corresponding efficient unl earning algorithms\, which are based on constructing a (maximal) coupling of Markov chains for the noisy SGD procedure. To understand the trade-offs between accuracy and unlearning efficiency\, we give upper and lower boun ds on excess empirical and population risk of TV stable algorithms for con vex risk minimization. Our techniques generalize to arbitrary non-convex f unctions\, and our algorithms are differentially private as well.

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-348@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Thomas Lavastida\nAffiliation: Carnegie Mellon Univers ity\nTitle: Combinatorial Optimization Augmented with Machine Learning\nAb stract:\nCombinatorial optimization often focuses on optimizing for the wo rst-case. However\, the best algorithm to use depends on the “relevant inp uts”\, which is application specific and often does not have a formal defi nition. \nThe talk gives a new theoretical model for designing algorithms that are tailored to inputs for the application at hand. In the model\, l earning is performed on past problem instances to make predictions on futu re instances. These predictions are incorporated into the design and analy sis of the algorithm. The predictions can be used to achieve “instance-op timal” algorithm design when the predictions are accurate and the algorith m’s performance gracefully degrades when there is error in the prediction. \nThe talk will apply this framework to online algorithm design and give a lgorithms with theoretical performance that goes beyond worst-case analysi s. The majority of the talk will focus on load balancing with restricted a ssignments. DTSTART;TZID=America/New_York:20210224T120000 DTEND;TZID=America/New_York:20210224T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Thomas Lavastida URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-thomas-lava stida/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Thom
as Lavastida

\nAffiliation: Carnegie Mellon University

Title: Combinatorial Optimization Augmented with Machine Learning

\nAbstra ct:

\nCombinatorial optimization often focuses on optimizing for the worst-case. However\, the best algorithm to use depends on the “relevant inputs”\, which is application specific and often does not have a formal d efinition.

\nThe talk gives a new theoretical model for designing algorithms that are tailored to inputs for the application at hand. In the model\, learning is performed on past problem instances to make predictio ns on future instances. These predictions are incorporated into the design and analysis of the algorithm. The predictions can be used to achieve “i nstance-optimal” algorithm design when the predictions are accurate and th e algorithm’s performance gracefully degrades when there is error in the p rediction.

\nThe talk will apply this framework to online algorithm design and give algorithms with theoretical performance that goes beyond w orst-case analysis. The majority of the talk will focus on load balancing with restricted assignments.

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-349@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Hung Le\nAffiliation: University of Massachusetts\, Am herst\nTitle: Reliable Spanners: Locality-Sensitive Orderings Strike Back \nAbstract:\nA highly desirable property of networks is robustness to fail ures.\nConsider a metric space $(X\,d_X)$\, a graph $H$ over $X$ is a $\\v artheta$-reliable $t$-spanner if\, for every set of failed vertices $B\\su bset X$\, there is a superset $B^+\\supseteq B$ such that the induced subg raph $H[X\\setminus B]$ preserves all the distances between points in $X\\ setminus B^+$ up to a stretch factor $t$\, while the expected size of $B^+ $ is as most $(1+\\vartheta)|B|$. Such a spanner could withstand a catastr ophe: failure of even $90\\%$ of the network.\nBuchin\, Har-Peled\, and Ol ah showed how to construct a sparse reliable spanner for Euclidean space f rom Euclidean locality-sensitive orderings\, an object introduced by Chan\ , Har-Peled\, and Jones. In this talk\, we extend their approach to non-Eu clidean metric spaces by generalizing the ordering of Chan\, Har-Peled\, a nd Jones to doubling metrics and introducing new types of locality-sensiti ve orderings for other metric spaces. We also show how to construct reliab le spanners from the newly introduced locality-sensitive orderings via rel iable 2-hop spanners for paths. The highlight of our results is that the n umber of edges in our spanner has no dependency on the spread. DTSTART;TZID=America/New_York:20210303T120000 DTEND;TZID=America/New_York:20210303T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Hung Le URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-hung-le/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Hung
Le

\nAffiliation: University of Massachusetts\, Amherst

Titl e: Reliable Spanners: Locality-Sensitive Orderings Strike Back

\nAb
stract:

\nA highly desirable property of networks is robustness to fa
ilures.

\nConsider a metric space $(X\,d_X)$\, a graph $H$ over $X$ i
s a $\\vartheta$-reliable $t$-spanner if\, for every set of failed vertice
s $B\\subset X$\, there is a superset $B^+\\supseteq B$ such that the indu
ced subgraph $H[X\\setminus B]$ preserves all the distances between points
in $X\\setminus B^+$ up to a stretch factor $t$\, while the expected size
of $B^+$ is as most $(1+\\vartheta)|B|$. Such a spanner could withstand a
catastrophe: failure of even $90\\%$ of the network.

Buchin\, Har -Peled\, and Olah showed how to construct a sparse reliable spanner for Eu clidean space from Euclidean locality-sensitive orderings\, an object intr oduced by Chan\, Har-Peled\, and Jones. In this talk\, we extend their app roach to non-Euclidean metric spaces by generalizing the ordering of Chan\ , Har-Peled\, and Jones to doubling metrics and introducing new types of l ocality-sensitive orderings for other metric spaces. We also show how to c onstruct reliable spanners from the newly introduced locality-sensitive or derings via reliable 2-hop spanners for paths. The highlight of our result s is that the number of edges in our spanner has no dependency on the spre ad.

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-356@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Teodor Marinov\nAffiliation: Johns Hopkins University \nTitle: Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bo unds for Episodic Reinforcement Learning\nAbstract:\nReinforcement Learnin g (RL) is a general scenario where agents interact with the environment to achieve some goal. The environment and an agent’s interactions are typica lly modeled as a Markov decision process (MDP)\, which can represent a ric h variety of tasks. But\, for which MDPs can an agent or an RL algorithm s ucceed? This requires a theoretical analysis of the complexity of an MDP. In this talk I will present information-theoretic lower bounds for a large class of MDPs. The lower bounds are based on studying a certain semi-infi nite LP. Further\, I will show that existing algorithms enjoy tighter gap- dependent regret bounds (similar to the stochastic multi-armed bandit prob lem)\, however\, they are unable to achieve the information-theoretic lowe r bounds even in deterministic transition MDPs\, unless there is a unique optimal policy. DTSTART;TZID=America/New_York:20210310T120000 DTEND;TZID=America/New_York:20210310T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Teodor Marinov URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-teodor-mari nov/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Teod
or Marinov

\nAffiliation: Johns Hopkins University

Title: Bey ond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Epi sodic Reinforcement Learning

\nAbstract:

\nReinforcement Learni
ng (RL) is a general scenario where agents interact with the environment t
o achieve some goal. The environment and an agent’s interactions are typic
ally modeled as a Markov decision process (MDP)\, which can represent a ri
ch variety of tasks. But\, for which MDPs can an agent or an RL algorithm
succeed? This requires a theoretical analysis of the complexity of an MDP.
In this talk I will present information-theoretic lower bounds for a larg
e class of MDPs. The lower bounds are based on studying a certain semi-inf
inite LP. Further\, I will show that existing algorithms enjoy tighter gap
-dependent regret bounds (similar to the stochastic multi-armed bandit pro
blem)\, however\, they are unable to achieve the information-theoretic low
er bounds even in deterministic transition MDPs\, unless there is a unique
optimal policy.

Speaker: Domi
nik Kempa

\nAffiliation: Johns Hopkins University

Title: How to store massive sequence collections in a searchable form

\nAbstrac
t:

\nCompressed indexing is concerned with the design and constructio
n of data structures to store massive sequences in space close to the size
of compressed data\, while simultaneously providing searching functionali
ty (such as pattern matching) on the original uncompressed data. These ind
exes have a huge impact on the analysis of large-scale DNA databases (cont
aining hundreds of thousands of genomes) but until recently the size of ma
ny indexes lacked theoretical guarantees and their construction remains a
computational bottleneck.

In this talk I will describe my work pro ving theoretical guarantees on index size as a function of compressed data size\, resolving a key open question in this field. Related to that\, I w ill also describe new algorithms for converting between two conceptually d istinct compressed representations\, Lempel-Ziv and the Burrows-Wheeler Tr ansform. I will show how these new findings enable advanced computation di rectly on compressed data. I will conclude by describing avenues for futur e work\, including the new algorithms for the construction of compressed i ndexes that have the potential to cut indexing time by further orders of m agnitude\, unlocking the ability to search truly enormous text or DNA data sets.

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-353@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Audra McMillan\nAffiliation: Apple\nTitle: Hiding amon g the clones: a simple and nearly optimal analysis of privacy amplificatio n by shuffling\nAbstract:\nDifferential privacy (DP) is a model of privacy -preserving machine learning that has garnered significant interest in rec ent years due to its rigorous privacy guarantees. An algorithm differentia lly private if the output is stable under small changes in the input datab ase. While DP has been adopted in a variety of applications\, most applica tions of DP in industry actually satisfy a stronger notion called local di fferential privacy. In local differential privacy data subjects perturb th eir data before it reaches the data analyst. While this requires less trus t\, it comes a substantial cost to accuracy. Recent work of Erlingsson\, F eldman\, Mironov\, Raghunathan\, Talwar\, and Thakurta [EFMRTT19] demonstr ated that random shuffling amplifies differential privacy guarantees of lo cally randomized data. Such amplification implies substantially stronger p rivacy guarantees for systems in which data is contributed anonymously [BE MMRLRKTS17] and has led to significant interest in the shuffle model of pr ivacy [CSUZZ19\, EFMRTT19]. In this talk\, we will discuss a new result on privacy amplification by shuffling\, which achieves the asymptotically op timal dependence in the local privacy parameter. Our result is based on a new proof strategy which is simpler than previous approaches\, and extends to a lightly weaker notion known as approximate differential privacy with nearly the same guarantees. Based on joint work with Vitaly Feldman and K unal Talwar (https://arxiv.org/abs/2012.12803) DTSTART;TZID=America/New_York:20210324T120000 DTEND;TZID=America/New_York:20210324T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Audra McMillan URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-audra-mcmil lan/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Audr
a McMillan

\nAffiliation: Apple

Title: Hiding among the clone s: a simple and nearly optimal analysis of privacy amplification by shuffl ing

\nAbstract:

\nDifferential privacy (DP) is a model of priva
cy-preserving machine learning that has garnered significant interest in r
ecent years due to its rigorous privacy guarantees. An algorithm different
ially private if the output is stable under small changes in the input dat
abase. While DP has been adopted in a variety of applications\, most appli
cations of DP in industry actually satisfy a stronger notion called local
differential privacy. In local differential privacy data subjects perturb
their data before it reaches the data analyst. While this requires less tr
ust\, it comes a substantial cost to accuracy. Recent work of Erlingsson\,
Feldman\, Mironov\, Raghunathan\, Talwar\, and Thakurta [EFMRTT19] demons
trated that random shuffling amplifies differential privacy guarantees of
locally randomized data. Such amplification implies substantially stronger
privacy guarantees for systems in which data is contributed anonymously [
BEMMRLRKTS17] and has led to significant interest in the shuffle model of
privacy [CSUZZ19\, EFMRTT19]. In this talk\, we will discuss a new result
on privacy amplification by shuffling\, which achieves the asymptotically
optimal dependence in the local privacy parameter. Our result is based on
a new proof strategy which is simpler than previous approaches\, and exten
ds to a lightly weaker notion known as approximate differential privacy wi
th nearly the same guarantees. Based on joint work with Vitaly Feldman and
Kunal Talwar (https://arxiv.org/abs/2012.12803)

Speaker: Mary
am Negahbani

\nAffiliation: Dartmouth University

Title: “Revi siting Priority k-Center: Fairness and Outliers

\nAbstract:

\nC
lustering is a fundamental unsupervised learning and facility location pro
blem extensively studied in the literature. I will talk about a clustering
problem called “priority k-center” introduced by Plesnik (in Disc. Appl.
Math. 1987). Given a metric space on n points X\, with distance function
d\, an integer k\, and radius r_v for each point v in X\, the goal is to c
hoose k points S as “centers” to minimize the maximum distance of a point
v to S divided by r_v. For uniform r_v’s this is precisely the “k-center”
problem where the objective is to minimize the maximum distance of any poi
nt to S. In the priority version\, points with smaller r_v are prioritized
to be closer to S. Recently\, a special case of this problem was studied
in the context of “individually fair clustering” by Jung et al.\, FORC 20
20. This notion of fairness forces S to open a center in every “densely po
pulated area” by setting r_v to be v’s distance to its closest (n/k)-th ne
ighbor.

In this talk\, I show how to approximate priority k-center with outliers: When for a given integer z\, you are allowed to throw away z points as outliers and minimize the objective over the rest of the poin ts. We show there is 9-approximation\, which is morally a 5\, if you have constant many types of radii or if your radii are powers of 2. This is via an LP-aware reduction to min-cost max-flow and is general enough that cou ld handle Matroid constraints on facilities (where instead of asking to pi ck at most k facilities\, you are asked to pick facilities that are indepe ndent in a given matroid). Things become quite interesting for priority kn apsack-center with outliers: where opening each center costs something and you have a limited budget to spend on your solution S. In this case\, we do not know how to solve the corresponding flow problem\, so we alter our reduction to reduce to a simpler problem we do know how to solve taking a hit of +5 in the approximation ratio. There are still many open problems i n this work\, in addition to solving the flow problem in the knapsack case \, the best LP integrality gap we know for priority k-center with outliers is 3.

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-346@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Leonidas Tsepenekas\nAffiliation: University of Maryla nd\nTitle: Approximating Two-Stage Stochastic Supplier Problems\nAbstract: \nThe main focus of this talk will be radius-based (supplier) clustering i n the two-stage stochastic setting with recourse\, where the inherent stoc hasticity of the model comes in the form of a budget constraint. Our event ual goal is to provide results in the most general distributional setting\ , where there is only black-box access to the underlying distribution. To that end\, we follow a two-step approach. First\, we develop algorithms fo r a restricted version of the problem\, in which all possible scenarios ar e explicitly provided\; second\, we employ a novel scenario-discarding var iant of the standard Sample Average Approximation (SAA) method\, in which we also crucially exploit structural properties of the algorithms develope d for the first step of the framework. In this way\, we manage to generali ze the results of the latter to the black-box model. Finally\, we note tha t the scenario-discarding modification to the SAA method is necessary in o rder to optimize over the radius.\nPaper: https://arxiv.org/abs/2008.03325 DTSTART;TZID=America/New_York:20210407T120000 DTEND;TZID=America/New_York:20210407T130000 LOCATION:https://wse.zoom.us/j/91450299380 SEQUENCE:0 SUMMARY:[Theory Seminar] Leonidas Tsepenekas URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-leonidas-ts epenekas/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Leon
idas Tsepenekas

\nAffiliation: University of Maryland

Title: Approximating Two-Stage Stochastic Supplier Problems

\nAbstract:

\nThe main focus of this talk will be radius-based (supplier) clustering
in the two-stage stochastic setting with recourse\, where the inherent st
ochasticity of the model comes in the form of a budget constraint. Our eve
ntual goal is to provide results in the most general distributional settin
g\, where there is only black-box access to the underlying distribution. T
o that end\, we follow a two-step approach. First\, we develop algorithms
for a restricted version of the problem\, in which all possible scenarios
are explicitly provided\; second\, we employ a novel scenario-discarding v
ariant of the standard Sample Average Approximation (SAA) method\, in whic
h we also crucially exploit structural properties of the algorithms develo
ped for the first step of the framework. In this way\, we manage to genera
lize the results of the latter to the black-box model. Finally\, we note t
hat the scenario-discarding modification to the SAA method is necessary in
order to optimize over the radius.

Paper: https://arxiv.org/abs/2 008.03325

\n END:VEVENT BEGIN:VEVENT UID:ai1ec-358@www.cs.jhu.edu/~mdinitz/theory DTSTAMP:20230327T101959Z CATEGORIES: CONTACT: DESCRIPTION:Speaker: Samson Zhou\nAffiliation: Carnegie Mellon University\n Title: Tight Bounds for Adversarially Robust Streams and Sliding Windows v ia Difference Estimators\nAbstract:\nWe introduce difference estimators fo r data stream computation\, which provide approximations to F(v)-F(u) for frequency vectors v\,u and a given function F. We show how to use such est imators to carefully trade error for memory in an iterative manner. The fu nction F is generally non-linear\, and we give the first difference estima tors for the frequency moments F_p for p between 0 and 2\, as well as for integers p>2. Using these\, we resolve a number of central open questions in adversarial robust streaming and sliding window models.\nFor both model s\, we obtain algorithms for norm estimation whose dependence on epsilon i s 1/epsilon^2\, which shows\, up to logarithmic factors\, that there is no overhead over the standard insertion-only data stream model for these pro blems. DTSTART;TZID=America/New_York:20210414T120000 DTEND;TZID=America/New_York:20210414T130000 SEQUENCE:0 SUMMARY:[Theory Seminar] Samson Zhou URL:https://www.cs.jhu.edu/~mdinitz/theory/event/theory-seminar-samson-zhou -3/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nSpeaker: Sams
on Zhou

\nAffiliation: Carnegie Mellon University

Title: Tigh t Bounds for Adversarially Robust Streams and Sliding Windows via Differen ce Estimators

\nAbstract:

\nWe introduce *difference estimat
ors* for data stream computation\, which provide approximations to F(v
)-F(u) for frequency vectors v\,u and a given function F. We show how to u
se such estimators to carefully trade error for memory in an iterative man
ner. The function F is generally non-linear\, and we give the first differ
ence estimators for the frequency moments F_p for p between 0 and 2\, as w
ell as for integers p>2. Using these\, we resolve a number of central open
questions in adversarial robust streaming and sliding window models.

For both models\, we obtain algorithms for norm estimation whose depe ndence on epsilon is 1/epsilon^2\, which shows\, up to logarithmic factors \, that there is no overhead over the standard insertion-only data stream model for these problems.

\n END:VEVENT END:VCALENDAR