Speaker: Jasper Lee

Affiliation: Brown University

Title: Real-Valued Sub-Gaussian Mean Estimation, Optimal to a (1+o(1)) Multiplicative Factor

Abstract:

We revisit one of the most fundamental problems in statistics: given access to independent samples from a 1D random variable (with finite but unknown mean and variance), what is the best way to estimate the mean, in terms of error convergence with respect to sample size? The conventional wisdom is to use the sample mean as our estimate. However it is known that the sample mean has optimal convergence if and only if the underlying distribution is (sub-)Gaussian. Moreover, it can even be exponentially slower than optimal for certain heavy-tailed distributions. On the other hand, the median-of-means estimator (invented and reinvented in various literature) does have sub-Gaussian convergence for all finite-variance distributions, albeit in the big-O sense with a sub-optimal multiplicative constant. The natural remaining question then, is whether it is possible to bridge the gap, to have an estimator that has optimal sub-Gaussian concentration with an optimal constant, for all finite-variance distributions.

In this talk, we answer the question affirmatively by giving an estimator that converges with the optimal constant inside the big-O, up to a (1+o(1)) multiplicative factor. Our estimator is furthermore computable in time linear in the sample size. The convergence analysis involves deriving tail bounds using tools from linear and convex programming, which may be of independent interest.

Joint work with Paul Valiant.