Strong lower bounds for approximating distribution support size and the distinct elements problem

Sofya Raskhodnikova, Dana Ron, Amir Shpilka, Adam Smith

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Scopus citations

Abstract

We consider the problem of approximating the support size of a distribution from, a small number of samples, when each element in the distribution appears with probability at least 1/n. This problem is closely related to the problem of approximating the number of distinct elements in a sequence of length n. For both problems, we prove a nearly linear in n lower bound on the query complexity, applicable even for approximation with additive error. At the heart of the lower bound is a construction of two positive integer random variables, X1 and X2, with very different expectations and the following condition on the first k moments: E[X1]/E[X2] =E[X12]/ E[X22] = ... = E[X 1k]/ E[X2k]. Our lower bound method is also applicable to other problems. In particular, it gives new lower bounds for the sample complexity of (1) approximating the entropy of a distribution and (2) approximating how well a given string is compressed by the Lempel-Ziv scheme.

Original languageEnglish (US)
Title of host publicationProceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2007
Pages559-569
Number of pages11
DOIs
StatePublished - 2007
Event48th Annual Symposium on Foundations of Computer Science, FOCS 2007 - Providence, RI, United States
Duration: Oct 20 2007Oct 23 2007

Publication series

NameProceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
ISSN (Print)0272-5428

Other

Other48th Annual Symposium on Foundations of Computer Science, FOCS 2007
Country/TerritoryUnited States
CityProvidence, RI
Period10/20/0710/23/07

All Science Journal Classification (ASJC) codes

  • General Engineering

Fingerprint

Dive into the research topics of 'Strong lower bounds for approximating distribution support size and the distinct elements problem'. Together they form a unique fingerprint.

Cite this