Also, neat trick related to the Poisson distribution 🐟... if you take n i.i.d. samples from some discrete p, then the # times you see domain elem'ts are not independent (they sum to n).

Independence would be great though... enters "Poissonization:"

1/3 https://twitter.com/ccanonne_/status/1263931437352226817
Instead of n samples, introduce randomness: take N~Poi(n)! With overwhelming probability, .9n≤N ≤1.1n (see below [1]), so that's not too expensive.
But now, the # of times you see elems m and k *are* independent. Even better, they're Poi(npₘ) and Poi(npₖ), respectively!

2/3
That simplifies a lot of variance or moments analysis. It's not necessarily optimal (you introduce randomness, so a bit more variance), but in most cases, it's fine, and that can make your life so. much. easier.

[1] https://github.com/ccanonne/probabilitydistributiontoolbox/blob/master/poissonconcentration.pdf
[2] my survey, Appendix D.3

3/3
You can follow @ccanonne_.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: