The Chernoff Bound

The Chernoff bound is like a genericized trademark: it refers not to a particular inequality, but rather a technique for obtaining exponentially decreasing bounds on tail probabilities.

Much of this material comes from my CS 365 textbook,Randomized Algorithms by Motwani and Raghavan.

Let be independent random variables where takes the value with probability and otherwise (a Bernoulli distribution). Suppose at least one of the is nonzero. Let , and let .

Multiplicative Chernoff Bound

We bound for via Markov’s inequality and a Taylor series approximation of the exponential function.

We have for all . We’ll later select an optimal value for . By Markov’s inequality, we have:

My textbook stated this inequality is in fact strict if we assume none of the are 0 or 1, but I’m not sure this is required, due to a strict inequality later on.

Then since the are independent:

We can compute explicitly: this random variable is with probability , and otherwise, that is, with probability , thus this is equal to:

We have for all . As long as at least one :

Whence:

It is time to choose . Differentiating the right-hand side shows we attain the minimum at , which is positive when is. This value of yields the Chernoff bound:

Below Expectations

We use the same technique to bound for . We have:

for any . If we proceed as before, that is, apply Markov’s inequality, use the approximation , then pick to minimize the bound, we have:

Bounding the bounds

Unfortunately, the above bounds are difficult to use, so in practice we use cruder but friendlier approximations.

Recall . Thus if , we have:

Exponentiating both sides, raising to the power of and dropping the highest order term yields:

Thus for :

As for the other Chernoff bound,Wikipedia states:

but my textbook quotes a better bound:

An Additive Chernoff Bound

Due to Hoeffding, this Chernoff bound appears as Problem 4.6 in Motwani and Raghavan. It was also mentioned in a cryptography class I took long ago.

For , let be a random variable that takes with probability and otherwise, and suppose they are independent. Let . Then:


Ben Lynn blynn@cs.stanford.edu 💡