First, notice that if \(X > \mu + a\text{,}\) then \(X - \mu \> a\) and so \((x-\mu)^2 > a^2\text{.}\) Similarly for \(X < \mu - a\text{,}\)\((x-\mu)^2 > a^2\text{.}\)
Starting with the definition of variance for a continuous variable X,
Set \(a = k\sigma \) and plug into Chebyshev’s Theorem.
Corollary10.2.3.Special Cases for Chebyshev’s Theorem.
For any distribution, it is not possible for f(x)=0 within one standard deviation of the mean. Aslo, at least 75% of the data for any distribution must lie within two standard deviations of the mean and at least 88% must lie within three.
Apply the Chebyshev Theorem with \(a = 2 \sigma\) to get \(1 - \frac{1}{2^2} = 0.75\) and with \(k = 3 \sigma\) to get \(1 - \frac{1}{3^2} = \frac{8}{9} > 0.8888\text{.}\)
Chebyshev’s Theorem requires you to know the mean and standard deviation of the variable if you are seeking the lower bound. On the other hand, you can "go backward" and find the mean and standard deviation for a given interval if you presume that the unknown mean is actually the midpoint of the interval and that \(a\) is the (equal) distance from that midpoint to either endpoint. Use this when working the exercise below.
A statistician uses Chebyshev’s Theorem to estimate that at least 20 % of a population lies between the values 4 and 18. Use this information to find the values of the population mean, \(\mu\) , and the population standard deviation \(\sigma\text{.}\)
Suppose that the blood pressure of the human inhabitants of a certain Pacific island is distributed with mean, \(\mu\) = 82 mmHg and standard deviation , \(\sigma\) = 10 mmHg. According to Chebyshev’s Theorem, at least what percentage of the islander’s have blood pressure in the range from 49 mmHg to 115 mmHg ?
Suppose that you have an exponential random variable X with mean 7. Using properties of exponential distributions, you also know that the standard deviation is 7. Also, you should note that for an exponential distribution the random variable represents time and thus can never be smaller than 0. It follows then that
The difference in these two results is not a problem since the first is designed to give you a precise answer with the knowledge that X itself has a known probability function whereas in the second case you only presume that X has the desired mean and standard deviation. With less information, you get a less precise lower bound but since the lower bound \(= 0.691 < 0.939 = \) exact value, then there is no conflict.