Interval Estimates

Section 10.2 Interval Estimates - Chebyshev

An interval centered on the mean in which at least a certain proportion of the actual data must lie.

Theorem 10.2.1. Chebyshev’s Theorem.

Given a random variable X with given mean \(\mu\) and standard deviation \(\sigma\text{,}\) for \(a \in \mathbb{R}^+\) ,

\begin{equation*} P( \big | X - \mu \big | \lt a ) \gt 1 - \frac{\sigma^2}{a^2} \end{equation*}

Proof.

First, notice that if \(X > \mu + a\text{,}\) then \(X - \mu \> a\) and so \((x-\mu)^2 > a^2\text{.}\) Similarly for \(X < \mu - a\text{,}\) \((x-\mu)^2 > a^2\text{.}\)

Starting with the definition of variance for a continuous variable X,

\begin{align*} \sigma^2 & = \int_{-\infty}^{\infty} (x - \mu)^2 f(x) dx\\ & \ge \int_{-\infty}^{\mu-a} (x - \mu)^2 f(x) dx + \int_{\mu + a}^{\infty} (x - \mu)^2 f(x) dx\\ & \ge \int_{-\infty}^{\mu-a} a^2 f(x) dx + \int_{\mu + a}^{\infty} a^2 f(x) dx\\ & = a^2 \left ( \int_{-\infty}^{\mu-a} f(x) dx + \int_{\mu + a}^{\infty} f(x) dx \right )\\ & = a^2 P( X \le \mu - a \text{ or } X \ge \mu + a )\\ & = a^2 P( \big | X - \mu \big | \ge a) \end{align*}

Dividing by \(a^2\) and taking the complement gives the result.

Corollary 10.2.2. Alternate Form for Chebyshev’s Theorem.

For positive k,

\begin{equation*} P( \big | X - \mu \big | \lt k \sigma ) \gt 1 - \frac{1}{k^2} \end{equation*}

Proof.

Set \(a = k\sigma \) and plug into Chebyshev’s Theorem.

Corollary 10.2.3. Special Cases for Chebyshev’s Theorem.

For any distribution, it is not possible for f(x)=0 within one standard deviation of the mean. Aslo, at least 75% of the data for any distribution must lie within two standard deviations of the mean and at least 88% must lie within three.

Proof.

Apply the Chebyshev Theorem with \(a = \sigma\) to get

\begin{equation*} P(\mu - \sigma \lt X \lt \mu + \sigma) \gt 1 - \frac{\sigma^2}{\sigma^2} = 0 \end{equation*}

Apply the Chebyshev Theorem with \(a = 2 \sigma\) to get \(1 - \frac{1}{2^2} = 0.75\) and with \(k = 3 \sigma\) to get \(1 - \frac{1}{3^2} = \frac{8}{9} > 0.8888\text{.}\)

Chebyshev’s Theorem requires you to know the mean and standard deviation of the variable if you are seeking the lower bound. On the other hand, you can "go backward" and find the mean and standard deviation for a given interval if you presume that the unknown mean is actually the midpoint of the interval and that \(a\) is the (equal) distance from that midpoint to either endpoint. Use this when working the exercise below.

Checkpoint 10.2.4. WebWork - Chebyshev.

A statistician uses Chebyshev’s Theorem to estimate that at least 20 % of a population lies between the values 4 and 18. Use this information to find the values of the population mean, \(\mu\) , and the population standard deviation \(\sigma\text{.}\)

a) \(\mu =\)

b) \(\sigma =\)

Answer 1.

\(11\)

Answer 2.

\(6.26099033699941\)

Checkpoint 10.2.5. WebWork - More Chebyshev.

Suppose that the blood pressure of the human inhabitants of a certain Pacific island is distributed with mean, \(\mu\) = 82 mmHg and standard deviation , \(\sigma\) = 10 mmHg. According to Chebyshev’s Theorem, at least what percentage of the islander’s have blood pressure in the range from 49 mmHg to 115 mmHg ?

answer: %

Answer.

\(90.8172635445363\)

Example 10.2.6. - Comparing known distribution to Chebyshev.

Suppose that you have an exponential random variable X with mean 7. Using properties of exponential distributions, you also know that the standard deviation is 7. Also, you should note that for an exponential distribution the random variable represents time and thus can never be smaller than 0. It follows then that

\begin{equation*} P( \mu - 1.8 \sigma \le X \le \mu + 1.8 \sigma) = P( 7 - 1.8 \cdot 7 \le X \le 7 + 1.8 \cdot 7) \\ = P( 0 \le X \le 19.6) = F(19.6) \approx 0.939. \end{equation*}

since the exponential distribution has a known distribution function.

However, using the Chebyshev’s Theorem,

\begin{equation*} P( \mu - 1.8 \sigma \le X \le \mu + 1.8 \sigma) = P( \big | X - \mu \big | \lt 1.8 \cdot \sigma ) \gt 1 - \frac{1}{{1.8}^2} \approx 0.691. \end{equation*}

The difference in these two results is not a problem since the first is designed to give you a precise answer with the knowledge that X itself has a known probability function whereas in the second case you only presume that X has the desired mean and standard deviation. With less information, you get a less precise lower bound but since the lower bound \(= 0.691 < 0.939 = \) exact value, then there is no conflict.

Essentials of Mathematical Probability and Statistics

Search Results:

Section 10.2 Interval Estimates - Chebyshev

Theorem 10.2.1. Chebyshev’s Theorem.

Proof.

Corollary 10.2.2. Alternate Form for Chebyshev’s Theorem.

Proof.

Corollary 10.2.3. Special Cases for Chebyshev’s Theorem.

Proof.

Checkpoint 10.2.4. WebWork - Chebyshev.

Checkpoint 10.2.5. WebWork - More Chebyshev.

Example 10.2.6. - Comparing known distribution to Chebyshev.