Skip to main content

Section 9.7 Central Limit Theorem

Often, when one wants to solve various scientific problems, several assumptions will be made regarding the nature of the underlying setting and base their conclusions on those assumptions. Indeed, if one is going to use a Binomial Distribution or a Negative Binomial Distribution, an assumption on the value of p is necessary. For Poisson and Exponential Distributions, one must know the mean. For Normal Distributions, one must assume values for both the mean and the standard deviation. Where do these values come from? Often, one may perform a preliminary study and obtain a sample statistic...such as a sample mean or a relative frequency and use these values for μ or p.

But what is the underlying distribution of these sample statistics? The Central Limit Theorem gives the answer...

The results from the previous section illustrate the tendency for bell-shaped distributions. This tendency can be described more mathematically through the following theorem. It is presented here without proof.

Often the Central Limit Theorem is stated more formally using a conversion to standard units. Indeed, the theorem indicates that the random variable \(\overline{X}\) has variance \(\frac{\sigma^2}{n}\) which means as n grows this variance approaches 0. So, the limiting random variable has a zero variance and therefore is no longer a random variable. To avoid this issue, the Central Limit Theorem is often stated as:

For random variables

\begin{equation*} W_n = \frac{\overline{X} - \mu}{\sigma/ \sqrt{n}} \end{equation*}

with corresponding distribution function \(F_n(W_n)\text{,}\)

\begin{equation*} \lim_{n \rightarrow \infty} F_n(c) = \int_{-\infty}^c \frac{1}{\sqrt{2 \pi}} e^{-z^2/2} dz = \Phi(c) \end{equation*}

that is, the standard normal distribution function.

While we are at it, we can again "go backwards" and figure out the mean and variance if given some probabilities.

Checkpoint 9.7.2. WebWork - Computing probabilities on \(\overline{X}\).

Example 9.7.3. Exponential X vs Normal \(\overline{X}\).

Consider an exponential variable X with mean time till first success of \(\mu = 4\text{.}\) Then, \(\sigma = 2\) using the exponential formulas.

You can use the exponential probability function to compute probabilities dealing with X. Indeed,

\begin{equation*} P(X \lt 3.9) = F(3.9) = 1 - e^{-3.9/4} \approx 0.6228 . \end{equation*}

If instead you plan to sample from this distribution n=32 times, the Central Limit Theorem implies that you will get a random variable \(\overline{X}\) which has an approximate normal distribution with the same mean but with new variance \(\sigma_{\overline{X}}^2 = \frac{4}{32} = \frac{1}{8}\text{.}\) Therefore

\begin{equation*} P( \overline{X} \lt 3.9 ) \\ \approx \text{normalcdf}(0,3.9,4,sqrt(1/8)) = 0.3886 . \end{equation*}

When converting probability problems from continuous (such as exponential or uniform) then no adjustment to the question is needed since you are approximating one area with another area. However, when converting probability problems from discrete (such as binomial or geometric) then you need to consider how the interval would need to be adjusted so that histogram areas for the discrete problem would relate to areas under the normal curve. Generally, you will need to expand the stated interval each way by 1/2.

The Central Limit Theorem provides that regardless of the distribution of X, the distribution of an average of X's is approximately normally distributed. However, it also shows why X may also be approximated for some distributions using the normal distribution as certain parameters are allowed to increase. Below, you can see how Binomial and Poisson distributions can be approximated directly using the Normal distribution.

Toward that end, for \(0 \lt p \lt 1\) consider a sequence of Bernoulli trials \(Y_1, Y_2, ..., Y_n\) with each over the space {0,1}. Then,

\begin{equation*} X = \sum_{k=1}^n Y_k \end{equation*}

is a Binomial variable.

Using the Bernoulli variables \(Y_k\) each with mean p and variance p(1-p), note that the Central Limit Theorem applied to \(\overline{X} = \frac{\sum Y_k}{n}\) gives that

\begin{equation*} \frac{\overline{X}-p}{\sqrt{p(1-p)/n}} \end{equation*}

is approximately standard normal. By multiplying top and bottom by n yields

\begin{equation*} \frac{\sum Y_k - np}{\sqrt{np(1-p)}} \end{equation*}

is approximately standard normal. But \(\sum Y_k\) actually is the sum of the number of successes in n trials and is therefore a Binomial variable.

Example 9.7.5. Binomial as Normal.

Binomial becomes normal as \(n \rightarrow \infty\text{.}\) Consider n = 50 and p = 0.3. Then, \(\mu = 15\) and \(\sigma^2 = 10.5\text{.}\)

Using the binomial formulas, for example,

\begin{equation*} P( X = 16 ) = \binom{50}{16} 0.3^{16} \cdot 0.7^{34} \approx 0.11470 \end{equation*}

Using the normal distribution,

\begin{align*} P( X = 16 ) & = P( 15.5 \lt X \lt 16.5) \\ & \approx normalcdf(15.5,16.5,15,sqrt(10.5)) \\ & = 0.11697 \end{align*}

Notice that these are very close.

Note from before that the Poisson distribution function was derived by approximating with Binomial and letting n approach infinity. Therefore, by the previous theorem, the Poisson variable is also approximately Normal using the Poisson mean and variance rather than the binomial's. Indeed, in standard units

\begin{equation*} \frac{Y - \mu}{\sqrt{\mu}} \end{equation*}

is approximately normal for large \(\mu\text{.}\)

Example 9.7.7. Poisson as Normal.

Poisson becomes normal as \(\mu \rightarrow \infty\text{.}\) Consider \(\mu = 20\text{.}\) Then, \(\sigma^2 = \mu = 20\text{.}\)

Using the Poisson formulas, for example,

\begin{equation*} P( X = 19 ) = \frac{20^{19} e^{-20}}{19!} \approx 0.08883 \end{equation*}

Using the normal distribution,

\begin{align*} P( X = 19 ) & = P( 18.5 \lt X \lt 19.5) \\ & \approx normalcdf(18.5,19.5,20,sqrt(20)) \\ & = 0.08683 \end{align*}

Again, these are very close.

Example 9.7.8. Gamma as Normal.

Gamma becomes normal as \(r \rightarrow \infty\text{.}\) Assume that the average time till a first success is 12 minutes and that \(r = 8\text{.}\) Then, the mean for the Gamma distribution is \(\mu = 12 \cdot 8 = 96\) and \(\sigma^2 = 8 \cdot 12^2 = 1152\) and so \(\sigma \approx 33.9411\text{.}\)

Using the Gamma formulas,

\begin{align*} P( 90 \le X \le 100 ) & = \int_{90}^{100} f(x) dx \\ & = 0.59252 - 0.47536 = 0.11716. \end{align*}

Using the normal distribution,

\begin{equation*} P( 90 \le X \le 100) \approx normalcdf(90,100,96,33.9411) = 0.11707. \end{equation*}

Amazingly, these are also very close.

Example 9.7.9. Uniform X vs Normal \(\overline{X}\).

Consider a discrete uniform variable X over R = {1,2,...,20}. Then, \(\mu = 10.5\) and \(\sigma = \frac{20^2-1^2}{20}\) using the uniform formulas.

You can use the uniform probability function to compute probabilities dealing with X. Indeed,

\begin{equation*} P(8 \le X \lt 12) = P(X \in \{8,9,10,11 \} = \frac{4}{20} = 1/5. \end{equation*}

If instead you plan to sample from this distribution n=49 times, the Central Limit Theorem implies that you will get a random variable \(\overline{X}\) which has an approximate normal distribution with the same mean but with new variance \(\sigma_{\overline{X}}^2 = \frac{199/20}{49} = \frac{199}{580}\text{.}\) Therefore, expanding the interval to include the boundaries of the corresponding histogram areas,

\begin{equation*} P( 8 \le \overline{X} \lt 12 ) = P(7.5 \le \overline{X} \le 11.5) \approx normalcdf(7.5,11.5,10.5,0.585750) \approx 0.9561 . \end{equation*}

As these examples illustrate, you will have increasing success in approximating the desired probabilities so long as the distribution's corresponding parameter is allowed to be "sufficiently large". The mathematical reasoning this is true is not provided but depends upon the "Central Limit Theorem" discussed in the next section.

The above theorems allow you to utilize the normal distribution to compute approximate probabilities for the variable X in the stated distributions. This is not always true for all distributions since some do not have parameters which allow for approaching normality. However, regardless of the distribution the Central Limit Theorem always allows you to approximate probabilities if they involve an average of repeated attempts...that is, for variable \(\overline{X}\text{.}\) This usefulness is illustrated in the examples below.