Section 9.7 Central Limit Theorem
Often, when one wants to solve various scientific problems, several assumptions will be made regarding the nature of the underlying setting and base their conclusions on those assumptions. Indeed, if one is going to use a Binomial Distribution or a Negative Binomial Distribution, an assumption on the value of p is necessary. For Poisson and Exponential Distributions, one must know the mean. For Normal Distributions, one must assume values for both the mean and the standard deviation. Where do these values come from? Often, one may perform a preliminary study and obtain a sample statistic...such as a sample mean or a relative frequency and use these values for μ or p. But what is the underlying distribution of these sample statistics? The Central Limit Theorem gives the answer... The results from the previous section illustrate the tendency for bell-shaped distributions. This tendency can be described more mathematically through the following theorem. It is presented here without proof.Theorem 9.7.1. Central Limit Theorem.
Presume X is a random variable from a distribution with known mean μ and known variance σ2x. For some natural number n, sample the distribution repeatedly creating a string of random variables denoted X1,X2,...,Xn and set ¯X=∑Xkn.
Then, ¯X is approximately normally distributed with mean μ and variance σ2=σ2xn.
Checkpoint 9.7.2. WebWork - Computing probabilities on ¯X.
Example 9.7.3. Exponential X vs Normal ¯X.
Consider an exponential variable X with mean time till first success of μ=4. Then, σ=2 using the exponential formulas.
You can use the exponential probability function to compute probabilities dealing with X. Indeed,
If instead you plan to sample from this distribution n=32 times, the Central Limit Theorem implies that you will get a random variable ¯X which has an approximate normal distribution with the same mean but with new variance σ2¯X=432=18. Therefore
Theorem 9.7.4. The Binomial Distribution approximately Normal if n large..
Given a Binomial variable X with μ=np and σ2=np(1−p), then X is approximately also normal with the same mean and variance so long as np>5 and n(1−p)>5.
Proof.
Using the Bernoulli variables \(Y_k\) each with mean p and variance p(1-p), note that the Central Limit Theorem applied to \(\overline{X} = \frac{\sum Y_k}{n}\) gives that
is approximately standard normal. By multiplying top and bottom by n yields
is approximately standard normal. But \(\sum Y_k\) actually is the sum of the number of successes in n trials and is therefore a Binomial variable.
Example 9.7.5. Binomial as Normal.
Binomial becomes normal as n→∞. Consider n = 50 and p = 0.3. Then, μ=15 and σ2=10.5.
Using the binomial formulas, for example,
Using the normal distribution,
Notice that these are very close.
Corollary 9.7.6. The Poisson Distribution is approximately normal with μ large..
Given a Poisson variable X with μ and σ2=μ given, then X is approximately also normal with the same mean and variance so long as μ>5.
Proof.
Note from before that the Poisson distribution function was derived by approximating with Binomial and letting n approach infinity. Therefore, by the previous theorem, the Poisson variable is also approximately Normal using the Poisson mean and variance rather than the binomial's. Indeed, in standard units
is approximately normal for large \(\mu\text{.}\)
Example 9.7.7. Poisson as Normal.
Poisson becomes normal as μ→∞. Consider μ=20. Then, σ2=μ=20.
Using the Poisson formulas, for example,
Using the normal distribution,
Again, these are very close.
Example 9.7.8. Gamma as Normal.
Gamma becomes normal as r→∞. Assume that the average time till a first success is 12 minutes and that r=8. Then, the mean for the Gamma distribution is μ=12⋅8=96 and σ2=8⋅122=1152 and so σ≈33.9411.
Using the Gamma formulas,
Using the normal distribution,
Amazingly, these are also very close.
Example 9.7.9. Uniform X vs Normal ¯X.
Consider a discrete uniform variable X over R = {1,2,...,20}. Then, μ=10.5 and σ=202−1220 using the uniform formulas.
You can use the uniform probability function to compute probabilities dealing with X. Indeed,
If instead you plan to sample from this distribution n=49 times, the Central Limit Theorem implies that you will get a random variable ¯X which has an approximate normal distribution with the same mean but with new variance σ2¯X=199/2049=199580. Therefore, expanding the interval to include the boundaries of the corresponding histogram areas,