Other "Bell Shaped" distributions

Section 9.4 Other "Bell Shaped" distributions

The Normal distribution discussed above is very important when doing statistical analysis. It however is not the only distribution that is symmetrical about the mean and looks like a bell. In this section, we consider two other options--one which is virtually useless and another which is very useful.

Definition 9.4.1. The Cauchy Distribution.

Consider a continuous random variable on the real numbers defined by

\begin{equation*} f(x) = \frac{1/\pi}{1+x^2}. \end{equation*}

A random variable with this probability function is said to be a Cauchy Distribution.

Theorem 9.4.2. The Cauchy Distribution.

\begin{equation*} f(x) = \frac{1/\pi}{1+x^2} \end{equation*}

is a probability function on \((-\infty, \infty)\text{.}\)

Proof.

Easily, note that

\begin{equation*} \int_{-\infty}^{\infty} \frac{1}{1+x^2} dx = tan^{-1}(\infty) - tan{-1}(-\infty) = \pi/2 - (-\pi/2) = \pi. \end{equation*}

Dividing by \(\pi\) gives the Cauchy probability function integrates to 1.

Now that we have a probability function, it is important to determine its mean and variance. It should be obvious that when doing so using the Cauchy probability function, problems quickly arise. Indeed,

\begin{equation*} \int_{-\infty}^{\infty} x \frac{1}{1+x^2} dx = (1/2) ( \ln( | \infty |) - \ln( | -\infty |) \end{equation*}

which is problematic. Further, even assuming that the distribution is symmetrical and therefore has a mean of 0, for the variance

\begin{equation*} \int_{-\infty}^{\infty} x^2 \frac{1}{1+x^2} dx \end{equation*}

and note that the integrand does not converge to 0 at the endpoints and therefore the integral is automatically considered divergent. Thus it is reasonable to note that the Cauchy distribution has no variance.

The formula for this curve is so much easier to deal with versus the normal distribution. Perhaps it should be used more. You can see above that it is pretty much inadequate since its theoretical statistics are not well-defined. In the interactive cell below, you might notice some issues right away by comparing the Cauchy probabilty function against a normal probability function (when \(\mu = 0\) but with varied standard deviations. Notice especially that as you change the normal distribution's \(\sigma_{\text{normal}}\) the the area you see under the normal curve totally overwhelms the area in the stationary Cauchy distribution. That means that the two tails of the Cauchy distribution have a lot more area far away from zero than the nomal distribution. This is one of the issues why the Cauchy doesn't give good results.

On the other hand, there is another "bell-shaped" distribution that is useful and its random variable can be created by using a mixture of a normal variable and a \(\Chi^2\) variable.

Definition 9.4.3. Student-t Distribution.

Suppose Z is a standard normal variable and Y is \(\chi^2(r)\) with Y and Z independent. Define a new random variable

\begin{equation*} T = \frac{Z}{\sqrt{Y/r}}. \end{equation*}

Then, T is said to have a (Student) t distribution with probability function given by

\begin{equation*} \frac{\Gamma \left ( \frac{n+1}{2} \right ) }{\sqrt{n \pi} \; \Gamma \left ( \frac{n}{2} \right ) } \left ( 1 + \frac{x^2}{n}\right )^{ - \left ( \frac{n+1}{2} \right )} \end{equation*}

The good news is that this distribution is useful and its properties are presented below without proof.

Theorem 9.4.4. Student t-distribution properties.

For the Student t variable T defined above,

\begin{equation*} \mu = 0 \end{equation*}

and if r>2

\begin{equation*} \sigma^2 = \frac{r}{r-2} \end{equation*}

and if r>3

\begin{equation*} \gamma_1 = 0 \end{equation*}

and if r>4

\begin{equation*} \gamma_2 = \frac{6}{r-4} + 3. \end{equation*}

Example 9.4.5. Similarity between Normal and t-distributions for larger n.

Consider the probabilities \(P(-2 \le Z \le 2)\) vs \(P(-2 \le T \le 2)\) for a t-distribution with r=30 degrees of freedom.

For normal,

\begin{equation*} P(-2 \le Z \le 2) = \Phi(2) - \Phi(-2) = 0.9545 \end{equation*}

while for t,

\begin{equation*} P(-2 \le T \le 2) = 0.9454. \end{equation*}

Here is a calculator for obtaining probabilities for the t-distribution over an interval.

As has been our pattern for some time, it is of interest to see what happens to the t-distribution's graph when (in this case) the number of degrees of freedom increase. The interactive cell below illustrates what happens up till 30 degrees of freedom. Notice at that point, the t-distribution and the normal distribution have almost the same probability function.