Section 9.4 Other "Bell Shaped" distributions
The Normal distribution discussed above is very important when doing statistical analysis. It however is not the only distribution that is symmetrical about the mean and looks like a bell. In this section, we consider two other options--one which is virtually useless and another which is very useful.Definition 9.4.1. The Cauchy Distribution.
Consider a continuous random variable on the real numbers defined by
f(x)=1/Ï€1+x2.
A random variable with this probability function is said to be a Cauchy Distribution.
Theorem 9.4.2. The Cauchy Distribution.
f(x)=1/Ï€1+x2
is a probability function on (−∞,∞).Proof.
\begin{equation*}
\int_{-\infty}^{\infty} \frac{1}{1+x^2} dx = tan^{-1}(\infty) - tan{-1}(-\infty) = \pi/2 - (-\pi/2) = \pi.
\end{equation*}
Dividing by \(\pi\) gives the Cauchy probability function integrates to 1.
∫∞−∞x11+x2dx=(1/2)(ln(|∞|)−ln(|−∞|)
which is problematic. Further, even assuming that the distribution is symmetrical and therefore has a mean of 0, for the variance
∫∞−∞x211+x2dx
and note that the integrand does not converge to 0 at the endpoints and therefore the integral is automatically considered divergent. Thus it is reasonable to note that the Cauchy distribution has no variance.
The formula for this curve is so much easier to deal with versus the normal distribution. Perhaps it should be used more. You can see above that it is pretty much inadequate since its theoretical statistics are not well-defined. In the interactive cell below, you might notice some issues right away by comparing the Cauchy probabilty function against a normal probability function (when μ=0 but with varied standard deviations. Notice especially that as you change the normal distribution's σnormal the the area you see under the normal curve totally overwhelms the area in the stationary Cauchy distribution. That means that the two tails of the Cauchy distribution have a lot more area far away from zero than the nomal distribution. This is one of the issues why the Cauchy doesn't give good results.
xxxxxxxxxx
# A nice picture of Cauchy compared to normal
f = 1/pi*(1/(1+x^2))
def _(sigma=slider(2/10,4,1/10,1,label="$$ \\sigma_{\\text{normal}}$$")):
g = 1/(sqrt(2*pi*sigma))*e^(-x^2/(2*sigma^2))
G = (plot(f,(x,-6,6),color='blue')
+plot(g,(x,-6,6),color='red'))
T = "Cauchy (blue) vs Normal (red)"
G.show(title=T,figsize=[5,3])
Definition 9.4.3. Student-t Distribution.
Suppose Z is a standard normal variable and Y is χ2(r) with Y and Z independent. Define a new random variable
T=Z√Y/r.
Then, T is said to have a (Student) t distribution with probability function given by
Γ(n+12)√nπΓ(n2)(1+x2n)−(n+12)
Theorem 9.4.4. Student t-distribution properties.
For the Student t variable T defined above,
μ=0
and if r>2
σ2=rr−2
and if r>3
γ1=0
and if r>4
γ2=6r−4+3.
Example 9.4.5. Similarity between Normal and t-distributions for larger n.
Consider the probabilities P(−2≤Z≤2) vs P(−2≤T≤2) for a t-distribution with r=30 degrees of freedom.
For normal,
P(−2≤Z≤2)=Φ(2)−Φ(−2)=0.9545
while for t,
P(−2≤T≤2)=0.9454.
xxxxxxxxxx
pretty_print("Calculator for t-Distribution")
var("x")
layout=dict(top=[['a', 'b']])) (
def _(a=input_box(-1,width=10,label='$$ a =$$'),
b=input_box(1,width=10,label='$$ b =$$'),
r0=input_box(5,width=8,label='$$ df = $$')):
T = RealDistribution('t', r0) # use built-in
P = T.cum_distribution_function(b)-T.cum_distribution_function(a)
pretty_print(html("$$ P("+str(a)+" < X < "
+str(b)+") \\approx "+str(P)+"$$"))
xxxxxxxxxx
# Display the Student's t distributions with various
# degrees of freedom and compare to the normal distribution
# Copied from www.statmethods.net
​
x <- seq(-4, 4, length=100)
hx <- dnorm(x)
​
degf <- c(1, 3, 8, 30)
colors <- c("red", "blue", "darkgreen", "gold", "black")
labels <- c("df=1", "df=3", "df=8", "df=30", "normal")
​
plot(x, hx, type="l", lty=2, xlab="x value",
ylab="Density", main="Comparison of t Distributions")
​
for (i in 1:4){
lines(x, dt(x,degf[i]), lwd=2, col=colors[i])
}
​
legend("topright", inset=.05, title="Distributions",
labels, lwd=2, lty=c(1, 1, 1, 1, 2), col=colors)