Section 8.4 Gamma Distribution
Extending the exponential distribution model developed above, consider a Poisson Process where you start with an interval of variable length X so that X measures the interval needed in order to obtain the rth success for some natural number r. Then and the resulting distribution of X will be called a Gamma distribution.
Theorem 8.4.2. Gamma Function on the natural numbers.
Proof.
Letting n be a natural number and applying integration by parts one time gives
Continuing using an inductive argument to obtain the final result.
To find the probability function for the gamma distribution, once again focus on the development of F(x). Assuming r is a natural number greater than 1 and noting that X measures the interval length needed in order to achieve the rth success
where the discrete Poisson probability function is used on the interval [0,x]. The derivative of this function however is "telescoping" and terms cancel. Indeed,
where you can replace
Notice that for this random variable, can be obtained for the exponential distribution. For the Gamma distribution, the following takes to be the average interval till the first success and then modifies the corresponding Gamma parameters according to increasing values of r.
So, to summarize, you get the Gamma Probability Function below.
Definition 8.4.3. Gamma Probability Function.
The Gamma Distribution is a general one that has some important special cases. We will analyze the Gamma Distributions properties in general but a special case of the Gamma Distribution that extends the exponential distribution is especially appropriate at this point. Indeed, the distribution below considers the case when is an integer.
Definition 8.4.4. Weibull Probability Function.
Checkpoint 8.4.5.
It can be presumed that the life span in years of a certain brand of lightbulbs follows the Poisson process assumptions. Suppose that the mean life span till failure is known to be 0.7 years. To do long-term study, a series of light bulbs are arranged so that when the first one fails, the next one comes on, etc.
Determine the probability that it takes at most 3.5 years before the 4th bulb fails:
Example 8.4.6. Router Requests Revisited Again.
For the third time, let’s consider a router which, over time, has been shown to receive on average 1000 requests in any given 10 minute period during regular working hours and you want to know the likelihood that it takes more than 4 seconds in order to receive the 5th request. As you have already seen, it takes on average minutes to receive the first request so we use that again here. If X were to measure the time interval until the fifth actual request comes in, then the Gamma distribution would be a good model using
The question above asks for
Therefore
Again, since X is a continuous variable you must integrate to compute probabilities. This will require integration by parts or you can use the F(x) from the derivation above. Here, let’s just let Sage do the integration for us noting that You can compute the needed integral using the interactive cell immediately below.
xxxxxxxxxx
var('x')
f = x^4*e^(-x/0.01)/(24*0.01^5)
prob_complement = integrate(f,x,0,4/60)
print('For the interval from 0 to 4/60, the probability above is')
print(n(1-prob_complement))
Theorem 8.4.7. Verify Gamma Probability function.
Proof.
This is kind of a tough integral to do. Let’s just evaluate the sage code below.
xxxxxxxxxx
# Gamma Distribution
var('x,lam,r')
assume(lam>0)
assume(r,'integer')
assume(r>1)
f(x) = x^(r-1)*e^(-x/lam)/(gamma(r)*lam^r)
S = integral(f,x,0,oo).full_simplify()
F = ("$ \\int_0^{\\infty} \\frac{x^{r-1}"+
"e^{-x/ \\lambda}}{\\Gamma(r) \\lambda^r} dx = %s $"%str(S))
show(html(F))
You can graph the gamma distribution’s probability function for various parameters below. Notice as r increases the curve becomes increasingly bell-shaped whicle changing the mean only shifts the curve around.
xxxxxxxxxx
# Gamma Distribution Graphing
var('x,lam,r')
assume(lam>0)
assume(r,'integer')
def _(r=[2,3,6,12,24],mu=slider(1,12,1,5,label='$$\\lambda$$')):
f(x) =x^(r-1)*e^(-x/lam)/(gamma(r)*lam^r)
plot(f,x,0,r*15).show(figsize=[5,3])
Theorem 8.4.8. Properties of the Gamma Distribution.
Proof.
Derivation of mean, variance, skewness, and kurtosis. Pick "alpha" for the general formulas.
xxxxxxxxxx
# Gamma Distribution
var('x,lam,r,alpha')
assume(lam>0)
assume(alpha,'integer')
assume(alpha>1)
def _(r=[2,3,6,9,alpha]):
f(x) =x^(r-1)*e^(-x/lam)/(gamma(r)*lam^r)
mean = integral(x*f,x,0,oo).full_simplify()
M2 = integral(x^2*f,x,0,oo).full_simplify()
M3 = integral(x^3*f,x,0,oo).full_simplify()
M4 = integral(x^4*f,x,0,oo).full_simplify()
print('Mean = ',mean)
v = (M2-mean^2).full_simplify().factor()
print('Variance = ',v)
stand = sqrt(v)
sk = (((M3 - 3*M2*mean + 2*mean^3))/stand^3).full_simplify()
print('Skewness = ',sk)
kurt = (M4 - 4*M3*mean + 6*M2*mean^2 -3*mean^4).factor()/stand^4
print('Kurtosis = ',(kurt-3).factor(),'+3')
The interactive cell below can be used to compute the distribution function for the gamma distribution for various input values. If you desire to let r get bigger than the slider allows, feel free to edit the cell above and evaluate again.
xxxxxxxxxx
# Gamma Distribution Calculator
var('x,lam,r')
pretty_print('Enter the number of successes r desired,')
pretty_print('the given lambda, and the value of X to get F(X)')
layout=dict(top=[['lambda','b']],bottom=[['r']])) (
def _(r=slider(1,10,1,2, label="$$ r = $$"),
lam = input_box(2,label="$$\\lambda = $$",width=10),
b=input_box(2,label="$$ X = $$",width=10)):
f(x) =x^(r-1)*e^(-x/lam)/(gamma(r)*lam^r)
p = integral(f,x,0,b)
pretty_print(html('$$\\text{Probability} = %s'%str(latex(p))
+' \\approx %s$$'%str(p.n(digits=5))))
xxxxxxxxxx
r=3 # the number of successes desired
mu1 = 3 # the mean till first must be given
mu = mu1*r
sdev = sqrt(r)*mu1 # the formula for the standard deviation
M = mu*3 # the space is infinite but we just go out 3 standard deviations
X = 0:M # quantiles for the space R of the random variable
Ppois <- function(x){dgamma(x, shape=r, scale=mu1 )} # create the probability function over X
curve(Ppois, from=0, to=M, xlab="X", col="blue", lwd=3,
main="Gamma Sampling vs Gamma Curve vs Approximating 'Bell Curve'")
Pnormal <- function(X){dnorm(X, mean=mu, sd=sdev)} # to overlap a bell curve
curve(Pnormal, col="red", lwd=2, add=TRUE)
Psample = rgamma(10^6, shape=r, scale=mu1) # to create a histogram, sample a lot
# Xtop=max(Psample) # for scaling the x-axis. Shift by 1/2 below.
hist(Psample, prob=TRUE, add=TRUE)