Negative Binomial

Section 7.4 Negative Binomial

Consider the situation where one can observe a sequence of independent trials where the likelihood of a success on each individual trial stays constant from trial to trial. Call this likelihood the probably of "success" and denote its value by

$p$ where

$0 \lt p \lt 1 \text{.}$ If we let the variable

$X$ measure the number of trials needed in order to obtain the rth success,

$r \ge 1\text{,}$ with

$R = \{r, r+1, r+2, ... \}$ then the resulting distribution of probabilities is called a Geometric Distribution.

🔗

Note that r=1 gives the Geometric Distribution.

🔗

Theorem 7.4.1. Negative Binomial Series.

$\begin{equation*} \displaystyle \frac{1}{(a+b)^n} = \sum_{k=0}^{\infty} {(-1)^k \binom{n + k - 1}{k} a^k b^{-n-k}} \end{equation*}$

🔗

Proof.

First, convert the problem to a slightly different form: $\frac{1}{(a+b)^n} = \frac{1}{b^n} \frac{1}{(\frac{a}{b}+1)^n} = \frac{1}{b^n} \sum_{k=0}^{\infty} {(-1)^k \binom{n + k - 1}{k} \left ( \frac{a}{b} \right ) ^k}$

So, let's replace $\frac{a}{b} = x$ and ignore for a while the term factored out. Then, we only need to show

\begin{equation*} \sum_{k=0}^{\infty} {(-1)^k \binom{n + k - 1}{k} x^k} = \left ( \frac{1}{1+x} \right )^n \end{equation*}

However

\begin{align*} \left ( \frac{1}{1+x} \right )^n & = \left ( \frac{1}{1 - (-x)} \right )^n \\ & = \left ( \sum_{k=0}^{\infty} {(-1)^k x^k} \right )^n \end{align*}

This infinite sum raised to a power can be expanded by distributing terms in the standard way. In doing so, the various powers of x multiplied together will create a series in powers of x involving $x^0, x^1, x^2, ...\text{.}$ To detemine the final coefficients notice that the number of time $m^k$ will appear in this product depends upon the number of ways one can write k as a sum of nonnegative integers.

For example, the coefficient of $x^3$ will come from the n ways of multiplying the coefficients $x^3, x^0, ..., x^0$ and $x^2, x^1, x^0, ..., x^0$ and $x^1, x^1, x^1, x^0,..., x^0\text{.}$ This is equivalent to finding the number of ways to write the number k as a sum of nonnegative integers. The possible set of nonnegative integers is {0,1,2,...,k} and one way to count the combinations is to separate k *'s by n-1 |'s. For example, if k = 3 then *||** means $x^1 x^0 x^2 = x^3\text{.}$ Similarly for k = 5 and |**|*|**| implies $x^0 x^2 x^1 x^2 x^0 = x^5\text{.}$ The number of ways to interchange the identical *'s among the idential |'s is $\binom{n+k-1}{k}\text{.}$

Furthermore, to obtain an even power of x will require an even number of odd powers and an odd power of x will require an odd number of odd powers. So, the coefficient of the odd terms stays odd and the coefficient of the even terms remains even. Therefore,

\begin{equation*} \left ( \frac{1}{1+x} \right )^n = \sum_{k=0}^{\infty} {(-1)^k \binom{n + k - 1}{k} x^k} \end{equation*}

Similarly,

\begin{equation*} \left ( \frac{1}{1-x} \right )^n = \left ( \sum_{k=0}^{\infty} {x^k} \right )^n = \sum_{k=0}^{\infty} {\binom{n + k - 1}{k} x^k} \end{equation*}

🔗

Consider the situation where one can observe a sequence of independent trials with the likelihood of a success on each individual trial

$p$ where

$0 \lt p \lt 1 \text{.}$ For a positive integer r, let the variable

$X$ measure the number of trials needed in order to obtain the rth success. Then the resulting distribution of probabilities is called a Negative Binomial Distribution.

🔗

Theorem 7.4.2. Negative Binomial Probability Function.

$\begin{equation*} f(x) = \binom{x - 1}{r-1}(1-p)^{x-r}p^r, \end{equation*}$

for $x \in R = \{r, r+1, ... \}\text{.}$

🔗

Proof.

Since successive trials are independent, then the probability of the rth success occurring on the x-th trial presumes that in the previous x-1 trials were r-1 successes and x-r failures. You can arrange these indistinguishable successes (and failures) in $\binom{x-1}{r-1}$ unique ways. Therefore the desired probability is given by

\begin{equation*} P(X=x) = \binom{x - 1}{r-1}(1-p)^{x-r}p^r \end{equation*}

🔗

Example 7.4.3. Flipping a die until getting a third success.

Once again, consider rolling our 24-sided die until you get a multiple of 9...that is, either a 9 or an 18...for the third time. Once again, the probability of getting a 9 or 18 on any given roll is $p = \frac{1}{12}$ but since we will continue rolling until we get a success for the third time, this is modeled by a negative binomial distribution and you are looking for

$\begin{equation*} f(x) = \binom{x-1}{2}(\frac{11}{12})^{x-3} (\frac{1}{12})^3 \\ = \frac{(x-1)(x-2)}{2}(\frac{11}{12})^{x-3} (\frac{1}{12})^3 \end{equation*}$

Computing f(x) for any given x is relatively painless but computing F(x) could take some effort. There is generally not a graphing calculator distribution option for negative binomial but the interactive cells below can be utilized to help with the tedious computations. For example, if you were interested in $P(X \lt 20) = P(X \le 19 ) = F(19)$ then the interactive calculator below using r=3 and $p = \frac{1}{12}$ gives $F(19) \approx 0.20737.$

🔗

Example 7.4.4. Testing a critical component until failure.

Let's once again test a series of critical system components until you find two that fail. Again, suppose a particular component has a p=0.01 probability of breaking on any given trial. Since you will stop when you encounter the third failure, you can model this situation with a negative binomial distribution and the probability that the 2nd component fails on the x-th trial is given by

$\begin{equation*} f(x) = \binom{x-1}{2-1} 0.99^{x-2} \cdot 0.01^2 = (x-1) \cdot 0.99^{x-2} \cdot 0.01^2. \end{equation*}$

Once again, let's compute the likelihood that you get the second failure on one of the first five trials. Then,

$\begin{equation*} F(5) = f(2)+f(3)+f(4)+f(5) \\ = 0.01^2 + 2 \cdot 0.99 \cdot 0.01^2 + 3 \cdot 0.99^2 \cdot 0.01^2 + 4 \cdot 0.99^3 \cdot 0.01^2 \approx some nice number \end{equation*}$

which is still relatively small.

🔗

Theorem 7.4.5. Negative Binomial Distribution Sums to 1.

$\begin{equation*} \sum_{x=r}^{\infty} {\binom{x - 1}{r-1}(1-p)^{x-r}p^r} = 1 \end{equation*}$

🔗

Proof.

\begin{equation*} \sum_{x=r}^{\infty} {\binom{x - 1}{r-1}(1-p)^{x-r}p^r} = p^r \sum_{x=r}^{\infty} {\binom{x - 1}{r-1}(1-p)^{x-r}} \end{equation*}

and by using $k = x-r$

\begin{align*} & = p^r \sum_{k=0}^{\infty} {\binom{r + k - 1}{k}(1-p)^k}\\ & = p^r \frac{1}{(1-(1-p))^r}\\ & = 1 \end{align*}

🔗

Below, the interactive cell symbolically computes f(x) and F(x) for the negative binomial distribution.

xxxxxxxxxx
 
## Negative Binomial calculator
@interact
def _(p=input_box(1/12,width=15),r=slider(1,20,1,2)):
    n = 4*(floor(r/p)+1)
    np1 = n+1
    R = range(r,np1)
    f(x) = ((factorial(x-1)/(factorial(r-1)*factorial(x-r)))
            *(1-p)^(x-r)*p^r)
    acc = 0
    for k in R:
        prob = f(x=k)
        acc = acc+prob
        pretty_print('f(%s) = '%k,' %.8f'%prob,
                     ' and F(%s) = '%k,' %.8f'%acc)

🔗

Checkpoint 7.4.6.

🔗

Theorem 7.4.7. Statistics for Negative Binomial Distribution.

For the Negative Binomial Distribution,

$\begin{equation*} \mu = \frac{r}{p} \end{equation*}$

$\begin{equation*} \sigma^2 = r \frac{1-p}{p^2} \end{equation*}$

$\begin{equation*} \gamma_1 = \frac{2-p}{\sqrt{r(1-p)}} \end{equation*}$

$\begin{equation*} \gamma_2 = \frac{p^2-6p+6}{r(1-p)} + 3 \end{equation*}$

🔗

Proof.

See interactive cell below...

xxxxxxxxxx
 
# Negative Binomial
var('x,n,p,r,alpha')
assume(x,'integer')
assume(alpha,'integer')
assume(alpha > 2)
assume(0 < p < 1)
@interact
def _(r=[2,5,10,15,alpha]):
    f(x) = binomial(x-1,r-1)*p^r*(1-p)^(x-r)
    mu = sum(x*f,x,r,oo).full_simplify()
    M2 = sum(x^2*f,x,r,oo).full_simplify()
    M3 = sum(x^3*f,x,r,oo).full_simplify()
    M4 = sum(x^4*f,x,r,oo).full_simplify()
        
    pretty_print('Mean = ',mu)
    
    v = (M2-mu^2).full_simplify()
    pretty_print('Variance = ',v)
    stand = sqrt(v)
    
    sk = (((M3 - 3*M2*mu + 2*mu^3)).full_simplify()/stand^3).factor()
    pretty_print('Skewness = ',sk)
    
    kurt = ((M4 - 4*M3*mu + 6*M2*mu^2 -3*mu^4)/v^2).full_simplify()
    pretty_print('Kurtosis = ',(kurt-3).factor(),'+3')

Once again, you can see that as one of the parameters is allowed to increase without bound this distribution must become symmetrical and bell-shaped.

🔗

Theorem 7.4.8. Negative Binomial Limiting Distribution.

For the negative binomial distribution 7.4.2, as $r \rightarrow \infty$

$\begin{equation*} \gamma_1 \rightarrow 0 \end{equation*}$

and

$\begin{equation*} \gamma_2 \rightarrow 3. \end{equation*}$

🔗

Proof.

For skewness, take the limit of the skewness result above

\begin{equation*} \lim_{r \rightarrow \infty} \frac{2-p}{\sqrt{r(1-p)}} = lim_{r \rightarrow \infty} \frac{C}{r^{1/2}} = 0. \end{equation*}

Similarly for kurtosis

\begin{equation*} \lim_{r \rightarrow \infty} \frac{p^2-6p+6}{r(1-p)} + 3 = lim_{r \rightarrow \infty} \frac{C}{r} + 3 = 3. \end{equation*}

xxxxxxxxxx
 
r = 3
p = 0.4
M = 20*r
X = r:M    # the space R should be infinite 
Xshift = 0:(M-r)   #  r does dbinom, etc. by letting
                   # x measure the number of failures 
                   # rather than the number
                   # of total trials.  So, x=r would be 
                   # 0 failures and so
                   # r would count that as x=0.
mu = r/p      # the formula for mean of the Binomial Distributions
sdev = sqrt(r*(1-p)/p^2)  # the formula for the standard deviation
if(M < 101){
dnbinom( Xshift, r, p )   # let's print out a bunch of actual probs
}
Pnbinom = dnbinom(Xshift, r, p )  #  the probability function over X
​
Psample = rnbinom(10^6, r, p)  #  create a histogram, sample a lot
​
Xtop=max(Psample)          # for scaling the x-axis. Shift by 1/2 below.
hist(Psample, prob=TRUE, br=(-1:Xtop)+0.5, col="skyblue2", xlab="X - r", 
  main="Negative Binomial pdf vs Approximating 'Bell Curve'")
​
points(Xshift, Pnbinom, pch=19, col="darkgreen")  #  actual (x,f(x))
​
Pnormal <- function(X){dnorm(X, mean=mu-r, sd=sdev)}   
# overlap a bell curve, shifting our mean by r to fit the r protocol
curve(Pnormal, col="red", lwd=2, add=TRUE)