Binomial Distribution

Section 7.2 Binomial Distribution

Consider a sequence of n independent Bernoulli trials with the likelihood of a success p on each individual trial stays constant from trial to trial with $0 \lt p \lt 1 \text{.}$ If we let the variable $X$ measure the number of successes obtained when doing a fixed number of trials n with $R = \{ 0, 1, ..., n \}\text{,}$ then the resulting distribution of probabilities is called a Binomial Distribution.

xxxxxxxxxx
 
# Binomial distribution over 0 .. n
# Probability of success on one independent trial = p must also be given
var('x')
@interact
def _(n=slider(3,50,1,3),p=slider(1/20,19/20,1/20,1/2)):
    np1 = n+1
    R = range(np1)
    f(x) = factorial(n)/(factorial(x)*factorial(n-x))*p^x*(1-p)^(n-x)
    pretty_print(html('Density Function: $f(x) =%s$'%str(latex(f(x)))))
    pretty_print(html('over the space $R = %s$'%str(R)))
    G = points((k,f(x=k)) for k in R)
    G.show()
    R = [k for k in R]
    probs = [f(x=k) for k in R]
#    H = histogram( R, weights = probs, align="mid", linewidth=2, edgecolor="blue", color="yellow")
#    H.show()
    for k in R:
        pretty_print(html('$f(%s'%k+') = %s'%latex(f(x=k))+' \\approx %s$'%f(x=k).n(digits=5)))

You can of course get specific values and graph the Binomial Distribution using R as well...

xxxxxxxxxx
 
n <- 10
p <- 0.3
​
paste('Probability Function')
dbinom(0:n, n, p)   # gives the probability function
paste('Distribution function')
pbinom(0:n, n, p)   # gives the distribution function
paste('A random sample')
rbinom(15, n, p)    # gives a random sample of 15 items from b(n,p)
​
x <- dbinom(0:n, size=n, prob=p)
barplot(x,names.arg=0:n, main=sprintf(paste('n=',n,' and p= ',p)))

Theorem 7.2.1 Derivation of Binomial Probability Function

For R = {0, 1, ..., n},

$\begin{equation*} f(x) = \binom{n}{x}p^x(1-p)^{n-x} \end{equation*}$

Proof

Since successive trials are independent, then the probability of X successes occurring within n trials is given by

\begin{equation*} P(X=x) = \binom{n}{x}P(SS...SFF...F) = \binom{n}{x}p^x(1-p)^{n-x} \end{equation*}

Theorem 7.2.2 Verification of Binomial Distribution Formula

$\begin{equation*} \sum_{x \in R} f(x) = \sum_{x=0}^n \binom{n}{x}p^x(1-p)^{n-x} = 1. \end{equation*}$

Proof

Using the Binomial Theorem with a = p and b = 1-p yields

\begin{equation*} \sum_{x=0}^n \binom{n}{x}p^x(1-p)^{n-x} = (p + (1-p))^n = 1 \end{equation*}

Utilize the interactive cell below to compute f(x) and F(x) for the Binomial distribution

xxxxxxxxxx
 
# Binomial calculator
@interact
def _(p=input_box(0.3,width=15),n=input_box(10,width=15)):
    R = range(n+1)
    f(x) = binomial(n,x)*p^x*(1-p)^(n-x)
    acc = 0
    for k in R:
        prob = f(x=k)
        acc = acc+prob
        pretty_print('f(%s) = '%k,' %.8f'%prob,' and F(%s) = '%k,' %.8f'%acc)

Theorem 7.2.3 Binomial Distribution Statistics

For the Binomial Distribution

$\begin{equation*} \mu = np \end{equation*}$

$\begin{equation*} \sigma^2 = np(1-p) \end{equation*}$

$\begin{equation*} \gamma_1 = \frac{1-2p}{\sqrt{np(1-p)}} \end{equation*}$

$\begin{equation*} \gamma_2 = \frac{1-6p(1-p)}{np(1-p)} + 3 \end{equation*}$

Proof

For the mean,

\begin{align*} \mu & = E[X] \\ & = \sum_{x=0}^{n} {x \binom{n}{x} p^x (1-p)^{n-x}}\\ & = \sum_{x=1}^{n} {x \frac{n(n-1)!}{x(x-1)!(n-x)!} p^x (1-p)^{n-x}}\\ & = np \sum_{x=1}^{n} {\frac{(n-1)!}{(x-1)!((n-1)-(x-1))!} p^{x-1} (1-p)^{(n-1)-(x-1)}} \end{align*}

Using the change of variables $k=x-1$ and $m = n-1$ yields a binomial series

\begin{align*} & = np \sum_{k=0}^{m} {\frac{m!}{k!(m-k)!} p^k (1-p)^{m-k}}\\ & = np (p + (1-p))^m = np \end{align*}

For the variance,

\begin{align*} \sigma^2 & = E[X(X-1)] + \mu - \mu^2 \\ & = \sum_{x=0}^{n} {x(x-1) \binom{n}{x} p^x (1-p)^{n-x}} + np - n^2p^2\\ & = \sum_{x=2}^{n} {x(x-1) \frac{n(n-1)(n-2)!}{x(x-1)(x-2)!(n-x)!} p^x (1-p)^{n-x}} + np - n^2p^2\\ & = n(n-1)p^2 \sum_{x=2}^{n} {\frac{(n-2)!}{(x-2)!((n-2)-(x-2))!} p^{x-2} (1-p)^{(n-2)-(x-2)}} + np - n^2p^2 \end{align*}

Using the change of variables $k=x-2$ and $m = n-2$ yields a binomial series

\begin{align*} & = n(n-1)p^2 \sum_{k=0}^{m} {\frac{m!}{k!(m-k)!} p^k (1-p)^{m-k}} + np - n^2p^2\\ & = n(n-1)p^2 + np - n^2p^2 = np - np^2 = np(1-p) \end{align*}

The skewness and kurtosis can be found similarly using formulas involving E[X(X-1)(X-2)] and E[X(X-1)(X-2)(X-3)]. The complete determination is performed using Sage below.

The following uses Sage to determine the general formulas for the Binomial distribution.

xxxxxxxxxx
 
var('x,n,p')
assume(x,'integer')
f(x) = binomial(n,x)*p^x*(1-p)^(n-x)
mu = sum(x*f,x,0,n)
M2 = sum(x^2*f,x,0,n)
M3 = sum(x^3*f,x,0,n)
M4 = sum(x^4*f,x,0,n)
​
pretty_print('Mean = ',mu)
​
v = (M2-mu^2).factor()
pretty_print('Variance = ',v)
stand = sqrt(v)
​
sk = ((M3 - 3*M2*mu + 2*mu^3)).factor()/stand^3
pretty_print('Skewness = ',sk)
​
kurt = (M4 - 4*M3*mu + 6*M2*mu^2 -3*mu^4).factor()/stand^4
pretty_print('Kurtosis = ',(kurt-3).factor(),'+3')

Flipping Coins

Suppose you flip a coin exactly 20 times. Determine the probability of getting exactly 10 heads and then determine the probability of getting 10 or fewer heads.

Solution

This is binomial with n = 20, p = 1/2 and you are looking for f(10). With these values

\begin{equation*} f(10) = \binom{20}{10} \cdot \left ( \frac{1}{2} \right )^{10} \cdot \left ( \frac{1}{2} \right )^{20-10} = \frac{46189}{262144} \approx 0.176 \end{equation*}

Notice, the mean for this distribution is also 10 so one might expect 10 heads in general. Next, to determine the probability for 10 or fewer heads requires F(10) = f(0) + f(1) + ... + f(10). There is no "nice" formula for F but this calculation can be performed using a graphing calculator, such as the TI-84 with F(x) = binomcdf(n,p,x). In this case, F(10) = binomcdf(20,1/2,10) = 0.588.