Bayes’ Theorem

Section 4.6 Bayes’ Theorem

Conditional probabilities 4.5.3 can be computed using the methods developed above if the appropriate information is available. Some times you will however have some information available, such as

P (A | B)

but need

P (B | A) .

The ability to "play around with history" by switching what has been presumed to occur leads to an important result known as Bayes’ Theorem.

🔗

Theorem 4.6.1. Bayes’ Theorem.

🔗

Let

S = {S_{1}, S_{2}, . . ., S_{m}}

where the

S_{k}

are pairwise disjoint and

S_{1} \cup S_{2} \cup . . . \cup S_{m} = S

(i.e. a partition of the space S). Then for any

A \subseteq S

P (S_{j} | A) = \frac{P (S_{j}) P (A | S_{j})}{\sum_{k = 1}^{m} P (S_{k}) P (A | S_{k})} .

🔗

The conditional probability

P (S_{j} | A)

is called the posterior probability of

S_{k} .

🔗

Proof.

Notice, by the definition of conditional probability 4.5.3 and the multiplication rule 4.5.5

P (S_{j} | A) = \frac{P (S_{j} \cap A)}{P (A)} = \frac{P (S_{j}) P (A | S_{j})}{P (A)} .

But using the disjointness of the partition

\begin{aligned} P (A) & = P ((A \cap S_{1}) \cup (A \cup S_{2}) \cup . . . \cup (A \cup S_{m})) \\ = P (A \cap S_{1}) + P (A \cup S_{2}) + . . . + P (A \cup S_{m}) \\ = P (S_{1} \cap A) + P (S_{2} \cup A) + . . . + P (S_{m} \cup A) \\ = P (S_{1}) P (A | S_{1}) + P (S_{2}) P (A | S_{2}) + . . . + P (S_{m}) P (A | S_{m}) \\ = \sum_{k = 1}^{m} P (S_{k}) P (A | S_{k}) \end{aligned}

Put these two expansions together to obtain the desired result.

🔗

To illustrate this result, from the web site http://stattrek.com/probability/bayes-theorem.aspx consider the following problem:

🔗

Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn’t rain, he incorrectly forecasts rain 10% of the time. What is the probability that it will rain on the day of Marie’s wedding?

🔗

Notice, all days can be classified into one of two disjoint options:

🔗
Rainy, in which case we can deduce from the given info that P(Rain) = 5/365
🔗
Not Rainy, and since this is the complement of above, P(Not Rain) = 360/365

🔗

In the notation of Bayes Theorem 4.6.1, let A represent a forecast of Rain and note you have

P (Rain) = P (S_{1}) = \frac{5}{365}

🔗

and

P (Not Rain) = P (S_{2}) = \frac{360}{365} .

🔗

Further, you are given the conditional probabilities

P (Forecast Rain | Rain) = P (A | S_{1}) = 0.9

P (Forecast Rain | Not Rain) = P (A | S_{2}) = 0.1

🔗

Notice that the question provided requests that you find the probability of Rain given that the weatherman has forecasted rain. What is given on the other hand is the reverse of that conditional probability. Using Bayes’ Theorem allows you to turn this around...

\begin{aligned} P (Rain) & = P (S_{1}) P (A | S_{1}) + P (S_{2}) P (A | S_{2}) \\ = \frac{5}{365} \cdot 0.9 + \frac{360}{365} \cdot 0.1 \end{aligned}

🔗

Hence, putting these together gives

\begin{aligned} P (Rain | Forecast Rain) & = \frac{\frac{5}{365} \cdot 0.9}{\frac{5}{365} \cdot 0.9 + \frac{360}{365} \cdot 0.1} \\ = \frac{5 \cdot 0.9}{5 \cdot 0.9 + 360 \cdot 0.1} \\ = \frac{45}{45 + 360} \approx 0.111 \end{aligned}

🔗

So, normally there is only approximately a 1.369 percent chance of rain (5/365) on a given day but given that the weatherman has forecast rain, the chance of rain increases to a little more than 11 percent.

🔗

Here’s a standard and yet surprising example:

🔗

Your neighbor has 2 children. You learn that he has a daughter Anna. What is the probability that Anna’s sibling is a sister? Your first reaction might be that it is obviously 1/2 since (ok, roughly) it is equally likely that the other sibling was born female or male. But is that accurate?

🔗

If you were to list all of the possible female/male outcomes when having two children, then the sample space is S = {FF, FM, MF, MM} with FM meaning the first child is female and the second is male. Assuming again that girls and boys are equally likely to be born, these 4 outcomes have equal probability of

1 / 4 .

🔗

The question asks is whether the neighbor has another daughter in the set Fem2 = {FF} but since Anna is a girl then the possible outcomes now is only Fem = {FF, FM, MF}.

🔗

So,

\begin{aligned} P (F e m 2 | F e m) & = \frac{P (F e m 2 \cap F e m)}{P (F e m)} \\ = \frac{P (F F)}{P (F F o r F M o r M F)} \\ = \frac{\frac{1}{4}}{\frac{3}{4}} = \frac{1}{3} \end{aligned}

🔗

Bayes’ Theorem 4.6.1 also works well in analyzing "Let’s Make A Deal" finale choice.

🔗

Checkpoint 4.6.2. WebWork - Bayes’.

A biomedical research company produces

46 %

of its insulin at a plant in Kansas City, and the remainder is produced at a plant in Jefferson City. Quality control has shown that

0.6 %

of the insulin produced at the plant in Kansas City is defective, while

0.8 %

of the insulin produced at the plant in Jefferson City is defective. What is the probability that a randomly chosen unit of insulin came from the plant in Jefferson City given that it is defective?

(Hint: Draw a tree diagram first)

Answer.

0.610169

You have to be careful to extract the conditional probabilities from the problem.

🔗

Checkpoint 4.6.3. WebWork- Bigger Bayes’.

Data from Office on Smoking and Health, Centers for Disease Control and Prevention, indicate that 42% of adults who did not finish high school, 34% of high school graduates, 26% of adults who completed some college, and 13% of college graduates smoke. Suppose that one individual is selected at random and it is discovered that the individual smokes. Use the probabilities in the following table to calculate the probability that the individual is a college graduate.

Education	Employed	Unemployed
Not a high school graduate	0.0975	0.0080
High school graduate	0.3108	0.0128
Some college, no degree	0.1785	0.0062
Associate Degree	0.0849	0.0023
Bachelor Degree	0.1959	0.0041
Advanced Degree	0.0975	0.0015

Probability =

Hints: This problem has all the information you need, but not in the typical ready-to-use form. The table above can tell you the proportion of people with various levels of education in the population. Keep in mind that any degree (Associate, Bachelor, or Advanced) counts as graduating from college.

Answer.

0.198786832540129

Notice that having the data expressed in tabular form sometimes makes it easier to deal with.

🔗

The interactive cell below can be used to easily compute all of the conditional probabilities associated with Bayes’ Theorem 4.6.1. Notice how the relative size of the pie-shaped partition changes when you presume that an event in the space has already occurred.


    
        
xxxxxxxxxx
 
1
#  This function is used to convert an input string into separate entries
2
def g(s): 
3
    S = str(s).replace(',',' ').replace('(',' ').replace(')',' ').split()
4
    return S
5
6
@interact
7
def _(Partition_Probabilities
8
         =input_box('0.35,0.25,0.40',
9
                    label="$$ P(S_1),P(S_2),... $$",width=50),
10
        Conditional_Probabilities
11
         =input_box('0.02,0.01,0.03',
12
                    label='$$ P(A|S_1),P(A|S_2),... $$',width=45),
13
        print_numbers=checkbox(True,label='Numerical Results on Graphs?'),
14
        auto_update=False):
15
            
16
    Partition_Probabilities = g(Partition_Probabilities)
17
    Conditional_Probabilities = g(Conditional_Probabilities)
18
    n = len(Partition_Probabilities)
19
    n0 = len(Conditional_Probabilities)
20
    
21
    if (n > n0):
22
        pretty_print("Unmatched data input.")
23
        
24
    else:                       # data streams now are the same size!
25
        colors = rainbow(n)
26
        accum = float(0)        #  whether partition probs sum to one
27
        ends = [0]              # where  graphed partition sectors change
28
        mid = []                # used for placement of text
29
        p_Sk_given_A = []       # P( S_k | A )
30
        pA = 0                  # P(A)
31
        PP=[]                   # the numerical Partition Probabilities
32
        CP=[]                   # numerical Conditional Probabilities   
33
        for k in range(n):
34
            PP.append(float(Partition_Probabilities[k]))
35
            CP.append(float(Conditional_Probabilities[k]))
36
            p_Sk_given_A.append(PP[k]*CP[k] )
37
            pA += p_Sk_given_A[k]
38
            accum = accum + PP[k]
39
            ends.append(accum)
40
            mid.append((ends[k]+accum)/2)
41
#
42
#  From 0 to 1, saving angles for each partition sector boundary.
43
#  Later, multiple these by 2*pi to get actual sector boundary angles.
44
#
45
        if abs(accum-float(1))>0.0000001:     #  Due to roundoff issues
46
            pretty_print("Sum of probabilities should equal 1.")
47
        
48
        else:                           # probability data is sensible
49
 
50
#        
51
#  Venn diagram by drawing sectors from the angles determined above
52
#  Create a circle of radius 1 to illustrate the the sample space S
53
#  Sectors with varying colors and print out their names on the edge
54
#
55
            G = circle((0,0), 1, rgbcolor='black',fill=False, alpha=0.4,
56
                       aspect_ratio=True,axes=False,thickness=5)
57
            for k in range(n):
58
                G += disk((0,0), 1, (ends[k]*2*pi, ends[k+1]*2*pi),
59
                          color=colors[mod(k,10)],alpha = 0.2)
60
                G += text('$S_'+str(k+1)+'$',(1.1*cos(mid[k]*2*pi),
61
                          1.1*sin(mid[k]*2*pi)), rgbcolor='black')
62
                
63
            G += circle((0,0), 0.6, facecolor='yellow', fill = True,
64
                        alpha = 0.1, thickness=5,edgecolor='black')
65
    
66
#  probabilities corresponding to each particular region as a list
67
            if print_numbers:
68
69
                html("$P(A) = %s$"%(str(pA),))
70
                for k in range(n):
71
                    html("$P(S_{%s} | A)$"%(str(k+1))
72
                           +"$ = %s$"%str(p_Sk_given_A[k]/pA))
73
                                        
74
                    G += text(str(p_Sk_given_A[k]),
75
                              (0.4*cos(mid[k]*2*pi),
76
                               0.4*sin(mid[k]*2*pi)), rgbcolor='black')
77
                    G += text(str(PP[k] - p_Sk_given_A[k]),
78
                              (0.8*cos(mid[k]*2*pi), 
79
                               0.8*sin(mid[k]*2*pi)), rgbcolor='black')
80
        
81
#  sectors now correspond in area to the Bayes Theorem probabilities
82
83
            accum = float(0)                        
84
            ends = [0]       # where the graphed partition sectors change
85
            mid = []         # middle of each pie chart sector  
86
            for k in range(n): 
87
                accum += float(p_Sk_given_A[k]/pA) 
88
                ends.append(accum)
89
                mid.append((ends[k]+accum)/2)
90
            H = circle((0,0), 1, rgbcolor='black',fill=False, 
91
                        alpha=0,aspect_ratio=True,axes=False,
92
                       thickness=0)
93
            H += circle((0,0), 0.6, facecolor='yellow',
94
                        fill=True,alpha=0.1,
95
                        aspect_ratio=True,axes=False,
96
                        thickness=5,edgecolor='black')
97
            
98
            for k in range(n):
99
                H += disk((0,0), 0.6, (ends[k]*2*pi, ends[k+1]*2*pi),
100
                    color=colors[mod(k,10)],alpha = 0.2)
101
                H += text('$S_'+str(k+1)+'|A$',
102
                    (0.7*cos(mid[k]*2*pi), 0.7*sin(mid[k]*2*pi)), 
103
                    rgbcolor='black')
104
                    
105
        #  bayesian probabilities using the smaller set A only
106
    
107
            if print_numbers:
108
                for k in range(n):
109
                    H += text(str( N(p_Sk_given_A[k]/pA,digits=4) ),
110
                      (0.4*cos(mid[k]*2*pi), 0.4*sin(mid[k]*2*pi)), 
111
                      rgbcolor='black')
112
                    
113
            G.show(title='Venn diagram of partition with A in middle')
114
            print
115
            H.show(title='Venn diagram presuming A has occured')

    
    
    
    
        
            
                Language:
                
            
        
    
    




    
    
        
        Messages

🔗

You can actually also use Bayes’ Theorem 4.6.1 to answer a easy question! Indeed, suppose that you draw one card from a shuffled and standard 52 card deck. Given that you know the card is an Ace, what is the probability that it is also a Heart.

🔗

Using the Bayes’ approach, let’s break up the world into Hearts (H) and non-Hearts (N). Easily,

P (A | H) = 1 / 13

P (A | N) = 3 / 39

🔗

and so by Bayes’

P (H | A) = \frac{P (H) P (A | H)}{P (H) P (A | H) + P (N) P (A | N)} = \frac{\frac{13}{52} \cdot \frac{1}{13}}{\frac{13}{52} \cdot \frac{1}{13} + \frac{39}{52} \cdot \frac{3}{39}} = \frac{1}{4}

🔗

as expected!

🔗

Checkpoint 4.6.4. Insured vs Accident.

Your automobile insurance company uses past history to determine how to set rates by measuring the number of accidents caused by clients in various age ranges. The following table summarizes the proportion of those insured and the corresponding probabilities by age range:

Table 4.6.5. Age vs Accident Likelihood

Age	Proportion of Insured	Probability of Accident
16-20	0.05	0.08
21-25	0.06	0.07
26-55	0.49	0.02
55-65	0.25	0.03
over 65	0.15	0.04

One of your family friends insured by this company has an accident.

Determine the conditional probability that the driver was in the 16-20 age range.
Compare this to the probability that the driver was in the 18-20 age range. Discuss the difference.
Determine how much more the company should charge for someone in the 16-20 age range compared to someone in the 26-55 age range.

Solution.

Plug the middle column into the first input box and the right column into the second input box of the Bayes Sage Cell.

🔗

Checkpoint 4.6.6. Spinal bifida odds.

Congratulations...your family is having a baby! As part of the prenatal care, some testing is part of the normal procedure including one for spinal bifida (which is a condition in which part of the spinal cord may be exposed.) Indeed, measurement of maternal serum AFP values is a standard tool used in obstetrical care to identify pregnancies that may have an increased risk for this disorder. You want to make plans for the new child’s care and want to know how serious to take the test results. However, some times the test indicates that the child has the disorder when in actuality it does not (a false positive) and likewise may indicate that the child does not have the disorder when in fact it does (a false negative.)

The combined accuracy rate for the screen to detect the chromosomal abnormalities mentioned above is approximately 85% with a false positive rate of 5%. This means that (from americanpregnancy.org ¹)

Approximately 85 out of every 100 babies affected by the abnormalities addressed by the screen will be identified. (Positive Positive)
Approximately 5% of all normal pregnancies will receive a positive result or an abnormal level. (False Positive)

Given that your test came back negative, determine the likelihood that the child will actually have spinal bifida.
Given that your test came back negative, determine the likelihood that the child will not have spina bifida
Given that a positive test means you have a 1/100 to 1/300 chance of experiencing one of the abnormalities, determine the likelihood of spinal bifida in a randomly selected child.

You can get some help checking your arithmetic using the Bayes’ Sage interact.

americanpregnancy.org/prenatal-testing/first-trimester-screen/

Essentials of Mathematical Probability and Statistics

Search Results: