Section 10.6 Interval Estimates - Confidence Interval for σ2
Once again, you may need to approximate the population variance or standard deviation but only have the sample values available. One difference from the previous sections is that you are not dealing with an average of values (such as ¯x or ˜p) but with the average of the squares of values. The Central Limit Theorem does not directly help you in this case but the following result (presented without proof) provides a solution.Theorem 10.6.1. Relationship between Variance and χ2.
If S2 is a random variable of possible sample variance values from a sample of size n, then
W=(n−1)S2σ2
is approximately χ2(n−1).
E1<σ2<E2
and determine values for the boundaries so that the likelihood of this being true is high. For this case, since the chi-square distribution only has a positive domain and is not symmetrical, you will not expect to determine a symmetrical confidence interval. Therefore, consider
P(E1<σ2<E2)=1−α
and by playing around with algebra you get
P(E1(n−1)S2<σ2(n−1)S2<E2(n−1)S2)=1−α
or by inverting the inequality yields
P((n−1)S2E2<(n−1)S2σ2<(n−1)S2E1)=1−α.
Using the previous theorem, note that the inside variable can be replaced with a chi-square variable. If F is the distribution function for chi-square, then you get
F((n−1)S2E1)−F((n−1)S2E2)=1−α.
For a given value of α there are many possible choices but often one often utilized is one in which
F(χ21−α/2)=F((n−1)S2E1)=1−α/2
and
F(χ2α/2)=F((n−1)S2E2)=α/2.
Using the inverse chi-square gives values for the expression on the inside and algebra can be used to solve for each of E1,E2. Indeed,
E1=(n−1)S2χ21−α/2
and
E2=(n−1)S2χ2α/2
To determine appropriate values for χ2α/2 and χ21−α/2 with equal probabilities in each tail, consider using the interactive cell below:
xxxxxxxxxx
# Chi-Square Calculator for confidence intervals with equal alpha/2 tails
var('t')
layout=dict(top=[['c'],['n']])) (
def _(c=input_box(0.95,width=10,label='Confidence Level = '),n=input_box(20,width=8,label='n =')):
alpha = 1-c
T = RealDistribution('chisquared', n)
a = T.cum_distribution_function_inv(alpha/2)
a1 = T.cum_distribution_function(a)
b = T.cum_distribution_function_inv(1-alpha/2)
b1 = T.cum_distribution_function(b)
print('From the Chi-Square distribution for X:')
print('P(',a,'< X < ',(b),') = ',c)
print('with')
print('P( X < ',a,') = ',a1)
print('P( X < ',b,') = ',b1)
f = x^(n/2-1)*e^(-x/2)/(gamma(n/2)*2^(n/2))
G = plot(f,x,0,b+(b-a)/2)+plot(f,x,a,b,thickness=5,color='green')
G += line([(a,0),(a,f(x=a))],color='green',thickness=3)
G += line([(b,0),(b,f(x=b))],color='green',thickness=3)
G += text(str(c.n(digits=5)),((a+b)/2,f(x=(a+b)/2)/3),color='green')
G.show()
xxxxxxxxxx
# Chi-Square Calculator specifics
var('t')
c=0.95
n=8
alpha = 1-c
T = RealDistribution('chisquared', n)
a = T.cum_distribution_function_inv(alpha/2)
a1 = T.cum_distribution_function(a)
b = T.cum_distribution_function_inv(1-alpha/2)
b1 = T.cum_distribution_function(b)
print('From the Chi-Square distribution for X:')
print('P(',a,'< X < ',(b),') = ',c)
print('with')
print('P( X < ',a,') = ',a1)
print('P( X < ',b,') = ',b1)
f = x^(n/2-1)*e^(-x/2)/(gamma(n/2)*2^(n/2))
G = plot(f,x,0,b+(b-a)/2)+plot(f,x,a,b,thickness=5,color='green')
G += line([(a,0),(a,f(x=a))],color='green',thickness=3)
G += line([(b,0),(b,f(x=b))],color='green',thickness=3)
G += text(str(c.n(digits=5)),((a+b)/2,f(x=(a+b)/2)/3),color='green')
G.show()
Example 10.6.2. - Two-sided Confidence interval for σ2 and σ.
Given the data 570, 561, 546, 540, 609, 580, 550, 577, 585, determine a 95% confidence interval for σ2.
Using the computational forumaula (or your calculator) gives s2≈479.5. Also, notice for n=9, the resulting interval will use a Chi-square variable with 8 degrees of freedom. Using the symmetric option, gives χ20.025=2.18 and χ20.975=17.53. Therefore
E1=8⋅479.517.53≈221.095
and
E2=8⋅479.52.18≈1759.63.
Hence, you are 95% certain that
221.095<σ2<1759.63.
By taking square roots you get
14.87<σ<41.95.
Notice, this interval is relatively wide which is a result both of the number of data values being relatively small (n=9) and the actual data values being relatively large and spread out.
xxxxxxxxxx
# Chi-Square Calculator specifics
var('t')
c=0.95
n=399
alpha = 1-c
T = RealDistribution('chisquared', n)
a = T.cum_distribution_function_inv(alpha/2)
a1 = T.cum_distribution_function(a)
b = T.cum_distribution_function_inv(1-alpha/2)
b1 = T.cum_distribution_function(b)
print('From the Chi-Square distribution for X:')
print('P(',a,'< X < ',(b),') = ',c)
print('with')
print('P( X < ',a,') = ',a1)
print('P( X < ',b,') = ',b1)
Checkpoint 10.6.3. WebWork - Two-sided Confidence Interval with large n.
E1=8⋅479.5456.24≈419.3
and
E2=8⋅479.5345.55≈553.7.
Hence, you are 95% certain that
419.24<σ2<553.7.
By taking square roots you get
20.48<σ<23.53
which is a relatively tight confidence interval. Notice, these are also completely contained in the confidence intervals from the previous small n example.
Similar to above, another choice to estimate σ2 is to use a one sided confidence interval. If you want to find one of these, continue as described above but just leave one endpoint off. Indeed,
σ2<E2
can be determined using
F(χ2α)=F((n−1)S2E2)=α
and
E1<σ2
can be determined using
F(χ21−α)=F((n−1)S2E1)=1−α.
Example 10.6.4. - One-sided Confidence intervals for σ2.
TBA
Checkpoint 10.6.5. WebWork - One-sided Confidence Interval.
Example 10.6.6. - Confidence intervals for σ.
TBA