Section 11.2 Hypotheses and Errors
In formulating a hypothesis, you will be making a declarative statement (a proposition) that has an actual truth value. That is, it is either true or it is not true in real life. However, since we will assess that truth using a test sample then measuring that truth value will never be 100% certain in the same manner that confidence intervals can never include 100% of the possible choices for a particular statistic. Hence, it is possible for you to make an incorrect conclusion.
In general, there are four different outcomes that are possible when testing a hypothesis:
- Your hypothesis is true and you determine that it is true.
- Your hypothesis is false but you determine that it is true.
- Your hypothesis is true but you determine that it is false.
- Your hypothesis is false and you determine that it is false.
The first and last cases are "good" since you have accurately determined the truth of the hypothesis. The second and third are however bad since you either believe something that is not true or you don’t believe something that is true. We would like to minimize the likelihood of allowing these last two possibilities.
Toward that end, let’s consider the case where the hypothesis is true but you determine (in error) that it is false. This is called a Type I error and we will designate the probability of this error by \(\alpha\text{.}\) To lower the risk of a Type I error, you will want to make \(\alpha\) smaller. In general, \(\alpha\) is also called "the significance level" with \(\alpha = 0.05\) a common choice with 0.01 and 0.10 also sometimes used. In general, any value between 0 and 1 is ok but large values mean large likelihood for error so choosing a value closer to 0 is preferred.
In a similar manner, consider the case where the hypothesis is false but you determine (in error) that it is true. This is called a Type II error and will be denote the probability of a Type II error by \(\beta\text{.}\) Again, your goal is to make the risk of a Type II error smaller and therefore want \(\beta\) to be as small as possible.
In the following sections you will play around with minimizing Type I and Type II errors. Type I errors will be minimized by simply choosing a smaller value for \(\alpha\) when working through the formulas. Type II errors will be minimized by talking (if possible) a larger sample size when computing the needed sample statistics.
Toward that end, you will compose in each case a Null Hypothesis (denoted \(N_0\)). This statement is what you will test for truth. You will also compose an Alternate Hypothesis (denoted \(N_a\)) that is often the logical complement (but not always) of \(N_0\text{.}\) The null hypothesis is often a statement corresponding to the likelihood that observations occur purely by chance while the alternate hypothesis will often indicate that outcomes are not actually random but are influenced by some (possibly unknown) causes. If our sample shows that the null hypothesis \(N_0\) is false, then we will accept the alternate \(N_a\text{.}\) If this is a bad decision then it will be true that the hypothesis is true but you will have determined that it is false...that is, made a Type I error.
To avoid ever making an Type II error, often often never "accepts" the null hypothesis \(N_0\) even if the sample does not conclude that it is false. In other words, if you do this then you will never allow yourself to determine that \(N_0\) is true. Seems odd but this is one way to avoid ever worrying much about type II errors. The best way to avoid this blind spot is to use relatively large test samples and in doing so you will minimize the likelihood of type II.
So, our plan of attack is to find some way to determine, from a sample, the amount of Type I error we might make. For a given problem, from the sample, you will create a specific estimate for Type I error...called a p-statistic...and compare to the chosen significance level \(\alpha\text{.}\) Sounds easy enough?
Finally, you will notice in each of the instances presented that the form of the solution method will be very similar to the forms we used in creating confidence intervals. The difference is that for hypothesis testing we assume the value in the middle and test to see whether the sample statistic is in one of the tails so that we can reject rather than with confidence intervals we assume as sample statistic is in the middle and then use that sample to create boundaries within which the theoretical population statistic must like with high confidence.