The 5 steps of hypothesis testing are:
These 5 steps remain the same regardless of the type of hypothesis being tested.
Here are the steps in more detail.
The null hypothesis represents the status quo - the way things have
always been. It is abbreviated Ho.
The alternative hypothesis represents the challenge to the status quo.
In testing a hypothesis, it is what we want to find out. In textbooks,
it is abbreviated as either Ha or H1.
When I do hypothesis testing, I always find it easier to set up Ha
first. Suppose we were conducting a Z test for one mean. Here are some
examples of alternative hypotheses:
| Our method will make more that 100 widgets per hour | Ha: m > 100 |
| This traffic sign will result in less than 4 accidents per month | Ha: m < 4 |
| There is a problem if sales fall below $1000 per day | Ha: m < 1000 |
| There is a problem if the assembly line has more than 1 defect per hour | Ha: m > 1 |
| The bottle volume must be maintained at 250 ml (too much or too little is not good) | Ha: m ¹ 250 |
The other thing to note is that the equality always goes with Ho, the strict inequality with Ha.
In reaching a verdict to reject or not reject the null hypothesis, there are 2 errors that can be committed:
With that in mind, there are some basic definitions.
Ultimately, the objective in any hypothesis testing should be to maximize the power of the test.
As a general rule of thumb, where the rejection region is located depends on Ha. Tests are generally classified as upper-tail, lower-tail or two-tail tests. For example, suppose a = 0.05 and we are using Z. Z0.05 = 1.645 and Z0.025 = 1.96.
| Ha: m > 200 | Reject Ho if Z > 1.645 |
| Ha: m < 5 | Reject Ho if Z < -1.645 |
| Ha: m ¹ 10 | Reject Ho if Z > 1.96 or Z < -1.96 |
I usually test hypotheses at a 5% level of significance. The reason is that this provides a nice balance between committing a Type I and Type II error. If I set the level too low (say below 1%), there is an increased risk of committing a Type II error. If I set it too high (say over 10%), there is an increased risk of a Type I error.
One final note: some of us prefer using p-values and may or may not set a level of significance. More on this later.
The formula depends on the test. See individual tests for details. On the site we have
Z test for one mean using Microsoft Excel
t test for one mean using Microsoft Excel
If your test statistic falls in the rejection region, reject Ho. Otherwise, do not reject Ho.
If you use the p-value method, the approach depends on whether or not you set a level of significance. Here is the rule:
What you do if the p-value falls between 1% and 10% varies from textbook to textbook. In the end, common sense should prevail. If I get a p-value of 1.5%, I'm likely to reject Ho. Similarly, if I get a p-value of 9.7%, I'm likely to not reject Ho. If I get a p-value around 5%, I'm likely to declare the results are inconclusive.
To write your conclusion, you have to go back to Ho/Ha.
For example, suppose you wanted to see if the average sale in your store is more than $50. The Ho/Ha would be:
Ho: m < $50
Ha m > $50
Suppose the p-value was 0.0092. Based on either a 5% level of significance or the general rule of thumb, you would reject Ho and conclude that the average sale is more than $50. On the other hand, if the p-value was 0.2273, you would not reject Ho and conclude that the average sale is not more than $50.
Note that in the latter case, we did not conclude that the average sale is less than or equal to $50. Keep in mind that the burden of proof is on Ha to provide sufficient evidence to reject Ho. If it does, then we reject Ho. This is the equivalent of finding a defendent guilty. If Ha does not provide sufficient evidence, then we do not reject Ho, the equivalent of finding a defendent not guilty.
Was this helpful? Buy the book at lulu.com. The price for the download version is $9.50 US, paperback $19.50.