Let’s imagine this story.
One day in town, you met this awkward looking Mathematics tuition teacher. He brags that 95% of his pupils get A’s for their Mathematics T in STPM every year. Since you love Mathematics so much, you thought that maybe you might want to take his tuition class. But being sceptical in nature, you were wondering whether 95% of his students getting A’s, is a little too much. So you decided that you want to put this teacher to a test. You managed to get some information from 15 of his ex-students, and find out that 11 of them got A for Maths T in the previous year.
Now your question is: is the Maths tuition teacher’s claim, a little bit overboard? Is 11 out of 15, 95%? Obviously it isn’t, but since you are only taking a sample, you can’t be sure that you are right. What if there were 13 or 14 students got A’s? You know that if 2 or 3 students got A’s, he is definitely lying. Then how about 10 students? 8 or 9 students? There must be a cut off point, such that you are VERY SURE that he is lying, or not. Isn’t it?
Or let’s think of another story. Suppose you are an athlete, participating in the MSSM 400m race. You find that every time, your running speed follows a normal distribution with a mean of 40km/h. Bored of running everyday, you decided to test whether drinking 2 cups of milk in the morning everyday helps improve your running. So after drinking milk for 5 days, you find your mean speed turned out to be 40.9km/h.
Again you question yourself: did you really “improved”? Well, it might so happen that you run a little faster this time, and has nothing to do with the milk. You might also be wondering, how much increase in speed is considered as ‘improve’? You need a cut off point, again.
NULL AND ALTERNATIVE HYPOTHESIS
If you didn’t notice, you were actually making hypotheses, or a significance test. You were trying to test a hypothesis, to determine whether you can conclude something. You were testing whether the 95% students get A’s and the ‘improvement’ in running is true. The initial assumption is what we called as a null hypothesis, H0. It is very important as it provides the model for the calculations. The null hypothesis for the first case is “95% of the students get A’s for Maths T”. If your results show that indeed 95% of the students get A’s in Maths T, then your hypothesis is true. The case is this: you can’t reject his claim if you don’t have enough evidence to do so. If after your test, you have enough evidence to reject his claim, then you need an alternative hypothesis, H1. The alternative hypothesis for this case is “less than 95% of the students get A’s in Maths T”. This is a binomial problem, so in Mathematical terms, we have
H0: p = 0.95
H1: p < 0.95
Notice that you are only interested in whether the probability is less than 95% or not, so this means that we are interested in the left hand end of the distribution. This is known as the lower tail. In the second case, we are interested in the upper tail, as in whether you have improved or not. There are cases that you want to know whether there is change in the values, e.g. whether there is a change in supporters for Barisan Nasional, Pakatan Rakyat or etc. For this case, we use a two-tailed test.
TEST STATISTIC & TEST VALUE
So now you have a null and alternative hypothesis. The next thing you need is a test statistic. A test statistic is the variable X that you are looking for. In the first case, you are looking for the number of students who get A’s in Maths T, while in the second case, it is your running speed. A test value is found when you have conducted the experiment. The test value in the first case is 11, as you found out of 15 people, while in the second case, is 40.9km/h. You definitely want to know what you are experimenting on, don’t you?
Proceed to the next section to continue our discussion… ☺