## Friday, August 5, 2011

### 14.3 – Tests Of Significance

A Hypothesis Test is a Test of Significance. In this section, we will be looking at all the possible types of hypothesis tests that can be made in STPM. Before we start, every hypothesis test follow a general rule. You need to state these 7 steps (or workings) in your answer sheet:

1. Define the variable X.

Let X be …,
X ~ B(n, p) / X ~ N(μ, σ2) /  X ~ P0(λ)

2. Define H0 and H1.
H0: p / λ / μ / μ1 – μ2     =      ?
H1: p / λ / μ / μ1 – μ2 <, >, ≠  ?

3. Write down the case if H0 is true.
If H0 is true, then
p / λ / μ / μ1 – μ2 = ?
and X ~ B(n, ?) / X ~ N(?, σ2) /  X ~ P0(?)

4. Define your type of test and significance level.
Use a upper / lower / two tailed test, at ?% level.

5. Set the criteria to reject H0.
Reject H0 if P(X ≥ x) < ? / P(X ≤ x) < ? / z < ? / z > ? / |z| > ? / T < ?

6. Do the calculations.
P(X ≥ ?) = ? / P(X ≤ ?) = ? / z = ? / T = ?

Since P(X ≥ x) = ? / P(X ≤ x) = ? / Z = ? / T = ?, x lies / doesn’t lie in the critical region.
H0 is rejected in favour of H1 / not rejected. We conclude that ………. at ?% level.

If you have all these 7 steps on your answer sheets, then you will probably get 90% percent of the marks. Don’t make calculation mistakes though.

TYPES OF SIGNIFICANT TESTS

In this part, there are 12 kinds of significant tests that you might face, be it lower tail, two-tailed or upper tail tests. I will go through this section with an example for each one. Questions are in blue, answers are in red:

1. Binomial Proportion p (n < 30)
A certain type of seed has a germination rate of 70%. The seeds undergo a new treatment after which 9 germinates in a packet of 10 seeds. Test, at the 5% level, whether this is evidence of an increase in the germination rate.

Let X be the germination rate of a certain type of seed, X ~ B(10, p)
H0: p = 0.7 [the germination rate is 70%]
H1: p > 0.7 [the germination rate increases]
If H0 is true, then p = 0.7, and X ~ B(10, 0.7)
Use an upper tail test,   at 5% level.
Reject H0 if P(X ≥ x) < 0.05 [0.05 stands for 5%]
P(X ≥ 9) = P(X = 9) + P(X = 10) = 0.1211 + 0.0282 = 0.1493 = 14.93%
Since P(X ≥ 9) = 14.93%, x doesn’t lie in the critical region.
H0 is not rejected.
We conclude that there is no evidence that there is an increase in germination rate, at 5% level.

A binomial proportion with small sample isn’t hard. The thing that bothers you might probably be the calculations of P(X ≥ 9). Remember the formula for Binomial distribution, nCxpxqn-x.

2. Binomial Proportion p (n ≥ 30)
For this case, an approximation to the normal distribution is used. Remember the continuity correction is used, such that it lies in the critical region.

A manufacturer claims that 8 out of 10 dogs prefer its brand of dog food to any others. In a random sample of 120 dogs, it was found that 85 appeared to prefer that brand. Test, at the 5% level whether you would accept the manufacturer’s claim.

Let X be the number of dogs which prefer the manufacturer’s brand of dog food,
X ~ B(120, p)
H0: p = 0.8
H1: p ≠ 0.8 [notice that we are using the ≠ sign. This is because we are testing whether the claim is exactly correct. That means, the claim is wrong if more than 8 dogs like the brand, and also if less than 8 dogs like the brand.]
If H0 is true, then p = 0.8 and X ~ B(120, 0.8)
Since np > 5, nq > 5, then X is approximately normal,
X ~ N(np, npq), which is X ~ N(96, 19.2).

Use a two-tailed test,   at 5% level.
Reject H0 if |z| > 1.960 [Still remember how to get this value 1.960? Remember that a two-tailed test at 5% means that both ends of the bell curve has 2.5% each. Refer to the critical values for the normal distribution at the end of this post.]

[85.5, continuity correction, such that it lies in the critical region, that means you correct it such that the value is nearer to the critical region.]
Since z = –2.396, z lies in the critical region.
H0 is rejected in favour of H1. There is evidence that the proportion is lesser, and therefore the manufacturer’s claim is not accepted, at 5% level.

3. Poisson Mean λ
The number of white corpuscles on a slide has a Poisson distribution with mean 3.5. After treat, a sample was taken and the number of white corpuscles was found to be 8. Test at the 5% level of significance, whether the number of white corpuscles has increased.

Let X be the number of white corpuscles on a slide, X ~ P0(λ).
H0: λ = 3.5
H1: λ > 3.5
If H0 is true, then λ = 3.5, and X ~ P0(3.5).
Use an upper tail test,   at 5% level.
Reject H0 if P(X ≥ x) < 0.05.
P(X ≥ 8) = 1 – P(X < 7) = 1 – 0.9733 = 0.0267 = 2.7% [I hope you remember the Poisson formula. In some formula booklets, there are Poisson cumulative probability tables, they help too.]
Since P(X ≥ 8) = 2.7% < 5%, x lies in the critical region.
H0 is rejected in favour of H1. There is evidence, at 5% level that the number of white corpuscles increased.

Not a hard one, I suppose. Remember that if λ > 5, you can actually make an approximation to the Normal distribution, X ~ N(λ, λ2).

4. Population Mean μ (Normal,  σ2 known)
A machine fills cans with soft drinks so that the volume of liquid in the cans follow a normal distribution with mean 335ml and standard deviation of 3ml. A setting on the machine is altered, following which the operator suspects that the mean volume of liquid discharged by the machine into the cans has decreased. He takes a random sample of 50 cans and finds that the mean volume of liquid in these cans is 334.6ml. Does this confirm his suspicion? Perform a significance test at the 5% level and assume that the standard deviation remains unchanged.

Let X be the volume of liquid in the cans, X ~ N(μ, 32)
H0: μ = 335
H1: μ < 335
The sample size is 50, X̅ ~ N(μ, 32/50) [recall what you learned in the previous chapter]
If H0 is true, then μ = 335, and X̅ ~ N(335, 9/50)
Use a lower tail test, at 5% level.
Reject H0 if z < –1.645

Since z = –0.9428 > –1.645, z doesn’t lie in the critical region.
H0 is not rejected. There is no evidence to confirm the suspicion of the operator, at 5% level.

For hypothesis type 4 to 8, you might want to recall what you learn in the previous chapter. Remember when to use t-distribution, when to approximate normal and etc. These few types make use of the sampling distribution.

5. Population Mean μ (Non-normal,  σ2 known)
I think I don’t need to show you an example on this one. It is similar to number 4. You make that non-normal distribution (or sometimes unnamed, or unknown distribution) approximate normal, and follow the exact same steps as type 4.

6. Population Mean μ (Normal,  σ2 unknown, n ≥ 30)
When the variance is unknown, you make use of the best unbiased estimate of population variance,

and the rest of the steps follows.

7. Population Mean μ (Non-normal,  σ2 unknown, n ≥ 30)
Similar to type 6, you make use of the best unbiased estimate of population variance. The following example illustrates both type 6 and 7:

A random sample of 75 11-year-olds performed a simple task and the time taken, x minutes, noted for each. The results were summarized as follows:
Σx = 1215, Σx2 = 21708
Test, at the 1% level, whether there is evidence that the mean time taken to perform the task is greater than 15 minutes.

Let X be the time taken to perform a simple task by the 11-year-olds.
H0: μ = 15
H1: μ > 15
The distribution is unknown. But since n = 75 is large, by the central limit theorem, X̅ is approximately normally distributed, X̅ ~ N(μ, σ̂ 2/75), with variance unknown.
If H0 is true, then
μ = 15,
and X̅ ~ N(15, σ̂ 2/50)
Use an upper tail test, at 1% level.
Reject H0 if z > 2.326

Since z < 2.326, z doesn’t lie in the critical region.
H0 is not rejected. There is no evidence, at ?% level that the mean time is greater than 15 minutes.

8. Population Mean μ (Normal,  σ2 unknown, n < 30)
You probably might have guessed correctly. You should use the t-distribution to do this kind of significance test.

Family packs of bacon slices are sold in 1.5kg packs. A sample of 12 packs was selected at random and their masses, measured in kilograms, noted. The following results were obtained: Σx = 17.81, Σx2 = 26.4357
Assuming that the masses measured in kg packs follow a normal distribution with variance σ2 unknown, test at the 1% level whether the packs are underweight.

Let X be the mass of packs of bacon slices, X ~ N(μ, σ2)
H0: μ = 1.5
H1: μ < 1.5
Since σ2 is unknown, and n < 30, a t-distribution is used, T ~ t(n – 1)
If H0 is true, then μ = 1.5, T ~ t(11), where

Use a lower tail test, at 1% level.
Reject H0 if t < –2.718 [refer to the t-distribution tables]
x̅ = 1.484, so T = –3.506 < –2.718
t lies in the critical region.
H0 is rejected in favour of H1. There is evidence that the packs are underweight, at 1% level.

9. Difference between Means μ1 – μ2 (different variance σ12 & σ22 known)
This is something new. Type 9, 10 and 11 are only for 2 Normal populations, X1 and X2 with unknown means μ1 and μ2. So it means that here, you have 2 samples, with the new test statistic 1 – X̅2, and let us consider its sampling distribution. Doing some expectation algebra,

and so our sampling distribution of difference of means will be

and therefore, in standardised form,

Let’s try one example:

Due to differences in the environment, the masses of a certain species of small animal are believed to be greater in Region A than in Region B. It is known that the masses in both regions are normally distributed with masses in Region A having a standard deviation of 0.04kg and masses in region B having a standard deviation of 0.09kg.
To test this theory, random samples are taken: 60 animals from Region A had a mean mass of 3.03kg and 50 animals from Region B had a mean mass of 3.00kg.
Does this provide evidence, at the 1% level that the animals of this species in Region A have a greater mass than those in Region B?

Let X1 be the mass (kg) of an animal in Region A, with population mean μ1. X1 ~ N(μ1, 0.042)
Let X2 be the mass (kg) of an animal in Region B, with population mean μ2. X2 ~ N(μ2, 0.092)
H0: μ1 – μ2 = 0 [there is no difference in the masses between the regions]
H1: μ1 – μ2 > 0 [the animals in Region A have greater mass]
Consider the distribution of the difference between the means 1 – X̅2, with n1 = 60, n2 = 50.
If H0 is true, then
μ1 – μ2 = 0 [there can be cases where it is not 0 too.]

Use an upper tail test, at 1% level.
Reject H0 if z > 2.326

z doesn’t lie in the critical region.
H0 is not rejected. There is no evidence, at the 1% level, that the animals in region A have a greater mass than those in region B.

10. Difference between Means μ1 – μ2 (common σ2 known)
This one has not much difference from the one above. This means that the 2 populations have a common variance, which doesn’t change in time. The distribution will then be represented by

and the test statistic,

By the way, you can also create confidence intervals for situations like this too. Try it out yourself. There can be 2 tail, upper tail and lower tail tests as well.

11. Difference between Means μ1 – μ2 (common σ2 unknown)
I don’t know if the variances are different. But if both populations have a common unknown variance, the unbiased estimate σ̂ 2, also known as a pooled two-sample estimate, has the formula

where n1 and n2 are the sample sizes and s12 and s22 are the variances of the 2 samples respectively. The distribution will be

and the test statistic,

This is, however, not always the case. When both the samples are small, we should use the t-distribution instead. The test statistic will now be

where T ~ t(n1 + n2 – 2). We should only use the t-distribution when n1 + n2 – 2 < 30.

A large group of sunflowers is growing in the shady side of a garden. A random sample of 36 of sunflowers is measured. The sample mean height is found to be 2.86m, and the sample standard deviation is found to be 0.60m. A second group of sunflowers is growing in the sunny side of the garden. A random sample of 26 of these sample flowers is measured. The sample mean height is found to be 3.29m and the sample standard deviation is found to be 0.9m. Treating the samples as large samples from normal distribution having the same variance but possibly different means, obtain a pooled estimate of the variance and test whether the results provide significant evidence at the 5% level that the sunny-side flowers grow taller, on average, than the shady-side sunflowers.

Let X1 be the height of sunflowers in the shady side, X1 ~ N(μ1, σ2)
Let X2 be the height of sunflowers in the sunny side, X2 ~ N(μ2, σ2)
where σ2 is unknown.
H0: μ1 – μ2 = 0
H1: μ1 – μ2 > 0
Consider the distribution of the difference between the means 1 – X̅2, with n1 = n2 = 36.
If H0 is true, then
μ1 – μ2 = 0

and therefore

Use an upper tail test, at 5% level.
Reject H0 if z > 1.645

z lie in the critical region.
H0 is rejected in favour of H1. There is evidence, at the 5% level, that the sunny-side sunflowers grow taller than the shady-side sunflowers.

When you perform a significance test, you tend to make errors. If H0 is correct and you accept it, or if H0 is false and you reject it, then you’ve made a correct decision. However, there are 2 kinds of errors that you will made:

1. A Type I Error, which is made when you reject H0 when it is true
2. A Type II Error, which is made when you accept H0 when it is false.

Questions are usually interested to know the probability of making these errors. The first one is easy, P(Type I error) = level of significance. For the type II error, things are not so straight forward. A specific value of H1 is stated in order to find the probability of this error. I’ll show you an example below:

A random observation is taken from a binomial distribution X ~ B(20, p) and used to test the null hypothesis p = 0.8 against the alternative hypothesis p> 0.8. The significance level of the test is 7%. Find the probability of making a Type I error. Find also the probability of making a Type II error if in fact p = 0.85.

The probability of making a Type I error is 7%. [same as the level of significance]
You make a Type II error if you accept H0 when p is the value specified in H1.
For Type II error,
H0: p = 0.80
H1: p = 0.85
P(X = 20) = 0.012 = 1.2%
P(X ≥ 19) = 0.069 = 6.9%
P(X ≥ 18) = 0.206 = 20.6%
So the critical region is X ≥ 19.
So P(Type II error) = P(accept H0 when H1 is true)
= P(X < 19 when p = 0.85)
= P(X < 19 when X ~ B(20, 0.85))
P(X ≤ 18) = 1 -
P(X = 20) - P(X = 19) = 0.824 = 82.4% [Note that in this part of the calculations, you are using p = 0.85, but not 0.80 as when you were finding the critical region above.]
∴ The probability of making a Type II error is 82.4%.

Let me summarize how you find the probability of a Type II error: