Numeracy, Maths and Statistics - Academic Skills Kit

Introduction to Hypothesis Testing and Confidence Intervals (Animal Science)

ContentsToggle Main Menu 1 The Outline for an Experiment 2 What is a Hypothesis test? 3 The Structure of a Hypothesis Test3.1 Summary of Steps for a Hypothesis Test 4 How to Report your Findings 5 Parametric and Non-Parametric Hypothesis Tests 6 One- and two- tailed tests 7 Type I and Type II Errors 8 Confidence Intervals 9 Test Yourself 10 See Also

The Outline for an Experiment

Before carrying out an experiment or trial the following steps must be carried out.

Come up a hypothesis (see below).
Collect data and carry out a hypothesis test.
Decide whether you have statistically significant results, i.e. have you got sufficient evidence to support your hypothesis?
Report your findings.

What is a Hypothesis test?

A statistical hypothesis is an unproven statement which can be tested. A hypothesis test is used to test whether this statement is true.

The Structure of a Hypothesis Test

The first step of a hypothesis test is to state the null hypothesis $H_0$ and the alternative hypothesis $H_1$. The null hypothesis is the statement or claim being made (which we are trying to disprove) and the alternative hypothesis is the hypothesis that we are trying to prove and which is accepted if we have sufficient evidence to reject the null hypothesis.

For example, consider a person in court who is charged with murder. The jury needs to decide whether the person in innocent (the null hypothesis) or guilty (the alternative hypothesis). As usual, we assume the person is innocent unless the jury can provide sufficient evidence that the person is guilty. Similarly, we assume that $H_0$ is true unless we can provide sufficient evidence that it is false and that $H_1$ is true, in which case we reject $H_0$ and accept $H_1$.

To decide if we have sufficient evidence against the null hypothesis to reject it (in favour of the alternative hypothesis), we must first decide upon a significance level. The significance level is the probability of rejecting the null hypothesis when it the null hypothesis is true and is denoted by $\alpha$. The $5\%$ significance level is a common choice for statistical test.

The next step is to collect data and calculate the test statistic and associated $p$-value using the data. Assuming that the null hypothesis is true, the $p$-value is the probability of obtaining a sample statistic equal to or more extreme than the observed test statistic.

Next we must compare the $p$-value with the chosen significance level. If $p \lt \alpha$ then we reject $H_0$ and accept $H_1$. The lower $p$, the more evidence we have against $H_0$ and so the more confidence we can have that $H_0$ is false. If $p \geq \alpha$ then we do not have sufficient evidence to reject the $H_0$ and so must accept it.

Alternatively, we can compare our test statistic with the appropriate critical value for the chosen significance level. We can look up critical values in distribution tables (see worked examples below). If our test statistic is:

positive and greater than the critical value, then we have sufficient evidence to reject the null hypothesis and accept the alternative hypothesis.
positive and lower than or equal to the critical value, we must accept the null hypothesis.
negative and lower than the critical value, then we have sufficient evidence to reject the null hypothesis and accept the alternative hypothesis.
negative and greater than or equal to the critical value, we must accept the null hypothesis.

For either method:

Significant difference found: Reject the null hypothesis
No significant difference found: Accept the null hypothesis

Finally, we must interpret our results and come to a conclusion. Returning to the example of the person in court, if the result of our hypothesis test indicated that we should accept $H_1$ and reject $H_0$, our conclusion would be that the jury should declare the person guilty of murder.

Summary of Steps for a Hypothesis Test

Specify the null and the alternative hypothesis
Decide upon the significance level.
Collect data and decide whether to accept $H_0$ or reject $H_0$ and accept $H_1$ by either:
- Comparing the $p$-value to the significance level $\alpha$, or
- Comparing the test statistic to the critical value.
Interpret your results and draw a conclusion

How to Report your Findings

If you were writing about findings of a hypothesis test in a report/project, you would do so in the following way:

You would state what the results mean in context of your experiment.
Immediately after the statement, in brackets, you would include what test you used, the test statistic and the P value it yielded.
It is not just at undergraduate level in which findings are reported in this way, published papers use this method too.

Parametric and Non-Parametric Hypothesis Tests

There are parametric and non-parametric hypothesis tests.

A parametric hypothesis assumes that the data follows a Normal probability distribution (with equal variances if we are working with more than one set of data) . A parametric hypothesis test is a statement about the parameters of this distribution (typically the mean).

A non-parametric test assumes that the data does not follow any distribution and usually bases its calculations on the median. Note that although we assume the data does not follow a particular distribution it may do anyway. We do not cover non-parametric hypothesis tests in detail on the Animal Science area of the wiki, however if you would like to find out more about them you can look at the Psychology section.

One- and two- tailed tests

Whether a test is One-tailed or Two-tailed is appropriate depends upon the alternative hypothesis $H_1$.

One-tailed tests are used when the alternative hypothesis states that the parameter of interest is either bigger or smaller than the value stated in the null hypothesis. For example, the null hypothesis might state that the average weight of chocolate bars produced by a chocolate factory in Slough is 35g (as is printed on the wrapper), while the alternative hypothesis might state that the average weight of the chocolate bars is in fact lower than 35g.

Two-tailed tests are used when the hypothesis states that the parameter of interest differs from the null hypothesis but does not specify in which direction. In the above example, a Two-tailed alternative hypothesis would be that the average weight of the chocolate bars is not equal to 35g.

Type I and Type II Errors

A Type I error is made if we reject the null hypothesis when it is true (so should have been accepted). Returning to the example of the person in court, a Type I error would be made if the jury declared the person guilty when they are in fact innocent. The probability of making a Type I error is equal to the significance level $\alpha$.

A Type II error is made if we accept the null hypothesis when it is false i.e. we should have rejected the null hypothesis and accepted the alternative hypothesis. This would occur if the jury declared the person innocent when they are in fact guilty.

Confidence Intervals

A confidence interval describes our uncertainty about where the population mean of a measurement lies, based on a sample. It's calculated using the of the mean. We first choose the confidence level of the interval; usually we choose the level to be 95%. This would mean that if we were to repeat our experiment 100 times and compute 100 corresponding confidence intervals, approximately 95 of the confidence intervals would contain the population mean.

A confidence interval consists of an upper and lower bound, calculated using the sample mean and sample standard deviation, and a t-value corresponding to the chosen significance level and the degrees of freedom in the sample.

\begin{align} \text{Upper bound} &= \bar{x} + (t\times\text{ Sample standard deviation}) \\ \text{Lower bound} &= \bar{x} - (t\times\text{ Sample standard deviation}) \end{align}

Minitab or R can calculate this range for you

Test Yourself

Try our Numbas test on hypothesis testing: Practising confidence intervals and hypothesis tests.