This is where you are only testing one sample, for example the number of owls in an area over the past 10 years. Usually you would compare your data with a known value, typically a mean that has been derived from previous research. You want to test the null hypothesis, i.e. is the mean of the sample the same as the known mean? A one-sample t-test is used to compare a sample mean $\bar{x}-$ (calculated using the data) to a known ‘’population’’ mean $\mu$ (typically obtained in previous research). We want to test the null hypothesis that the population mean is equal to the sample mean. For example, we might want to test whether the proportion of red squirrels to grey squirrels in Newcastle is different from the known UK average.
As you progress through your university career you will be introduced to statistical packages such as R and Minitab that can perform these tests for you and present the final significance level. However, you may also be introduced to how to conduct and interpret hypothesis test without using such software (this is good to demonstrate a thorough knowledge of what is really happening with the data). This is done as follows:
\begin{equation} t = \dfrac{\bar{x}-\mu}{\sqrt{\dfrac{s^2}{n}}} \end{equation}
Where $\bar{x}-$ is the sample mean, $\mu$ is the population mean, $n$ is our sample size and $s$ is the sample standard deviation.
The number of owls in an area has been recorded for the past 50 years and the average number for these past 50 years (in a previous experiment) was found to be $106$. Over the last 9 years the counts have been recorded in the table below.
Has there been a change in the number of owls in the area?
Year |
Owl Count |
---|---|
2005 |
108 |
2006 |
131 |
2007 |
156 |
2008 |
113 |
2009 |
105 |
2010 |
99 |
2011 |
140 |
2012 |
123 |
2013 |
110 |
For the last 9 years, the mean number of owls has been $120.6$ (1 d.p.) with a standard deviation of $18.6$.
The null hypothesis is that the number of owls has remained the same for the past 50 years. The hypothesis is that the number of owls has changed (from $106$). (Note: This is a two tailed t-test because we are just testing for a change, with no specific direction.) I.e.
Using Minitab and R we find the t statistic is $2.342$ (3 d.p.). $n = 9$, so we will compare our t statistic to a t table on $9 - 1 = 8$ degrees of freedom.
Looking at the table, we can see that the critical t-value at the $95$% level $(P=0.05)$ is $2.306$. We see that $2.342$ is greater than $2.306$. Therefore, our t-statistic is statistically significant at the $95$% level and its corresponding P-value will be less than $0.05$.
We have evidence for the hypothesis and thus can conclude that the number of owls has changed.
Here are video tutorials on using R Studio and Minitab (ver. 16) for this example:
A two sample t-test compares two samples of normally distributed data where the population variance is unknown and the sample sizes are small ($n \lt 30$). We shall look at two types of two sample t-tests:
The main difference between these two tests is that the t statistic is calculated differently (using differences for Paired), however Minitab and R calculate this for you, once you specify which type of two sample t-test you would like to perform.
See the page of worked examples.
We use F-tests (usually in Minitab or R) to check our two samples have equal variances. See F-test for more information. Minitab and R also can be used to test for normality.
Here is an example of a t-table with explanations of what each bit means. (This is for a two tailed or paired t-test, for a one-tailed t-test the probabilities are halved-see worked example below).
This example is very similar to examples in the lecture notes in the first year animal behaviour module (ACE1027).
This is a paired t-test because there is one group being tested twice, rather than two independent groups.
A class were conducting an experiment to assess interobserver reliability. They observed stabled horses performing stereotypes (repetitive behaviours indicative of poor welfare: weaving, wind sucking and crib biting). They watched the footage 3 times. They assessed whether their observation and recording skills had improved each time they watched. They used data from a group of animal behaviourist students and performed intra-observation agreement calculations between watching the first and second time, then between the second and third times. The groups results are displayed in the table below.
The results were $44$% agreement between the first two observations and $61$% on the second. This seemed like a difference. They conducted a t-test and found the t statistic to be $1.731$ with $P = 0.134$. Is this a significant result?
First to Second |
Second to Third |
---|---|
16 |
41 |
41 |
54 |
65 |
76 |
65 |
61 |
12 |
83 |
45 |
46 |
68 |
69 |
Since the t statistic $1.731$ is less than the t value of $2.447$ on $6$ degrees of freedom at $95$% level $(P = 0.05)$ (circled in the table above), we conclude that this is not statistically significant. There is no change.
The P value is $0.134$, which is approaching a trend, suggesting more experiments are needed to be more conclusive.
A behaviourist is interested in time taken to complete a maze for two different strains of laboratory rat. The trial involves 20 animals, 10 rats were from a strain selected according to the performance of their parents and 10 rats were from an unselected line. The time in seconds to complete the maze is recorded in the table below.
Is there a difference between the average times to complete the maze for the two strains?
Selected Strain |
Unselected Strain |
---|---|
30 |
35 |
52 |
40 |
37 |
59 |
49 |
29 |
27 |
60 |
35 |
35 |
52 |
65 |
40 |
49 |
43 |
73 |
61 |
39 |
The mean for the selected strain is $42.6$ and the standard deviation is $10.8$. The mean for the unselected strain is $48.4$ and the standard deviation is $15.0$.
Using Minitab we find the t-statistic is $0.99$. (R calculates it as $0.992$.) We compare this to the t-value on $n_1 + n_2 - 2 = 18$ degrees of freedom.
Looking at the table, we can see that the critical t-value at the $95$% level $(P=0.05)$ is $2.101$. We see that $0.992$ is less than $2.101$ so our t-statistic is not significant at the $95$% level. Minitab calculates $P = 0.336$, which means there is no evidence to accept the hypothesis.
There is no difference in average time to complete the maze between the two strains.
Here are video tutorials for Minitab (ver. 16) and R Studio for this example:
Try our Numbas test on hypothesis testing: Practising confidence intervals and hypothesis tests
To develop these ideas further see the other sections of Hypothesis Tests (Animal Science).
For additional information on topics covered in this section see the main site's page on hypothesis testing.