The mean of a set of numbers in a data set is obtained by adding up all the numbers then dividing by the size of the data set. When people use the word 'average' in everyday conversation, they are usually referring to the mean.
The ages of people in the checkout queue at Aldi are as follows: \[23, 54, 2, 6, 20, 25, 21, 64, 19, 19, 75, 36\text{.}\] Find the mean.
To find the mean add all of the observed numbers together then divide by the number of observations, which, in this case, is $12$.
\[\dfrac{23+54+2+6+20+25+21+64+19+19+75+36}{12}=\dfrac{364}{12}=30.33333\ldots\]
If (which is unusual) we have information for the entire population, we use the term population mean for, as you would expect, the mean of the entire population. We represent the population mean by $\mu$. If we have data for the entire population, we can calculate it in the same way: \[\mu = \frac{1}{N} \sum\limits_{i=1}^N x_i\]
where $N$ is the size of the population consisting of $x_1, x_2, \ldots, x_N$.
When we have taken a sample of $n$ observations $x_1, x_2, x_3,...,x_n$ from the underlying population, we use the term sample mean for the mean of $x_1, x_2, x_3,...,x_n$. It is represented by \[\bar{x} = \frac{1}{n} \sum\limits_{i=1}^n x_i\] where $n$ is the size of the sample and $x_1, x_2, \ldots, x_n$ are the $n$ observations obtained. This is exactly the same as what has been done above, it is just a more formal way of expressing it.
The median is usually described as the ‘middle number’. We can obtain the median by ordering the data in terms of size, then:
When you have a large data set it is often more useful to find the position of the median within the data set. This is given by $\frac{n+1}{2}$ where $n$ is the number of data values in the data set.
The ages of people in the checkout queue at Aldi are as follows: \[23, 54, 2, 6, 20, 25, 21, 64, 19, 19, 75, 36\text{.}\]
Find the median.
To find the median first reorder the numbers in terms of size.
\[2 , 6 , 19 , 19 , 20 , 21 , 23 , 25 , 36 , 54, 64 , 75\text{.}\]
The number of data entries is $12$ so the position of the median is
\[\frac{n+1}{2}=\frac{12+1}{2}=\frac{13}{2}=6.5\text{.}\]
This means the median is between the $6$th and $7$th values, which are $21$ and $23$ respectively. In this case we compute
\[\dfrac{21+23}{2}=22\text{.}\]
So $22$ is the median age of people in the checkout queue at Aldi.
The mode is the most common number that appears in your set of data. To find the mode count how often each number appears and the number that appears the most times is the mode.
The ages of people in the checkout queue at Aldi are as follows: \[23, 54, 2, 6, 20, 25, 21, 64, 19, 19, 75, 36\text{.}\]
The age that appears most frequently is the number $19$; so the modal age is $19$.
This workbook produced by HELM is a good revision aid, containing key points for revision and many worked examples.
Test yourself: Calculate measures of central tendency and spread