Confidence Intervals (Business)

Confidence Intervals

If you are working with a sample of the population but want to use this sample to predict the mean of the whole population, you might use a confidence interval. A confidence interval is a range of values in which we estimate the true mean will lie with a specified probability. To calculate a confidence interval, we must first calculate the standard error $s$ of the sample mean.

We typically denote the unknown population mean by $\mu$ and the population variance by $\sigma^2$. A sample of size $n$ has sample mean $\bar{x}$ and sample variance $s^2$.

There are two important situations:

Population Variance Known

Before calculating a confidence interval, we must first decide upon the significance level $\alpha$ that we wish to work with. The most common level is 0.05, which means that if we were to repeat our experiment $100$ times, approximately $95$ of the experiments would contain the true (population) value of the mean. We therefore call the resulting confidence interval a 95% confidence interval.

Note: if we are using a $\alpha \%$ significance level then a corresponding confidence interval will be a ''$(1-\alpha)100\%$ confidence interval.

Other significance levels (such as 0.1 or 0.01) may be used. A 0.01 significance level, for instance, makes the confidence intervals more accurate (approximately $99$ of the experiments will contain the true (population) value of the mean) but much larger.

If we assume that the population of interest is normally distributed (approximately or exactly) normally distributed you can begin constructing the confidence interval for the population mean $\mu$ using the following formula:

\begin{equation} z = \dfrac{(\bar{x} -\mu)}{\sqrt{\dfrac{\sigma^2}{n}}} \end{equation}

This makes $z$ a $N(0,1)$ distribution (see converting Normal to Standard Normal for more information).

Note: if we are using a $\alpha \%$ significance level, we denote the $z$-score by $z_{\frac{\alpha}{2}}$.

If we want to calculate a $95$% confidence interval, we have that $P( -1.96 < z < 1.96 ) = 0.95$ so we set $z_{0.025} =1.96$. For different levels of significance, replace $1.96$ by the desired value from the normal table (see reading Normal tables for more information).

As we want a confidence interval for $\mu$, after setting $z$ to the desired value, we rearrange the above formula for $z$ to get:

\begin{equation} \bar{x} - z \times \sqrt{\dfrac{\sigma^2}{n}} \leq \mu \leq \bar{x} + z \times \sqrt{\dfrac{\sigma^2}{n}} \end{equation}

The interval is usually written $\bar{x} \pm z \times \sqrt{\dfrac{\sigma^2}{n}}$ and we can calculate this because we know the values of $\bar{x}, \sigma^2$ and $n$.

For instance, for a $95%$ confidence interval, we would use $\bar{x} \pm 1.96\sqrt{\dfrac{\sigma^2}{n}}$

Population Variance Unknown

You may not always work with data where you know $\sigma^2$ (the population variance). In fact this is usually the case. This means you cannot use the above formula for working out confidence intervals. What you can do instead is use the sample variance ($s^2$) and substitute this for $\sigma^2$ into the formula for $z$ and call it $T$. (Note: this is still assuming that the observations are approximately normally distributed.)

\begin{equation} T =\dfrac{(\bar{x} -\mu)}{\sqrt{\dfrac{s^2}{n}}} \end{equation}

Unfortunately now our value of $T$ has a larger variation between each sample. This means $T$ does not have a $N(0,1)$ distribution. So we now have to use a t-distribution with $(n-1)$ degrees of freedom instead. The t-distribution becomes close to the normal distribution as the number of degrees of freedom increases to over $30$. (See reading tables for the t- table and how to read it) The formula for the confidence interval when the population variance is unknown is:

\begin{equation} \bar{x} - t_{\alpha/2}\sqrt{\dfrac{s^2}{n}} <\mu <\bar{x} +t_{\alpha/2}\sqrt{\dfrac{s^2}{n}} \end{equation}

Again for convenience this is usually written: $\bar{x} \pm t_{\alpha/2}\sqrt{\dfrac{s^2}{n}}$ and this is still interpreted as a confidence interval for the true value of the population mean.

Worked Example 1

Worked Example

The percentage yearly return on a particular share in Herschel Inc. is known to be normally distributed with a standard deviation of $0.8$%. A sample of $20$ randomly selected yearly returns yields a mean of $\bar{x} = 3.2$%.

Obtain a $99$% confidence interval for the mean yearly return on this share.

Solution

We want a $99$% confidence interval so $\alpha = 0.01$ and $\alpha/2 = 0.005$. Thus $z_{0.005} = 2.5758$. The confidence interval is:

$3.2 \pm 2.5758\sqrt{\dfrac{0.8}{20}~} = 3.2 \pm 0.51516$. We often write this (here to 3 d.p.) in the form $(2.685, 3.715)$.

As this interval is positive this means that if you have shares in Herschel ltd you will will probably make money on your investment!

Worked Example 2

Worked Example

A sample of travel expenses of $15$ workers from the same company were collected. This sample gave a mean of $£15.29$ and a sample standard deviation of $1.95$.

Find the $95$% confidence interval for the mean cost of travel expenses. What do we need to assume?

Solution

Firstly the population variance is unknown and we have a relatively small sample, so to construct this interval we must use the t-distribution and thus we must assume that the data is normally distributed to continue.

We have a sample of size $n = 15$ so we have $n-1= 14$ degrees of freedom. Since we want to find the $95$% confidence interval $\alpha = 0.05$ we take $\dfrac{\alpha}{2} = 0.025$. Looking up the the corresponding t value $t_{0.025} = 2.1448$ from the t-distribution table.

The formula for the 95% confidence interval is:

$\bar{x} \pm t_{0.0025}\sqrt{\dfrac{s^2}{n}~}$

Substituting the values we have in:

£$15.29 \pm 2.1448\sqrt{\dfrac{1.95}{15}~}$

Thus the $95$% confidence interval for the true value of the mean is $(£14.52 , £16.06)$.

See Also

For more information on the topics covered in this section see descriptive statistics and presenting data. To develop these ideas further see hypothesis testing