23 One-Sample T-Test
The one-sample t-test is a statistical procedure used to determine whether the mean of a single sample differs significantly from a known or hypothesized population mean. This test is particularly useful when the population standard deviation is unknown and the sample size is small, which is a common scenario in many practical research applications.
23.1 Assumptions
Before conducting a one-sample t-test, certain assumptions must be verified to ensure the validity of the test results:
- Normality: The data should be approximately normally distributed. This assumption is especially important with smaller sample sizes. For larger samples, the Central Limit Theorem helps as it suggests that the means of the samples will be approximately normally distributed regardless of the shape of the population distribution.
- Independence: The sampled observations must be independent of each other. This means that the selection of one observation does not influence or alter the selection of other observations.
- Scale of Measurement: The data should be measured at least at the interval level, which means that the numerical distances between measurements are defined.
23.2 Hypotheses
The hypotheses for a one-sample t-test are structured as follows:
- Null Hypothesis (H₀): The population mean is equal to the specified value (\(\mu = \mu_0\)).
- Alternative Hypothesis (H₁): The population mean is not equal to the specified value (\(\mu \neq \mu_0\)). The alternative hypothesis can also be directional, stating that the mean is greater than (\(\mu > \mu_0\)) or less than (\(\mu < \mu_0\)) the specified value, depending on the research question.
23.2.1 Formula
The t-statistic is calculated using the formula:
\[t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}\]
Where:
- \(\bar{x}\) is the sample mean.
- \(\mu_0\) is the hypothesized population mean.
- \(s\) is the sample standard deviation.
- \(n\) is the sample size.
23.2.2 Calculating Degrees of Freedom
The degrees of freedom for the one-sample t-test are calculated as \(n - 1\). This value is crucial for determining the critical values from the t-distribution, which are needed to assess the significance of the test statistic.
23.2.3 Interpretation
To decide whether to reject the null hypothesis, compare the calculated t-value to the critical t-value from the t-distribution at the desired significance level (\(\alpha\), often 0.05 for a 5% significance level). The decision rules are:
- If the absolute value of the calculated t-value is greater than the critical t-value, reject the null hypothesis.
- If the absolute value of the calculated t-value is less than or equal to the critical t-value, do not reject the null hypothesis.
23.3 One-Sample T-Test Example problem
A bakery claims that its chocolate chip cookies weigh at least 60 grams on average. A quality control manager is skeptical of this claim and decides to test it. She randomly selects 15 cookies and finds the following weights in grams:
52, 55, 61, 54, 58, 59, 62, 53, 56, 57, 60, 59, 61, 64, 58She decides to use a one-sample t-test to see if there’s evidence that the average weight is different from the bakery’s claim. She chooses a significance level of 0.05.
Hypotheses
- Null Hypothesis (\(H_0\)): \(\mu = 60\) grams. The average weight of the cookies is 60 grams.
- Alternative Hypothesis (\(H_1\)): \(\mu \neq 60\) grams. The average weight of the cookies is not 60 grams.
First, let’s calculate the sample mean (\(\bar{x}\)), sample standard deviation (\(s\)), and the t-statistic.
Calculate the Sample Mean (\(\bar{x}\)):
The sample size \(n\) is 15.
Sample mean = \[ \bar{x} = \frac{\sum \text{sample values}}{n} \]
\[ = \frac{52 + 55 + 61 + 54 + 58 + 59 + 62 + 53 + 56 + 57 + 60 + 59 + 61 + 64 + 58}{15} \]
\[ = \frac{866}{15} = 57.73 \text{ grams} \]
Calculate the Sample Standard Deviation (s):
To calculate \(s\), use the formula:
\[ s = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n-1}} \]
First, compute the deviations from the mean, square each, and then sum them up:
- \((52 - 57.73)^2 = 32.6729\)
- \((55 - 57.73)^2 = 7.4929\)
- \((61 - 57.73)^2 = 10.6329\)
- \((54 - 57.73)^2 = 13.9129\)
- \((58 - 57.73)^2 = 0.0729\)
- \((59 - 57.73)^2 = 1.6129\)
- \((62 - 57.73)^2 = 18.1929\)
- \((53 - 57.73)^2 = 22.3729\)
- \((56 - 57.73)^2 = 2.9929\)
- \((57 - 57.73)^2 = 0.5329\)
- \((60 - 57.73)^2 = 5.1129\)
- \((59 - 57.73)^2 = 1.6129\)
- \((61 - 57.73)^2 = 10.6329\)
- \((64 - 57.73)^2 = 39.3129\)
- \((58 - 57.73)^2 = 0.0729\)
Sum of squared deviations:
\[ \sum (x_i - \bar{x})^2 = 167.1204 \]
Now calculate \(s\):
\[ s = \sqrt{\frac{167.1204}{14}} = 3.46 \text{ grams} \]
Compute the T-Statistic:
Using the t-test formula:
\[ t = \frac{\bar{x} - \mu}{s / \sqrt{n}} = \frac{57.73 - 60}{3.46 / \sqrt{15}} = -2.32 \]
Determine Degrees of Freedom:
\[ df = n - 1 = 15 - 1 = 14 \]
Calculate P-Value for a Two-Tailed Test:
Based on the t-statistic, look up or compute the p-value for \(|t| = 2.32\) with \(df = 14\). This value is approximately \(p = 0.036\).
Interpretation
T-Statistic: The negative value of the t-statistic (-2.32) indicates that the sample mean is less than the null hypothesis mean of 60 grams.
P-Value: The p-value of 0.036 is less than the chosen significance level of 0.05. This suggests that there is statistically significant evidence to reject the null hypothesis.
Therefore, based on the sample of 15 cookies, there is sufficient statistical evidence to conclude that the average weight of the bakery’s chocolate chip cookies is different from the claimed 60 grams.
Given the direction indicated by the t-statistic, it suggests that the cookies may, on average, weigh less than the claimed 60 grams.
23.3.1 One-Sample T-Test calculation using Excel:
23.4 One-Sample T-Test calculation using R and Python
Summary
| Concept | Description |
|---|---|
| Foundations | |
| One-Sample t-test | Tests whether the mean of a single sample differs significantly from a known reference value |
| Benchmark or Reference Mean | The fixed value to which the sample mean is compared, derived from theory, claim, or benchmark |
| Assumptions | |
| Normality | The sampled values should be approximately normally distributed for the test to be exact |
| Independence | Each observation must be independent of the others |
| Interval Scale | Data must be measured on an interval or ratio scale |
| Hypotheses | |
| Null Hypothesis | States that the population mean equals the hypothesised reference value |
| Alternative Hypothesis | States that the population mean differs from the hypothesised reference value |
| One-Tailed vs Two-Tailed | Use a two-tailed test when deviation in either direction matters, one-tailed when only one direction is of interest |
| Computation | |
| t-statistic Formula | t equals the sample mean minus the reference mean divided by the standard error |
| Sample Standard Deviation | Estimate of population standard deviation computed with n minus 1 in the denominator |
| Degrees of Freedom (n-1) | Reference distribution parameter equal to the sample size minus one |
| Decision and Reporting | |
| p-value Decision Rule | Reject the null when the p-value falls below the chosen significance level |
| Confidence Interval | Interval estimate of the population mean complementing the test result |
| In R and Python | |
| R t.test() | Built-in R function that performs the one-sample t-test with optional mu argument |
| Python scipy.stats.ttest_1samp | Python function in SciPy that performs the one-sample t-test with an explicit popmean |