24 Two Sample t-test / Independent samples t-test
The independent samples t-test, also known as the two-sample t-test or Student’s t-test, is a statistical procedure used to determine if there is a significant difference between the means of two independent groups. This test is commonly used in situations where you want to compare the means from two different groups, such as two different treatments or conditions, to see if they differ from each other in a statistically significant way.
24.1 Assumptions
You would use an independent samples t-test under the following conditions:
- Independence of Samples: The two groups being compared must be independent, meaning the samples drawn from one group do not influence the samples from the other group.
- Normally Distributed Data: The data in the two groups should be roughly normally distributed.
- Equality of Variances: The variances of the two groups are assumed to be equal. If this assumption is significantly violated, a variation of the t-test, like Welch’s t-test, may be used instead.
24.2 Hypotheses
The hypotheses for an independent samples t-test are usually framed as follows:
- Null Hypothesis (H₀): The means of the two groups are equal (\(\mu_1 = \mu_2\)).
- Alternative Hypothesis (H₁): The means of the two groups are not equal (\(\mu_1 \neq \mu_2\)), which can be two-tailed, or one-tailed if the direction of the difference is specified.
24.3 Formula
The t-statistic is calculated using the following formula: \[ t = \frac{\bar{X}_1 - \bar{X}_2}{s_p \cdot \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}} \] Where:
- \(\bar{X}_1\) and \(\bar{X}_2\) are the sample means of groups 1 and 2, respectively.
- \(n_1\) and \(n_2\) are the sample sizes of groups 1 and 2, respectively.
- \(s_p\) is the pooled standard deviation of the two samples, calculated as: \[ s_p = \sqrt{\frac{(n_1 - 1) \cdot s_1^2 + (n_2 - 1) \cdot s_2^2}{n_1 + n_2 - 2}} \]
- \(s_1^2\) and \(s_2^2\) are the variances of the two samples.
24.3.1 Calculating Degrees of Freedom
The degrees of freedom used in this table are \(n_1 + n_2 - 2\).
24.3.2 Interpretation
To decide whether to reject the null hypothesis, compare the calculated t-value to the critical t-value from the t-distribution at the desired significance level (\(\alpha\), often 0.05 for a 5% significance level). The decision rules are:
- If the absolute value of the calculated t-value is greater than the critical t-value, reject the null hypothesis.
- If the absolute value of the calculated t-value is less than or equal to the critical t-value, do not reject the null hypothesis.
This test allows researchers to understand whether different conditions have a statistically significant impact on the means of the groups being compared, providing crucial insights in fields such as medicine, psychology, and economics.
24.4 Two Samples T-Test Example problem
Suppose we want to determine if there is a significant difference in the average test scores between two classes. Class A has 5 students, and Class B has 5 students. Here are their test scores:
- Class A: 85, 88, 90, 95, 78
- Class B: 80, 83, 79, 92, 87
Hypotheses:
- Null Hypothesis (H₀): \(\mu_1 = \mu_2\) (The means of both classes are equal)
- Alternative Hypothesis (H₁): \(\mu_1 \neq \mu_2\) (The means of both classes are not equal)
We will use a significance level (\(\alpha\)) of 0.05.
To illustrate the mathematics behind the calculations performed for the independent samples t-test, let’s break down each step using the provided scores for Class A and Class B:
Calculate the means (\(\bar{X}_1\) and \(\bar{X}_2\)):
For Class A: \[ \bar{X}_1 = \frac{85 + 88 + 90 + 95 + 78}{5} = 87.2 \]
For Class B: \[ \bar{X}_2 = \frac{80 + 83 + 79 + 92 + 87}{5} = 84.2 \]
Calculate the sample variances (\(s_1^2\) and \(s_2^2\)):
For Class A: \[ s_1^2 = \frac{(85 - 87.2)^2 + (88 - 87.2)^2 + (90 - 87.2)^2 + ....}{4} \]
\[ s_1^2 = \frac{(-2.2)^2 + (0.8)^2 + (2.8)^2 + (7.8)^2 + (-9.2)^2}{4} \] \[ s_1^2 = \frac{4.84 + 0.64 + 7.84 + 60.84 + 84.64}{4} = 39.7 \]
For Class B: \[ s_2^2 = \frac{(80 - 84.2)^2 + (83 - 84.2)^2 + (79 - 84.2)^2 + ....}{4} \] \[ s_2^2 = \frac{(-4.2)^2 + (-1.2)^2 + (-5.2)^2 + (7.8)^2 + (2.8)^2}{4} \] \[ s_2^2 = \frac{17.64 + 1.44 + 27.04 + 60.84 + 7.84}{4} = 28.7 \]
Calculate the pooled variance (\(s_p^2\)):
\[ s_p^2 = \frac{(4 \times 39.7) + (4 \times 28.7)}{8} \] \[ s_p^2 = \frac{158.8 + 114.8}{8} = 34.2 \]
Calculate the t-statistic:
\[ t = \frac{87.2 - 84.2}{\sqrt{34.2} \cdot \sqrt{\frac{1}{5} + \frac{1}{5}}} \] \[ t = \frac{3}{\sqrt{34.2} \cdot \sqrt{\frac{2}{5}}} \] \[ t = \frac{3}{\sqrt{34.2} \cdot \sqrt{0.4}} \] \[ t = \frac{3}{5.85 \cdot 0.6325} = 0.8111 \]
Degrees of freedom:
\[ \text{df} = 5 + 5 - 2 = 8 \]
The critical t-value and p-value:
The t-value needs to be compared against the critical value from a t-distribution table for df = 8 and a two-tailed test with \(\alpha = 0.05\).
If t > 2.306, the null hypothesis is rejected.
In this case, t = 0.8111, so the null hypothesis is not rejected.
24.4.1 Two samples t-test calculation using Excel:
24.5 Two-Sample T-Test calculation using R and Python:
24.6 Example Research Articles on Independent samples t-test:
- Analysis of Job Satisfaction and Turnover Intention According to the Characteristics of Forest Industry Workers — Forests, 2024. 👉 Download Article
- Factors Affecting Remote Workers’ Job Satisfaction in Utah: An Exploratory Study — International Journal of Environmental Research and Public Health, 2023. 👉 Download Article
- Job Satisfaction, Perceived Performance and Work Regime: What Is the Relationship Between These Variables? — Administrative Sciences, 2025. 👉 Download Article
- Customer orientation, open innovation and enterprise performance: Evidence from Ethiopian SMEs — Cogent Business & Management, 2024. 👉 Download Article
Summary
| Concept | Description |
|---|---|
| Foundations | |
| Independent-Samples t-test | A parametric test that compares the means of two independent groups to detect a significant difference |
| Two Independent Groups | The observations in group 1 do not overlap with the observations in group 2 |
| Assumptions | |
| Independence of Samples | Each observation is collected independently within and between groups |
| Normality | The values in each group are approximately normally distributed, especially for small samples |
| Equality of Variances | The variances of the two groups are assumed equal unless Welch's correction is applied |
| Hypotheses | |
| Null Hypothesis | States that the two population means are equal |
| Alternative Hypothesis | States that the two population means differ |
| Two-tailed vs One-tailed | Two-tailed tests any difference; one-tailed tests a specific direction stated in advance |
| Computation | |
| Sample Means | The arithmetic averages of the two samples, which drive the numerator of the t-statistic |
| Pooled Standard Deviation | A weighted average of the two sample variances, used when equal variances are assumed |
| t-Statistic | The ratio of the difference in means to the standard error of that difference |
| Degrees of Freedom | Equals n1 plus n2 minus 2 for the equal-variances form |
| Decision Rules | |
| Critical Value Rule | Reject H0 when the absolute t-statistic exceeds the critical value for the chosen alpha |
| p-value Rule | Reject H0 when the p-value is below the chosen alpha, which is equivalent to the critical-value rule |
| In R and Python | |
| Welch's t-test | A variant used when variances are clearly unequal, with an adjusted degrees-of-freedom formula |
| R via t.test() | Use t.test(x, y, var.equal = TRUE) for the classic form and drop var.equal for Welch's correction |
| Python via ttest_ind() | Use scipy.stats.ttest_ind(x, y, equal_var=True) and set equal_var=False for Welch's correction |