25 Paired Samples t-test

The paired sample t-test, also known as the dependent sample t-test or repeated measures t-test, is a statistical method used to compare two related means. This test is applicable when the data consists of matched pairs of similar units or the same unit is tested at two different times.

25.1 Key Features and Applications

The paired sample t-test is commonly used in situations such as:

Comparing the before and after effects of a treatment on the same subjects.
Measuring performance on two different occasions.
Comparing two different treatments on the same subjects in a crossover study.

25.2 Assumptions

To properly conduct a paired sample t-test, the data must meet the following assumptions:

Paired Data: The observations are collected in pairs, such as pre-test and post-test measurements or measurements of the same subjects under two different conditions.
Normality: The differences between the paired observations should be approximately normally distributed. This assumption can be tested using plots or normality tests like the Shapiro-Wilk test.
Scale of Measurement: The variable being tested should be continuous and measured at least at the interval level.

25.3 Hypotheses

The hypotheses for a paired sample t-test are as follows:

Null Hypothesis (H₀): The mean difference between the paired observations is zero (\(\mu_d = 0\)).
Alternative Hypothesis (H₁): The mean difference between the paired observations is not zero (\(\mu_d \neq 0\)). This can be tailored to a one-tailed test if a specific direction is hypothesized (\(\mu_d > 0\) or \(\mu_d < 0\)).

25.3.1 Formulae

Mean Difference (\(\bar{d}\)):

The mean difference is calculated by taking the average of the differences between all paired observations. \[ \bar{d} = \frac{1}{n} \sum_{i=1}^n (x_{i1} - x_{i2}) \]

Where \(x_{i1}\) and \(x_{i2}\) are the measurements from the first and second condition for the ith pair, and \(n\) is the number of pairs.

Standard Deviation of the Differences (\(s_d\)):

This measures the variability of the differences between the paired observations. \[ s_d = \sqrt{\frac{\sum_{i=1}^n (d_i - \bar{d})^2}{n-1}} \] Here, \(d_i = x_{i1} - x_{i2}\) represents the difference for each pair.

t-Statistic:

The t-statistic is calculated to determine if the differences are statistically significant. \[ t = \frac{\bar{d}}{s_d / \sqrt{n}} \] This formula represents the ratio of the mean difference to the standard error of the difference.

25.3.2 calculation of Degrees of Freedom

The degrees of freedom for the paired sample t-test are \(n - 1\), where \(n\) is the number of pairs.

25.3.3 Interpretation

To decide whether to reject the null hypothesis, compare the calculated t-value with the critical t-value from the t-distribution at the chosen significance level (\(\alpha\)), typically set at 0.05 for a 5% significance level. If the absolute value of the t-statistic is greater than the critical value, the null hypothesis is rejected, suggesting a significant difference between the paired groups.

This test is particularly valuable for detecting changes in conditions or treatments when the same subjects are observed under both scenarios, as it effectively accounts for variability between subjects.

25.4 Paired samples t-test Example problem

A nutritionist wants to test the effectiveness of a new diet program. To do this, they measure the weight of 5 participants before starting the program and again after 6 weeks on the program. The goal is to see if there is a significant change in weight due to the diet.

Participant Weights (kg) Before the Diet: 70, 72, 75, 80, 78
Participant Weights (kg) After the Diet: 68, 70, 74, 77, 76

Hypotheses:

Null Hypothesis (H₀): There is no significant difference in the mean weight before and after the diet. (\(\mu_d = 0\))
Alternative Hypothesis (H₁): There is a significant difference in the mean weight before and after the diet. (\(\mu_d \neq 0\))

Significance Level:

We will use a significance level (\(\alpha\)) of 0.05.

Let’s break down the detailed mathematics behind each step of the paired samples t-test for the diet program effectiveness example, using the provided weights before and after the diet.

Calculate the differences for each participant:

\[ \begin{align*} d_1 & = 70 - 68 = 2 \\ d_2 & = 72 - 70 = 2 \\ d_3 & = 75 - 74 = 1 \\ d_4 & = 80 - 77 = 3 \\ d_5 & = 78 - 76 = 2 \\ \end{align*} \] Differences: \(d = [2, 2, 1, 3, 2]\)

Calculate the mean difference (\(\bar{d}\)):

\[ \bar{d} = \frac{2 + 2 + 1 + 3 + 2}{5} = \frac{10}{5} = 2 \text{ kg} \]

Calculate the standard deviation of the differences (\(s_d\)):

First, calculate the squared deviations from the mean:

\[ \begin{align*} (2 - 2)^2 & = 0 \\ (2 - 2)^2 & = 0 \\ (1 - 2)^2 & = 1 \\ (3 - 2)^2 & = 1 \\ (2 - 2)^2 & = 0 \\ \end{align*} \] Sum of squared deviations:

\[ 0 + 0 + 1 + 1 + 0 = 2 \] Now, calculate \(s_d\): \[ s_d = \sqrt{\frac{2}{4}} = \sqrt{0.5} = 0.707 \text{ kg} \]

Calculate the t-statistic:

Use the formula for the t-statistic with \(n = 5\) (number of participants): \[ t = \frac{\bar{d}}{s_d / \sqrt{n}} = \frac{2}{0.707 / \sqrt{5}} = \frac{2}{0.707 / 2.236} = \frac{2}{0.316} = 6.324 \]

Degrees of freedom (\(df\)):

\[ df = n - 1 = 5 - 1 = 4 \]

Compare the calculated t-statistic to the critical t-value:

The critical t-value for \(df = 4\) and a two-tailed test with \(\alpha = 0.05\) is approximately 2.776 (from t-distribution tables).

25.4.1 Interpretation

Since the calculated t-statistic (6.324) is significantly greater than the critical t-value (2.776), we reject the null hypothesis. This indicates a statistically significant decrease in weight due to the diet, confirming the effectiveness of the nutritionist’s program. The precise calculation steps and their results provide strong mathematical evidence for this conclusion.

25.4.2 Paired Samples T-Test calculation using Excel:

📥 Stats Basics (Excel)

25.5 Paired Samples T-Test calculation using R and Python

Python

25.6 Example Research Articles on Paired t-test:

How to maximize the impact of workplace training: a mixed-method analysis of social support, training transfer and knowledge sharing — European Journal of Work and Organizational Psychology, 2024. 👉 Download Article

Summary

Concept	Description
Foundations
Paired-Samples t-test	A parametric test that compares two related means by analysing within-pair differences
Matched Pairs Design	The same units or matched pairs appear in both conditions, creating natural pairings
Before-and-After Studies	A common use case where the same subjects are measured before and after an intervention
Assumptions
Paired Data	Observations come as pairs rather than as two independent samples
Normality of Differences	The differences between paired observations should be approximately normally distributed
Interval or Ratio Scale	The measured variable must be continuous and on at least an interval scale
Hypotheses
Null Hypothesis	States that the mean of the paired differences is zero
Alternative Hypothesis	States that the mean of the paired differences is not zero, or is in a specified direction
Computation
Mean Difference	The average of all paired differences, which drives the numerator of the t-statistic
Standard Deviation of Differences	Measures variability among the paired differences and drives the denominator of the t-statistic
t-Statistic	The ratio of the mean difference to the standard error of the mean difference
Degrees of Freedom	Equals n minus 1, where n is the number of pairs
Interpretation
Decision Rule	Reject H0 when the absolute t-statistic exceeds the critical value or when p is below alpha
Removes Between-Subject Variability	Pairing cancels out stable differences between units, boosting sensitivity compared with an unpaired design
In R and Python
R via t.test(paired = TRUE)	Use t.test(x, y, paired = TRUE) to run the paired t-test on two vectors of equal length
Python via ttest_rel()	Use scipy.stats.ttest_rel(x, y) to run the paired t-test on two arrays of equal length