36 Friedman Tests and related Post-hoc Tests

The Friedman test (Milton Friedman, 1937) is a non-parametric statistical test used to detect differences in treatments across multiple test attempts. It is the non-parametric alternative to the one-way ANOVA with repeated measures and is suitable for experiments where the data violate the assumption of normality required for ANOVA. This test is commonly used in situations where measurement variables are ordinal or when interval scale measurements fail the assumptions of normality.

36.1 Assumptions

The Friedman test operates under several key assumptions:

Dependent Samples: The groups are related or matched; typically, measurements are taken from the same subjects at different times or under different conditions.
Ordinal or Continuous Data: The data should be at least ordinal, though they can also be continuous.
Same Number of Observations: Each subject is measured the same number of times, ensuring that the data matrix is balanced.

36.2 Hypotheses

The hypotheses for the Friedman test can be framed as:

Null Hypothesis (H₀): The distributions of the treatments are identical across repeated measures.
Alternative Hypothesis (H₁): At least one treatment yields a different distribution.

36.3 Formula

The Friedman test statistic ($ ^2_F $) is calculated as follows:$$ ^2_F = _{j=1}^k R_j^2 - 3n(k+1) $$ Where:

$n$ is the number of subjects.
$k$ is the number of conditions or treatments.
$R_j$ is the sum of ranks for each treatment across all subjects.

36.3.1 Calculation Steps

Rank each row (or block) of observations separately from 1 to $k$, treating ties by assigning average ranks.
Calculate the sum of ranks for each treatment across all subjects.
Compute the test statistic using the formula above.

36.3.2 Interpretation

A significant $ ^2_F $ statistic indicates that at least one of the treatments significantly differs from the others. The value is compared against critical values from the chi-square distribution with $k-1$ degrees of freedom. If $ ^2_F $ is greater than the critical value for a given level of significance ($\alpha$), the null hypothesis is rejected.

36.4 Example Problem

Imagine a study examining the effect of three different diets on weight loss. Each of four participants tries each diet for one month, and their weight loss is recorded:

Diet A: 5, 4, 6, 5
Diet B: 4, 3, 5, 4
Diet C: 6, 5, 7, 6

Hypotheses:

Null Hypothesis (H₀): No diet leads to significantly different weight loss.
Alternative Hypothesis (H₁): At least one diet leads to significantly different weight loss.

36.4.1 Friedman Test using Excel:

📥 Stats Basics (Excel)

36.5 Friedman Test using R and Python

Python

This test is crucial in fields like medicine, psychology, and agronomy where multiple treatments are compared under non-normal conditions and repeated measures are common.

Summary

Concept	Description
Foundations
Friedman Test	A non-parametric test for differences across three or more related conditions on the same subjects
Non-parametric Counterpart of RM ANOVA	Stand-in for repeated-measures ANOVA when normality or interval-scale assumptions fail
Ranked Within Blocks	Differences are detected by ranking values within each subject rather than across the whole dataset
Assumptions
Dependent Samples	Each subject is measured under every condition, creating natural blocks
Ordinal or Continuous Outcome	The outcome must be at least ordinal so within-subject ranks can be assigned
Balanced Block Design	Each subject contributes exactly one observation per condition, producing a complete data matrix
Hypotheses
Null Hypothesis	States that all conditions have the same distribution within blocks
Alternative Hypothesis	States that at least one condition tends to produce systematically higher or lower ranks
Computation
Within-Block Ranking	Within each subject, observations are ranked from 1 to k across the k conditions
Sum of Ranks Per Treatment	Ranks are summed within each treatment column to produce R_j, the rank sum for treatment j
Friedman Chi-square Statistic	Chi-square_F is computed from the rank sums, number of subjects and number of conditions
Chi-square Approximation	Under H0, the statistic approximately follows a chi-square distribution with k minus 1 degrees of freedom
Decision and Follow-up
Decision Rule	Reject H0 when chi-square_F exceeds the critical value or when p is below alpha
Post-hoc Tests (Nemenyi, Conover)	Follow a significant Friedman test with pairwise rank-based comparisons such as Nemenyi or Conover
In R and Python
R via friedman.test()	Use friedman.test(matrix) where rows are subjects and columns are conditions in R
Python via friedmanchisquare()	Use scipy.stats.friedmanchisquare(c1, c2, ..., ck) on equal-length sequences in Python