18 Cochran’s Q test-post-hoc test

Cochran’s Q Test

Cochran’s Q Test (William G. Cochran, 1950) is a non-parametric statistical test used to determine whether there are significant differences in the frequencies of a binary outcome across three or more related groups or conditions. It is an extension of the McNemar test for scenarios involving more than two related groups and is commonly used for repeated measures where the response variable is dichotomous. This test is useful for analyzing data from studies where the same subjects are under different conditions, such as different time points or different treatments.

18.1 Understanding Cochran’s Q Test:

1. Null and Alternative Hypotheses:

Null Hypothesis (H0): The null hypothesis states that the proportions of the binary outcome are the same across all groups or conditions.
Alternative Hypothesis (H1): The alternative hypothesis suggests that there is a significant difference in the proportions of the binary outcome across at least one of the conditions.

2. Test Statistic:

The Cochran’s Q test statistic is based on the number of times each subject has the characteristic of interest across all conditions and the total number of characteristics observed for all subjects across all conditions.
The test statistic follows a chi-squared distribution with (k - 1) degrees of freedom under the null hypothesis, where (k) is the number of related groups or conditions.

3. Calculation of Test Statistic:

Let (n) be the total number of subjects, and (k) be the number of conditions. The Cochran’s Q test statistic is calculated by comparing the variance of the total scores across conditions with the variance expected by chance.
The formula for Cochran’s Q test statistic is: \[ Q = \frac{(k-1)(k\sum{T_j^2} - (\sum{T_j})^2)}{k\sum{t_i} - \sum{T_j^2}} \] where (T_j) is the total number of times the characteristic appears in the (j)-th condition and (t_i) is the total number of times the characteristic appears for the (i)-th subject.

4. Interpretation of Results:

If the calculated (Q) value is greater than the critical value from the chi-squared distribution with (k - 1) degrees of freedom at the chosen significance level (commonly (= 0.05)), then the null hypothesis is rejected, indicating significant differences across conditions.

18.2 Applications of Cochran’s Q Test:

A. Medical Research:

In clinical trials, the Cochran’s Q test is used to assess the consistency of treatment effects observed at different time points or under different conditions within the same group of patients.

B. Psychology:

Psychologists may apply the Cochran’s Q test to evaluate the consistency of binary responses (like success or failure) across repeated measures or different experimental conditions.

C. Quality Control:

In industrial settings, the Cochran’s Q test can be used to compare the pass/fail rates of products or processes across different shifts or batches.

Considerations:

Cochran’s Q test assumes that the observations are independent within subjects but not between subjects.
The test is only applicable to binary (dichotomous) outcomes.
The Cochran’s Q test may lose power if the sample size is small, and alternative methods should be considered in such cases.

In summary, Cochran’s Q test offers a robust method for analyzing differences in binary outcomes across more than two related groups or conditions. It is especially valuable for repeated measures design where the same subjects are exposed to different conditions, allowing researchers to investigate the consistency of an effect or response across those conditions.

18.3 Example Problem: Cochran’s Q test

A software company wants to test the reliability of three versions of a software application (Version A, Version B, and Version C) under the same conditions. They have 10 testers that each test all three versions for reliability. The outcome is binary: Pass (if the software version works reliably during the test) or Fail (if it does not). The results are as follows:

Tester	Version A	Version B	Version C
1	1	0	0
2	1	0	0
3	1	0	0
4	1	0	0
5	1	0	0
6	1	1	0
7	1	1	0
8	1	1	1
9	1	1	1
10	0	1	1

(Pass=1, Fail=0)

The company wants to know if there is a significant difference in reliability between the three software versions.

18.3.1 Calculation of Cochran’s Q Test:

Calculate the totals for each version (sum across testers):
- \(T_A = 1+1+1+1+1+1+1+1+1+0 = 9\)
- \(T_B = 0+0+0+0+0+1+1+1+1+1 = 5\)
- \(T_C = 0+0+0+0+0+0+0+1+1+1 = 3\)
Calculate the totals for each tester (sum across versions):
- \(t_1 = 1 + 0 + 0 = 1\)
- \(t_2 = 1 + 0 + 0 = 1\)
- \(t_3 = 1 + 0 + 0 = 1\)
- \(t_4 = 1 + 0 + 0 = 1\)
- \(t_5 = 1 + 0 + 0 = 1\)
- \(t_6 = 1 + 1 + 0 = 2\)
- \(t_7 = 1 + 1 + 0 = 2\)
- \(t_8 = 1 + 1 + 1 = 3\)
- \(t_9 = 1 + 1 + 1 = 3\)
- \(t_{10} = 0 + 1 + 1 = 2\)
Compute the sums needed for the Q statistic:
- \(\sum T_j^2 = 9^2 + 5^2 + 3^2 = 115\)
- \((\sum T_j)^2 = (9+5+3)^2 = 17^2 = 289\)
- \(\sum t_i = 1+1+1+1+1+2+2+3+3+2 = 17\)
- \(\sum t_i^2 = 1^2+1^2+1^2+1^2+1^2+2^2+2^2+3^2+3^2+2^2 = 35\)
- \(k\sum t_i - \sum t_i^2 = 3 \times 17 - 35 = 16\)
- Number of versions: \(k = 3\)
Calculate the Q statistic: \[ Q = (k-1)\;\frac{k\sum T_j^2 - (\sum T_j)^2}{k\sum t_i - \sum t_i^2} \]

Substituting values:

\[ Q = (3-1)\;\frac{3(115) - 289}{16} = 2 \times \frac{345 - 289}{16} = 2 \times \frac{56}{16} = 7 \]

Result:
- \(Q = 7\) with \(df = k-1 = 2\)
- Comparing with table value, \(\chi^2_2\), \(p \approx 0.03\).

Interpretation

Since \(p < 0.05\), the Cochran’s Q test indicates a statistically significant difference in reliability among the three software versions.

Conclusion

At least one software version differs in reliability. The company can conclude that the performance of the software versions is not the same, and some versions are more reliable than others.

18.4 Cochran’s Q Test in R and Python

Python

18.5 Post-hoc Tests for Cochran’s Q Test:

After performing Cochran’s Q test, if the result is significant, it implies that there are differences in the binary outcomes across the related groups. However, Cochran’s Q test does not specify which groups differ from each other. To identify the specific groups between which these differences occur, you would perform post-hoc pairwise comparisons.

For Cochran’s Q test, one common approach for post-hoc analysis is to use pairwise comparisons with a Bonferroni correction to adjust for multiple testing.

Let’s consider a hypothetical example involving Cochran’s Q test and its subsequent post-hoc analysis:

18.6 Post-hoc Tests for Cochran’s Q Test in R and Python

Python

Summary

Concept	Description
Foundations
Cochran's Q Test	Non-parametric test for three or more related groups measuring a binary outcome on the same subjects
Extension of McNemar	Generalises McNemar's test from two paired conditions to any number of paired conditions
Binary Outcome	Applicable only when the outcome takes one of two possible values, such as pass and fail
Hypotheses and Formula
Null Hypothesis	All conditions share the same proportion of successes, so any condition is exchangeable with any other
Alternative Hypothesis	At least one condition differs in success proportion from the others
Q Statistic	Formula based on condition totals and subject totals that follows a chi-squared distribution
Degrees of Freedom (k-1)	Reference degrees of freedom equal the number of conditions minus one
Condition Totals (T_j)	Column totals representing the number of successes in each condition
Subject Totals (t_i)	Row totals representing the number of successful responses for each subject
Applications
Medical Research	Compares treatment effects at multiple time points within the same patient cohort
Psychology	Tests whether repeated binary responses differ across experimental conditions
Quality Control	Compares pass and fail rates across shifts, batches, or machines
Social Sciences	Analyses repeated survey outcomes across multiple waves of the same respondents
Post-hoc Analysis
Post-hoc Pairwise Tests	Pairwise McNemar tests that identify which conditions differ when the overall Q is significant
Bonferroni Correction	Divides the alpha level by the number of comparisons to control the family-wise error rate
In R and Python
R friedman.test / cochrans_q	In R, Cochran's Q can be computed via friedman.test or the cochrans_q helper
Python scipy.stats.friedmanchisquare	In Python, Cochran's Q is reproduced by scipy.stats.friedmanchisquare on binary data
Caveats
Sample Size Sensitivity	The test loses power with small samples, requiring alternative approaches when n is tiny