32 Introduction to Non-parametric Tests

Non-parametric tests, also known as distribution-free tests, are a key category of statistical hypothesis tests that do not assume a specific distribution for the underlying population from which the samples are drawn. This contrasts with parametric tests, which require assumptions about the population parameters, such as the mean or standard deviation. Non-parametric methods are particularly useful when these assumptions cannot be met, either because the sample size is too small to reliably estimate the distribution or because the data clearly deviate from the assumed distribution (e.g., non-normal distributions).

32.1 Key Features of Non-parametric Tests

Distribution-Free: They do not require the data to follow a specific distribution, making them more flexible and widely applicable.
Small Sample Sizes: Non-parametric tests can be more appropriate for analyses with small sample sizes, where the distribution of data is not well-defined.
Ordinal Data: These tests are particularly suited for ordinal data (data that can be ranked but not numerically measured) or nominal data (categorical data without a natural order).
Robustness: Non-parametric tests are less sensitive to outliers and the effects of non-normality, which can skew the results of parametric tests.

32.2 Common Non-parametric Tests

Mann-Whitney U Test: Used to compare differences between two independent groups when the dependent variable is either ordinal or continuous, but not normally distributed. It’s the non-parametric alternative to the independent samples t-test.
Wilcoxon Signed-Rank Test: A non-parametric test used to compare two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean ranks differ. It can be viewed as the non-parametric counterpart to the paired sample t-test.
Kruskal-Wallis H Test: An extension of the Mann-Whitney U Test for more than two groups. It assesses whether there are statistically significant differences between three or more groups of an independent variable on a continuous or ordinal dependent variable.
Friedman Test: Used for comparing three or more paired groups, and it’s the non-parametric alternative to the one-way ANOVA with repeated measures.
Spearman’s Rank Correlation: Measures the strength and direction of association between two ranked variables. It’s used when the assumptions of Pearson’s correlation are not met.

32.3 When to Use Non-parametric Tests

When your data doesn’t meet the normality assumption required for parametric tests.
When dealing with ordinal data or nominal data that cannot be suitably analyzed with parametric methods.
When sample sizes are too small to reliably estimate the distribution of the population.
When you want to be conservative in your testing approach, given non-parametric tests are less powerful than parametric tests if all assumptions of the latter are met.

Summary

Concept	Description
Foundations
Non-parametric Tests	Statistical tests that do not assume a specific population distribution
Distribution-Free	Inference is valid without requiring the data to follow a normal or other parametric distribution
Key Features
Small Sample Friendliness	Often appropriate when n is too small to reliably estimate distributional parameters
Ordinal and Nominal Data	Well-suited to ranked or categorical outcomes where parametric assumptions cannot apply
Robustness to Outliers	Less sensitive to extreme values than parametric counterparts that rely on means and variances
Common Tests
Mann-Whitney U Test	Compares two independent groups, the rank-based counterpart of the independent-samples t-test
Wilcoxon Signed-Rank Test	Compares two related samples or matched pairs, the rank-based counterpart of the paired t-test
Kruskal-Wallis H Test	Compares three or more independent groups, the rank-based counterpart of one-way ANOVA
Friedman Test	Compares three or more related groups, the rank-based counterpart of repeated-measures ANOVA
Spearman's Rank Correlation	Measures the strength and direction of association between two ranked variables
When to Use
Violation of Normality	Use a non-parametric test when residuals are clearly non-normal and a transformation does not help
Inability to Estimate Distribution	When the underlying distribution cannot be estimated reliably, distribution-free methods are safer
Conservative Testing Approach	Non-parametric tests offer a cautious approach when assumptions for parametric tests are doubtful
Trade-offs
Power Trade-off	Non-parametric tests are typically less powerful than parametric tests when all parametric assumptions hold