37  Spearman Rank Correlation

37.1 Spearman Rank Correlation Scale Tests

The Spearman Rank Correlation Coefficient (Charles Spearman, 1904), often referred to as Spearman’s rho, is a non-parametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function. This test is ideal for cases where the variables may not meet the assumptions necessary for Pearson’s correlation coefficient, such as not having a normal distribution of data or a linear relationship.

37.1.1 Assumptions

The Spearman Rank Correlation test operates under the following assumptions:

  1. Monotonic Relationship: The relationship between the variables should be monotonic, either increasing or decreasing, but not necessarily at a constant rate.
  2. Ordinal Data: The test can be applied to ordinal data or to continuous data that do not meet the assumptions required for Pearson’s correlation.

37.1.2 Hypotheses

The hypotheses for the Spearman Rank Correlation test are:

  • Null Hypothesis (H₀): There is no association between the two variables (the correlation is zero).
  • Alternative Hypothesis (H₁): There is an association between the two variables (the correlation is not zero).

37.1.3 Formula

The Spearman’s rho ($ \() is calculated as follows:\)$ = 1 - $$ Where: - \(d_i\) is the difference between the ranks of corresponding variables. - \(n\) is the number of observations.

37.1.4 Calculation Steps

  1. Rank each variable separately. Assign average ranks in case of ties.
  2. Compute the difference (\(d\)) between the ranks of each pair of corresponding variables.
  3. Square each difference (\(d_i^2\)).
  4. Sum all squared differences.
  5. Substitute the summed value into the formula to find $ $.

37.1.5 Interpretation

The Spearman’s rho values range from -1 to +1: - A \(\rho\) of +1 indicates a perfect positive association. - A \(\rho\) of -1 indicates a perfect negative association. - A \(\rho\) of 0 suggests no association.

The significance of \(\rho\) can be tested using tables of critical values or computationally to determine if the observed correlation is unlikely under the null hypothesis.

37.1.6 Example Problem

Suppose a researcher wants to examine if there is a correlation between the ranks of employees based on their performance scores and peer ratings. Here are the data for 5 employees:

  • Performance Scores: 90, 85, 80, 95, 70
  • Peer Ratings: 88, 80, 85, 90, 75

Hypotheses:

  • Null Hypothesis (H₀): There is no correlation between performance scores and peer ratings.
  • Alternative Hypothesis (H₁): There is a correlation between performance scores and peer ratings.

37.1.7 Spearman Rank Correlation using Excel:

📥 Stats Basics (Excel)

37.2 Spearman Rank Correlation using R and Python

This test is particularly valuable in research areas where data are ordinal or do not meet the prerequisites for parametric tests, providing a robust method for correlation analysis under such conditions.


Summary

Concept Description
Foundations
Spearman's Rho A non-parametric measure of monotonic association between two variables, based on their ranks
Monotonic Relationship Captures relationships in which one variable consistently increases or decreases with the other
Non-parametric Counterpart of Pearson Used in place of Pearson's r when assumptions of linearity or bivariate normality are not met
Assumptions
Ordinal or Continuous Data Applicable when data are ordinal or when continuous variables fail Pearson's assumptions
Robustness to Outliers Less sensitive to extreme values than Pearson's r because it operates on ranks rather than raw values
Interpretation
Range of Rho Always lies between minus one and plus one inclusive
Interpretation Plus one is a perfect monotonic increase, minus one a perfect decrease, zero indicates no monotonic association
Hypotheses
Null Hypothesis States that there is no monotonic association between the two variables
Alternative Hypothesis States that there is a monotonic association, with rho not equal to zero
Computation
Rank Each Variable Separately Each variable is ranked independently, with average ranks assigned for ties
Differences in Ranks Compute the difference between paired ranks to quantify how closely the orderings agree
Spearman Formula Rho equals 1 minus 6 times the sum of squared rank differences divided by n times (n squared minus 1)
Tied Ranks Handling Tied values receive the average of the ranks they would otherwise have occupied
In R and Python
R via cor(method = 'spearman') Use cor(x, y, method = 'spearman') for rho and cor.test(..., method = 'spearman') for the test in R
Python via scipy.stats.spearmanr() Use scipy.stats.spearmanr(x, y) to obtain rho and a two-sided p-value in Python