26 ANOVA

ANOVA (Ronald A. Fisher, 1925), which stands for Analysis of Variance, is a statistical technique used to determine if there are any statistically significant differences between the means of three or more independent (unrelated) groups. It tests the hypothesis that the means of several groups are equal, and it does this by comparing the variance (spread) of scores among the groups to the variance within each group. The primary goal of ANOVA is to uncover whether there is a difference among group means, rather than determining which specific groups are different from each other.

26.1 Types of ANOVA

One-Way ANOVA: Also known as single-factor ANOVA, it assesses the impact of a single factor (independent variable) on a continuous outcome variable. It compares the means across two or more groups. For example, testing the effect of different diets on weight loss.
Two-Way ANOVA: This extends the one-way by not only looking at the impact of one, but two factors simultaneously on a continuous outcome. It can also evaluate the interaction effect between the two factors. For example, studying the effect of diet and exercise on weight loss.
Repeated Measures ANOVA: Used when the same subjects are used for each treatment (e.g., measuring student performance at different times of the year).
Multivariate Analysis of Variance (MANOVA): MANOVA is an extension of ANOVA when there are two or more dependent variables.

26.2 Assumptions of ANOVA

ANOVA relies on several assumptions about the data:

Independence of Cases: The groups compared must be composed of different individuals, with no individual being in more than one group.
Normality: The distribution of the residuals (differences between observed and predicted values) should follow a normal distribution.
Homogeneity of Variances: The variance among the groups should be approximately equal. This can be tested using Levene’s Test or Bartlett’s Test.

26.3 ANOVA Formula

The basic formula for ANOVA is centered around the calculation of two types of variances: within-group variance and between-group variance. The F-statistic is calculated by dividing the variance between the groups by the variance within the groups:

\[F = \frac{\text{Variance between groups}}{\text{Variance within groups}}\]

26.3.1 Steps to Conduct ANOVA

State the Hypothesis:
- Null hypothesis (H0): The means of the different groups are equal.
- Alternative hypothesis (Ha): At least one group mean is different from the others.
Calculate ANOVA: Determine the F-statistic using the ANOVA formula, which involves calculating the between-group variance and the within-group variance.
Compare to Critical Value: Compare the calculated F-value to a critical value obtained from an F-distribution table, considering the degrees of freedom for the numerator (between-group variance) and the denominator (within-group variance) and the significance level (alpha, usually set at 0.05).
Make a Decision: If the F-value is greater than the critical value, reject the null hypothesis. This indicates that there are significant differences between the means of the groups.

26.3.2 Post-hoc Tests

If the ANOVA indicates significant differences, post-hoc tests like Tukey’s HSD, Bonferroni, or Dunnett’s can be used to identify exactly which groups differ from each other.

Summary

Concept	Description
Foundations
ANOVA	Analysis of Variance, a parametric test for comparing means across three or more groups
Why Not Multiple t-tests	Running many pairwise t-tests inflates the family-wise error rate, so ANOVA is used for the omnibus test
Types of ANOVA
One-Way ANOVA	Examines the effect of a single factor on a continuous outcome across two or more groups
Two-Way ANOVA	Examines the effects of two factors and their interaction on a continuous outcome
Repeated Measures ANOVA	Used when the same subjects are measured under every condition or at every time point
MANOVA	Multivariate extension of ANOVA when there are two or more dependent variables
Assumptions
Independence of Cases	Each group must contain different individuals, with no overlap between groups
Normality of Residuals	The residuals should be approximately normally distributed
Homogeneity of Variances	The variances should be approximately equal across groups, tested with Levene's or Bartlett's
Computation
Between-Group Variance	Variability of group means around the overall mean, the signal in the F-ratio
Within-Group Variance	Variability of observations around their own group mean, the noise in the F-ratio
F-Statistic	The ratio of between-group variance to within-group variance, compared against an F-distribution
Decision and Follow-up
Null Hypothesis	States that all group means are equal
Decision Rule	Reject H0 when the F-statistic exceeds the critical value or when p is below alpha
Tukey's HSD	A common post-hoc procedure that compares all pairs of group means while controlling family-wise error
Bonferroni and Dunnett	Alternative post-hoc corrections, used for conservative control or comparisons against a single reference group