Q&A 5 How do you perform a one-way ANOVA?
5.1 Explanation
ANOVA (Analysis of Variance) is used to compare means across more than two groups. It tests the null hypothesis that all groups have the same mean.
5.2 Python Code
import pandas as pd
from scipy.stats import f_oneway
# Load sample data
df = pd.read_csv("data/iris.csv")
# Group by species
setosa = df[df["species"] == "setosa"]["sepal_length"]
versicolor = df[df["species"] == "versicolor"]["sepal_length"]
virginica = df[df["species"] == "virginica"]["sepal_length"]
# Perform ANOVA
f_stat, p_val = f_oneway(setosa, versicolor, virginica)
print(f"F-statistic: {f_stat}, P-value: {p_val}")
F-statistic: 119.26450218450468, P-value: 1.6696691907693826e-31
5.3 R Code
library(readr)
# Load sample data
df <- read_csv("data/iris.csv")
# Perform one-way ANOVA
anova_result <- aov(sepal_length ~ species, data = df)
summary(anova_result)
Df Sum Sq Mean Sq F value Pr(>F)
species 2 63.21 31.606 119.3 <2e-16 ***
Residuals 147 38.96 0.265
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1