Q&A 5 How do you perform a one-way ANOVA?

5.1 Explanation

ANOVA (Analysis of Variance) is used to compare means across more than two groups. It tests the null hypothesis that all groups have the same mean.

5.2 Python Code

import pandas as pd
from scipy.stats import f_oneway

# Load sample data
df = pd.read_csv("data/iris.csv")

# Group by species
setosa = df[df["species"] == "setosa"]["sepal_length"]
versicolor = df[df["species"] == "versicolor"]["sepal_length"]
virginica = df[df["species"] == "virginica"]["sepal_length"]

# Perform ANOVA
f_stat, p_val = f_oneway(setosa, versicolor, virginica)
print(f"F-statistic: {f_stat}, P-value: {p_val}")
F-statistic: 119.26450218450468, P-value: 1.6696691907693826e-31

5.3 R Code

library(readr)

# Load sample data
df <- read_csv("data/iris.csv")

# Perform one-way ANOVA
anova_result <- aov(sepal_length ~ species, data = df)
summary(anova_result)
             Df Sum Sq Mean Sq F value Pr(>F)    
species       2  63.21  31.606   119.3 <2e-16 ***
Residuals   147  38.96   0.265                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1