Q&A 8 How do you test if the mean of two groups is significantly different?
8.1 Explanation
You can use a t-test to compare the means of two independent groups. This test checks whether the difference in means is statistically significant. ## Python Code
import pandas as pd
from scipy.stats import ttest_ind
# Load sample data
df = pd.read_csv("data/iris.csv")
# t-test between two species
group1 = df[df['species'] == 'setosa']['sepal_length']
group2 = df[df['species'] == 'versicolor']['sepal_length']
t_stat, p_val = ttest_ind(group1, group2)
print(f"T-statistic: {t_stat}, P-value: {p_val}")
T-statistic: -10.52098626754911, P-value: 8.985235037487079e-18
8.2 R Code
library(readr)
library(dplyr)
# Load data
df <- read_csv("data/iris.csv")
# t-test between two species
setosa <- df %>% filter(species == "setosa") %>% pull(sepal_length)
versicolor <- df %>% filter(species == "versicolor") %>% pull(sepal_length)
t.test(setosa, versicolor)
Welch Two Sample t-test
data: setosa and versicolor
t = -10.521, df = 86.538, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.1057074 -0.7542926
sample estimates:
mean of x mean of y
5.006 5.936