Q&A 8 How do you test if the mean of two groups is significantly different?

8.1 Explanation

You can use a t-test to compare the means of two independent groups. This test checks whether the difference in means is statistically significant. ## Python Code

import pandas as pd
from scipy.stats import ttest_ind

# Load sample data
df = pd.read_csv("data/iris.csv")

# t-test between two species
group1 = df[df['species'] == 'setosa']['sepal_length']
group2 = df[df['species'] == 'versicolor']['sepal_length']

t_stat, p_val = ttest_ind(group1, group2)
print(f"T-statistic: {t_stat}, P-value: {p_val}")
T-statistic: -10.52098626754911, P-value: 8.985235037487079e-18

8.2 R Code

library(readr)
library(dplyr)

# Load data
df <- read_csv("data/iris.csv")

# t-test between two species
setosa <- df %>% filter(species == "setosa") %>% pull(sepal_length)
versicolor <- df %>% filter(species == "versicolor") %>% pull(sepal_length)

t.test(setosa, versicolor)

    Welch Two Sample t-test

data:  setosa and versicolor
t = -10.521, df = 86.538, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -1.1057074 -0.7542926
sample estimates:
mean of x mean of y 
    5.006     5.936