Q&A 11 How do you use a chi-squared test to determine if two categorical variables are independent?
11.1 Explanation
This question demonstrates how to perform a chi-squared test of independence to assess the relationship between two categorical variables. The chi-squared statistic tests whether distributions of categorical variables differ from each other, based on a contingency table.
11.2 Python Code
import pandas as pd
import scipy.stats as stats
# Sample contingency table
data = pd.DataFrame({
"A": [20, 15],
"B": [30, 35]
}, index=["Yes", "No"])
# Chi-squared test
chi2, p, dof, expected = stats.chi2_contingency(data)
print(f"Chi2: {chi2:.2f}, p-value: {p:.4f}")
Chi2: 0.70, p-value: 0.4017
11.3 R Code
# Create a contingency table
data <- matrix(c(20, 30, 15, 35), nrow = 2, byrow = TRUE)
colnames(data) <- c("A", "B")
rownames(data) <- c("Yes", "No")
# Perform chi-squared test
test <- chisq.test(data)
test
Pearson's Chi-squared test with Yates' continuity correction
data: data
X-squared = 0.7033, df = 1, p-value = 0.4017