Q&A 3 How do you compute the correlation between two variables?

3.1 Explanation

Correlation measures the strength and direction of a linear relationship between two variables. The Pearson correlation coefficient ranges from -1 to 1.

3.2 Python Code

import pandas as pd

# Load sample data
df = pd.read_csv("data/iris.csv")

# Pearson correlation between sepal length and sepal width
correlation = df["sepal_length"].corr(df["sepal_width"])
print("Correlation:", correlation)
Correlation: -0.11756978413300208

3.3 R Code

library(readr)

# Load sample data
df <- read_csv("data/iris.csv")

# Pearson correlation
cor(df$sepal_length, df$sepal_width)
[1] -0.1175698