summaryrefslogtreecommitdiff
path: root/tutorials/module_4/3_linear_regression.md
diff options
context:
space:
mode:
Diffstat (limited to 'tutorials/module_4/3_linear_regression.md')
-rw-r--r--tutorials/module_4/3_linear_regression.md106
1 files changed, 99 insertions, 7 deletions
diff --git a/tutorials/module_4/3_linear_regression.md b/tutorials/module_4/3_linear_regression.md
index 511ea1a..6c60531 100644
--- a/tutorials/module_4/3_linear_regression.md
+++ b/tutorials/module_4/3_linear_regression.md
@@ -1,18 +1,110 @@
# Linear Regression
-## Statistical tools
-Numpy comes with some useful statistical tools that we can use to analyze our data.
+###
+
+## Linear Regression
+### What is Linear Regression?
+Linear regression is one of the most fundamental techniques in data analysis.
+It models the relationship between two (or more) variables by fitting a **straight line** that best describes the trend in the data.
+
+Mathematically, the model assumes a linear equation:
+$$
+y = m x + b
+$$
+where
+- $y$ = dependent variable
+- $x$ = independent variable
+- $m$ = slope (rate of change)
+- $b$ = intercept (value of $y$ when $x = 0$)
+
+Linear regression helps identify proportional relationships, estimate calibration constants, or model linear system responses.
+
+### Problem 1: Stress–Strain Relationship
+Let’s assume we’ve measured the stress (σ) and strain (ε) for a material test and want to estimate Young’s modulus (E) from the slope.
```python
import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+
+# Example data (strain, stress)
+strain = np.array([0.000, 0.0005, 0.0010, 0.0015, 0.0020, 0.0025])
+stress = np.array([0.0, 52.0, 104.5, 157.2, 208.1, 261.4]) # MPa
+
+# Fit a linear regression line using NumPy
+coeffs = np.polyfit(strain, stress, deg=1)
+m, b = coeffs
+print(f"Slope (E) = {m:.2f} MPa, Intercept = {b:.2f}")
+
+# Predicted stress
+stress_pred = m * strain + b
+
+# Plot
+plt.figure()
+plt.scatter(strain, stress, label="Experimental Data", color="navy")
+plt.plot(strain, stress_pred, color="red", label="Linear Fit")
+plt.xlabel("Strain (mm/mm)")
+plt.ylabel("Stress (MPa)")
+plt.title("Linear Regression – Stress–Strain Curve")
+plt.legend()
+plt.grid(True)
+plt.show()
-mean = np.mean([1, 2, 3, 4, 5])
-median = np.median([1, 2, 3, 4, 5])
-std = np.std([1, 2, 3, 4, 5])
-variance = np.var([1, 2, 3, 4, 5])
```
+The slope `m` represents the Young’s Modulus (E), showing the stiffness of the material in the linear elastic region.
+```python
+import numpy as np
+import matplotlib.pyplot as plt
+from sklearn.linear_model import LinearRegression
+from sklearn.metrics import r2_score, mean_squared_error
-###
+# ------------------------------------------------
+# 1. Example Data: Stress vs. Strain
+# (Simulated material test data)
+strain = np.array([0.000, 0.0005, 0.0010, 0.0015, 0.0020, 0.0025])
+stress = np.array([0.0, 52.0, 104.5, 157.2, 208.1, 261.4]) # MPa
+
+# Reshape strain for scikit-learn (expects 2D input)
+X = strain.reshape(-1, 1)
+y = stress
+
+# ------------------------------------------------
+# 2. Fit Linear Regression Model
+model = LinearRegression()
+model.fit(X, y)
+
+# Extract slope and intercept
+m = model.coef_[0]
+b = model.intercept_
+print(f"Linear model: Stress = {m:.2f} * Strain + {b:.2f}")
+
+# ------------------------------------------------
+# 3. Predict Stress Values and Evaluate the Fit
+y_pred = model.predict(X)
+
+# Coefficient of determination (R²)
+r2 = r2_score(y, y_pred)
+
+# Root mean square error (RMSE)
+rmse = np.sqrt(mean_squared_error(y, y_pred))
+
+print(f"R² = {r2:.4f}")
+print(f"RMSE = {rmse:.3f} MPa")
+
+# ------------------------------------------------
+# 4. Visualize Data and Regression Line
+plt.figure(figsize=(6, 4))
+plt.scatter(X, y, color="navy", label="Experimental Data")
+plt.plot(X, y_pred, color="red", label="Linear Fit")
+plt.xlabel("Strain (mm/mm)")
+plt.ylabel("Stress (MPa)")
+plt.title("Linear Regression – Stress–Strain Relationship")
+plt.legend()
+plt.grid(True)
+plt.tight_layout()
+plt.show()
+
+``` \ No newline at end of file