diff options
| author | Christian Kolset <christian.kolset@gmail.com> | 2025-10-20 17:40:42 -0600 |
|---|---|---|
| committer | Christian Kolset <christian.kolset@gmail.com> | 2025-10-20 17:40:42 -0600 |
| commit | 1630ff5771ba7aa4623d25fd9d97af3a6facecbb (patch) | |
| tree | dc13362b08d8e0711ec5d7432bcf0042e7d026e1 /tutorials/module_4 | |
| parent | ba75ac4c0a829316b3cd30cd8110d353cde2b6d2 (diff) | |
Re-arranged content. Moved manipulating pandas
dataframe to lecture 1.
Diffstat (limited to 'tutorials/module_4')
| -rw-r--r-- | tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md | 55 | ||||
| -rw-r--r-- | tutorials/module_4/4.3 Importing and Managing Data.md | 42 |
2 files changed, 50 insertions, 47 deletions
diff --git a/tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md b/tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md index 8327006..d52c33c 100644 --- a/tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md +++ b/tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md @@ -14,16 +14,16 @@ We may collect this in the following ways: - **Experiments** – temperature readings from thermocouples, strain or force from sensors, vibration accelerations, or flow velocities. - **Simulations** – outputs from finite-element or CFD models such as pressure, stress, or temperature distributions. - **Instrumentation and sensors** – digital or analog signals from transducers, encoders, or DAQ systems. - -### Introduction to pandas -`pandas` (**Pan**el **Da**ta) is a Python library designed for **data analysis and manipulation**, widely used in engineering, science, and data analytics. It provides two core data structures: the **Series** and the **DataFrame**. +## Introduction to pandas +`pandas` (**Pan**el **Da**ta) is a Python library designed for data analysis and manipulation, widely used in engineering, science, and data analytics. It provides two core data structures: the **Series** and the **DataFrame**. A `Series` represents a single column or one-dimensional labeled array, while a `DataFrame` is a two-dimensional table of data, similar to a spreadsheet table, where each column is a `Series` and each row has a labeled index. DataFrames can be created from dictionaries, lists, NumPy arrays, or imported from external files such as CSV or Excel. Once data is loaded, you can **view and explore** it using methods like `head()`, `tail()`, and `describe()`. Data can be **selected by label** or **by position**. These indexing systems make it easy to slice, filter, and reorganize datasets efficiently. -### Problem 1: Import a text file +### Problem 1: Create a dataframe from a text file +Given the the file `force_displacement_data.txt`. Use pandas to tabulate the data into a dataframe ```python import pandas as pd @@ -38,7 +38,7 @@ df_txt = pd.read_csv( ) print("\n=== Basic Statistics ===") -print(df_txt.describe()) +print(df_txt.describe())232 if "Force_N" in df_txt.columns: print("\nFirst five Force readings:") @@ -59,6 +59,51 @@ except ImportError: ``` +## Subsetting and Conditional filtering +You can select rows, columns, or specific conditions from a DataFrame. + +```python +# Select a column +force = df["Force_N"] + +# Select multiple columns +subset = df[["Time_s", "Force_N"]] + +# Conditional filtering +df_high_force = df[df["Force_N"] > 50] +``` + + +![[Pasted image 20251013064718.png]] + +## Combining and Merging Datasets +Often, multiple sensors or experiments must be merged into one dataset for analysis. + +```python +# Merge on a common column (e.g., time) +merged = pd.merge(df_force, df_temp, on="Time_s") + +# Stack multiple test runs vertically +combined = pd.concat([df_run1, df_run2], axis=0) +``` + + +## Problem 1: Describe a dataset +Use pandas built-in describe data to report on the statistical mean of the given experimental data. + +```python +import matplotlib.pyplot as plt + +plt.plot(df["Time_s"], df["Force_N"]) +plt.xlabel("Time (s)") +plt.ylabel("Force (N)") +plt.title("Force vs. Time") +plt.show() +``` + + + + **Activities & Examples:** - Load small CSV datasets using `numpy.loadtxt()` and `pandas.read_csv()` diff --git a/tutorials/module_4/4.3 Importing and Managing Data.md b/tutorials/module_4/4.3 Importing and Managing Data.md index 101d5ab..ef44a7a 100644 --- a/tutorials/module_4/4.3 Importing and Managing Data.md +++ b/tutorials/module_4/4.3 Importing and Managing Data.md @@ -85,48 +85,6 @@ df.to_csv("edited_experiment.csv", index=False) This workflow makes pandas ideal for working with tabular data, you can quickly edit or generate datasets, verify values, and save clean, structured files for later visualization or analysis. -## Subsetting and Conditional filtering -You can select rows, columns, or specific conditions from a DataFrame. - -```python -# Select a column -force = df["Force_N"] - -# Select multiple columns -subset = df[["Time_s", "Force_N"]] - -# Conditional filtering -df_high_force = df[df["Force_N"] > 50] -``` - - -![[Pasted image 20251013064718.png]] - -## Combining and Merging Datasets -Often, multiple sensors or experiments must be merged into one dataset for analysis. - -```python -# Merge on a common column (e.g., time) -merged = pd.merge(df_force, df_temp, on="Time_s") - -# Stack multiple test runs vertically -combined = pd.concat([df_run1, df_run2], axis=0) -``` - - -## Problem 1: Describe a dataset -Use pandas built-in describe data to report on the statistical mean of the given experimental data. - -```python -import matplotlib.pyplot as plt - -plt.plot(df["Time_s"], df["Force_N"]) -plt.xlabel("Time (s)") -plt.ylabel("Force (N)") -plt.title("Force vs. Time") -plt.show() -``` - ### Problem 2: Import time stamped data |
