summaryrefslogtreecommitdiff
path: root/tutorials
diff options
context:
space:
mode:
authorChristian Kolset <christian.kolset@gmail.com>2025-10-26 15:34:27 -0600
committerChristian Kolset <christian.kolset@gmail.com>2025-10-26 15:34:27 -0600
commit0fa588700aa9d222cc8f3e562a7b30e9fb75e287 (patch)
tree87b25a1ff267b08b3e47b13c18271dfbfeff2c4e /tutorials
parenteb0ee1f0d51d33666376552e610de15f233167f5 (diff)
Updated tutorials
Diffstat (limited to 'tutorials')
-rw-r--r--tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md20
-rw-r--r--tutorials/module_4/4.2 Interpreting Data.md82
-rw-r--r--tutorials/module_4/4.3 Importing and Managing Data.md11
-rw-r--r--tutorials/module_4/4.4 Statistical Analysis.md11
-rw-r--r--tutorials/module_4/4.6 Data Filtering and Signal Processing.md3
-rw-r--r--tutorials/module_4/4.7 Data Visualization and Presentation.md3
6 files changed, 85 insertions, 45 deletions
diff --git a/tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md b/tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md
index 3ad34e4..5cd3879 100644
--- a/tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md
+++ b/tutorials/module_4/4.1 Introduction to Data and Scientific Datasets.md
@@ -18,6 +18,16 @@ We may collect this in the following ways:
flowchart
A[Collecting] --> B[Cleaning & Filtering] --> C[Analysis] --> D[Visualization]
```
+Data processing begins with **collection**, where measurements are recorded either manually using instruments or electronically through sensors. Regardless of the method, every measurement contains some degree of error, whether due to instrument limitations or external interference. In engineering, recognizing and quantifying this uncertainty is essential, as it defines the confidence range of our predictions.
+
+Once the data has been collected, the next step is **cleaning and filtering**. This involves addressing missing data points, managing outliers, and reducing noise. Errors can arise from faulty readings, sensor drift, or transcription mistakes. By cleaning and filtering the data, we ensure it accurately represents the system being measured.
+
+After the data is refined, we move into **analysis**. Here, statistical methods and computational tools are applied to model the data, uncover trends, and test hypotheses. This stage transforms raw numbers into meaningful insight.
+
+Finally, **visualization** allows us to communicate these insights effectively. Visualization can occur alongside analysis to guide interpretation or as the concluding step to present results clearly and purposefully. Well-designed visualizations make complex findings intuitive and accessible to the intended audience.
+
+To carry out this workflow efficiently, particularly during the cleaning, analysis, and visualization stages, we rely on powerful computational tools. In Python, one of the most versatile and widely used libraries for handling tabular data is pandas. It simplifies the process of managing, transforming, and analyzing datasets, allowing engineers and scientists to focus on interpreting results rather than wrestling with raw data.
+
## Introduction to pandas
`pandas` (**Pan**el **Da**ta) is a Python library designed for data analysis and manipulation, widely used in engineering, science, and data analytics. It provides two core data structures: the **Series** and the **DataFrame**.
@@ -25,7 +35,7 @@ A `Series` represents a single column or one-dimensional labeled array, while
DataFrames can be created from dictionaries, lists, NumPy arrays, or imported from external files such as CSV or Excel. Once data is loaded, you can **view and explore** it using methods like `head()`, `tail()`, and `describe()`. Data can be **selected by label** or **by position**. These indexing systems make it easy to slice, filter, and reorganize datasets efficiently.
-### Problem 1: Create a dataframe from an array
+### Problem: Create a dataframe from an array
Given the data `force_N` and `time_s`
```python
@@ -70,8 +80,6 @@ merged = pd.merge(df_force, df_temp, on="Time_s")
# Stack multiple test runs vertically
combined = pd.concat([df_run1, df_run2], axis=0)
```
-
-
https://pandas.pydata.org/docs/user_guide/merging.html
#### Creating new columns based on existing ones
@@ -90,10 +98,9 @@ air_quality["ratio_paris_antwerp"] = (
```
https://pandas.pydata.org/docs/getting_started/intro_tutorials/03_subset_data.html
-
https://pandas.pydata.org/docs/user_guide/reshaping.html
-### Problem 1: Create a dataframe from data
+### Problem: Create a dataframe from data
Given the the file `force_displacement_data.txt`. Use pandas to tabulate the data into a dataframe
```python
import pandas as pd
@@ -130,6 +137,5 @@ except ImportError:
```
-**Activities & Examples:**
-- Load small CSV datasets using `numpy.loadtxt()` and `pandas.read_csv()`
+## **Activities & Examples:**
- Discuss real ME examples: strain gauge data, thermocouple readings, pressure transducers \ No newline at end of file
diff --git a/tutorials/module_4/4.2 Interpreting Data.md b/tutorials/module_4/4.2 Interpreting Data.md
index c889b86..836d181 100644
--- a/tutorials/module_4/4.2 Interpreting Data.md
+++ b/tutorials/module_4/4.2 Interpreting Data.md
@@ -1,7 +1,7 @@
# Interpreting Data for Plotting
Philosophy of visualizing data
-A useful tool is using the acronym **PCC** to enforce legibility of your data.
+A useful tool is using the acronym **PCC** to enforce legibility of your data. These three principles form the foundation of data visualization and ensure your figures communicate meaning rather than just display numbers.
```mermaid
flowchart LR
@@ -11,24 +11,25 @@ flowchart LR
```
Whether we are preparing figures for a lab report or a research paper, these three elements should always be applied when presenting data. They ensure that our figures are clear, effective, and convey the intended message to our audience.
- - Purpose -> explain a process, compare of contrast, show a change or establish a relationship
- - Composition -> How do you arrange to components of the plot to clearly show the purpose
- - Color -> Using contrasts and colors to highlight important elements in a figure.
-We may re-iterate a figure 2 or 3 times until the figure shows the data we want to present is clear.
+- **Purpose** -> What are you trying to communicate? Are you explaining a process, comparing results, showing change, or revealing a relationship?
+- **Composition** -> How do you arrange the elements of your figure so that the story is clear?
+- **Color** -> How can you use contrast and tone to highlight key insights and guide your viewer’s attention?
+
+Remember: great figures rarely emerge on the first attempt. Iterating, refining layout, simplifying elements, or adjusting colors, helps ensure your data is represented honestly and effectively.
*Remember:* Data don't lie and neither should your figures, even unintentionally.
## Syntax and semantics in Mathematics - The meaning of our data
-In the English language, grammar defines the syntax—the structural rules that determine how words are arranged in a sentence. However, meaning arises only through semantics, which tells us what the sentence actually conveys.
+In the English language, grammar defines the syntax, the structural rules that determine how words are arranged in a sentence. However, meaning arises only through semantics, which tells us what the sentence actually conveys.
-Similarly, in the language of mathematics, syntax consists of the formal rules that govern how we combine symbols, perform operations, and manipulate equations. Yet it is semantics—the interpretation of those symbols and relationships—that gives mathematics its meaning and connection to the real world.
+Similarly, in the language of mathematics, syntax consists of the formal rules that govern how we combine symbols, perform operations, and manipulate equations. Yet it is semantics, the interpretation of those symbols and relationships, that gives mathematics its meaning and connection to the real world.
-As engineers and scientists, we must grasp the semantics of our work—not merely the procedures—it is our responsibility to understand the meaning behind it. YouTube creator and rocket engineer Destin Sandlin, better known as SmarterEveryDay, illustrates this concept in his video on the “backwards bicycle,” which demonstrates how syntax and semantics parallel the difference between knowledge and understanding.
+As engineers and scientists, we must grasp the semantics of our work, not merely the procedures, it is our responsibility to understand the meaning behind it. YouTube creator and rocket engineer Destin Sandlin, better known as SmarterEveryDay, illustrates this concept in his video on the “backwards bicycle,” which demonstrates how syntax and semantics parallel the difference between knowledge and understanding.
![Backwards Brain Bike](https://www.youtube.com/watch?v=MFzDaBzBlL0)
## Purpose - Why?
-Does the figure show the overall story or main point when you hide the text?
+> Does the figure show the overall story or main point when you hide the text?
Starting with the most important aspect of a figure is the purpose. What do you want to show? Why are we showing this? What is so important? These questions will help us decide on what time of plot we need. There are many types of plots and some are better for different purposes.
@@ -42,9 +43,9 @@ There are many other types of plots that you can choose from so it can be useful
- Clients (may not always be technical professionals).
## Composition - Making good plots
-Can you remove or adjust unnecessary elements that attract your attention?
+>Can you remove or adjust unnecessary elements that attract your attention?
-Composition refers to how you choose to format your plot — including labeling, gridlines, and axis scaling.
+Composition refers to how you choose to format your plot, including labeling, gridlines, and axis scaling.
Often, the main message of a figure can be obscured by too much information. To improve clarity, consider removing or simplifying unnecessary elements such as repetitive labels, bounding boxes, background colors, extra lines or colors, redundant text, and shadows or shading. You can also reduce clutter by adjusting or removing excess data and moving supporting information to supplementary figures.
@@ -52,30 +53,13 @@ If applicable, be sure to follow any additional formatting or figure guidelines
<img src="image_1760986811788.png" width="500" center="true">
-## Color - Highlight
-Does the color palette enhance or distract from the story?
-
-Similarly to composition using color or the absence thereof (gray scale) can help you draw the attention of the read to a specific element of the plot.
+## Color - Highlight Meaning
+>Does the color palette enhance or distract from the story?
-Here is an example of how color can be used to enhance the difference between the private-for-profit.
+Similarly to composition using color or the absence thereof (gray scale) can help you draw the attention of the read to a specific element of the plot. Here is an example of how color can be used to enhance the difference between the private-for-profit.
<img src="image_1760986545639.png" width="500">
-
-## In the end
-
-#### Data don't lie
-And neither should your figures, even unintentionally. So it's important that you understand every step that stands between your raw data and the final figure. One way to think of this is that your data undergoes a series of transformations to get from what you measure to what ends up in the journal. For example, you might start with a set of mouse weight measurements. These numbers get 'transformed' into the figure as the vertical position of points on a chart, arranged in such a way that 500g is twice as far from the chart baseline as 250g. Or, a raw immunofluorescence image (a grid of photon counts) gets transformed by the application of a lookup table into a grayscale image. Either way, exactly what each transformation entails should be clear and reproducible. Nothing in the workflow should be a magic "black box."
-
-#### Follow the formatting rules
-Following one set of formatting rules shouldn't be too hard, at least when the journal is clear about what it expects, which isn't always the case. But the trick is developing a workflow that is sufficiently flexible to handle a wide variety of formatting rules — 300dpi or 600dpi, Tiff or PostScript, margins or no margins. The general approach should be to push decisions affecting the final figure format as far back in the workflow as possible so that switching does not require rebuilding the entire figure from scratch.
-
-#### Quality
-Unfortunately, making sure your figures look just the way you like is one of the most difficult goals of the figure-building process. Why? Because what you give the journal is _not_ the same thing that will end up on the website or in the PDF. Or in print, but who reads print journals these days? The final figure files you hand over to the editor will be further processed — generally through some of those magic "black boxes." Though you can't control journal-induced figure quality loss, you can make sure the files you give them are as high-quality as possible going in.
-
-#### Transparency
-If Reviewer #3 — or some guy in a bad mood who reads your paper five years after it gets published — doesn't like what he sees, you are going to have to prove that you prepared the figure appropriately. That means the figure-building workflow must be transparent. Every intermediate step from the raw data to the final figure should be saved, and it must be clear how each step is linked. Another reason to avoid black boxes.
-
Checklist
- [ ] Select appropriate type
- [ ] Labels
@@ -85,8 +69,42 @@ Checklist
## Problem 1:
+```python
+import matplotlib.pyplot as plt
+import numpy as np
+
+# Pseudo data
+time_s = np.linspace(0,300,15)
+temperature_C = 20 + 0.05 * time_s + 2 * np.random.randn(len(time_s))
+
+
+# Plot
+plt.figure(figsize=(8,6))
+plt.plot(time_s, temperature_C, 'r--o', linewidth=5)
+plt.title("Experiment 3")
+plt.xlabel("x")
+plt.ylabel("y")
+plt.grid(True)
+plt.legend(["line1"])
+plt.show()
+
+# Plot (IMPROVED)
+plt.figure(figsize=(7,5))
+plt.plot(time_s, temperature_C, color='steelblue', marker='o', linewidth=2, label='Measured Temperature')
+plt.title("Temperature Rise of Metal Rod During Heating", fontsize=14, weight='bold')
+plt.xlabel("Time [s]", fontsize=12)
+plt.ylabel("Temperature [°C]", fontsize=12)
+plt.grid(True, linestyle='--', alpha=0.5)
+plt.legend(frameon=False)
+plt.tight_layout()
+plt.show()
+```
+
+
+## Data don't lie
+And neither should your figures, even unintentionally. So it's important that you understand every step that stands between your raw data and the final figure. One way to think of this is that your data undergoes a series of transformations to get from what you measure to what ends up in your final results. Nothing in the workflow should be a magic "black box".
-## Problem 2:
+## Problem 2: Misleading plots
diff --git a/tutorials/module_4/4.3 Importing and Managing Data.md b/tutorials/module_4/4.3 Importing and Managing Data.md
index cd66164..91411c6 100644
--- a/tutorials/module_4/4.3 Importing and Managing Data.md
+++ b/tutorials/module_4/4.3 Importing and Managing Data.md
@@ -86,11 +86,17 @@ df.to_csv("edited_experiment.csv", index=False)
This workflow makes pandas ideal for working with tabular data, you can quickly edit or generate datasets, verify values, and save clean, structured files for later visualization or analysis.
-### Problem 2: Import time stamped data
+### Problem: Import time stamped data
-### Further Docs
+
+
+
+
+
+
+# Further Docs
[Comparison with Spreadsheets](https://pandas.pydata.org/docs/getting_started/comparison/comparison_with_spreadsheets.html#compare-with-spreadsheets)
[Intro to Reading/Writing Files](https://pandas.pydata.org/docs/getting_started/intro_tutorials/02_read_write.html)
[Subsetting Data](https://pandas.pydata.org/docs/getting_started/intro_tutorials/03_subset_data.html)
@@ -98,3 +104,4 @@ This workflow makes pandas ideal for working with tabular data, you can quickly
[Reshaping Data](https://pandas.pydata.org/docs/user_guide/reshaping.html)
[Merging DataFrames](https://pandas.pydata.org/docs/user_guide/merging.html)
[Combining DataFrames](https://pandas.pydata.org/docs/getting_started/intro_tutorials/08_combine_dataframes.html)
+
diff --git a/tutorials/module_4/4.4 Statistical Analysis.md b/tutorials/module_4/4.4 Statistical Analysis.md
index de6de07..1112ca8 100644
--- a/tutorials/module_4/4.4 Statistical Analysis.md
+++ b/tutorials/module_4/4.4 Statistical Analysis.md
@@ -42,9 +42,16 @@ Pandas also includes several built-in statistical tools that make it easy to ana
Great, so we
## Statistical Distributions
Normal distributions
-
+<img src="image_1761513820040.png" width="650">
- Design thinking -> Motorola starting Six sigma organization based on the probability of a product to fail. Adopted world wide.
+- Statistical analysis of data.
+
+## Spectroscopy
+### Background
+Spectroscopy is the study of how matter interacts with electromagnetic radiation, including the absorption and emission of light and other forms of radiation. It examines how these interactions depend on the wavelength of the radiation, providing insight into the physical and chemical properties of materials. In simple terms, spectroscopy helps us understand what substances are made of and how they behave when exposed to energy.
+
+In engineering applications, spectroscopy is a powerful diagnostic and analysis tool. It can be used for material identification, such as how NASA determines the composition of planetary surfaces and atmospheres. It’s also applied in combustion and thermal analysis, where emission spectroscopy measures plasma temperatures and monitors exhaust composition in rocket engines. These applications allow engineers to better understand material behavior under extreme conditions and improve system performance and efficiency.
## Problem: Spectroscopy
-Let's
+
diff --git a/tutorials/module_4/4.6 Data Filtering and Signal Processing.md b/tutorials/module_4/4.6 Data Filtering and Signal Processing.md
index 112826e..ac1760e 100644
--- a/tutorials/module_4/4.6 Data Filtering and Signal Processing.md
+++ b/tutorials/module_4/4.6 Data Filtering and Signal Processing.md
@@ -10,7 +10,6 @@
---
-
#### Topics
- Review: what “noise” looks like statistically
@@ -41,7 +40,7 @@
#### Problems
-- Filter noisy vibration or pressure data and compare spectra before/after
+- Filter noisy scpectroscopy data and compare spectra before/after
- Apply a moving average and a Butterworth filter to the same dataset — evaluate differences
- Use `ndimage.sobel()` to highlight temperature gradients in a heat-map image
- Challenge: write a short Python function that automatically chooses an appropriate smoothing window based on noise level \ No newline at end of file
diff --git a/tutorials/module_4/4.7 Data Visualization and Presentation.md b/tutorials/module_4/4.7 Data Visualization and Presentation.md
index b788fc7..3b53ff0 100644
--- a/tutorials/module_4/4.7 Data Visualization and Presentation.md
+++ b/tutorials/module_4/4.7 Data Visualization and Presentation.md
@@ -20,6 +20,9 @@
+:
+
+