summaryrefslogtreecommitdiff
path: root/tutorials/module_4/1_importing_scientific_data.md
diff options
context:
space:
mode:
Diffstat (limited to 'tutorials/module_4/1_importing_scientific_data.md')
-rw-r--r--tutorials/module_4/1_importing_scientific_data.md65
1 files changed, 60 insertions, 5 deletions
diff --git a/tutorials/module_4/1_importing_scientific_data.md b/tutorials/module_4/1_importing_scientific_data.md
index b51558b..f609b39 100644
--- a/tutorials/module_4/1_importing_scientific_data.md
+++ b/tutorials/module_4/1_importing_scientific_data.md
@@ -12,12 +12,10 @@ import numpy as np
import pandas as pd
```
-
+---
DataFrame
-A `DataFrame` in pandas is a two-dimensional data structure like table with rows and columns.
-
-
+A `DataFrame` in pandas is a two-dimensional data structure like table with rows and columns. We can either load some data in a [python dictionary:](https://www.w3schools.com/python/python_dictionaries.asp)
```python
# Tensile test data
@@ -28,13 +26,70 @@ data = {
}
```
+Or we can load a spreadsheet in the form of a CSV file with the `read_csv()` function. In order to do this we need
+
+```python
+# Read CSV file into a DataFrame
+pd.read_csv("data.csv")
+```
+
+---
+Alternatively, the `read_excel()` function allows you to import xlsx files. For this you need to specify the sheet name.
+```python
+df.read_excel("foo.xlsx", sheet_name="Sheet1", index_col=None, na_values=["NA"])
+```
+---
Object Creation
+Up to this point, the data has been read into memory but not assigned to a variable. By convention, we will assign the resulting DataFrame to the variable `df`, where the name `df` serves as a common shorthand for “data frame.”
+
```python
# Create DataFrame with row labels
df = pd.DataFrame(data, index=["Test1", "Test2", "Test3", "Test4"])
print(df)
-``` \ No newline at end of file
+```
+
+Creating the object allows us to re-call the data or specific data points if needed.
+
+---
+
+Calling Data
+
+Data can be called using the square brackets. We can get an entire column or row using it's label as follows:
+
+Single point:
+```python
+df[3,2]
+```
+
+Ranges:
+```python
+df[2:4,2:3]
+```
+
+Data series by label
+```python
+df["Test1"]
+```
+
+---
+
+
+Assignment 1:
+
+
+```python
+
+```
+
+---
+
+Assignment 2:
+Load a the data from the csv file and calculate the mean point of the array.
+
+```python
+
+```