Introduction to Python for Data Science
This is a four-day workshop that will introduce students to Python as a data science language. We teach the basics of programming and logic in the context of Python and go on to show the tools that use Python for modern data analysis. This assumes no prior knowledge of Python, but will move at a quick pace to cover all the content. The workshop meets for 3 hours for 4 sessions.
Getting started
These workshops are Jupyter notebooks, which are documents that contain interactive Python code blocks interleaved with formatted text. You can participate in this workshop by clicking the links below which will open the respective notebooks in Google Colab . That's it! The notebook should load and be able to run without any additional setup.
Workshop content
Part 0: Python Healthy Habits
In this bonus notebook, we've collected tips, tricks, and healthy habits for Python and programming in general that we couldn't fit in the main course or that were too heavy on lecture. We will refer to this in class but not spend actual class time on it.
- Troubleshooting Guide
- Code annotation
- More advanced control flow with break and continue
- Exception Handling
- Working with dates and times
Part 1: Programming basics and intro to logic
- What is a computer program?
- Functions and data types
- Operators as functions
- Logical operations
- Metaprogramming skills
Part 2: Control flow and iterable data structures
- Control flow
- If statements
- Loops
- Iterables
- Lists
- Dictionaries
Part 3: Writing functions and putting all together
- Writing functions
- Functions to programs
- Writing a random walk program
Day 4: Introduction to Python as a data science language
- Python refresher
- Reading and writing files
- Introduction to pandas
Day 5: More pandas dataframes and plotting
- Filtering & modifying pandas dataframes
- Summarizing and grouping dataframes
- Plotting data with Seaborn
- Data cleaning exercise
Day 6: Analyzing a real dataset: Indiana storms
- Introduction to the Indiana storms dataset
- Exercises to practice the skills learned in the previous days