Course Syllabus
Welcome to Lede 2019: Foundations of Computing
Details
- Instructor: Jonathan Soma, js4571@columbia.edu
- Dates: Tuesdays and Thursdays, 5/28-7/2 + Saturdays 6/15, 6/22, 6/29
- Class: 10am-1pm, World Room
- Lab: 2pm-5pm, World Room
- Slack channel: #foundations
Course Overview
By the end of this course you’ll have the flexibility to find and execute solutions to most any coding- or data-related problem you run across. In theory we’re focusing on Python in general, the data package pandas, and comfort with the command line.
Homework
On many, many assignments, I will give you:
- More homework than you can reasonably accomplish
- Homework that involves googling answers
- Homework that requires thinking through problems and answers in complicated and specific ways
What’s this mean? It’s going to be hard.
If you find yourself falling down a black hole: just take a break. Or stop altogether! Don’t worry about it - ask people near you or TA’s for guidance.
Oh, and most importantly: if you can’t finish? Not a problem. It’s far more important to not get burned out and discouraged.
Schedule
This is a rough outline, and will absolutely change very, very often.
Introduction to Python and the command line (5/28 + 5/30)
In our first week we’ll take a look at the insides of our computers using the command line, with tools like cd
, grep
, and cat
. Learn to navigate your computer and run basic Python scripts.
Exploring data and APIs with Jupyter Notebooks (6/4 + 6/6)
Become more comfortable with data types in Python by consuming data from APIs - dynamic sources of information that are easily understandable by Python. Learn the joys of Jupyter Notebooks, and the basics of git
for version control.
Analyzing structured data with Pandas (6/11 + 6/13 + 6/15)
Begin work with pandas
, a data analysis library that runs circles around Excel in analyzing, cleaning, and presenting data.
Also, how to take semi-structured text data and clean/extract the parts we’re interested in, along with taming troublesome datasets with format conversation, and filling in and ignoring “bad” values.
Obtaining data through scraping (6/18 + 6/20)
Using our Python skills to scrape web sites, including advanced scraping involving form submission and page interaction.
Servers and “too much” problems (6/22)
What do you do when your data is too big or your scraping takes too long? Servers lend a hand.
Geographic analysis with QGIS (6/25 + 6/27)
Become familiar with different geographic data types, geocoding, the limitless power of column and spatial joins.
Diving deeper into pandas visualization (6/29 + 7/2)
Getting deeper into matplotlib and seaborn with fancy plots and customization. Quick introduction to declarative visualization grammars with Altair and Vega.
BONUS CLASS: Self-directed project management and workflow (7/9)
How to manage your time, workflow, and expectations when working on a project.