Welcome to Lede 2017: Foundations of Computing

Details

  • Instructor: Jonathan Soma, js4571@columbia.edu
  • Dates: Mondays and Wednesdays, 5/22-7/10 (Holiday: Monday May 29)
  • Class: 10am-1pm, room 601B
  • Lab: 2pm-5pm, 501A
  • Slack channel: #foundations

Course Overview

By the end of this course you’ll have the flexibility to find and execute solutions to most any coding- or data-related problem you run across. In theory we’re focusing on Python in general, the data package pandas, and comfort with the command line.

Homework

On many, many assignments, I will give you:

  • More homework than you can reasonably accomplish
  • Homework that involves googling answers
  • Homework that requires thinking through problems and answers in complicated and specific ways

What’s this mean? It’s going to be hard.

If you find yourself falling down a black hole: just take a break. Or stop altogether! Don’t worry about it - ask people near you or TA’s for guidance.

Oh, and most importantly: if you can’t finish? Not a problem. It’s far more important to not get burned out and discouraged.

Schedule

This is a rough outline, and will absolutely change very, very often.

Week 1: Introduction to Python and the command line (5/22 + 5/24)

In our first week we’ll take a look at the insides of our computers using the command line, with tools like cd, grep, and cat. Learn to navigate your computer and run basic Python scripts.

Week 2: Navigating APIs (5/31, Wed only)

No class on Monday due to Memorial Day

Become more comfortable with data types in Python by consuming data from APIs - dynamic sources of information that are easily understandable by Python. Learn the joys of Jupyter Notebooks.

Week 3: Analyzing structured data with Pandas (6/5 + 6/7)

Begin work with pandas, a data analysis library that runs circles around Excel.

Week 4: Obtaining data through scraping (6/12 + 6/14)

Using our Python skills to scrape web sites with BeautifulSoup, along with advanced scraping such as form submission and using the browser automation tool Selenium.

Week 5: Dealing with text-based data in pandas (6/19 + 6/21)

How to take semi-structured text data and clean/extract the parts we’re interested in.

Week 6: Visualization and dealing with missing and dirty data (6/26 + 6/28)

Taming troublesome datasets with format conversation, filling in and ignoring “bad” values.

Week 7: Geographic analysis with geopandas and friends (7/3 + 7/5)

Become familiar with different geographic data types, geocoding, the limitless power of column and spatial joins.

Week 8: Float (7/10)

Left empty to allowing some cushion in topic coverage.