Welcome to Lede 2018: Foundations of Computing

Details

  • Instructor: Jonathan Soma, js4571@columbia.edu
  • Dates: Mondays and Wednesdays, 5/21-7/11 (Holidays: Monday 5/28, 7/4)
  • Class: 10am-1pm, World Room
  • Lab: 2pm-5pm, World Room
  • Slack channel: #foundations

Course Overview

By the end of this course you’ll have the flexibility to find and execute solutions to most any coding- or data-related problem you run across. In theory we’re focusing on Python in general, the data package pandas, and comfort with the command line.

Homework

On many, many assignments, I will give you:

  • More homework than you can reasonably accomplish
  • Homework that involves googling answers
  • Homework that requires thinking through problems and answers in complicated and specific ways

What’s this mean? It’s going to be hard.

If you find yourself falling down a black hole: just take a break. Or stop altogether! Don’t worry about it - ask people near you or TA’s for guidance.

Oh, and most importantly: if you can’t finish? Not a problem. It’s far more important to not get burned out and discouraged.

Schedule

This is a rough outline, and will absolutely change very, very often.

Week 1: Introduction to Python and the command line (5/21 + 5/23)

In our first week we’ll take a look at the insides of our computers using the command line, with tools like cd, grep, and cat. Learn to navigate your computer and run basic Python scripts.

Week 2: Navigating APIs (5/30, Wed only)

No class on Monday due to Memorial Day

Become more comfortable with data types in Python by consuming data from APIs - dynamic sources of information that are easily understandable by Python. Learn the joys of Jupyter Notebooks.

Week 3: Analyzing structured data with Pandas (6/4 + 6/6)

Begin work with pandas, a data analysis library that runs circles around Excel.

Week 4: Obtaining data through scraping (6/11 + 6/13)

Using our Python skills to scrape web sites with BeautifulSoup, along with advanced scraping such as form submission and using the browser automation tool Selenium.

Week 5: Dealing with text-based data in pandas (6/18 + 6/20)

How to take semi-structured text data and clean/extract the parts we’re interested in.

Week 6: Visualization and dealing with missing and dirty data (6/25 + 6/27)

Taming troublesome datasets with format conversation, filling in and ignoring “bad” values.

Week 7: Geographic analysis with QGIS (7/2, Monday only)

No class on Wednesday due to Memorial Day

Become familiar with different geographic data types, geocoding, the limitless power of column and spatial joins.

Week 8: Float (7/9 + 7/11)

Left empty to allowing some cushion in topic coverage. We’ll probably cover more QGIS,