Adding ZIP codes to lat/lon coordinates (spatial joins)¶
...or adding states to latitude/longitude pairs, or anything else!
When you have geographic information that you'd like to combine with other geographic information, this is called a spatial join. You can think of spatial joins as "if this thing overlays this other thing, combine the information." If you just had addresses you'd need to geocode, but in this case we can just run right into spatial joins.
In this case we start off with some sort of geography in a CSV file:
name,latitude,longitude
South Dakota Tractor Museum,43.7363023,-98.9598954
Corn Palace,43.7148,-98.0254
The next addition we need is some sort of shapes of ZIP codes. We're probably looking for a geographic file type called a shapefile. You can find one for ZIP codes at data.gov or census.gov.
Since a "spatial join" is a pretty common concept, we have around ten thousand ways of doing this. QGIS, mapshaper, geopandas... the list goes on and on! We can also skirt around the issue and reverse geocoding with a service like geocod.io or HERE.
Spatial joins with mapshaper¶
After you've installed mapshaper, you can run the following command to perform a spatial join. This assumes you're in the same directory as lat-lons.csv
and you've unzipped your zip codes shapefile into a folder named tl_2020_us_zcta520
.
mapshaper lat-lons.csv \
-proj wgs84 \
-points \
-join tl_2020_us_zcta520/tl_2020_us_zcta520.shp \
-o joined.csv
Most people would type it out on a single line, but for readability's sake I put it on separate lines by adding a \
at the end of each line. Let's look at parts of the commands:
- -points to say, this CSV is points and not shapes
- -proj to say, my coordinates are in latitude/longitude, 20m video here
- -join to say, here's the file to combine it with
- -o to say, here's the file to save it as
And now we have a delightfully long CSV file, full of extra information pulled out of the shapefile!
name,latitude,longitude,ZCTA5CE20,GEOID20,CLASSFP20,MTFCC20,FUNCSTAT20,ALAND20,AWATER20,INTPTLAT20,INTPTLON20
South Dakota Tractor Museum,43.7363023,-98.9598954,57355,57355,B5,G6350,S,994988736,5685572,+43.7836075,-098.9314441
Corn Palace,43.7148,-98.0254,57301,57301,B5,G6350,S,617320997,3321485,+43.7109118,-098.0404632
Done and done.
ZIP codes aren't numbers!
Even though ZIP codes look like numbers - for example, 57355
and 57301
up above – you can find all sorts of them that start with 0! If you treat them like numbers, you'll accidentally turn 03301
from Concord VT into 3301
, and risk not making later calculations correctly.
Spatial joins with QGIS¶
QGIS is a great visual software if you don't like poking around on the command line. This web page has a great summary of spatial joins in QGIS.
Spatial joins with geopandas¶
If you'd prefer something reproducible that can live in, say, a Jupyter notebook, geopandas is a great option. Here's video of mine from back in the day about spatial joins with geopandas.
Spatial joins (kind of) with Geocod.io¶
If you're dealing with spatial data in the United States or Canada, geocod.io is an absolute treat. You'll have to pay if you're using a lot of data, but hopefully you can sneak in through the free tier for now.
You'll start by uploading your data from the Geocod.io homepage.
Watch out that the cut-and-paste only works from Excel, so I had to upload my CSV even though it was teeny-tiny. We're interested in reverse geocoding, where we take lat/lon and turn it into an address. This is useful to us since an address includes a ZIP code!
Check out the preview of your data, and bask in the pleasure of all of the extra columns you're getting - city, state, country, etc etc.
You can add additional data, but I just scrolled down and clicked Continue.
Wait a bit for processing to finish, and click Download to get your updated CSV!
That's a lot of extra data.
name,latitude,longitude,Latitude,Longitude,"Accuracy Score","Accuracy Type",Number,Street,"Unit Type","Unit Number",City,State,County,Zip,Country,Source
"South Dakota Tractor Museum",43.7363023,-98.9598954,43.736575,-98.959228,0.99,rooftop,117,"Cemetery Rd",,,Kimball,SD,"Brule County",57355,US,"Statewide SD"
"Corn Palace",43.7148,-98.0254,43.714644,-98.025515,1,rooftop,612,"N Main St",,,Mitchell,SD,"Davison County",57301,US,"Statewide SD"