Parts of a shapefile

When you’re using D3, you’re generally going to use topojson or geojson, two Javascript-friendly formats for geographic information. Not the rest of the world, though! Most GIS people use shapefiles.

The term shapefile is a little misleading - yes, there’s usually a .shp file that’s the guts of the shapefile, but a proper shapefile comes zipped up with all sorts of friends.

What’s in a shapefile?

Here’s a partial list:

  • .shp is the geometry for the data - polygons, points, you name it.
  • .prj is the projection used, as latitude and longitude don’t quite cut it in the GIS world.
  • .dbf is the data associated with each point, line polygon. Think state names, population data, or the year a house was built.
  • .shx is an index file for the .shp’s geometry (a.k.a. helps speed things up)
  • .shp.xmlis “geospatial metadata in XML format,” but I don’t know what exactly follows from it

There are many more, but the ones above are the ones you’ll see most often. They know they’re related because they all share the same name. Let’s say I was trying to get a shapefile for the united states, I might have

  • United_States.zip, which extracts to contain
  • United_States.shp
  • United_States.prj
  • United_States.dbf
  • United_States.shx
  • United_States.shp.xml

Want to hear when I release new things?
My infrequent and sporadic newsletter can help with that.