Home > Tutorials > D3 Tutorials > Wide vs. Long Data

Wide vs. Long Data in D3

Your data can be formatted in a few different ways, chief among them wide versus long.

Wide data looks like this:

State	1960	1970	1980	1990	2000
New York	2	5	2	5	4
New Jersey	3	1	4	1	5
Arizona	3	9	8	7	5

While long data looks like this:

State	Year	Value
New York	1960	2
New York	1970	5
New York	1980	2
New York	1990	5
New York	2000	4
New Jersey	1960	3
New Jersey	1970	1
New Jersey	1980	4
…	…	…

Which do you want? Depends on the application! I can think of exactly zero examples at the moment.

Converting between the two

While you could convert between the two in pandas without too much work, I like to use the original data files in my work whenever I can. Not only does it decrease the number of mistakes you might make, but it allows you to easily update to a new release (of the census, for example) by just dropping in a new csv. Converting in d3 isn’t too tough, either.

Usually you can use part of these methods to get to where you’re going, but I’m just going full-out just in case.

Converting Long Data to Wide Data

Your data, which we’ll call long.csv

State	Year	Value
New York	1960	2
New York	1970	5
New Jersey	1960	3
New Jersey	1970	1
Arizona	1960	7
Arizona	1970	2
…	…	…

Your data would look like

[ 
  { "State": "New York", "Year": 1960, "Value": 2 },
  { "State": "New York", "Year": 1970, "Value": 5 },
  { "State": "New Jersey", "Year": 1960, "Value": 3 },
  // etc
]

Converting Long Data to Wide Data, Method A

Stored in a d3.map()
Using queue

var state_map = d3.map();

queue().
  defer(d3.csv, "../long.csv", function(row) {
    // Try to get the state, if it doesn't exist make a new one
    var datapoint = state_map.get(row["State"]) || {"State": row["State"]};
    // row["Year"] becomes, say, 1960, and row["Value"] is 2, so really it's
    // datapoint[1960] = 2;
    datapoint[row["Year"]] = row["Value"];
    // return the unadultered row to go be passed to ready
    return row;
  })
  .await(ready);

Then, later…

state_map.get("New York");
// { "State": "New York", "1960": 2, "1970": 5 }

state_map.values();
// [ 
//   { "State": "New York", "1960": 2, "1970": 5 },
//   { "State": "New Jersey", "1960": 3, "1970": 1 },
//   { "State": "Arizona", "1960": 7, "1970": 1 },
//   ...
// ]

Converting Long Data to Wide Data, Method B

Doesn’t need a d3.map()
Created using d3.nest()

d3.csv("../long.csv", function(error, long_data) {
// data looks like
// [ 
//   { "State": "New York", "Year": 1960, "Value": 2 },
//   { "State": "New York", "Year": 1970, "Value": 5 },
//   { "State": "New Jersey", "Year": 1960, "Value": 3 },
//   ... etc

var wide = d3.nest()
  .key(function(d) { return d["State"] }) // sort by key
  .rollup(function(d) { // do this to each grouping
    // reduce takes a list and returns one value
    // in this case, the list is all the grouped elements
    // and the final value is an object with keys
    return d.reduce(function(prev, curr) {
      prev["State"] = curr["State"];
      prev[curr["Year"]] = curr["Value"];
      return prev;
    }, {});
  })
  .entries(long_data) // tell it what data to process
  .map(function(d) { // pull out only the values
    return d.values;
  });
})

wide would look like

[ 
  { "State": "New York", "1960": 2, "1970": 5 },
  { "State": "New Jersey", "1960": 3, "1970": 1 },
  { "State": "Arizona", "1960": 7, "1970": 1 },
  // etc
]

If you left off the last .map() section, wide would be keyed according to the state name, and would not be an array but instead look like this instead:

{
  "New York": { "State": "New York", "1960": 2, "1970": 5 },
  "New Jersey": { "State": "New Jersey", "1960": 3, "1970": 1 },
  "Arizona": { "State": "Arizona", "1960": 7, "1970": 1 },
  // etc
}

You can find out more about .reduce over here

Converting Long Data to Wide Data

Your data, which we’ll call long.csv

State	1960	1970	1980	1990	2000
New York	2	5	2	5	4
New Jersey	3	1	4	1	5
Arizona	3	9	8	7	5

Your data would look like

[ 
  { "../State": "New York", "1960": 2, "1970": 5 },
  { "State": "New Jersey", "1960": 3, "1970": 1 },
  { "State": "Arizona", "1960": 7, "1970": 1 },
  ...
]

Converting Wide Data to Long Data, Method A

Stored in an array
Using queue
Not returned to ready

JavaScript

var long_data = [];

queue().
  defer(d3.csv, "wide.csv", function(row) {
    // Loop through all of the columns, and for each column
    // make a new row
    Object.keys(row).forEach( function(colname) {
      // Ignore 'State' and 'Value' columns
      if(colname == "State" || colname == "Value") {
        return
      }
      long_data.push({"State": row["State"], "Value": row[colname], "Year": colname});
    });
    return row;
  })
  .await(ready);

Then, later, long_data would look like this:

[ 
  { "State": "New York", "Year": 1960, "Value": 2 },
  { "State": "New York", "Year": 1970, "Value": 5 },
  { "State": "New Jersey", "Year": 1960, "Value": 3 },
  // etc
]

Converting Wide Data to Long Data, Method B

This is honestly the same thing as Method A just wrapped a little differently.

Stored in an array
Not using queue

JavaScript

d3.csv(d3.csv, "../wide.csv", function(wide_data) {
  var long_data = [];
  wide_data.forEach( function(row) {
    // Loop through all of the columns, and for each column
    // make a new row
    Object.keys(row).forEach( function(colname) {
      // Ignore 'State' and 'Value' columns
      if(colname == "State" || colname == "Value") {
        return
      }
      long_data.push({"State": row["State"], "Value": row[colname], "Year": colname});
    });
  });

  // do magic with long_data down here
})

Then, later, long_data would look like this:

[ 
  { "State": "New York", "Year": 1960, "Value": 2 },
  { "State": "New York", "Year": 1970, "Value": 5 },
  { "State": "New Jersey", "Year": 1960, "Value": 3 },
  // etc
]