An important concept understand with d3 is that you (typically) need exactly one data point for every element you want to draw on the page.

An example

If you have one year of monthly temperatures for 3 cities, that’s going to be 12 * 3 = 36 data points.

That’s fine if you want to draw dots on each of those points - that would be 36 circles, needing 36 data points.

But if you want to draw one line per city you’re in trouble. Since you have three cities, you’re currently stuck at 3 lines but 36 data points. You need to combine your data points to be organized by city to have it match up!

Usually this is called ‘grouping’ or ‘group by’ or something like that, but in d3 it’s called nesting. It’s because d3.nest can also do a lot of other things, but we’re going to ignore those for now.

In order to have d3.nest group your data, you need to give it two things

  1. The value you want to group by (the KEY)
  2. Your data (the ENTRIES)

So in our case, we might do

var nested = d3.nest()
	.key(function(d) { return d.cityname; })
	.entries(datapoints)

It will eat up our 36 data points and group them by the ‘cityname’ key: if we have 3 different city names, we’ll wind up with 3 elements in nested.

The grouped data

If our original data looked like this:

[
	{ cityname: "Dallas", high: 56, low: 34, month: "Jan" },
	{ cityname: "Dallas", high: 67, low: 34, month: "Mar" },
	{ cityname: "Dallas", high: 67, low: 34, month: "Sept" },
	{ cityname: "NYC", high: 54, low: 21, month: "Jan"},
	{ cityname: "NYC", high: 45, low: 21, month: "Mar"},
	{ cityname: "NYC", high: 45, low: 21, month: "Sept"},
	{ cityname: "San Diego", high: 54, low: 21, month: "Jan"},
	{ cityname: "San Diego", high: 45, low: 21, month: "Mar"},
	{ cityname: "San Diego", high: 45, low: 21, month: "Sept"}
]

The grouped data now looks like this:

[
	{
		key: "Dallas",
		values: [
			{ cityname: "Dallas", high: 56, low: 34, month: "Jan" },
			{ cityname: "Dallas", high: 67, low: 34, month: "Mar" },
			{ cityname: "Dallas", high: 67, low: 34, month: "Sept" }
		]
	},
	{
		key: "NYC",
		values: [
			{ cityname: "NYC", high: 54, low: 21, month: "Jan"},
			{ cityname: "NYC", high: 45, low: 21, month: "Mar"},
			{ cityname: "NYC", high: 45, low: 21, month: "Sept"}
		]
	},
	{
		key: "San Diego",
		values: [
			{ cityname: "San Diego", high: 54, low: 21, month: "Jan"},
			{ cityname: "San Diego", high: 45, low: 21, month: "Mar"},
			{ cityname: "San Diego", high: 45, low: 21, month: "Sept"}
		]
	}
]

Where we once had an array 9 elements long, now we have an array 3 elements long. Perfect for making into 3 lines!

You can find some good examples by looking at Mister Nester (although he also uses .rollup and likes to output hashes/maps/dictionaries/objects instead of arrays, but you’ll get the idea).

Using the grouped data, part 1

Once you’ve bound your group data, you need to use it. This might look like:

d3.selectAll("path")
	.data(nested)
	.enter().append("path")
	.attr("d", function(d) {
		return line(d.values);
	})

It’s kind of weird because d is now not just a data point, it’s one of the groups. So instead of d just having a high and low temperature, it has

  1. The name of the group
  2. All of the data points in that group.

Take another look at the “the grouped data” section above if you’d like to think a little harder about it.

Using the grouped data, part 2

To access the name of the group, use d.key. To access the data points of that group, d.values.

That means code you’d normally write something like this:

svg.selectAll("circle")
	.data(datapoints)
	.enter().append("circle")

svg.append("text")
	.text("NYC")
	.attr("x", 30)
	.attr("y", 0)

Gets changed into something like this:

svg.selectAll("circle")
	.data(d.values)
	.enter().append("circle")

svg.append("text")
	.text(d.key)
	.attr("x", 30)
	.attr("y", 0)

With datapoints being replaced by d.values (the data points for that particular group) and hardcoded values like “NYC” being replaced by d.key (the name of the city).

Two examples

Now that you have some background, try to understand the difference between these two examples - one with normal data, and one with grouped data.

TK