{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Altair is the best graphing software even made, but right now there's an open bug that prevents me from really recommending it to you. It's incredible, though, and the LA Times uses it a lot I think." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making charts of categorical vs. categorical data\n", "\n", "Let's say we have some crimes that occur across different months." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
murdertheftburglarymonth
0146January
1254February
2343March
\n", "
" ], "text/plain": [ " murder theft burglary month\n", "0 1 4 6 January\n", "1 2 5 4 February\n", "2 3 4 3 March" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "# Your data might look like this\n", "df = pd.DataFrame({\n", " 'murder': [1, 2, 3],\n", " 'theft': [4, 5, 4],\n", " 'burglary': [6, 4, 3],\n", " 'month': ['January', 'February', 'March'],\n", "})\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most graphing software needs this to be long data, not wide data, so we'll melt it." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
monthcrime_typecrime_count
0Januarymurder1
1Februarymurder2
2Marchmurder3
3Januarytheft4
4Februarytheft5
5Marchtheft4
6Januaryburglary6
7Februaryburglary4
8Marchburglary3
\n", "
" ], "text/plain": [ " month crime_type crime_count\n", "0 January murder 1\n", "1 February murder 2\n", "2 March murder 3\n", "3 January theft 4\n", "4 February theft 5\n", "5 March theft 4\n", "6 January burglary 6\n", "7 February burglary 4\n", "8 March burglary 3" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "melted = df.melt(id_vars='month', var_name='crime_type', value_name='crime_count')\n", "melted" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can use altair to graph each point. Our X axis is going to be the month, and our Y axis is going to be the crime type. We'll use `crime_count` for the size of each circle.\n", "\n", "**You'll need to `pip install vega` and `pip install altair` before this**" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "application/javascript": [ "var spec = {\"config\": {\"view\": {\"width\": 400, \"height\": 300}, \"mark\": {\"tooltip\": null}}, \"data\": {\"name\": \"data-6c27e0442d5189f4c44a697a9dd4bac2\"}, \"mark\": \"circle\", \"encoding\": {\"color\": {\"type\": \"nominal\", \"field\": \"crime_type\"}, \"size\": {\"type\": \"quantitative\", \"field\": \"crime_count\"}, \"x\": {\"type\": \"nominal\", \"field\": \"month\"}, \"y\": {\"type\": \"nominal\", \"field\": \"crime_type\"}}, \"height\": 200, \"width\": 200, \"$schema\": \"https://vega.github.io/schema/vega-lite/v3.3.0.json\", \"datasets\": {\"data-6c27e0442d5189f4c44a697a9dd4bac2\": [{\"month\": \"January\", \"crime_type\": \"murder\", \"crime_count\": 1}, {\"month\": \"February\", \"crime_type\": \"murder\", \"crime_count\": 2}, {\"month\": \"March\", \"crime_type\": \"murder\", \"crime_count\": 3}, {\"month\": \"January\", \"crime_type\": \"theft\", \"crime_count\": 4}, {\"month\": \"February\", \"crime_type\": \"theft\", \"crime_count\": 5}, {\"month\": \"March\", \"crime_type\": \"theft\", \"crime_count\": 4}, {\"month\": \"January\", \"crime_type\": \"burglary\", \"crime_count\": 6}, {\"month\": \"February\", \"crime_type\": \"burglary\", \"crime_count\": 4}, {\"month\": \"March\", \"crime_type\": \"burglary\", \"crime_count\": 3}]}};\n", "var opt = {};\n", "var type = \"vega-lite\";\n", "var id = \"8ef15c1d-2b2f-4706-8478-1bdec98f16db\";\n", "\n", "var output_area = this;\n", "\n", "require([\"nbextensions/jupyter-vega/index\"], function(vega) {\n", " var target = document.createElement(\"div\");\n", " target.id = id;\n", " target.className = \"vega-embed\";\n", "\n", " var style = document.createElement(\"style\");\n", " style.textContent = [\n", " \".vega-embed .error p {\",\n", " \" color: firebrick;\",\n", " \" font-size: 14px;\",\n", " \"}\",\n", " ].join(\"\\\\n\");\n", "\n", " // element is a jQuery wrapped DOM element inside the output area\n", " // see http://ipython.readthedocs.io/en/stable/api/generated/\\\n", " // IPython.display.html#IPython.display.Javascript.__init__\n", " element[0].appendChild(target);\n", " element[0].appendChild(style);\n", "\n", " vega.render(\"#\" + id, spec, type, opt, output_area);\n", "}, function (err) {\n", " if (err.requireType !== \"scripterror\") {\n", " throw(err);\n", " }\n", "});\n" ], "text/plain": [ "" ] }, "metadata": { "jupyter-vega": "#8ef15c1d-2b2f-4706-8478-1bdec98f16db" }, "output_type": "display_data" }, { "data": { "text/plain": [] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "" }, "metadata": { "jupyter-vega": "#8ef15c1d-2b2f-4706-8478-1bdec98f16db" }, "output_type": "display_data" } ], "source": [ "import altair as alt\n", "alt.renderers.enable('notebook')\n", "\n", "# https://altair-viz.github.io/gallery/index.html\n", "chart = alt.Chart(melted, width=200, height=200).mark_circle().encode(\n", " x='month',\n", " y='crime_type',\n", " size='crime_count',\n", " color='crime_type'\n", ")\n", "chart" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can save it, too." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# https://altair-viz.github.io/user_guide/saving_charts.html\n", "chart.save('chart.svg')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }