The Many-Faced Infographic: Brooklyn, Elephants, and the Visualization of Data
Let me tell you the story of an outlandish analogy—a story that involves a visualization of data, an elephant, and a group of blind men. It begins—of all places—in Brooklyn.
First, the visualization. Natives of Brooklyn I know usually complain that their borough is neglected by the media, compared to other areas in New York City, namely Manhattan. I don't know if there's quantitative evidence to support this idea, but it's the main driving force behind BKLYNR.com, another one of those small, independent, and highly energetic online news publications that I praised in my previous article (1). Its founders, Thomas Rhiel and Raphael Pope-Sussman, have written that "it’s harder than it should be to find quality journalism about Brooklyn."
BKLYNR's latest visualization project, All the Stops (Figure 1), presents data from NYPD's controversial "stop, question, and frisk" policy. Half a million people were stopped in 2012; most of them were African-American men. This interactive graphic is simple and visually powerful. Grey circles bounce around and cluster together depending on what you choose to see: Race of the suspects, their ages and sex, the reasons for the stops, etc.
So this is the visualization. Now, let me take a detour and refer to our elephant.
The Elephant and the Blind Men
The famous parable of the blind men and the elephant has many versions, but they follow the same basic structure: Several blind men are asked to describe the appearance of an elephant. They surround the animal, and each of them feels a different part of its anatomy.
Then, "the blind man who feels a leg says the elephant is like a pillar; the one who feels the tail says the elephant is like a rope; the one who feels the trunk says the elephant is like a tree branch; the one who feels the ear says the elephant is like a hand fan; the one who feels the belly says the elephant is like a wall; and the one who feels the tusk says the elephant is like a solid pipe." (2) Obviously, the men are not able to reach a consensus on what the shape of an elephant is.
Now, let's change the terms of this parable: Suppose you are someone blinded (or at least confused) by the complexity of a dataset—our elephant. You are presented with a visualization that encodes those data into a single graphic form—this would be the part of the elephant you feel; let's assume that it's the trunk.
If you are like me, someone interested in social equality and fairness, you wouldn’t be satisfied feeling that an elephant looks just like a soft, wobbly pipe—or, going back to BKLYNR's graphic, feeling that our dataset “looks” just like a bunch of bubbles. You'd want to make sure that you are devising a complete and cohesive mental image of the elephant. In other words, you'd wish to see the dataset represented in many different ways to be able to explore it in its entirety, all its angles and nuances, its averages and outliers alike.
OK, the analogy may be a bit fishy, I'll give you that, but I got your attention, didn't I?
The Multiple Representation Strategy
In The Functional Art I explained that one of the keys to designing effective information graphics is to accept that function constrains form. This means that, if your goal is to communicate well, the visual shape you make your data adopt is not primarily a matter of aesthetic preferences, but should depend on the questions readers may want to get answered, or on the tasks they may wish to complete.
For instance, if you can make an informed guess that most readers will try to compare and rank figures accurately, you shouldn't design a shaded map—or, to use the right term, a choropleth map (3). Design a bar graph instead, or a dot plot (4). But if you conjecture that readers will want to see geographic patterns, by all means choose the map, as a bar graph would be useless for this task.
But what if it's necessary to facilitate both tasks, or more? Given that you have enough space, and that you can sequence different graphs and maps inside a visualization, why wouldn't you do it? Why wouldn't you visually represent your data more than once? Those are key questions many visualization designers tend to ignore—consciously or not.
Back to BKLYNR's All the Stops visualization. What can you really do with it? What do you see? In my case, I could make some intriguing but imprecise comparisons. For instance, if you go to "The suspected," and then to "What was the suspects race" try to tell how much bigger the "Black" bubble is compared to the "White Hispanic" one. You can't. You can just see that it's indeed bigger, that's all.
Why, then, don't we give readers the option to switch back and forth between the bubble chart—which is a nice, albeit a bit shallow, overview of the data—and a dot plot or a bar graph? That way, we would preserve the excitement provoked by the initial and unusual graphic form, and then add the precision of a graph in which readers can really compare and rank because all items are sitting on a common 0 axis. Figure 2 is a rough sketch of what I mean.
Figure 2: Offering alternative visual representations of the same data. The bubble chart on the left is the original one. The bar graph on the right is the proposed alternative, which can be accessed through the buttons of the upper menu.
Moreover, what about if we also represent these same data on a map of the NYC area, to see if people of different races are more likely to be stopped in certain neighborhoods? The possibilities are not endless, but they are certainly more than what we currently see in BKLYNR's interactive graphic.
(By the way, it's possible to argue that you cannot show these numbers without including also standardized variables, such as rates. If you really want to make the case that minorities are disproportionally targeted in police stop-and-frisk searches, you need to compare them somehow with NYC's racial makeup. And then perhaps even think about bringing other variables to the table, such as poverty rates, average educational attainment, etc. This would also contribute to providing a more accurate portrait of our “elephant.”)
Overview First, Zoom and Filter, Then Details on Demand
The strategy I'm proposing here is not original. Ben Shneiderman, a professor of computer science at the University of Maryland who is an expert in visualization, wrote a classic paper in 1996 in which he outlined a “Visual Information-Seeking Mantra,” which could be applied to most exploratory representations of data: “Overview first, zoom and filter, then details on demand.” (5)
Shneiderman's mantra is another way to explain that visually presenting information always involves layering, sequencing, or a multiplicity of representations, which can be controlled by the reader either by manipulating buttons and sliders or, in the case of static graphics, by directing her eyes to different parts of the visualization. Let me give you a couple of examples.
Take 50 Years of Change, a visualization made by Erin Hamilton, Rashauna Mead, and Vanessa Knoppke-Wetzel, Cartography graduate students from the University of Wisconsin-Madison (FIG 3). It documents “laws and regulations that both restrict and promote” the rights of gays in the U.S. Notice that the map is not the only way data are represented. The map is just the entry point to the visualization, a useful means to see regional patterns, and also bait that attracts readers' attention; maps are beautiful, after all, much more than abstract graphs.
Underneath the map, you'll see a bar graph—should I dare to call it a “stacked-squares bar graph”?—and a large timeline that extends all the way back to the 60's. Each one of these graphic forms, as you probably understand at this point, lets you see something different in the data: Regional patterns (the map), number of law changes (the bar graph), etc.
Figure 3: Fifty Years of Change, a look at LGBT civil rights over the last half-century of state legislation changes.
Another example, which aesthetically is much more creative, is Exoplanets Discoveries (Figure 4), by Jan Willem Tulp (6), published by Scientific American magazine. In this one, readers can arrange and sort the planets in several ways, and also filter the different kinds in or out. Sure, I miss the opportunity to zoom in and out to see things more clearly, but this visualization was likely produced under a tight deadline.
Figure 4: Exoplanets Discoveries, a visualization by Jan Willem Tulp. Reproduced with permission. Copyright ©2012 Scientific American, a division of Nature America, Inc. All rights reserved.
Be Creative, But Be Efficient
Here's the main takeaway of this article: When creating interactive visualizations for general readers, there's nothing wrong with trying to be a bit innovative and experimenting with unusual graphic forms if they are reasonably efficient and if you wish to bring attention to your graphic. But then, whenever it's possible—and it usually is—offer readers the opportunity to visualize the data in multiple and perhaps more traditional ways. One of the keys to designing compelling visualizations is to understand that there's more to an elephant than just its beautiful ivory tusks.
NOTES:
(1) The Confederacy of Truth-Tellers http://www.peachpit.com/articles/article.aspx?p=2126864
(2) This quotation comes from the excellent Wikipedia article about the parable: http://en.wikipedia.org/wiki/Blind_men_and_an_elephant
(3) About choropleth maps: http://my.ilstu.edu/~jrcarter/Geo204/Choro/Tom/
(4) Read about dot plots here: http://trellischarts.com/what-is-a-dot-plot
(5) Shneiderman's classic article is available online: http://www.cs.umd.edu/~ben/papers/Shneiderman1996eyes.pdf
(6) Jan Willem Tulp's portfolio can be found in http://www.tulpinteractive.com