Category Archives: Design

Infographics and Data Visualization (Week 4)

The assignment for Week 4 is the based on data used in a recent Guardian article on U.S. unemployment. Having used Bureau of Labor Statistics (BLS) data for many years at my previous job, I am far more familiar with this topic than I was with the data we used for last week’s assignment. In fact, I have already written several blog posts dealing with general employment statistics so it will be challenging to come up with something fresh.

The Guardian article includes an interactive map that highlights the lower 48 states (Hawaii and Alaska are off screen) and allows the user to select one of eight different employment metrics. A five-color scale defines the range of each metric while clicking on an individual state brings up a bar chart displaying a few data points and some additional text.

One problem I have with this map is that I think the states are too large to tell a detailed story about how unemployment affects different areas of the country. Maps at the county level (like this one from the BLS or this gorgeous D3 example posted on GitHub) show far more interesting regional employment patterns and help create a more compelling story. (Alberto Cairo talks about the importance of enumeration unit size in this week’s reading assignment.)

Another criticism is that the map only uses a fraction of the employment/unemployment information available from the BLS. This data is relatively easy to download and so there’s no real reason not to include a richer dataset in the graphic. Additional data would allow more detailed monthly trends and more meaningful comparisons to the National rate and/or the rates of other states.

Finally, I think the color scheme used on this map is hard to interpret. The color categories are not easily distinguished from one another and they don’t relate to any natural scale that the user could use to detect patterns. Creating more categories might also help with interpretation of the data.

The range and structure of the data suggests that there is a good story to be found looking at unemployment before, during and after Obama’s first term. There were certainly some unusual statistics associated with the 2007-2009 recession (as defined by the National Bureau of Economic Research).  It was the worst period of economic performance in the U.S. since the Great Depression and the pace of the recovery is one of the slowest on record.

In fact, until President Obama was re-elected a few weeks ago, no sitting president since World War II had been returned to office with an unemployment rate above 7.2%. This metric was such a sacred cow that conservative pundits accused the BLS of bias when data more favorable to the President was released in the run-up to the election. So, how did Obama earn a second term fighting these headwinds?

My first set of charts presents an overview of unemployment in the U.S. over the past twelve years. I wanted to show both the long-term trend in unemployment as well as a side-by-side comparison of the three most recent presidential terms. I’ve included a shaded area for each of the past two recessions on the first chart to show the effect of the two recessions.

The first thing I noticed by looking at these charts is that, over the past twelve years, the U.S. unemployment rate has never been lower than it was during George W. Bush’s first month in office. The rate got pretty close to that mark in the final months of Bush’s second term but it never quite made it. The second thing I noticed was that the drop in unemployment during the months following the Great Recession was slightly faster than it was during the recovery period following the 2001 recession.

My second chart shows the unemployment rate for each state over the course of Obama’s first term. It also includes a ranking of states by total unemployment and colors each chart using the results of the 2012 election.

A Thanksgiving Meal Preparation Timeline

The art of timing the preparation of Thanksgiving dishes takes years of experience and perhaps more than a few hard lessons in the kitchen (ever have anyone de-bone a turkey?). For those less experienced chefs, I’ve always felt that a good inforgraphic might help organize the work so that all the dishes are ready at the proper time.

I didn’t have the time to document my own family’s meal this year but I noticed that L.V. Anderson over at Slate wrote a great piece on her attempt to organize a full dinner. She sums up the issues nicely:

Cooking a Thanksgiving meal is a somewhat masochistic enterprise. It’s rewarding, for sure, and fun if you like cooking. But perfectly coordinating the timing of several dishes—nearly all of which taste best hot, many of which require oven time, and some of which begin deteriorating in quality shortly after you finish cooking them—is, well, impossible.

I’ve taken her instructions and organized them into a timeline with a target mealtime of 3:00 PM. Each box in the chart represents a 15-minute interval and clicking on it describes the task and provides a link to the recipe. Here it is … posted just under the wire:

It still needs some work so I’ll be making a few changes over the weekend. Meanwhile, Happy Thanksgiving!

Infographics and Data Visualization (Week 3)

The goal of this week’s assignment is to review some global aid data from the Guardian and evaluate how this information should be presented.  This is a two-part assignment and I have been able to download the data and let my thoughts percolate over the past few days. The focus is on the aid transparency index, which uses a broad set of criteria to rank major aid donors on their openness.

I’ll have to admit that my first reaction after looking at the data a bit was a muted “so what?” A simple rank of the aid organizations shows some of the usual good samaritans at the top and an apparent decline in transparency that roughly corresponds to a drop in GDP per capita (or possibly happiness or density of heavy metal bands).

Part of my lukewarm response stems from the fact that don’t really know what the consequences of transparency (or lack of transparency) means. Is there a concern about influence? Bribery? Funding of criminal or terrorist organizations? The U.S. aid organizations are kind of in the middle of the pack, which I suppose is not ideal. However, the U.S. list includes the Department of Defense, which I wouldn’t necessarily expect to be that open given the paticular nature of its mission.

Other questions that come to mind include:

  • What criteria are used to pick the organizations in this list? Who’s missing?
  • Do other military organizations make the list?
  • How is aid defined?
  • Why are some country’s scores aggregated while others are listed separately by organization?

Some of these answers can be found in the primary report, which suggests that the goal of aid transparency is to allow for effective policy planning and decision-making. The report states:

For aid to be more effective it needs to be more predictable, coordinated between donors, managed for results, and aligned to recipient countries’ own plans and systems. To achieve this, the information has to be shared between all parties involved in the delivery of aid in a timely, comprehensive and comparable way. Without this information it is not possible to know what is being spent where, by whom and with what results.

This makes sense … but I don’t know if I would normally associate this goal with “transparency.” To me, transparency has more to do with promoting accountability and providing information to citizens about what their Government is doing. The aid Index seems to be more about project coordination, efficiency and data governance. (Later on in the report, the text does mention that citizens will want to know where their money is going … more of a traditional goal of transparency.)

One of the major tools in the push for transparency is the development of a common standard for publishing aid information through the International Aid Transparency Initiative (IATI). The IATI registry has improved the quality and transparency of aid information, particularly for organizations that have either automated their publication or have already begun to address gaps and inconsistencies.

So, is there a story in the development and adoption of this standard? The report itself suggests that the purpose of the Index is in flux and asks whether a simpler methodology could still achieve the goal of providing effective, efficient and accountable aid information.

As I thought about this chart, I decided that any overview should show both the total transparency score and some measure of improvement from the previous year (there is both a 2011 and 2012 score). I decided on a scatterplot with the total score on the horizontal axis and the change in score (a ratio or percent) from 2011 to 2012 on the vertical axis. Along the right side I also thought I’d include a regular bar chart sorted by score.

A static sketch of this first chart:

I like the way the scatterplot emphasizes both the overall score and the year-over-year improvement. This shows organziations that have made progress toward the ultimate goal of transparency but may not have reached the heights of a group like the World Bank. The bar chart on the right shows standard ranking.

From this chart, the user should be able to navigate to details for each agency. I’d like to see comparisons of each sub-level (agency, organization, country) as well as the individual survey questions. There’s a pretty interesting chart toward the end of the report that shows the responses to all questions for all agenies as colored dots. It is intriguing and might offert some direction to these detailed charts. Otherwise it may be worth exploring standard charts.

 

 

Infographics and Data Visualization (Week 2)

For week two of the course, we’ve been asked to take a look at this interactive graphic from the New York Times, which compares the different words that Democrats and Republicans speakers used during their respective conventions.

Overall, I thought that the graphic was pretty good but there were a few things that I might consider redesigning. The first problem I noticed was that, when you click on the word bubbles, the political quotes below the chart change based on your selection. Unfortunately, most of this interactivity occurs “below the fold” or off-screen so you don’t necessaryily see it right away. I would need to be presented with more cues to know that this was going on. It seems like tightening up the top part of the chart and shrinking some of the ad space or menu heights might help here.

It also took me awhile to figure out that you could type in your own words and add them to the graphic. This feature is pretty cool but I don’t think it is necessarily obvious to first time visitors. I liked how the new word bubbles kind of migrated around to find a spot in the crowd but they sometimes got stuck in the middle of the pack if the words around them were too big.

The bubble sizes are difficult to interpret directly but I don’t think that is necessary for this graphic. I do have a problem with the way the bubbles indicate the % of word usage by political party. I would expect either a pie chart with the % in a slice or maybe a color difference along a spectrum (blue to red).

My first redesign attempt:

Although this “sketch” is not interactive, you can kind of see where I was headed. The first issue I tackled was trying to make it more obvious that the individual words or phrases could be shown in context. I did this by moving the quotes up from the bottom and placing them in cartoon speech bubbles along the sides of the graphic. The directional arrow for each speech bubble points to the word being examined and also indicates a slider that can be moved up and down from word to word. The speech bubbles could expand to include multiple quotes or maybe there could be some other form of gallery navigation within the bubble itself.

The individual words are displayed in a standard bar chart that clearly shows the word itself but doesn’t play with the font size at all. I let all comparisons between the words be shown using the red and blue bars, with relative usage rates treated by the length of the bars. This allows direct comparison of usage rates between the two parties as well as relative comparison between words.

I imagined that typing a word or phrase in the box would add that word or phrase to the top of the “stack” of bar charts, moving the rest of the words down one slot. This way the user could add as many words as they want and scroll down the length of the chart to look at their entire list and make comparisons.

Despite these adjustments, it’s still hard to see how the average user would pull a compelling narrative out of this  presentation without some assistance. To me, the story of this graphic is about the language that the different parties use to craft their messages. The use of certain words over others reflects each party’s priorities and their understanding of the intended audience.

Since we know word choice is designed to influence the audience in some way, it might be interesting to include examples of how the two parties have used language in the past. On the Republican side, Newt Gingrich’s 1994 memo to the GOPAC titled “Language: A Key Mechanism of Control” is a famous example. It contains a list of “optimistic positive governing words” that Gingrich recommended for use in describing Republican politicians and “contrasting words” that he suggested using to describe Democrats.

On the other side of the aisle, people like George Lakoff and Elisabeth Wehling at the The Little Blue Blog use concepts like “frames” to describe how the use of particular words trigger associations with either conservative or progressive moral systems. (Another interesting look at the use of language in politics can be found at Sasha Issenberg’s Victory Lab site.)

Either of these resources might be a good starting point for an analysis of word usage by politicians. In fact, one member of the class posted a quick graphic using Gingrich’s positive words here and I found it fascinating that the top three positive words used by Democrats (fair, building and reform) demonstrated a far different focus than those used by Republicans (liberty, freedom and lead).)

Modifying the NYT graphic to accomodate these investigations might involve the addition of “starter lists” of words such as the top 10 words for each party by word count, top 10 words by uniqueness to each party, or Gingrich’s positive word list. I also like the idea of a word association feature which could suggested related topics via a word cloud or a “you might also try this word” feature.

 

Trends in NFL Football Scores (Part 1)

One of the goals I set for myself this summer was to learn a bit about D3, a visualization toolkit that can be used to manipulate and display data on the web. Considering that the trees are bare and we’ve already had our first frost here in Wisconsin, you can safely assume that I am behind schedule. Nevertheless, I feel that I’ve finally reached a point where I have something to publish, so here goes.

First of all, a little background. D3 is a JavaScript library that allows you to bind data to any of the elements (text, lines and shapes) you might normally find on a web page.  These objects can be stylized using CSS and animated using simple dynamic functions. These features make D3 a perfect tool for creating interactive charts and graphs without having to depend on third party programs like Google Charts, Many Eyes or Tableau.

I wanted to start out with something simple so I elected to go with a basic line chart using data I pulled from Pro-Fooball-Reference.com. This site contains a ton of great information and statistics from the past 90+ years of the National Football League but — for now — I just looked at the final scores of all the games played from 1920 to 2011. My first D3-powered chart is below. It shows the average combined scores of winning and losing teams for each year of the NFL’s existence.

Although this chart looks pretty simple, every element — including titles, subtitles, axes, labels, grids and data lines — has been created manually using the D3 code. The payoff is pretty nice. All of the elements can be reused and you have tremendous control over what is shown onscreen. To demonstrate some of these cababilities, I’ve added interactive overlays that show a few of the major eras in NFL football (derived from work of David Neft and this discussion thread). If you move your mouse over the graph, you will see these different eras highlighted:

Early NFL (1920-1933) – The formation of the American Professional Football Association (APFA) in 1920 marked the official start of what was to become the National Football League. This era was marked by rapid formation (and dissolution) of small town franchises, vast differences in team capabilities and a focus on a relatively low-scoring running game. At this time, the pass was considered more of an emergency option than a reliable standard. The rapid growth in popularity of the NFL during this era culminated with the introduction of a championship game in 1932.

Introduction of the Forward Pass (1933-1945) – The NFL discontinued the use of collegiate football rules in 1933 and began to develop its own set of rules designed around a faster-paced, higher-scoring style of play. These innovations included the legalization of the forward pass from anywhere behind the line of scrimmage — a change that is often called the  “Bronko Nagurski Rule” after his controversial touchdown in the 1932 NFL Playoff Game.

Post-War Era (1945-1959) – The end of WWII saw the expansion of the NFL beyond its East Coast and Midwestern roots with the move of the Cleveland Rams to Los Angeles — the first big-league sports franchise on the West Coast. This period also saw the end of racial segregation (enacted in the 30s) and the start of nationally televised games.

Introduction of the AFL (1959-1966) – Professional football’s surge in popularity led to the formation of a rival organization — the American Football League — in 1960. The growth of the flashy AFL was balanced by a more conservative style of play in the NFL. This style was epitomized by coach Vince Lombardi and the Green Bay Packers, who would win five championships in the 1960s. In 1966, the two leagues agreed to merge as of the 1970 season.

Dead Ball Era (1966-1977) – Driven in part by stringent restrictions on the offensive line, this period is marked by low scores and tough defensive play. Teams that thrived in this environment include some of the most famous defenses in modern NFL history: Pittsburgh’s Steel Curtain, Dallas’ Doomsday Defense, Minnesota’s Purple People Eaters and the Rams’ Fearsome Foursome.

Live Ball Era (1978-present) – Frustrated by the decreasing ability of offenses to score points in 70s, the NFL began to add rules and make other changes to the structure of the game in an attempt to boost scoring. The most famous of these initiatives was the so-called “Mel Blount Rule” (introduced in 1978), which severely restricted the defense’s ability to interfere with passing routes. With the subsequent introduction of the West Coast Offense in 1979 — an offense based on precise, short passes — this period became marked by a major focus on the passing game.

Having created this first chart, I decided to build a second chart based on the ratio of average winning scores to average losing scores to see if there were any patterns.

The chart above shows how — after a period of incredibly lopsided victories — the average scoring differential settled in to a very steady pattern by the late 1940s and stayed at that level (roughly 2:1) for the next 30 years. Despite many changes in rules, coaching techniques, technology and other factors, only the pass interference rules of the late 1970s seemed to have any signifcant effect on this ratio, shifting it to just under 1.8:1 for the next 30 years.

While I had the data available, I also decided to look at the differences in average scores between home teams and away teams. The chart below plots this data along with the same overlay I used in the first chart.

A look at the ratio of average home team scores to average away team scores follows:

What’s fascinating about this chart is how quickly a form of parity was acheived among all the NFL teams. By the mid-30s, a measurable home field advantage can be seen at roughly 15%, a rate that has remained essential constant for over 70 years. Factors for this boost could include the psychological support of fans, familiar weather conditions, unique features of local facilities, lack of travel fatigue, referee bias and/or increased levels of motivation in home town players.

Thanks to Charles Martin Reid for his solution to getting D3 and WordPress to play nice.

Infographics and Data Visualization (Week 1)

The Introduction to Infographics and Data Visualization course begins Sunday so I’m starting to receive emails from the instructor. The first thing I need to do is tackle the reading list and then take a look at the first assignment, which involves the review of this graphic, which was based on a survey of 32,000 Internet users from 16 different countries. The survey asked these users about the kind of online services they used on a regular basis.

The online class discussion was pretty good and very thorough. My own thoughts began with the graphic “building block” that the designer used to organize and convey information. This consisted of a nested group of overlapping doughnut charts that used color, size and fractional divisions to represent the data for each country (see below).

I think that the arcs of the doughnut are meant to be interpreted in two dimensions: 1) the sweep of the arc represents the % of the category population that is engaged in the activity (similar to a regular pie chart) and 2) the radius from the center represents the overal size of the category population (similar to a regular bubble chart). Both pie charts and bubble charts can work in certain circumstances but they make direct comparisons difficult. Throw in the fact that the arcs overlap and it is almost impossible to understand the meaning associated with different variables. For example, the predominant color in graphs for countries like the U.S. or Canada is pink, which downplays the larger population of social profile users.

My first instinct for adjusting this infographic was to “unpack” the doughnut chart and place the data in a regular bar chart. By using standard bars, it is fairly easy to make comparisons between the different categories. The bar chart also shows percentages naturally if I include a gray bar that represents the total population of internet users. (The value of the gray bar is an assumption on my part, calculated by dividing the user value by the access percentage. This works for almost every country excpet the U.S. and the U.K.)

The real power of this approach comes with side-by-side comparisons of the data. After swapping the axes and adding in the other countries, the resulting chart allows for relatively easy comparison of both overall Internet usage and individual social media involvement. Both the U.S. and U.K. totals are fudged.

One problem I have with this chart is the huge amount of white space in the upper right quadrant. This is caused by the great disparity in size between the Internet populations of the largest and smallest countries. Adjustments like the use of a logarithmic scales or scatterplots might be able to fill out the canvas a bit but they also make direct comparisons more difficult. I’m also not too sure about the color scheme, which I find somewhat distracting.

Tackling both of these issues at once, I’ve removed the seperate colors for the social media categories and added in an overlay that uses a radar chart to show the realtive differences between social media usage within countries.

The radar charts are kind of fun and they make it pretty easy to see different patterns of Internet usage among the 16 countries. The higher social profile participation (and lower blog usage) of Western countries creates a distinctive shape when compared to Asian countries like Japan and South Korea. The two-color scheme also makes it easier to see patterns in the column charts. However, I’m not sure that depending on the order of the columns is enough to compare social categories across countries.

I’m going to let my solution stand for now. Meanwhile, here are some other solutions from the class and around the web:

 

 

Infographics and Data Visualization (Sign Up)

Despite a crazy schedule, I’ve decided to sign up for a free online course offered by the Knight Center for Journalism called Introduction to Infographics and Data Visualization. It runs from October 28 – December 8 and will be taught by Alberto Cairo, the author of The Functional Art: an Introduction to Information Graphics and Visualization, published by PeachPit Press. I will be sure to post my completed assignments here. It should be fun!