Category Archives: Infographics

What ‘The Office’ Gets Wrong About the Office

I start a new job next week and so I’ve been working on documenting all of my old tasks and projects in preparation for the transition. As I was going through old e-mails, I came across the introductory note my manager sent out to the department on my first day back in June 2004. Comparing it to the departure e-mail from my current manager, it’s amazing to see the changes in personnel over a seven-year span.

I prepared this chart using the distribution list from both e-mails, a drawing program, and a site that creates proportional venn diagrams. Only eight people are listed twice — including me and a person who left the company and has since returned. Some of the people who are only listed once have more tenure then me — they just may have gone to/come from another department. Still, it represents an interesting fact about the modern office. Change is constant.

Thick as a [LEGO] Brick

A few weeks ago, Samuel Arbesman wrote an article in Wired touching on the mathematical properties inherent in LEGO structures. In it, he discussed the results of a 10-year old study of natural and human-made networks that described how the number of distinct components in a network increased with the overall size of the network.

The study showed that the LEGO systems did indeed follow this rule. However, Arbesman noted that the relationship increased sublinearly, suggesting that LEGO systems were under some form of selection pressure (like the economics of production) that made it more expensive to grow the system and create new types of pieces. He was curious to see whether or not these findings would hold true with a more complete list of LEGO sets available today (n=389 in the 2002 study).

After using a webcrawler to pull the data for the available sets and their component pieces, I was presented with a list of over 6,800 individual toys or kits. Not all of these kits fit the criteria of the original study, which investigated sets that were designed to build somthing specific as opposed to generic collections of pieces.

Paring down this list turned out to be the most difficult part of this excercise. I ended up eliminating any set had words like “accessories,” “supplemental,”  or “universal building set” in the name. I also removed entire toy lines such as DUPLO, Clikits, and Primo/Baby which didn’t seem to fit in the standard LEGO system. Basically, I tried to include anything with a brick, plate, or tile that had a picture of a single object on the box. I ended up with about 3,750 sets … or about ten times the number in the original study.

So, do the results hold up with the new data? At first glance, it appears they do. Both the log-log and semi-log plots described in the study are reproduced here with the larger counts. Note that a power-law relationship still appears to fit the data better than a logarithmic relationship.

Once I had access to all of that cool LEGO data, of course, I couldn’t resist a few more visuals. The first thing I developed was an interactive chart that lets you navigate the size and complexity data to see specific kits. Check out the links for pictures and parts lists.

This display was interesting because the LEGO kits with the most pieces tended to be elaborate secret bases or fortresses while the LEGO kits with the most variety of pieces were cultural artifacts like the Taj Mahal or the Statue of Liberty. Ironically, the Death Star (which might be considered both a cultural icon and a fortress) fits neatly in the upper right corner.

The following charts look at the trend of unique pieces over time as well as the distribution of color over the distinct LEGO sets available (this includes all LEGO products, not just the specific “objects” used in the logarithmic plots above). Note both the increasing variety of the LEGO pieces and the move away from the traditional color palette. The mottled gray represents the “other” category.

It is interesting to note that the shift toward more complexity in both pieces and colors corresponds with the deal LEGO inked with Lucasfilm in 1999 that allowed the company to sell toys based on the “Star Wars” universe. These changes came at a time of turmoil for LEGO as it struggled to remain true to its roots while competing with a flood of specialty toys and video games. Licensing products from Lucasfilm was a big step for LEGO but one that seems to have paid some creative dividends … four five of the top ten largest LEGO structures ever released commerically are spaceships from the “Star Wars” series.

This trend toward replicating such specific visions (LEGO has also licensed themes from Harry PotterToy Story, Pirates of the Caribbean, and others) explains some of the incredible variety of pieces now in circulation. Items from these new kits introduced many pieces used only once.

On the opposite end of the spectrum, the most commonly shared LEGO piece in the database is a black 1 x 2 plate (part number 3004). The other pieces in the top 10 are also very simple and very monochromatic. I found it interesting that all the colors in the top ten reflected the sequence of Berlin and Kay’s basic color terms (in which Stage I cultures have only the colors black (dark–cool) and white (light–warm) and Stage II adds Red).

One thing this database does not cover is the huge market for non-standard kits and free-form LEGO bricks. According to Chris Anderson’s Long Tail blog:

“… 90% of Lego’s products are not available in traditional retail. They’re only available in the catalogs and online … [o]verall, those non-retail parts of the business represent 10-15% of Lego’s annual $1.1 billion in sales. “

User-created structures represent an amazingly creative use of the standard set of parts available.  Check out this footbal stadium or this minifig-scaled Saturn V rocket. Some of these models were created using the old LEGO Factory/Design by Me software but some are done on the fly. It would be interesting to see if some of the above findings apply to these custom structures.

For more stats and a company timeline, check out this site.



Now that we can safely say that Brett Favre has retired (notwithstanding rumors to the contrary), I thought it was time to pull out some data on the indecisive quarterback’s career touchdown passes. Stats on passes say a lot about the relationship between a quarterback and his receivers so I wanted to create a visual that captured some of these stories.

The chart below shows each touchdown pass that Brett Favre threw during his NFL career and displays it up by receiver (vertical axis), season (horizontal axis), average yardage per month (size of marker), and team (color of marker).

Packer Fans will immediately recognize the significance of some of the data points. For the rest of you, here are a few highlights:

  • Sterling Sharpe caught Brett Favre’s first touchdown pass as a Green Bay Packer in 1992 and continued to be the quarterback’s primary receiver for the next three years. The 5x All-Pro led the NFL in touchdown receptions in both 1992 and 1994 and would certainly have played a major role in the team’s subsequent success if he hadn’t suffered a career-ending neck injury at the end of the 1994 season.
  • Following Sharpe’s early exit from football, Favre was forced to distribute his passes among a broader range of players, chief among them wide receivers Robert Brooks and Antonio Freeman. These two players would serve as the primary pillars of the passing game throughout Favre’s most successful period with Green Bay.
  • During the 1996 season (the year the Packers won Super Bowl XXXI), Favre threw touchdowns to ten different receivers, a career high. His total touchdown pass yardage that year also reached a high water mark.
  • Following Favre’s two Super Bowl appearances, there was a noticeable dropoff in the number of new players catching touchdowns. It is not clear whether it was because the receiving core had stabilized or the coaches were focused on developing other aspects of the team but there were no fresh faces in the 1998 season and only two (Corey Bradford and Donald Driver) in 1999.
  • Favre did not have another pair of favorite “big play” receivers until his last two seasons with the Packers, when he had both Driver and Greg Jennings.
  • After Favre’s retirement from the Packers, he was introduced to an entirely new slate of receivers with the New York Jets in 2008. This situation was repeated in 2009 when he signed up with the Minnesota Vikings. He threw his final touchdown pass to Percy Harvin in December 2010.

Unemployment vs. Underemployment

The Bureau of Labor Statistics releases the results of two major surveys on the first Friday of every month (the Current Employment Statistics or CES and the Current Population Statistics or CPS). Although the amount of information in these two surveys is quite extensive, the general public is probably familiar with only a few specific metrics.

First and foremost among these is the unemployment rate, which represents the ratio of unemployed workers to the overall civilian labor force. As with anything involving the government, this simple number is more complex than it than it seems. For one thing, the BLS has no less than six different methods of calculating unemployment … and each one comes in a seasonally adjusted and unadjusted format. The standard unemployment rate — the one that makes all the headlines — is called U-3 and it is usually seasonally adjusted.

Many economists feel that U-3 is misleading because, over they years, it has slowly excluded many of the factors that used to go into how the U.S. reported unemployment. They prefer to use the “underemployment” rate or U-6, which is the BLS’s broadest measure of unemployment.

The basic definitions:

  • U-3 – Total unemployed persons, as a percent of the civilian labor force (the official unemployment rate).
  • U-6 – Includes those people counted by U-3, plus marginally attached workers (not looking, but want and are available for a job and have looked for work sometime in the recent past), as well as persons employed part time for economic reasons (they want and are available for full-time work but have had to settle for a part-time schedule).

Keeping all of these terms straight can be difficult for the average person, so — despite Stephen Few’s objections — I have created a pie chart that attempts to explain all of the various relationships. The central pie shows the  basic division of the working age population into the civilian labor force and people who are outside of the labor force. Each subsequent pie divides these categories into smaller and more specific subcategories. 

The calculations for U-3 and U-6 can then be represented as slices of the pie:

Right off the bat you can see that there is a problem with some of the various categories. For one thing, there is an entire group of people who are listed as Want a Job Now but aren’t working and aren’t counted as unemployed. This category includes people who have been out of work for over a year and have officially fallen out of the civilian labor force. Although the U-6 figure includes a portion of this group, many critics still feel that this practice understates unemployment.

Another way to show the calculation of the two metrics is graphically, using the color coding of the legend from the chart to show the details for each metric:

This excercise highlights another potential issue for measurement of the economy by showing the importance of the denominator (in this case, the Civilian Labor Force). Variations in this number have a tremendous effect on the outcome of both calculations. By reclassifying certain groups of unemployed (the Want a Job Now crowd), people are siphoned off from both the numerator and the denominator. The end result is a slight reduction of both the U-3 and U-6 rates. Not a big deal … unless you happen to be running for office.

Six Degrees of Joy Division

My local record store used to have this great poster on the back wall that explained how several dozen British indie bands from the 80s were all linked together through their various group members. The title of the poster was something like “Why All These Bands Sound the Same” and it was clearly a tongue-and-cheek slam of the gloomy post-punk sound of musical groups like Bauhaus and the Smiths.

I loved the design concept and looked for the poster when the store finally went out of business a few years ago. Although I never found it, it occurred to me recently that I might be able to reconstruct the graphic using some modern tools and data from the online music site

AllMusic is an outstanding musical resource and their meticulous site formatting allowed me to write a program that would crawl from page to page gathering information about interrelated bands and band members as it went. I decided to use the group Joy Division as a starting point because I liked the movie Control and had a vague memory of that particular band name appearing on the poster. The program ran over night … evaluating 37,538 separate pages before it completed its run.

Using the IBM visualization tool, Many Eyes, I created a network diagram of the bands that are within six steps of my “seed” group. The full interactive results are at the end of the post (worth the effort if the Many Eyes site is working) but here is a detail:


The Joy Division Network

At nearly 38K records, this particular musical network covers a huge swath of Anglo-American rock-and-roll and includes almost all of the major groups in the Pop/Rock genre. What’s perhaps most interesting about this massive network is the fact that Joy Division is only linked to two bands directly, the acclaimed New Order (formed in 1980 after the death of JD vocalist Ian Curtis) and the Manchester supergroup Freebass (formed in 2004). All other connections are indirect, with a total of 20 degrees of separation between Joy Division and the most distant band in the network, post-grunge Los Angeles outfit Open Hand (formed in 2000).


Other Thoughts on the Data

The first odd thing I noticed about the network was that, by focusing on the relationships between bands, the network excludes a lot of well-known solo artists. Even when these musicians joined a band, their independent careers limited these associations to one or two instances. The best example of this situation would be someone like Elvis Presley or Johnny Cash. Both of these artists were loosely linked together through a glorified hootenanny called The Million Dollar Quartet (along with Carl Perkins and Jerry Lee Lewis). The only other bands in this network are The Offenders and the Cash-related groups The Highwaymen and Johnny Cash & the Tennessee Two. Some of the other solo artists in this minor network are household names (depending on the household, of course), including Waylon Jennings, Kris Kristofferson, and Willie Nelson. Three bands, a half-dozen stars and a lot of hits … but no direct connection to the huge Joy Division network. Many current rap artists seem to fit this mold as well.

On the flip side, progressive rock groups like King Crimson had members who were in dozens of other bands. These social connectors can be seen at the center of a huge spider web of interrelated groups in the network diagram. Bands like these are often experimental in nature, with talented musicians putting their stamp on a number of different side projects. Some very influential artists can be spotted in the midst of these groups, including — using King Crimson as an example — famous journeyman players like Robert Fripp, Adrian Belew, John Wetton and Greg Lake.

Finally, although I distinctly remember the band Bauhaus and its associated constellation of bands (Love & Rockets, Tones on Tail, The Jazz Butcher, etc.) on the poster, they were not within six degrees of separation of Joy Division in the network data (they were about eight links away). This exposes an issue with my data gathering methodology because it doesn’t take into account other relationships between artists such as mentors, guest musicians, common producers or other ties. Still, it was an interesting exercise with fruitful results.

Additional Interactive Charts

Bubble diagram of musical styles (full band network):
Network diagram (six degrees of Joy Division):

Time-Distance Diagrams

After I was in a car accident a few years ago, I contacted the city Traffic Control Engineer to see if I could get a copy of the signal timing sequence for the intersection of the street where the accident occurred. The information they provided allowed me to construct a time-distance diagram to relate the path the car traveled to the 90-second traffic signal cyles for several streets. 

The time in seconds can be read down the side of the diagram and the distance can be read across the bottom. Stationary objects (like the traffic lights) show up vertically on the chart while moving objects cross the chart at different slopes depending on their speed. The blue line represents the path taken by a car which starts from a complete stop at one intersection and accelerates to a speed of 35 miles-per-hour in a very leisurely 9 or 10 seconds. Note that the line crosses the final intersection during a green light.

What I liked about this diagram was how easy it was to show a series of timed lights and the effects that different average speeds had on the outcome. I was reading Edward Tufte’s The Visual Display of Quantitative Information at the time and his section on train schedules was very inspirational. Despite my efforts, however, the other driver sued for injury and my insurance company settled out of court. Oh well, at least I was able to get this great chart out of the process.

Fighting Insidious Business Jargon with Design

One of the biggest barriers to introducing new concepts to people is that they often have old, preconceived notions about those concepts that are just close enough to the truth to cause confusion. For example, my company recently started a sales program that establishes performance incentives by “vertical” — one of those vague business terms that could mean just about anything to anybody. Tracking such an ambiguous concept can cause a lot of angst when someone’s paycheck is on the line so it fell to me to come up with some ideas to help clarify the definition.

The first problem I needed to address was the fact that most people already thought they knew what vertical meant. When you try and find a definition online, you’ll usually come across terms like “vertical industry” or “vertical market” which both refer to groups of companies that serve specific, related  industries (i.e. a niche market). In contrast, a “horizontal market” refers to companies that meet more general business needs.

The differences between these two definitions are pretty subtle and, as a result, most people tend to associate the term vertical with almost any industry, departmental function or even groups of occupations. In our business, we need to keep such categories distinct so I decided to create a matrix that placed our two main areas of focus — jobs and industries — on two separate axes. This would provide a simple visual cue to the differences during future discussions and presentations. The basic distinctions are:

  1. Industry (based on the NAICS standard) applies to a company or client.
  2. Occupation (based on the SOC standard) applies to a person or individual.

Unfortunately, a standard table would still present the information in columns and rows — leaving the vague association with “vertical” unresolved. To address this, I decided to take a cue from a common Scandinavian holiday decoration and rotate the table 45 degrees. This eliminates all vertical and horizontal lines in the diagram and forces the observer to abandon the concept altogether. In the diagrams, the industry information appears in the orange axis, while the occupation appears in the blue axis.


Once this basic structure is established, unique industry/occupation combinations can be “mapped” to demonstrate situations that are familiar to the audience. These examples help reinforce the concepts while emphasizing the difference between the two categories. It can be particularly helpful explaining examples where industries and occupations share some elements in their names (i.e. health services vs. healthcare practitioners).