Tag Archives: Tableau

A Thanksgiving Meal Preparation Timeline

The art of timing the preparation of Thanksgiving dishes takes years of experience and perhaps more than a few hard lessons in the kitchen (ever have anyone de-bone a turkey?). For those less experienced chefs, I’ve always felt that a good inforgraphic might help organize the work so that all the dishes are ready at the proper time.

I didn’t have the time to document my own family’s meal this year but I noticed that L.V. Anderson over at Slate wrote a great piece on her attempt to organize a full dinner. She sums up the issues nicely:

Cooking a Thanksgiving meal is a somewhat masochistic enterprise. It’s rewarding, for sure, and fun if you like cooking. But perfectly coordinating the timing of several dishes—nearly all of which taste best hot, many of which require oven time, and some of which begin deteriorating in quality shortly after you finish cooking them—is, well, impossible.

I’ve taken her instructions and organized them into a timeline with a target mealtime of 3:00 PM. Each box in the chart represents a 15-minute interval and clicking on it describes the task and provides a link to the recipe. Here it is … posted just under the wire:

It still needs some work so I’ll be making a few changes over the weekend. Meanwhile, Happy Thanksgiving!

The Days Keep Getting Longer … Literally

Keeping track of time is never easy without an accurate clock and so people have come up with a number of different folk methods to keep themselves on pace. One of the most common techniques is to introduce a multi-syllable word as you count seconds so that you don’t count too fast. The most familiar phrase is probably something like “A thousand one, a thousand two …” but there are several others. My Dad actually had a teacher in school that used the phrase “steam engine” and I’ve heard others use words like “Mississippi” or even “alligator.” Basically, any four or five syllable phrase will serve as a good placeholder. Whatever phrase you favor, be prepared to dust it off tonight as the world is officially given an extra leap second at the end of the day.

The reason for this extra second is rather complicated. A normal “day” was officially defined back in 1967 as 86,400 seconds in the International System of Units (SI) and it is tracked by a very precise atomic clock. This is the Universal Coordinated Time (UTC) that we all know and love. The actual solar day is pretty much the same length but not quite. There are several different events that can speed up or slow down the Earth’s rotation by a few thousandths of a second. These events can include earthquakes, changes in the jet stream, the tidal pull of the Moon, the position of the Earth in its orbit, fluid motion at the Earth’s core, and the gradual slowing of the Earth’s rotation.

Whenever these forces cause the solar day and the UTC to get too far out of whack, the Sub-bureau for Rapid Service and Predictions of Earth Orientation Parameters of the International Earth Rotation and Reference System Service — let me pause here while I catch my breath — calls for a leap second. This manifests itself as an additional second tacked on to the normal clock reading around midnight (11:59:59 –> 11:59:60 –> 12:00:00). This whole process is essentially designed to keep the sun directly above you at high noon.

Pretty cool, eh?

What’s really interesting about this issue is that, in the long run, it doesn’t really matter because the Earth’s rotation is slowing by a few fractions of a second each year and the standard Earth day continues to get longer. This is one of those weird facts that kind of blew my mind when I first heard it. I guess I had read too many science fiction stories where the hero hops into his time machine and goes back to some ridiculously precise date like 10:24 AM on Tuesday, August 13, 250,000,000 B.C. In reality, our current concept of days and dates are firmly based on the Earth’s current circumstances. Back in dinosaur times, the typical Earth day was an hour or two shorter and there were an extra 10-20 days in the year (the length of the year was the same overall). When the Earth was really young, days were only six hours long and there were over 1,000 of them per year.

In order to visualize this, I found a paper online which provided me with a model that estimates the length of a day and the number of days per year for any time period. I’m not sure how official these calculations are but they do appear to correlate with data obtained from fossil corals and radiometric dating methods. I’ve included information on each geologic period from Wikipedia, so use with the appropriate amount of caution.

Anyway, enjoy your leap second! One steam engine …


You Are What You Watch

Experian-Simmons released some survey data in December that looked at the relative popularity of major television shows for three different political groups: liberal Democrats; conservative Republicans; and middle-of-the-road voters. Each show was given an index based on the concentration of specific voters and this information was used to create lists of the top programs for each political group in both entertainment and news categories.

Although these top ten lists were interesting on their own, the fact that each individual TV program actually had an index rating for all three groups offers an opportunity for more complex analysis. The most obvious next step involves comparing pairs of groups in a 2D scatterplot chart. The Tableau visualization below shows the results.

A few notes:

  • Entertainment shows are in blue, news shows are in orange.
  • Shows without enough data for a particular group were still plotted as a zero index.
  • Hovering over each data point reveals the show and its indices.


The first thing I noticed was that news shows were much more partisan than entertainment shows. In fact, almost all of the shows with the most extreme scores were either news shows (primarily FOX and MSNBC) or fake news shows (Comedy Central’s Daily Show and Colbert Report). PBS gets a few high scores on the liberal side but the standard television networks are all fairly evenly watched.

Another thing that strikes me is how similar the watching habits of middle-of-the-road voters are to those of conservatives Republicans. The only noticeable exception occurs with news programs, but it is a pretty big exception: FOX News. All of the top ten conservative news programs were all on FOX while none of the top middle-of-the-road news programs were on that network. It might be encouraging for conservative politicians to see the similarities in entertainment interests between conservative voters and independents but I suspect that the gulf in news sources would be hard to overcome.

Many of the other differences have been noted elsewhere but are worth repeating: liberal Democrats tend to favor funnier shows and stories involving morally complex characters while conservative Republicans favor shows where people are doing stuff — either real work or reality competitions.

Of course, having complained about the lack of 2D analysis for this data in the major online outlets, I would be remiss if I didn’t point out the fact that each show has three indices apiece. Logically, we should be trying to show the data in a 3D scatterplot.

This isn’t as easy as it sounds since most of the major charting applications aren’t very good in 3D and they don’t provide any interactive option for the web that I could find. The best options seemed to be R or something called CanvasXpress — neither of which I had worked with before. I chose R, which allowed me to create both static and interactive 3D plots. However, only screenshots of the interactive plot are available at the moment. Several hours later …

Geographic References in Local Business Names

This little exercise came about after I read an article on the old Northwest Territory in the U.S., which basically consisted of all the land west of Pennsylvania, northwest of the Ohio River, and east of the Mississippi River. As the country expanded westward, this geographic area gradually became known as the “Midwest” (or the East North Central States region) but not before the older name left its mark on the local culture. Organizations like Northwestern Mutual Life (Milwaukee) and Northwestern University (Chicago) still refer back to to the days when these places were located on the fringe of the country, not at its center.

It occurred to me that researching such place names would be a good way to see if there was still a residual “shadow” of the old Northwest territory so I downloaded a sample list of company headquarters with the phrase “Northwest” or “Northwestern” in their names and plotted them on a map. Alas, this attempt failed to find anything significant (there was too much competition with the Pacific Northwest in name usage). However, I did look up some other regional terms with more positive results.


The geographic patterns for most of these terms are fairly distinct but there are also some areas of overlap. It was especially interesting to see regions that had local businesses in three or more categories. The old Northwest territory fits this mold with a combination of Midwest, Great Lakes, and Prairie.

Thick as a [LEGO] Brick

A few weeks ago, Samuel Arbesman wrote an article in Wired touching on the mathematical properties inherent in LEGO structures. In it, he discussed the results of a 10-year old study of natural and human-made networks that described how the number of distinct components in a network increased with the overall size of the network.

The study showed that the LEGO systems did indeed follow this rule. However, Arbesman noted that the relationship increased sublinearly, suggesting that LEGO systems were under some form of selection pressure (like the economics of production) that made it more expensive to grow the system and create new types of pieces. He was curious to see whether or not these findings would hold true with a more complete list of LEGO sets available today (n=389 in the 2002 study).

After using a webcrawler to pull the data for the available sets and their component pieces, I was presented with a list of over 6,800 individual toys or kits. Not all of these kits fit the criteria of the original study, which investigated sets that were designed to build somthing specific as opposed to generic collections of pieces.

Paring down this list turned out to be the most difficult part of this excercise. I ended up eliminating any set had words like “accessories,” “supplemental,”  or “universal building set” in the name. I also removed entire toy lines such as DUPLO, Clikits, and Primo/Baby which didn’t seem to fit in the standard LEGO system. Basically, I tried to include anything with a brick, plate, or tile that had a picture of a single object on the box. I ended up with about 3,750 sets … or about ten times the number in the original study.

So, do the results hold up with the new data? At first glance, it appears they do. Both the log-log and semi-log plots described in the study are reproduced here with the larger counts. Note that a power-law relationship still appears to fit the data better than a logarithmic relationship.

Once I had access to all of that cool LEGO data, of course, I couldn’t resist a few more visuals. The first thing I developed was an interactive chart that lets you navigate the size and complexity data to see specific kits. Check out the links for pictures and parts lists.

This display was interesting because the LEGO kits with the most pieces tended to be elaborate secret bases or fortresses while the LEGO kits with the most variety of pieces were cultural artifacts like the Taj Mahal or the Statue of Liberty. Ironically, the Death Star (which might be considered both a cultural icon and a fortress) fits neatly in the upper right corner.

The following charts look at the trend of unique pieces over time as well as the distribution of color over the distinct LEGO sets available (this includes all LEGO products, not just the specific “objects” used in the logarithmic plots above). Note both the increasing variety of the LEGO pieces and the move away from the traditional color palette. The mottled gray represents the “other” category.

It is interesting to note that the shift toward more complexity in both pieces and colors corresponds with the deal LEGO inked with Lucasfilm in 1999 that allowed the company to sell toys based on the “Star Wars” universe. These changes came at a time of turmoil for LEGO as it struggled to remain true to its roots while competing with a flood of specialty toys and video games. Licensing products from Lucasfilm was a big step for LEGO but one that seems to have paid some creative dividends … four five of the top ten largest LEGO structures ever released commerically are spaceships from the “Star Wars” series.

This trend toward replicating such specific visions (LEGO has also licensed themes from Harry PotterToy Story, Pirates of the Caribbean, and others) explains some of the incredible variety of pieces now in circulation. Items from these new kits introduced many pieces used only once.

On the opposite end of the spectrum, the most commonly shared LEGO piece in the database is a black 1 x 2 plate (part number 3004). The other pieces in the top 10 are also very simple and very monochromatic. I found it interesting that all the colors in the top ten reflected the sequence of Berlin and Kay’s basic color terms (in which Stage I cultures have only the colors black (dark–cool) and white (light–warm) and Stage II adds Red).

One thing this database does not cover is the huge market for non-standard kits and free-form LEGO bricks. According to Chris Anderson’s Long Tail blog:

“… 90% of Lego’s products are not available in traditional retail. They’re only available in the catalogs and online … [o]verall, those non-retail parts of the business represent 10-15% of Lego’s annual $1.1 billion in sales. “

User-created structures represent an amazingly creative use of the standard set of parts available.  Check out this footbal stadium or this minifig-scaled Saturn V rocket. Some of these models were created using the old LEGO Factory/Design by Me software but some are done on the fly. It would be interesting to see if some of the above findings apply to these custom structures.

For more stats and a company timeline, check out this site.



Now that we can safely say that Brett Favre has retired (notwithstanding rumors to the contrary), I thought it was time to pull out some data on the indecisive quarterback’s career touchdown passes. Stats on passes say a lot about the relationship between a quarterback and his receivers so I wanted to create a visual that captured some of these stories.

The chart below shows each touchdown pass that Brett Favre threw during his NFL career and displays it up by receiver (vertical axis), season (horizontal axis), average yardage per month (size of marker), and team (color of marker).

Packer Fans will immediately recognize the significance of some of the data points. For the rest of you, here are a few highlights:

  • Sterling Sharpe caught Brett Favre’s first touchdown pass as a Green Bay Packer in 1992 and continued to be the quarterback’s primary receiver for the next three years. The 5x All-Pro led the NFL in touchdown receptions in both 1992 and 1994 and would certainly have played a major role in the team’s subsequent success if he hadn’t suffered a career-ending neck injury at the end of the 1994 season.
  • Following Sharpe’s early exit from football, Favre was forced to distribute his passes among a broader range of players, chief among them wide receivers Robert Brooks and Antonio Freeman. These two players would serve as the primary pillars of the passing game throughout Favre’s most successful period with Green Bay.
  • During the 1996 season (the year the Packers won Super Bowl XXXI), Favre threw touchdowns to ten different receivers, a career high. His total touchdown pass yardage that year also reached a high water mark.
  • Following Favre’s two Super Bowl appearances, there was a noticeable dropoff in the number of new players catching touchdowns. It is not clear whether it was because the receiving core had stabilized or the coaches were focused on developing other aspects of the team but there were no fresh faces in the 1998 season and only two (Corey Bradford and Donald Driver) in 1999.
  • Favre did not have another pair of favorite “big play” receivers until his last two seasons with the Packers, when he had both Driver and Greg Jennings.
  • After Favre’s retirement from the Packers, he was introduced to an entirely new slate of receivers with the New York Jets in 2008. This situation was repeated in 2009 when he signed up with the Minnesota Vikings. He threw his final touchdown pass to Percy Harvin in December 2010.

Earnings and Unemployment by College Major

The Wall Street Journal recently published a table of income and unemployment data  that presented pay and employment rates for various college majors. The original study by Georgetown University’s Center on Education and the Workforce contained enough additional details that I thought it might be worth trying to incorporate the information into a Tableau visualization.

After a little data massaging, I created charts for both the high-level fields of study and the more detailed individual majors. Each level contains unemployment rates, income levels, and popularity of major measured by number of enrollees.

One of the first things you notice is that, despite frequent claims to the contrary, college graduates with a degree in Education have the lowest median earnings overall. The Education field also has the narrowest range of income and includes four of the ten majors with the lowest median earnings. On the plus side, fifteen of the sixteen Education majors have (or had at the time of the study) unemployment rates below 5.5% — the weighted average rate of unemployment for all majors in the study.

Graduates with an Engineering degree have the highest median earnings overall and a relatively low unemployment rate compared to other disciplines. In addition, seven of the ten majors with the highest median earnings were found in Engineering.

Other majors with good earnings potential included the usual suspects (Computers & Mathematics, Health, and Business) while the best employment prospects were found in Education, Health, Physical Sciences, and Agriculture & Natural Resources.

As for individual majors, the winners in my completely fictitious categories are as follows:

  • Most Popular –  Business Management & Administration takes this category with nearly 2.8 million grads holding this degree. The next two majors in line (also in the Business field) weren’t even close — trailing by over a million people.
  • Best Prospects –  Actuarial Science beat out four other fully-employed competitors by coming in with a median income of over $80K.
  • Worst Prospects –  Clinical Psychology tops this category with an estimated unemployment rate of nearly 20%. Yikes! I also noticed that a number of other majors in the Psychology field had unemployment rates above 10%, which means that intra-discipline career changes for people with this major would be difficult.
  • Most Deceptive – The “winner” here is Architecture, an outlier with the lowest median earnings and the highest unemployment rate of all of the Engineering majors. For this category, I wanted a relatively popular major with an uncommonly high unemployment rate … the kind of major that churns out grads and then strands them in the unemployment line. An educational Judas, if you will. (Full disclosure: I have an Architecture degree, but I can’t say I wasn’t warned.)
  • Hidden Gem – I’m going to call this one a tie between Petroleum Engineering and Pharmacy Pharmaceutical Sciences & Administration. Petroleum Engineering has a slight edge on median earnings ($127K vs. $105K) but the Pharma major has a lower overall unemployment rate (3.2% vs. 4.4%). You probably can’t go wrong with either one but keep on eye on the horizon … Petroleum Engineering is notoriously dependent on the boom/bust cycles of the oil and gas industry while workers in the pharmaceutical industry are facing major changes as companies try to adjust to globalization and increasing costs of product development.