Category Archives: Information

Politicians Discover Data Science

During the 2008 U.S. Presidential campaign, the online design community devoted a lot of pixels to comparisons of the two candidate’s web sites (a few great examples here, here, and here). The overall consensus was that Obama won the war for eyeballs by emphasizing design, web usability, multimedia, and robust social networking. According to an in-depth study by the Pew Research Center’s Project for Excellence in Journalism, Obama’s online network was over five times larger than McCain’s by election day and his site was drawing almost three times as many unique visitors each week.

There is no doubt that the web has fundamentally transformed the way political campaigns are run. Voters are no longer tied to traditional media outlets for information and they can participate directly in a campaign in ways that were unimaginable only a few years ago. Adam Nagourney, columnist for the New York Times, summed it up nicely:

[The Internet has] rewritten the rules on how to reach voters, raise money, organize supporters, manage the news media, track and mold public opinion, and wage — and withstand — political attacks.

So, with the next campaign season gearing up, what technology-driven changes can we expect for 2012? If the rumblings are true, this election may see the ascendancy of data science as a formal part of the campaign toolkit.

In a recent CNN article, Micah Sifry wrote about the Obama campaign’s establishment of a “multi-disciplinary team of statisticians, predictive modelers, data mining experts, mathematicians, software developers, general analysts and organizers.” The article goes on to discuss the importance of data harmonization (a fancy term for master data management), geo-targeting, and integrated marketing.

Obama may be struggling in the polls and even losing support among his core boosters, but when it comes to the modern mechanics of identifying, connecting with and mobilizing voters, as well as the challenge of integrating voter information with the complex internal workings of a national campaign, his team is way ahead of the Republican pack.

All this has some GOP supporters concerned. Martin Avila, a Republican technology consultant, states in the same article that he doesn’t think that anyone on the opposing side fully understands the power of organizing and analyzing all of this data. According to Avila, the current GOP use of information technology is still largely shaped by its pre-Internet experience in broadcast advertising.

In some ways, this cavalier attitude toward the value of data shouldn’t come as a complete surprise. One trait that many members of the so-called “party of business” share with executives in the private sector is a strong attachment to a “gut based” approach to making decisions.

A recent Accenture Analytics survey of over 600 managers at more than 500 companies found that senior managers rarely used data-driven analysis when making key business decisions and instead relied heavily on intuition, peer-to-peer consultation, and other soft factors. According to the study, 50% of companies weren’t even structured in a way that would allow them to use data and analytical talent to generate enterprise-wide insight. In addition, those organizations that did make analytics-based decisions often depended on inconsistent, inaccurate, or incomplete data.

Savvy voters, like savvy customers, have come to expect a certain level of performance and consistency from the IT systems they use. This is bad news for businesses that still think that things like social media, data analytics, and master data management are gimmicks:

Organizations that fail to tackle the issues around data, technology and analytics talent will lose out to the high-performing 10 percent who have leveraged predictive analytics to become more agile and gain competitive advantage.

Creating a structured program for better targeting and more efficient communications seems like a no-brainer these days, but, for now, there doesn’t seem to be a lot of competition.

Further Reading:

    • 1/30/2012 – Slate recently published an article that talks about the different philosophies guiding the development of Democratic and Republican voter databases. Catalist, an independent data initiative, is focused less on profit and more on becoming “an indispensable tactical resource for the American left” with a privately-funded data warehouse containing records of the entire voting-age population combined with other commercially available data. It’s customers include many traditionally liberal groups who consider the Democratic National Committee’s database insufficient. In response, the DNC has stepped up development of its own database, the Voting List Management Cooperative (or “Co-op”). In order to take advantage of the increased desire for voter information, the DNC has also developed statistical models that are particularly valuable for candidates. Meanwhile, the Republican National Committee established the Data Trust, a private company filled to the brim with former RNC staffers and committee members. The goal of this organization is to create robust voter profiles that can be shared with political allies. However, because of concerns about outside influence, the RNC is modeling it more along the lines of the DNC’s data co-operative instead of the more independent Catalist. The Data Trust development model is also less focused on data mining activities and more on basic data.
      7/17/2012 – Another Slate article. This one covers the Romney campaign’s attempt to boost its analytics efforts. Their initial approach appears to center on trying to figure out the President’s strategy by tracking his movements and breaking down his ad buys. This seems pretty reactive to me but time will tell.

    King Bhumibol is a Noob

    In an effort to suppress disparaging remarks about the monarchy, the government of Thailand has recently established an official agency called the Office of Prevention and Suppression of Information Technology Crimes. The sole purpose of this department is to enforce the country’s lèse-majesté laws by combing the Internet for anything offensive to King Bhumibol Adulyadej and his family and then either eliminating or blocking the offensive material.

    Agency technicians have apparently blocked over 70,000 pages so far, including those with pictures of the king with a foot above his head (considered very rude) and those that misuse informal pronouns before the king’s name.

    Punishment for such disrespect for authority can be harsh. Under Thai law, even the digital distribution of information that threatens the “good morals of the people”  will get you five years in prison. For anyone who insults or defames the royal family, sentences can stretch to 15 years.

    I am always surprised at the lengths to which repressive regimes will go in order to “safeguard” the sensibilities of its citizens while trying to maintain the openness and flexibility of the Internet. I’m even more surprised at this particular effort to shield a grown man from the forms of mild online abuse and disagreements that confront other world leaders every day.

    This kind of experience can certainly be frustrating. However, is white-washing the Internet really the answer? Is it even possible? Wouldn’t everyone’s time be better spent teaching the king to deal with a few negative comments rather than censoring the entire Web? I understand the desire for people to protect someone they love from getting hurt but, in the long run, such heavy-handed tactics will probably fail. The Internet is just too irrepressible.

    In an old discussion of online ethics, Simon Waldman notes:

    “I find the views expressed on many organizations’ sites repellent. But one of the greatest achievements of the Internet has been to create the greatest gallery of human opinion in history, and that is something we should marvel at, rather than shake our heads in dismay.”

    Would the people of Thailand deny their king access to such a place?

    BTW: Sawatdee-krap, Mr. Surachai


    Three Rules of PowerPoint

    The sheer ubiquity of the Microsoft Office suite has created a cottage industry around the evaluation and critique of its bundled applications. Microsoft Excel, with its attendant realm of spreadmarts and shadow databases, seems to draw the most negative attention from the business world (particularly IT) but PowerPoint isn’t far behind.

    At various times, the world’s leading presentation software has been banned by CEOs of some of the world’s largest companies, called an internal threat by the U.S. military, and served as the driving force behind the establishment of the Anti-Power Point Party — a Swiss political party whose only stated goal is to rid the world of boring presentations. It has even been suggested that the “chronic use of PowerPoint” at NASA helped obscure critical information that might have prevented the 2003 Columbia space shuttle disaster.

    How has a little presentation program like PowerPoint earned the ire of so many people? Sure, the tool has its flaws (discussed here, here, and here) but are these shortcomings really the cause of all the world’s slide show ills? Well, yes and no.

    Generalized applications like those found in the Microsoft Office suite help companies hold down costs while providing their workforce with a fairly decent toolset. However, once these programs are in place, there is rarely any business incentive to provide additional training or purchase more specialized applications for more complex tasks. This leaves users in a bind. Either they can sit around waiting for more instruction and more powerful tools or they can start experimenting with the tools they already have available.

    Like its close cousin, Excel, PowerPoint suffers from the fact that most people end up using it for tasks that it was never designed to do. In the case of Excel, a simple accounting application has become the de facto database and analytics package for most businesses while, with PowerPoint, a basic slide management tool has supplanted lectures and written reports to become their sole information delivery platform.

    You would think that PowerPoint would be well-suited to this role. People seem to prefer multimedia presentations over standard lectures and studies in dual-coding theory suggest that they retain more information from presentations that have both verbal and visual content. However, because most speakers don’t really emphasize the full visual capabilities of PowerPoint, their presentations become a combination of verbal and textual content … and retention of information presented in this format may be much worse.

    The problem is that cognitive process of creating a presentation in PowerPoint is a lot different from the cognitive process of watching a presentation in PowerPoint. Speakers get so involved in the preparation of their slide deck that they rarely give much thought to how it will be received by the audience.

    Max Atkinson sums it up:

    “PowerPoint makes it so easy to put detailed written and numerical information on slides that it leads presenters into the mistaken belief that all the detail will be successfully transmitted through the air into the brains of the audience. “

    This assumption fails because:

    “… the audience’s attention is split between (1) trying to read what’s on the screen at the same time as (2) listening to and following what the speaker is saying and (3) looking repetitively from speaker to screen and back again.”

    Simply adding a graphic element to a text slide doesn’t necessarily improve knowledge retention, either. Researchers have found that the use of unrelated pictures in a presentation (think clip art) can actually distract the audience from the main content and interfere with overall learning. This is because people end up paying too much attention to the non-essential material on the screen and not enough to the text or narration.

    A more effective technique involves the use of custom-designed images that relate directly to the concepts being presented. Even very complex topics can be tackled using such a combination of pictures and text and the level of information recall is much higher. However, this approach requires a better understanding of how people absorb information and not everyone will have the time or inclination to learn the basic principles of multimedia design.

    The realities of human learning would seem to suggest that a presenter error on the side of simplicity, but that approach comes with its own set of pitfalls. For example, some experts say that you only use one slide for every 2-3 minutes of speaking while others suggest that you should never use more than 2-3 sentences per slide. Doing the math, this means that the “ideal” PowerPoint presentation would deliver no more than 1 to 1.5 sentences to the audience every minute and an hour-long presentation would have a maximum of 90 sentences. Even an adult with below average reading skills can tackle that amount of text in about 10 minutes. Expecting people sit through such a glorified guided reading course is a recipe for boredom.

    Some executives respond to this kind of presentation bloat by imposing an upper limit on the total number of slides … say six or so. While this directive cuts down on the overall size of the presentation it also starts to have a negative impact on the content. To meet this restriction, presenters are either going to try and cram more information on the few slides they have available (making their presentation incomprehensible) or dumb down their presentation entirely (making it irrelevant or even ridiculous).

    Therein lies the dilemma. Some say that creating a meaningful presentation in PowerPoint is impossible:

    “There is simply no way to express precise, detailed and well-articulated ideas or subjects through Powerpoint.”

    Others say that the tool is perfectly fine and that any fault lies with the user:

    “Is PowerPoint bad? No, in fact, it is quite a useful tool. Boring talks are bad. Poorly structured talks are bad. Don’t blame the problem on the tool.”

    My own thoughts tilt towards the idea that the tool has been badly misused and its reputation can be redeemed through proper use. Many people have written extensively on what you should and shouldn’t do with your PowerPoint presentations (here’s one) but I have distilled my own thoughts on the subject down to three basic rules or guiding principles (with exceptions, of course):

    1. Don’t use any text – That’s right, you heard me people … none. PowerPoint is a visual medium and should only be used for visual images. You’re supposed to be telling a story, not writing a grocery list. Simply putting your speaker notes on screen is a cop out and will leave your audience squirming in their seats after the first five bullet points.  Yes, you can create an outline to help organize your thoughts, but by the time you’re done developing your presentation, these blocks of text should be gone. Exceptions: Every visual medium uses some text on occasion. Things like titles, section breaks, tables, end notes, and explanatory text on charts are all welcome in moderation … but, if you strive for a slide deck that is 100% text-free, you might actually achieve something that is 80% text-free, which is way better than 90% of PowerPoint presentations out there.

    3. Only use images or videos that you create yourself – As you struggle with the content of your presentation, it is always tempting to add a little cartoon, GIF animation, or random stock photo to spice things up. Don’t do this. Your presentation should be tailored to deliver a specific idea to a specific audience. Adding someone else’s work to your presentation – even a picture from your company’s own brand library – is just a distraction. Build your own charts, draw your own diagrams, and create your own videos. You will be rewarded with a presentation that is consistent and perfectly suited for your message.Exceptions: If you are you are truly creatively challenged, find someone else who can help you visualize your ideas. Don’t appoint a committee to the task, however, since you want to maintain a consistent visual language.

    5. Focus on your delivery, not your handouts – Using PowerPoint to display pictures and graphs that support your presentation is good … using PowerPoint as a crutch to help you get through your talk is bad. Memorize what you want to say and prepare notes that you can use for reference while you are speaking. The audience should be getting a well- delivered presentation from someone who is organized and confident, not the half-formed thoughts of someone reading from their slide handouts. Exceptions: Seth Godin recommends creating a written document that complements your PowerPoint presentation and handing it out after you’re done speaking. This document shouldn’t substitute for adequate preparation but it should support your key points and provide additional details that help your audience understand the topic.

    P.S. For a PowerPoint presentation of this post, click here.


    Addressing the Loss of Institutional Memory

    There was an interesting article in the news today that highlights the struggle organizations face when they try to preserve knowledge, codify decisions, or record experiences in a way that can be passed on from one generation of staff to the next.

    The backstory is that United Airlines recently re-assigned the numbers Flight 93 and Flight 175 to existing Continental Airlines flights, triggering protests from pilots and flight attendents who felt the reuse of these numbers was disrespectful to the people who lost their lives in the 9/11 hijackings. The two companies are in the process of merging and a spokesperson for the airline claimed that the revival of these particular numbers was “an unfortunate and inadvertent mistake” caused by a computer glitch.

    I certainly believe that the error was unintentional but I think it’s disingenuous to blame this on a simple computer glitch. This is clearly a business process error stemming from the consolidation of two different systems.

    My guess is that somewhere in the bowels of the program that handles the random assignment of flight numbers, there is (or was) a hard coded sequence that screens out codes like 93 and 175 (and, for superstitious reasons, 13 and 666). With the merger of United and Continental, this process was either erased or bypassed, allowing the error to manifest itself in an improper assignment.

    Nobody caught the problem at this point because the activity of assigning flight numbers became disassociated from the cultural weight given certain numbers. In other words, the company’s business processes and business rules were no longer capable of conveying the rationale behind a particular business decision.

    This kind of knowledge hiccup happens all of the time (although probably with less media impact). Most companies are just not very good at setting up a system that reliably captures institutional memory in a way that guarantees continuity. Instead, they rely on their employees to learn and store important aspects of their business. This is called experience or, alternatively, job security.

    The problem with this approach is that demographic trends are working against you. With the inevitable retirement of older workers and the more nomadic nature of the younger workforce, businesses are going to need to start paying more attention to the transmission of knowledge and resources from one person to another. Don’t wait for a knowledge management mistake to show up in the media before you act.

    Is Lying to the Public OK?

    In a recent post, Lane Wallace discusses the pros and cons of a proposed amendment to Canada’s Broadcasting Act of 1986 which would allow broadcasters more leeway to broadcast false or misleading news. As you might imagine, that has generated some controversy, with free speech advocates saying the current law is too restrictive while others were concerned that the modifications would lead to a more toxic (and perhaps less accurate) American-style news environment.

    Basically, the issue boiled down to whether or not people felt that TV and radio broadcasters had the same right to free speech as individuals. Because of their access to the public airwaves and the incredible power associated with holding a broadcast license, the CRTC (the equivalent to America’s FCC) determined that licence holders must be held to different standards and withdrew the proposed amendment,

    Ms. Wallace wonders why we don’t have something similar in the U.S.:

    Is it unacceptable censorship to require someone to be basically honest in what they broadcast as “news” — and which we are more likely to accept as truth, because it comes from a serious and authoritative-sounding news anchor?


    We prohibit people from lying in court, because the consequences of those lies are serious. That’s a form of censorship of free speech, but one we accept quite willingly. And while the consequences of what we hear on television and radio are not as instantly severe as in a court case, one could argue that the damage widely-disseminated false information does to the goal of a well-informed public and a working, thriving democracy is significant, as well.

    One could counter that it is up to the individual to pick and choose their own news sources (something I have also discussed in the past) but she points out that:

    In theory, we could all fact-check everything we hear on the TV or radio, of course. But few people have the time to do that, even if they had the contacts or resources.

    Regulating U.S. broadcasters in such a manner would no doubt raise cries against  the “nanny state” from many circles but I suspect that these same folks are the ones that have raised obfuscation to a high art. In the long run, failing to hold news agencies responsible for their content does more harm to our society than good.

    Notes from the Margin: The Tipping Point

    I finally had a chance to read Malcolm Gladwell’s book The Tipping Point the other day and I thought I’d pull together a few quick comments.

    Things that caught my fancy:

    1. Social epidemics – On his website, Gladwell defines a meme as an infectious idea that spreads through a population like a virus or a disease. He prefers the term social epidemic, however, because he wants to understand more about the details of the ideas or behaviors that are being transmitted as well as the methods of transmission. (Richard Dawkins originated the term “meme” in his book The Selfish Gene.)
    2. The Broken Windows Theory – Even though the debate on the efficacy of this theory still rages, Gladwell gives a good overview of the basic idea that small positive changes in the urban environment can have a large effect on social behavior.
    3. The Rule of 150 – I had heard of Dunbar’s Number (or “monkeysphere”) before I read the book but I really liked Gladwell’s presentation of the concept. He builds up to it by first introducing the idea of channel capacity — the fact that the brain only has a certain amount of space that it can devote to storing or processing various types of information. He than relates channel capacity to a few familiar studies (like George Miller’s “The Magical Number Seven“) and notes how these cognitive limitations can have real-world applications (such as limiting phone numbers to seven digits in length). Finally, he discusses social channel capacity or the natural limit of close relationships among humans, which appears to be approximately 150 people. In any group larger than 150 people, it becomes difficult to keep track of all the relationships.
    4. Maven traps – These are methods of identifying and communicating with people who (perhaps obsessively) accumulate knowledge about very specific things. Gladwell devotes an entire section of the book to these information specialists but only mentions a few examples of Maven traps near the end. I liked his examples because they reveal a lot about human quirks and interests while demonstrating the opportunities available to companies that are willing to listen to these personalities.
    5. Law of Plentitude – This gets a brief mention in the afterward and it refers to a chapter in Kevin Kelly’s book The New Rules for the New Economy. The concept here is that the value of an individual node in a network starts at zero when there is only one node and increases exponentially with the addition of each node. (I haven’t read this particular book yet but I’ve really enjoyed some of KK’s other work, so I’ll have to add it to the list.)
    6. Network nuisance costs – In response to law of plentitude, Gladwell discusses some of the inherent issues associated with the growth of large networks. He points out that the human capacity do deal with phone calls and e-mails does not scale well and that, eventually, people develop an “immunity” to these forms of communication. In other words, as the number of messages increase, people become more selective (through the use of filters) so they don’t become overwhelmed. (He also notes that part of that increased selectivity involves a greater dependence on personal connections and trusted advisers.)

    Rise of the (Old) Machines

    With the recent shutdown of Egypt’s internet and cell phone service, people have started to break out some older technology to keep the flow of information going. Fax machines, dial-up modems and ham radios are all back in vogue now that the communications blackade is up and running. (It kind of makes me regret throwing out my U.S. Robotics 56K modem over the holidays.)

    Interestingly, a contractual loophole that prevents the Egyptian government from accessing decrypted messages and data sent from BlackBerrys has allowed these device to remain functional during the blackout. Apparently, in negotiations with Research in Motion (RIM), the Egyptian government failed to gain access to encrypted data sent via BlackBerry servers. Unfortunately for hopeful revolutionaries in the rest of the Islamic world, the governments of Saudi Arabia, the United Arab Emirates and Indonesia all demanded — and received — such access.