Tag Archives: Map

Most Popular Word Roots in U.S. Place Names

My family visited Washington D.C. last year for Spring Break and, during our 12-hour drive, I remember noticing a subtle change in the names of the cities and towns we were passing through. In the beginning, the place names had a familiar mid-western flavor; one that mixed Native American origins (e.g. Milwaukee, Chicago) with bits of French missionary and 19th-century European settler. The names slowly took on a more Anglo-Saxon bent as we moved east, traveling through spots like Wexford, PA, Pittsburgh, PA, Gaithersburg, MD, Boonsboro, MD, Hagerstown, MD, and Reston, VA.

We have English-sounding place names in Wisconsin, of course, including highfalutin towns like Brighton, Kingston, and New London, but they seem to get overwhelmed by the sheer number of places with syllables like “wau”, “kee”, and “sha” (or all three combined). Many of these town names can be difficult for “outsiders” to pronounce and the spelling is all over the place since they were often coined by non-native speakers who’d misheard the original words. (The Native American word for “firefly”, for example, is linked to variations like Wauwatosa (WI), Wawatasso (MN), and Wahwahtaysee Way (a street in MI).)

I thought it would be interesting to see if there were any patterns to these U.S. place names or toponyms so I pulled a list of Census Places and extracted the most frequent letter combinations from the names of the country’s cities, towns, and villages. I tried to isolate true prefixes and suffixes by remove any letter pairings that were simply common to the English language and I then counted up the number of times each word root appeared and ranked them by state.

Top 10 Word Roots by State

After looking over the top word roots by state, I was interested in seeing more detail so I calculated a location quotient for some of the most common word roots and plotted these out by county. Click on the maps for a larger D3 map.

Location Quotient for “ton”
ii_Map_word_root_ton
The word town derives from the Germanic word for “fence” or “fenced settlement.” In the U.S., the use of -ton/-town to honor important landowners or political leaders began before the American Revolution (think Jamestown, VA or Charleston, SC) and continued throughout the settlement of the country. (Interestingly, my hometown of Appleton, WI was named for philanthropist Samuel Appleton and is not a true town-based word root.)

Location Quotient for “boro/borough”
ii_Map_word_root_boro_borough
The word borough originates from the Germanic word for “fort” and has many common variations, including suffixes like -borough/-boro, and -burgh/-burg. Like -ton/-town, these place name suffixes became popular in the 18th century and were used extensively throughout New England and the Atlantic coastal colonies. You can see how dominant the -boro/-borough suffix is in the upper Northeast.

Location Quotient for “ville”
ii_Map_word_root_ville
The suffix “ville” comes from the French word for “farm” and is the basis for common words like “villa” and “village”. The use of the suffix -ville for the names of cities and towns in the U.S. didn’t really begin until after the Revolution, when pro-French sentiment spread throughout the country — particularly in the South and Western Appalachian regions. The popularity of this suffix began to decline in the middle of the 19th century but you can still see it’s strong influence in the southern states.

Location Quotient for “san/santa”
ii_Map_word_san_santa
The Spanish colonial period in the Americas left a large legacy of Spanish place names, particularly in the American West and Southwest. Many of the Californian coastal cities were named after saints by early Spanish explorers, while other cities in New Spain simply included the definite article (“la”, “el”, “las” and “los) in what was often a very long description (e.g. “El Pueblo de Nuestra Señora la Reina de los Ángeles del Río de Porciúncula” … now known simply as Los Angeles or LA). The map shows the pattern for the San/Santa prefix, which is strong on the West Coast and weaker inland, where it may actually be an artifact of some Native American word roots.

Location Quotient for “Lake/Lakes”
ii_Map_word_root_lake_lakes
The practice of associating a town with a nearby body of water puts a wrinkle into the process of tracking of place names (the history of “hydronyms” being an entirely different area of study) but it was common in parts of the country that were mapped by explorers first and settled later. This can be seen in the prevalence of town names with word roots like Spring, Lake, Bay, River, and Creek.

Location Quotient for “Beach”
ii_Map_word_root_beach
There is a similar process for other prominent features of the landscape such as fields, woods, hills, mountains, and — in Florida’s case — beaches.

Location Quotient for “wau”
ii_Map_word_root_wau
Here is the word root that started this whole line of inquiry. It is apparently a very iconic Wisconsin toponym, with even some of the outlying place names having Wisconsin roots (the city of Milwaukie in Clackamas County, Oregon was named after Milwaukee, Wisconsin in the 1840s).

D3 Notes:

Infographics and Data Visualization (Week 5/6)

I took part in a brief discussion on the student forum after the Week 4 project and it made me realize that I’d been spending so much time trying to create a functional interactive graphic in Tableau that I was missing out on practicing some of the basic techniques of the class. When you combine that with the fact that my favorite attempt was a sketch I laid out in PowerPoint, I decided that I should try to focus on the structure and design of the graphic to see what I could come up with.

The topic I picked was based on some data that I’d pulled back in May/June that I’d never had a chance to use. This data covered all of the various U.S. breweries and the variety of beers they made. I did some additional research to add some information on beer ingredients (especially water, barley and hops) as well as some interesting stats on beer consumption based on a few fun maps done at FloatingSheep.

I spent a good deal of time coming up with the basic grid of the graphic, which ended up having a static left hand column for the introduction to each topic and then an interactive map of the U.S. on the right. The interactive portion consists of tabbed sections that allow you to navigate through several subtopics.

The flow of the of the series starts with an overview of beer production in the U.S., moves to a section on the ingredients of beer, and ends with information on American beer consumption. (I also thought about including some local beer stats for the great State of Wisconsin but that may have to wait.)

Due to time constraints, these mockups contain sample maps from other sources (here. and here):

Geographic References in Local Business Names

This little exercise came about after I read an article on the old Northwest Territory in the U.S., which basically consisted of all the land west of Pennsylvania, northwest of the Ohio River, and east of the Mississippi River. As the country expanded westward, this geographic area gradually became known as the “Midwest” (or the East North Central States region) but not before the older name left its mark on the local culture. Organizations like Northwestern Mutual Life (Milwaukee) and Northwestern University (Chicago) still refer back to to the days when these places were located on the fringe of the country, not at its center.

It occurred to me that researching such place names would be a good way to see if there was still a residual “shadow” of the old Northwest territory so I downloaded a sample list of company headquarters with the phrase “Northwest” or “Northwestern” in their names and plotted them on a map. Alas, this attempt failed to find anything significant (there was too much competition with the Pacific Northwest in name usage). However, I did look up some other regional terms with more positive results.

 

The geographic patterns for most of these terms are fairly distinct but there are also some areas of overlap. It was especially interesting to see regions that had local businesses in three or more categories. The old Northwest territory fits this mold with a combination of Midwest, Great Lakes, and Prairie.

Wisconsin Voters Banished to NULL Island

The top headline in my local paper this morning was “Glitch puts some Wisconsin voters in Africa” … an interesting thing to ponder over a bowl of Quaker Oatmeal Squares. I suppose this problem merits at least some attention given the heated political climate surrounding the state’s voter redistricting process. But headline news? Above the fold? Sounds like a slow news day to me.

Online, of course, the debate has already devolved into the standard round of mudslinging and name-calling so good luck trying to find out what’s going on from that crowd. The reporters themselves focused on the political fallout of the issue rather than an explanation so no help there either. I guess it’s up to the humble folks at Ideas Illustrated to offer up some insight!

The first clue to the problem can be found in the article’s pullout quote, which describes the voter’s location as the “coast of Africa” and not a specific country in Africa. The second clue can be found deep within the article when it is mentioned that clerks have recently made changes to the way voters are being entered into the voter registration database:

” … voters are [now] being entered into different districts by the physical location of their address in computerized maps. Previously, they were entered into different districts in the state voter database according to where their address fell in certain address ranges.”

These two hints point to a very common problem associated with geocoding, which is the process of converting a postal address to a set of map coordinates. Let’s backtrack. An online mapping tool like Google Maps uses specific geographic coordinates (latitude and longitude) to place a location on a map. However, because none of these physical locations are actually stored in a database anywhere, the tool needs to interpolate the coordinates from a vector database of the road network (i.e. a mathematically represented set of lines).

For example, if you look up the address for Trump Tower, you find that it is located at 725 Fifth Avenue in Midtown Manhattan. When you enter this address into Google Maps, the tool finds 5th Avenue on the underlying road grid and then uses an algorithm to determine that the “725” address is somewhere between 56th and 57th streets. It will also determine which side of the street the address is located based on stored knowledge of the “odd” and “even” numbering pattern. In other words, it’s guessing.

Google Map detail of the area around Trump Tower

 

TIGER/Line® Shapefile detail of the same area

These guesstimates work pretty well in dense urban environments where there are a lot of cross streets to serve as reference points. In rural areas, the curvilinear streets and widely-spaced buildings make things a little more difficult. When the situation gets really muddled, some mapping tools essentially “punt” and enter a default set of coordinates. In the case of the Wisconsin voter addresses, these default coordinates are 0.00 degrees latitude and 0.00 degrees longitude. Where is this exactly? It is the intersection of the Prime Meridian and the Equator … which occurs just off the coast of Africa.

Geographers have actually given this place a rather fanciful name called NULL island (it is not, in fact, a real island). It even has its own web site and unofficial flag (below right).

So there are no nefarious schemes behind this situation … just normal, everyday data problems. The state clerks need to tell their IT guys to flag the errant voter addresses and then they can assign them to the appropriate districts by hand. Problem solved. However, they should be aware that interpolation is an imperfect process and, in addition to assigning blocks of voters to NULL island, the geocoding process may also assign voters to the wrong districts. This could be particularly true for people who live close to a district boundary. It might actually make sense to keep the old method around for backup.

Six Degrees of Joy Division

My local record store used to have this great poster on the back wall that explained how several dozen British indie bands from the 80s were all linked together through their various group members. The title of the poster was something like “Why All These Bands Sound the Same” and it was clearly a tongue-and-cheek slam of the gloomy post-punk sound of musical groups like Bauhaus and the Smiths.

I loved the design concept and looked for the poster when the store finally went out of business a few years ago. Although I never found it, it occurred to me recently that I might be able to reconstruct the graphic using some modern tools and data from the online music site AllMusic.com.

AllMusic is an outstanding musical resource and their meticulous site formatting allowed me to write a program that would crawl from page to page gathering information about interrelated bands and band members as it went. I decided to use the group Joy Division as a starting point because I liked the movie Control and had a vague memory of that particular band name appearing on the poster. The program ran over night … evaluating 37,538 separate pages before it completed its run.

Using the IBM visualization tool, Many Eyes, I created a network diagram of the bands that are within six steps of my “seed” group. The full interactive results are at the end of the post (worth the effort if the Many Eyes site is working) but here is a detail:

>

The Joy Division Network

At nearly 38K records, this particular musical network covers a huge swath of Anglo-American rock-and-roll and includes almost all of the major groups in the Pop/Rock genre. What’s perhaps most interesting about this massive network is the fact that Joy Division is only linked to two bands directly, the acclaimed New Order (formed in 1980 after the death of JD vocalist Ian Curtis) and the Manchester supergroup Freebass (formed in 2004). All other connections are indirect, with a total of 20 degrees of separation between Joy Division and the most distant band in the network, post-grunge Los Angeles outfit Open Hand (formed in 2000).

>

Other Thoughts on the Data

The first odd thing I noticed about the network was that, by focusing on the relationships between bands, the network excludes a lot of well-known solo artists. Even when these musicians joined a band, their independent careers limited these associations to one or two instances. The best example of this situation would be someone like Elvis Presley or Johnny Cash. Both of these artists were loosely linked together through a glorified hootenanny called The Million Dollar Quartet (along with Carl Perkins and Jerry Lee Lewis). The only other bands in this network are The Offenders and the Cash-related groups The Highwaymen and Johnny Cash & the Tennessee Two. Some of the other solo artists in this minor network are household names (depending on the household, of course), including Waylon Jennings, Kris Kristofferson, and Willie Nelson. Three bands, a half-dozen stars and a lot of hits … but no direct connection to the huge Joy Division network. Many current rap artists seem to fit this mold as well.

On the flip side, progressive rock groups like King Crimson had members who were in dozens of other bands. These social connectors can be seen at the center of a huge spider web of interrelated groups in the network diagram. Bands like these are often experimental in nature, with talented musicians putting their stamp on a number of different side projects. Some very influential artists can be spotted in the midst of these groups, including — using King Crimson as an example — famous journeyman players like Robert Fripp, Adrian Belew, John Wetton and Greg Lake.

Finally, although I distinctly remember the band Bauhaus and its associated constellation of bands (Love & Rockets, Tones on Tail, The Jazz Butcher, etc.) on the poster, they were not within six degrees of separation of Joy Division in the network data (they were about eight links away). This exposes an issue with my data gathering methodology because it doesn’t take into account other relationships between artists such as mentors, guest musicians, common producers or other ties. Still, it was an interesting exercise with fruitful results.

Additional Interactive Charts

Bubble diagram of musical styles (full band network):
Network diagram (six degrees of Joy Division):