This Amazing Map Tells You How Many Times You Live
This Amazing Map Tells You How Many Times You Live
How babby is formed, in one chart.
Blue states watch more porn. But what’s the matter with Kansas?
According to Pornhub Insights, Kansas leads the nation in porn pageviews per capita at roughly 194. They don’t specify what interval this is over (monthly, weekly, etc), but the state-by-state comparison is nonetheless interesting.
Plotting Obama vote share in 2012 versus porn consumption, it looks like blue states consume more porn per capita than red ones. Aside from Kansas - a clear outlier - and Georgia, the remaining top ten per-capita porn consumers are all blue. Similarly, New Mexico and Maine are the only blue states in the bottom ten per-capita porn consumers.
The data do make you wonder what’s going on in Kansas. Per-capita page views is not the most precise metric - a handful of mega-users could likely skew a state’s view totals, as could a high proportion of page views from bots or search engine crawls. Pornhub is probably being purposefully vague here - when it comes to web analytics, small differences can mean millions of dollars. Giving away too much of that information could give their competitors a leg up.
Aha - sharp-eyed readers of Andrew Sullivan’s blog have noted that Kansas’ strong showing is likely an artifact of geolocation - when a U.S. site visitor’s exact location can’t be determined by the server, they are placed at the center of the country - in this case, Kansas. So what you’re seeing here is likely a case of Kansas getting blamed (taking credit for?) anonymous Americans’ porn searches.
Ridin dirty, in one chart
Gallup just released new economic confidence figures with a surprising finding - confidence in DC is flying 20 points higher than the next-highest state, and a full 60+ points higher than West Virginia, the least-confident state.
Except that this isn’t shocking at all - DC is a city, and Gallup is comparing it with states. A few months ago Matt Yglesias wrote about why comparing DC to the states is an exercise in absurdity:
But as a matter of demographics, comparing D.C. to the 50 states leads to madness. If D.C. were a state, it would have the demographic characteristics of a central city. Because of course that’s what it is. You have to compare D.C. to other cities to say something interesting. Or else you can compare the D.C. metro area to other metro areas.
Is it any surprise that economic confidence in an affluent metro area is much, much higher than in a largely rural state like West Virginia? Of course it isn’t. But you can’t really blame the media for running with the comparison when Gallup draws it so explicitly.
Graphic by Christopher Ingraham
Ted Cruz has already referred to Barack Obama’s use of executive orders as “lawless.” Considering it’s a safe bet we’ll be hearing a lot more of this rhetoric in the months to come, I decided to dig into the historical data on presidential executive orders. The American Presidency Project at UCSB helpfully maintains a table of executive orders issued per president, from George Washington onwards. This is super-useful, but it doesn’t account for disparities in the amount of time each president served. So I normalized the data against the number of days in office per president, taken from Wikipedia.
It turns out that the first half of the 20th century was the high water mark of presidential executive orders, with FDR issuing them at a rate of nearly one per day! Obama, by contrast, is currently issuing orders more slowly than any president since Grover Cleveland. It’ll be interesting to check back on this number again a year from now.
One more tidbit: Republican presidents issue more orders than their Democratic counterparts, at a rate of 0.23 vs. 0.18 orders per day in office.
The Brookings Institution’s Vital Statistics on Congress provides average party ideology scores going back to 1947, based on voteview’s DW-NOMINATE numbers. Combining these figures with bill passage numbers from the Resume of Congressional Activity, we get a pretty clear picture of the link between increasing polarization (measured here as the gap between the average ideology score of each party) and decreasing legislative activity.
Some standouts in the data: in the Senate, there’s a period of relative stability in the 70s and 80s, visible as a tangled cluster of values. Looking along the x-axis, it’s plain that in the 2013-2014 term the ideological gap widened considerably. In the house, a similar jump occurred between ‘94 and ‘96 - the time of the Gingrich Revolution.
As this chart from Brookings makes clear, the widening gulf between the parties is primarily due to a strong rightward shift in the GOP. While the gap between parties can theoretically widen indefinitely, we are nearly at the rock-bottom of legislative activity - the Senate, for instance, only passed 14 bills last year. Whether we’re at the bottom of a trough or the start of a new normal remains to be seen.
Notes on data and charts: The “projected” values for the term ending this year use the actual DW-NOMINATE scores listed on voteview.com, combined with a projected number of bills passed based on A)the number of bills passed in the 2013 session and B) the average increase from first to second session throughout the 2000s. The inspiration for the connected scatterplots is Hannah Fairfield’s work at the New York Times, as seen in a blog post by Alberto Cairo.
We’ve all read the headlines: the just-wrapped session of Congress is literally the least productive on record, passing fewer bills (55) than in any previous session, including the “Do Nothing Congress” of 1948.
As we roll into a midterm election year conventional wisdom holds that Congress will do even less in 2014, as political posturing and gridlock will rule in the run-up to November. But records going back to 1947 show that, without a single exception, Congresses actually pass more legislation in election years than in off-years. Our own do-nothing Congress will actually do a little more this year.
Looking at yearly bills passed in the Senate and the House, a clear pattern of alternating peaks and valleys emerges. For any given Congress, the first session (which falls on an odd-numbered off-election year) sees fewer bills passed than the second session (which falls on an even-numbered election year). On average, the number of bills passed from the first to second session jumps by an astonishing 73.4% in the Senate and 64.4% in the House.
This pattern holds true even in Presidential election years - while these are slightly less productive than midterm election years, they still see an average of 61.6% (Senate) and 55.5% (House) more bills passed than in off years.
Now, this is admittedly a crude yardstick for measuring Congressional productivity - for starters, it doesn’t distinguish between significant legislation (the Civil Rights Act) and minor legislation (naming a post office). But it should at least thoroughly debunk the notion that Congress gets nothing done during election years.
So take heart, Congress-watchers: if history is any guide, we’ll see a whopping 24 bills coming out of the Senate and 67 from the House this year. To find out if 2013 was truly the rock-bottom of Congressional inactivity, we’ll have to wait until 2015.
Note on the data: these figures come from the Resume of Congressional Activity. The House clerk maintains these as separate PDF files going back to the 1940s. The charts above count both public and private bills in the tally. I’ve got a csv of this kicking around now - if there’s interest I’ll put it up in Github for others to use.
One of the strongest factors predicting divorce rates (per 1000 married couples) is the concentration of conservative or evangelical Protestants in that county… It turns out that people who simply live in counties with high proportions of religious conservatives are also more likely to divorce than their counterparts elsewhere.
Also, snazzy multi-variate color grid on that map. How do we feel about this? Initial impression: when I look at that map I register the light-to-dark variation, but the color variation gets lost.
A commenter “Florin” at the Scotch and Ice Cream blog cleaned up the data and re-ran the analysis, and generated four slightly different clusters: peaty whiskies, ex-sherry whiskies, ex-bourbon / no peat whiskies, and whiskies with some ex-sherry blended in or with some peat. Extending the analysis to five clusters apparently succeeded in “separating the hard-core peated whiskies from the less-peated ones”.
David Wishart, the Nate Silver of whiskey tasting.
The data file on the University of Strathclyde page was completely unsourced, leaving a lot of open questions:
I went to /r/scotch with my questions, and within an hour they set me on the right path. Redditor “howheels” did some domain research and found that whiskyclassified.com changed hands and entered its current spammy incarnation in April 2013. Prior to that it was a promotional site for a book, Whisky Classified: Choosing Single Malts by Flavour. Written by David Wishart of the University of Saint Andrews, the book had its most recent printing in February 2012.
You can see the original site for Wishart’s book using the Internet Archive. The most current version, however, seems to have migrated to Saint Andrews, where among other things you can find a fairly detailed methodology for how the flavor scores were arrived at. Bingo!
I’ll quote at length, because it’s interesting stuff. Wishart started with tasting notes from ten different previously-published books. The man was an aggregator before aggregating was cool - a Nate Silver of whiskey tasting.
Most distilleries produce several brands that are differentiated by length of time in cask, special conditioning or finishing, e.g. to impart flavours such as oak, sherry, port or Madeira to the whisky. As our objective was to develop a classification of malts that are readily available to consumers, we felt we should select a benchmark malt whisky from each distillery. We firstly excluded rare malts and any premium brands that are specially aged, cask conditioned or finished. We also decided not to cover distilleries that had been demolished or are not currently in production.
Not all of our 10 authors reviewed the same distillation from each distillery, as some limit their tasting notes to house style only (e.g. Milroy (1995)). Where more than one distillation is produced we selected the most widely available brand, usually of 10-15 years maturation in cask. New distilleries that currently offer young malts (Arran and Drumguish) were included for future reference, as they evolve. Vatted malts (blends of pure malts), and malt whiskies produced in Ireland, Japan, New Zealand and Wales were excluded. We thus arrived at 86 single malt whiskies of around 10-15 years maturation, most of which are widely available in the U.K.
This is key - the scores are based on one representative, commonly-available whiskey from each distillery, not an average range. This obviously elides any differences between a distillery’s products, but on the other hand you don’t have to worry about the score for any given distillery reflecting some arcane bottling that you’ll never hope to find.
A vocabulary of 500 aromatic and taste descriptors was thus compiled from the tasting notes in the 8 books. These were grouped into 12 broad aromatic features: Body (Light-Heavy), Sweetness (Dry-Sweet), Smoky (Peaty), Medicinal (Salty), Feinty (Sulphury), Honey (Vanilla), Spicy (Woody), Winey (Sherry), Nutty (Oaky-Creamy), Malty (Cerealy), Fruity (Estery) and Floral (Herbal).
The 12 flavor categories are condensed from 500 different descriptors used by the original authors (not sure why he says 8 books here?). This might have been more of an art than a science - one man’s ‘smoky’ is another’s ‘peaty’ - but a necessary one.
Similar to the Revolution Analytics blog post, Wishart does some cluster analysis to arrive at 10 different flavor groupings. There’s a lot more detail on his methodology page that I won’t get into - be sure to give it a read.
So it looks like we’ve answered our initial questions about the data. Finally, if you poke around the site you’ll find a link to a downloadable Windows program Wishart wrote called Whiskey Analyst. Meant primarily as way to record tasting notes on individual whiskies, it also contains a text file of Wishart’s notes and scores. The data are in an idiodyncratic format (one column, 11,000 rows, with groups of rows for each whiskey) so I can’t be certain, but it looks like there might be a lot more detail here than in original dataset - multiple products for each distillery, rather than one representative sample. This would definitely be a fun Processing/R project for someone to go through and convert it to a standard .csv - have at it!
Kevin Schaul recently posted an example of a reusable d3.js radar chart, using data on 86 scotch distilleries. I forked it and made a few cosmetic changes, and got a request to make an infographic version. An image file of the results is here; pdf version is here.
The dataset is something of a mystery. Among other things it contains flavor scores (0-4) across 12 categories for each of the whiskies. But there’s no definition of what, exactly, the categories mean, or how the scores were tabulated. A link in the footer of the data page goes to a rather generic site offering whiskies for sale. I shot an email to the math deparment at the University of Strathclyde, which hosts the page containing the data - I’ll update if I hear anything.
Those caveats aside, assuming the data is legit it paints a fascinating picture of scotch flavor varieties. I can testify that the flavor scores for Laphroaig appear to be spot-on, and based on how the Laphroaig chart looks I should probably try Lagavulin and Ardbeg.
For a more stats-based dig into the whiskey data, check out this recent post from Revolution Analytics.
A paper recently published in Frontiers of Zoology finds that dogs exhibit a significant preference for aligning themselves along the magnetic North-South axis when pooping. There are charts, graphs, and yes, a photo.
Just as interesting as the clustering along the N-S axis is the marked avoidance of the E-W axis, with the notable exception of two deviants (free-thinking beagles? loopy retreivers?) who prefer East-to-West pooping.
There’s lots of interesting stuff in the report and it’s fairly accessible to the non-specialist, although I’d love it if someone could explain like I’m five the difference between ‘calm’ and ‘unstable’ magnetic fields.
2014 is already shaping up to be a great year.
My brother and I grew up in Oneonta, NY, a smallish town roughly smack-dab in the middle of the state. Recently we both took that NYTimes dialect quiz (my map above).
I got two Western New York cities (Rochester and Buffalo) along with Arlington for my similars. My bro, on the other hand, got Newark NJ, Yonkers NY (just north of NYC), and Providence RI.
If you’ve ever been to Buffalo and Yonkers, or Rochester and Newark, you know that the cities have very different dialects. How did my brother and I end up with such different responses despite growing up under the same roof with the same parents in the same town?
My father grew up in Western New York - Niagara County, a stone’s throw from both Rochester and Buffalo. Mom, on the other hand, is from downstate - Rockland County, across the river from Westchester and not far from Newark.
Evidently, I had subconsciously modeled my speech patterns after Dad’s, while my brother grew up talking like Mom. The fact that either of us had absorbed different dialect quirks from different parents is something that never occurred to me. In fact, it’s the kind of fundamental insight into a family’s dynamics that usually only comes with, say, years of therapy.
That this type of insight can now be gleaned from a 15-minute web quiz tells me that we are living in a ridiculously exciting time - what else are we going to learn about ourselves through smart and creative applications of data?
Incidentally, have you compared your dialect results to those of your siblings? If so, drop me a line.
In case you were wondering.