Link roundup for 1/27/2017

I first tried the experiment of a roughly weekly link roundup (including both my own stuff and my favorite reads) in November, and I haven’t tried it since. So, weekly it is not (or has not been, at least). But that’s because of the baseball off-season, during which my work schedule slows down dramatically. With work ramping back up now (pitchers and catchers report in just over two weeks!), I intend to make this a more regular feature at my blog.

With that preamble out of the way, here’s what I’ve been working on recently:

I was happy to write up this piece that summarized a recently-accepted paper I worked on with Greg Matthews and an undergraduate student of his. (I was responsible for a very, very small portion of that paper, so I am kind of mooching off their work. Forgive me.) In it, we analyzed the ways that the BBWAA voters seem to cluster their votes, and found (predictably enough) that the major split is between PED-supporters and non-PED supporters. You can find the paper linked in the article, but I thought it was a cool example of how you can use the summary statistics and public portion of partially anonymous datasets to infer characteristics about the anonymous portion. That fact has applications as far as genetics, where some patients may choose to participate anonymously while others reveal their data. If a small enough portion is anonymous, and you have the overall statistics, you can effectively “de-anonymize” the remaining portion with a method like this.

I wrote this short piece in reaction to Tim Raines finally being elected to the Hall. I never quite embraced the Raines campaign, and that’s mostly because I simply can’t muster up much outrage about the Hall of Fame. It’s never been consistent or objective, and it never will be. We know a lot more about baseball now than we did in 1970, and our metrics have changed, and it doesn’t bother me that we elect a different kind of player. Similarly, I will never understand the moral relativism that goes on in these Hall of Fame debates. That the fans of the 1930s tolerated Ty Cobb launching himself into their midst and throwing punches does not mean that we need to–or should–put up with much less dickish behavior today. As far as I’m concerned, it’s OK (and indeed unavoidable) that our standards for inclusion have changed over time. I guess this opinion is not a very hot take.

For the Athletic, I did a short piece about whether the Cubs hitters and pitchers would suffer at all from the long postseason. As with many effects in sabermetrics these days, there was the small hint that playing deep into October could make a difference, but it did not pass any stringent statistical threshold. As all the low-hanging fruit gets picked, we should expect for this to be a more and more common pattern.

Have to say: never did I think I would write an entire article in response to a tweet from the President, but here we are. The point is simple: Chicago’s murder rate is high, but no higher than it was in the 1990s. In fact, it’s not higher than New York City’s rate was in the 1990s. It’s still too many deaths (as any number of murders would be too many), but the notion that this murder rate represents some radical departure from the past is wrong.


And here’s what I’ve read.

This article is incredibly demoralizing for me. In it, they show Drumpf supporters photos of the Drumpf and Obama crowds, asking which one is larger–a simple, obvious question with only one reasonable answer. And yet, a surprising number of Drumpf supporters pick his crowd as the larger one, defying all rational belief.

So why was this demoralizing? Photographic evidence is a kind of gold standard in my mind. If we can’t convince people with side-by-side photos, then what hope does a more sophisticated and nuanced argument have? I think about this in regards especially to journalism, where we are often trying to make points using words or (in my case) numbers, both of which are abstract representations of data from the real world. If people can ignore what is immediately in front of their eyes, why would they ever choose to think through a reasoned, but complex argument that they disagree with? So this piece made me hopeless.

This article was just fantastic, one of my favorites of the last few months. The history of statistics stuff is interesting, if necessarily incomplete (they barely touch on the important role science played in developing statistical knowledge). But where it really came through for me was the ending, and how they described a future–perhaps even a present–in which people don’t buy into government-supplied statistics.

There are a lot of reasons for the current state of the electorate, and their overall disbelief in objective knowledge. Some of it has to do with identity politics: There is a fraction of the electorate who don’t believe even photographic evidence when it is provided, as detailed above.

But some of the problem is undoubtedly due to the ways statistics has been used and described. In reading the article, one problem that occurred to me is how statisticians consistently describe the average of a group as being representative of the experience of the group as a whole. In other words, if I am describing the population of some town in Nebraska, I may summarize its wealth by the median (or mean) income. But that will grate on individuals in the town who live on the poverty line, and incorrectly describe those who are upper class there.

This is a long-term issue with how statistics are discussed and written about. The noise or variation around the mean is just as important as the mean itself. I think too often statisticians (and writers like myself who translate statistics to a larger audience) stop at the mean, assuming incorrectly that it is a sufficient description of the population.

While the mean should be representative, humans are inclined to disregard information if it doesn’t accord with their prior beliefs (or lived experience). So if you describe a population by the mean, and the reader is at one end of a distribution, they assume (falsely) that not only the mean but the whole dataset is flawed. As a profession (or group of professions), I think we statisticians have to develop a language and a framework to describe the average in the context of the variation around it, in such a way that readers intuitively understand the idea of a range of outcomes–a distribution. At the end of the day, it’s the distribution that matters the most, and the idea of describing that curve has often been abandoned in favor of the simplification of a single number. That’s a mistake.



Link roundup for 9/23/2016

No matter how complete an article feels at the time of publication, there are always a handful of interesting details that slip through the cracks or don’t fit under the word limit. On top of that, I tend to receive a ton of feedback post-publication, some of which is even worth addressing.

Twitter isn’t the ideal medium to respond or provide those additional details. So I wanted to experiment with a kind of weekly, link roundup-style blog post, summarizing the articles I’ve done in the past week and highlighting a few pieces from other authors that are worth reading as well. As mentioned, this will be a trial run for now; your comments and criticisms are welcome.

My Articles

Baseball’s Savviest (And Crappiest!) Bullpen Managers

At FiveThirtyEight, I wrote a followup to last week’s piece on optimal bullpen management with Rian Watt. We extended our metric, which measured the extent to which managers used their best relievers in the highest-leverage spots, to grade individual skippers. Better still, we established a run value for the skill, allowing us to say how many additional wins optimal bullpen management was worth.

The metric itself, which we called weighted Reliever Management+ (wRM+) is best thought of as a retrospective yardstick of a manager’s decisions. It is limited in that it does not factor in fatigue (both day-to-day and cumulative effects), matchups, or how bullpens can change over the year. Much of the criticism toward the piece focused on the fact that we didn’t account for these issues.

All of that criticism is, of course, fair. But insofar as it’s incredibly difficult to judge bullpen management in any sort of rigorous, quantitative way, I think this piece was a significant step forward.

The optimal metric would probably appraise bullpen decisions in a dynamic way, that is to say, on an inning-by-inning basis according to what the manager knows at the time of the decision. (The distinction between retrospective and dynamic measurements was suggested to me by BP writer and all-around good guy Rob Mains.)

So, for example, rather than aggregating season-level statistics as we did, you could build a system to grade every individual call to the bullpen according to which relievers were available in that game, their statistics to date in the season, their projections, the matchup (who they’d be facing), and so on. In this way, you could say whether a manager made the optimal decision based on the information he had at the time, and price in the effects of fatigue and availability.

Such a system would be exceedingly difficult to create, however. You’d need game-by-game information, and you’d have to make a lot of assumptions about when relievers were tired and how much to consider matchups. With that said, I have full faith that eventually, someone is going to make this kind of dynamic scoring system. It’s going to be awesome, and probably more accurate and insightful than wRM+ (although by how much, I do not know). In the mean time, I think of Rian and I’s metric as a step in the right direction, an approximation that works better over longer managerial careers, where factors like bullpen quality tend to even out.

For The Athletic, I wrote about some of the ways October baseball is different from the regular season, and how those factors may affect the overwhelming postseason-favorite Cubs.

It’s striking how distinct playoff baseball is from the rest of the year. On top of the weather and better caliber of opponent, you have very different patterns of pitching usage. As managers get more sabermetrically savvy, I think that October is going to get even weirder and more tactically separated from the rest of the year. Ned Yost pioneered a new style of employing his relievers to more full effect in the postseason, and that increased usage will only grow more pronounced. The increase in pitching quality–both in terms of higher-caliber starting pitchers, and more bullpen action–is probably the single biggest factor which separates October from the rest of the year.

Long-term, I think that means there will be a premium on hitters who can maintain their performance against the highest-quality opposition. That is, if those hitters really exist; so far, sabermetrics hasn’t found much evidence for there being a kind of hitter who is less susceptible to the quality of the opposing pitcher. (Of course, that doesn’t mean that front offices can’t find those hitters better than public analysts.)

Other links
Looking at some of the team-level records being broken this year. More on the Cubs BABIP here:
It’s probably the biggest deviation from the league average of all time (at least for BABIP). So what is it? Defense? A new kind of positioning or shifting? Pitchers who can suppress batted ball velocity?
You’ll never guess the luckiest team in baseball this year.
3% of American adults own half of the guns in the United States. Think about that for a minute. The article is worth a full read.
From R.J. Anderson, on how the Oakland front office has failed to navigate the modern age of sabermetric equality.
A distillation of the righteous anger many feel when thinking about a Drumpf voter. I think I’m more insulated from Drumpf voters than most people; only one person on my Facebook feed ever tweets pro-Drumpf propaganda. As a result, I’m more bewildered and confused than angry.

How I got my job

I get asked one question in particular more than any other—how did you get your job at FiveThirtyEight? It’s a reasonable thing to ask. Data journalism is still so new and rapidly developing that I don’t think there’s any standard path into a position like mine. To whatever extent there is such a path, it probably runs through the same paths as traditional media jobs, either via J-school or from outlet to outlet.

That’s not the path I took. I started by getting my PhD in evolutionary genetics. I had a long-term ambition (since I was a kid) to get my PhD in something, and I felt passionate about understanding evolution in particular. I had the idea (along with many other people) to combine genomics/systems biology methods with evolutionary questions, and so went about finding an opportunity to do that.

I loved the first half of grad school. The first two years of most PhD programs are focused on learning the skills and theory within your discipline, before applying them later on to a research question. My program was an incredible intellectual environment, and I was able to test out ideas among a varied, brilliant group of students and professors.

At the same time, science can be stifling. Setting aside matters of intellectual curiosity, graduate school is also about getting you a job and launching you into (most frequently) an academic career. To that end, a great deal of it is devoted to the messy everyday business of doing science: publishing papers, applying for grants, going to conferences, making the appropriate contacts, and so on. Much of that everyday work isn’t about science at all. I know that academia isn’t unique in this. Like many careers, you have to grit your teeth and accomplish certain goals in a prescribed manner, even sometimes (for me, often) to the detriment of your broader, intellectual mission.

Around about halfway through graduate school, I became increasingly frustrated with that side of my job. I started looking around for a more creative outlet, one where I could ask interesting, data-centric questions without needing the payoff of a full-fledged academic paper to justify my efforts. I started a blog—this blog—and forced myself to do about one piece every two weeks on any topic that interested me. (That pace was calculated to be difficult and uncomfortable, but achievable.)

Perhaps the hardest thing about any regular, frequent writing assignment is finding enough material to sustain it. In search of topics for my blog, I turned to baseball, which has an abundance of available and well-curated data. I mixed a few baseball topics into my rotation, typically doing fairly simple modeling work.

I had neither expectation nor plan that this writing would lead to anything, but about a year into writing my blog, I received an email from Ben Lindbergh, who was then the Editor in Chief of Baseball Prospectus. He asked if I’d like to write for them. I said yes, reasoning that I’d be doing largely the same thing but getting paid a small amount for it.

I wanted to keep the same bi-weekly to weekly schedule, but again, I had no particular ambition to make a career out of my baseball writing. Frankly, I figured I’d be a spectacular failure, which is how I enter a lot of situations. Imagining that I’d be belly-flopping anyway, I decided to be bold in doing so, and try to take on topics too big or complex for others to attempt. I asked Ben if I could name my column Moonshot, partially as a sarcastic joke on myself, and partially in reference to the other, baseball meaning of the word.

I found myself loving the work at Baseball Prospectus. Instead of largely speaking into the echo chamber of my blog (or the broader, but still depressingly empty world of science), I was getting feedback—not only from Ben and the other wonderful writer/researchers at BP, but from the internet at large. (A paradox of internet writing tends to be that the fewer the people reading your work, the greater the percentage of the feedback that is positive.)

After a handful of articles, I was contacted by an MLB team about doing consulting, an opportunity at which I jumped. That seemed to legitimize my efforts, and so for the first time I started considering baseball-writing related careers, instead of just the default academic science path. With that said, I explored team-related opportunities and found them wanting. And with no immediate prospects at other media outlets, I tabled that prospect.

One year into my work at Baseball Prospectus, Nate Silver contacted me about the open baseball writer position at FiveThirtyEight. Apparently Ben, who had recently joined Grantland, had recommended me. I didn’t apply; I hadn’t considered myself good enough to have a shot. But following a few phone conversations, I had a contract offer from ESPN for part-time work that would be about the same time commitment as what I managed at BP.

At this point, I started to consider the idea of a career as a writer more seriously. The contract offer took place as I was finishing my PhD, in the last year of it. For those who have survived graduate school, you’ll recognize this chapter as the toughest time. (The difficulty was further compounded by working two jobs, as well as some changes in my personal life.) For me and for most people, this period is mostly about the part of graduate school I liked the least: finishing papers, pleasing faculty members, and lining up some kind of post-graduate opportunity to show that you are ready to receive your PhD.

Partially as a result of the misery of finishing my PhD and partially because it was the natural continuation of a longer arc in my life, I started scheming toward a career in journalism instead of science. I succeeded in getting a temporary postdoctoral fellowship while I made up my mind and surveyed my options. By the time I had to make a decision about going on to another fellowship, I was completely certain and ready for a change. I renewed my contract with FiveThirtyEight, quit my postdoc, and set about freelancing to build a more full journalistic resume.

I don’t know that there are any major lessons to take from my career path except: be extremely lucky. I certainly was: I owe my whole career to Ben Lindbergh finding my site on a list of Google results. What are the odds?

To the extent that there is a lesson, I’d say it’s that you should start a blog. This is the advice, puny as it is, that I give to nearly everyone. When I was still blogging anonymously, I thought of each piece as a lottery ticket. Like the lottery, the expected payoff on any given ticket is negative, but you can maximize your chances by getting more tickets. Most of my blog posts went basically unvisited, or perhaps by a handful of Facebook friends. Once I got to the front page of Reddit, but it never went any further beyond that.

The winning ticket, so to speak, was a random piece on baseball player injuries that happened to be posted around the same time that Ben was researching one of his articles on the same subject. I doubt it was the best piece on the blog at the time; I’m ashamed at the quality of it now. But it was Good Enough to lead to the next opportunity, which took me to the next opportunity, and so on. I can’t claim much credit for the progression except insofar as I was doggedly persistent in continuing to write on a regular schedule. The rest was good fortune.