A New View on Pitch Sequencing

The core of baseball is the duel between pitcher and batter.  The pitcher’s job is to throw a baseball past the batter, or otherwise induce the batter to make an out (via a weakly hit ball, for instance).  The batter’s job is the opposite: to make solid contact with the ball, transferring enough force to either cause the batted ball to be difficult to field (e.g. a line drive) or impossible to field (a home run).

An interesting aspect of this battle between pitcher and batter is the fact that the batter can, in theory, hit any pitch within the strike zone (at the major league level).  It is straightforward physics to reason that, given enough information, a batter can make good contact with any strike, and indeed, MLB hitters can get hits from even 100+ mph fastballs (albeit rarely).

The fastest pitches still take about four-tenths of a second to go from hand to plate, which is more than enough time to swing.  I would argue that the crucial weapon the pitcher has in his arsenal isn’t speed, it’s uncertainty.  If a pitcher can cause a batter to adjust their swing mid-cut, or not swing at all, that is a vastly more powerful advantage than the milliseconds that can be chopped off by simply throwing harder.

For the remainder of this post, I’ll investigate how pitchers utilize uncertainty in their pitch sequences.  MGL has hypothesized (or perhaps has data to show?) that pitchers optimize their sequencing of pitches such that the identity of the first pitch in a sequence gives no information as to the identity of the next pitch.  That is to say, conditional on the first pitch being, say, a fastball, the batter has no more idea of what the next pitch will be than he did before the first pitch.


(Re-)Introducing Entropy

While unpredictability in pitch type is an important weapon in the pitcher’s arsenal, it’s not clear how to quantify that uncertainty.  For the duration of this post, I’ll borrow a metric from information theory, entropy (for more details on its calculation, see below).  Entropy is a measure of uncertainty, and it can be thought of as the number of yes/no questions required to determine the identity of an unknown quantity.  In baseball terms, a pitcher whose arsenal is employed with great entropy prevents the batter from guessing the next pitch effectively.  Conversely, a pitcher who throws less entropically allows the batter to predict the identity of the next pitch more easily.

Using Pitch F/X data from Baseball Savant, I investigated patterns of pitch type entropy in two pitchers (an important caveat to note is that I’m trusting Pitch F/X’s pitch type classifications).  Let’s start with one Clayton Kershaw, he of the the recently signed $215 million dollar extension.

Kershaw employs four pitches.

Pitch TypeFastballSliderCurveChangeup
Number of pitches223792046485

Generally, Kershaw uses his fastball most of the time, which is just fine because he averages a solid 92.6 mph on it.  When he’s not using his fastball, however, he has excellent breaking balls, most especially his trademark curve.  Thinking about it from the batter’s perspective, the most important question might be: fastball or breaking ball?  If a fastball, the batter has precious little time to react, but knows that there won’t be too much motion.  If a breaking ball, the batter will have slightly more time, but has to factor in the horizontal break on the ball in his decision when and where to swing.

Doing the math, one finds that the total entropy of Kershaw’s repertoire is almost exactly 1 bit (.997 bits).  Remembering that entropy can be thought of as the number of yes or no questions required to elucidate the type of pitch, we can sort of imagine that yes or no question being “Is it a fastball?”  (I’m sweeping some of the mathematical details under the carpet here, but if you’re interested there’s more explanation below.)

Because the maximal entropy possible with 4 pitches is exactly 2 bits (which would be achieved if each pitch was used exactly 25% of the time), we can see that Kershaw is approximately half as entropic as he could be.  That doesn’t necessarily disprove MGL’s theory, because not all pitches are equally effective.  To figure out how often to use each pitch, Kershaw must also consider the fact that his fastball is lethal, and his curve devastating, while his changeup is merely adequate.  In more mathematical terms, Kershaw is really optimizing along two dimensions: the first is pitch quality, and the second is overall entropy.  The maximal entropy distribution need not be the best to get hitters out.


The Entropy of Sequences

Let’s now engage with sequences of pitches.  As I noted above, under conditions of optimal pitch sequencing, the first pitch ought to provide no information at all concerning the next pitch.  With conditional entropy, we can rigorously quantify this possibility.  Conditional entropy is as it sounds: the entropy of the next pitch, conditional upon knowing the first pitch in a sequence.  If Kershaw manages his sequencing optimally, the conditional entropy ought to be nearly equal to the unconditional entropy.

I pulled out all 2-pitch sequences in Kershaw’s PitchF/X data.  All 16 possible combinations occurred (too many to display well); I reproduce here the set beginning with fastballs.

Pitch TypeFastballSliderCurveballChangeup
Pitch Frequency50425117715

First, I must note that by limiting it to two-pitch sequences, the numbers of and thus the entropy of Kershaw’s pitches changes slightly (one-pitch at-bats and such are eliminated); it becomes 1.087193.  But more interestingly, calculating the conditional entropy for each possible beginning pitch results in this:

Pitch Sequence Starts With (Conditioned On)Conditional Entropy

Remembering that more entropy = less predictability, we see that Kershaw’s conditional entropy is only very, very slightly lower than his unconditional entropy, and certainly not significantly different.

Conditional entropy ~ unconditional entropy.
Conditional entropy ~ unconditional entropy.

The takeaway here is that MGL is right; Kershaw sequences his pitches in such a manner that the batter, knowing the first pitch, has no more information about the next pitch.


Kershaw Isn’t Unique

The initial comment that may come to mind is that Kershaw is special.  After all, he’s the reigning Cy Young winner and arguably one of, if not the, best pitcher(s) in baseball.  Perhaps lesser pitchers are worse at managing their entropy, resulting in the batter being able to predict the next pitch and capitalize on that knowledge.

To go to the opposite extreme, I selected one Joe Saunders, a decidedly lesser pitcher with a similarly large number of pitches (sorry, Joe…).  I ran him through the same set of analyses.  Mr. Saunders is a rather more entropic pitcher at 1.443 bits, which may have to do with the fact that he has worse “stuff” than Kershaw (an idea to investigate further in future posts).  Nevertheless, Saunders doesn’t come close to maximizing the entropy he could possibly achieve with his 5-pitch arsenal.

And yet, despite his greater entropy and worse results, his conditional entropy nearly matches his unconditional entropy, just as we saw before in Kershaw’s case.

As above.
As above.

Thus we see that Kershaw, one of the very best pitchers in MLB, is similar to Saunders with respect to the entropy of his pitch sequencing, even though Saunders is decidedly… ahem… not one of the best pitchers in MLB.


Entropy as Tool

I’ll wind up with a couple of conclusions.  The first is that hurlers don’t come close to maximizing the entropy of their pitch types.  Given how widely a thrower’s pitches may vary in quality, this pattern is perhaps unsurprising.  The second is that sequences of two pitches, at least for this limited sample of pitchers, are effectively as entropic as possible.  This pattern confirms that for a batter, knowing the first pitch helps not one bit in guessing the next pitch, and effectively reaffirms MGL’s excellent prediction.

Which is all fine, well, and good; but for my part, I’m more excited about the prospects of using entropy in future studies.  There’s a whole host of questions still to be answered, for which entropy seems an ideal tool: from the mysteries of longer sequences (3 pitch sequences already begin to show interesting trends) to the differences between pitchers, entropy is a promising step towards understanding how pitchers vary speed, spin, and location to disrupt hitters.


Appendix: A Brief Calculation of Entropy

Entropy is one of the most important and mysterious quantities in science, and plays pivotal roles in both computer science and physics (not to mention statistics).  It can be thought of in the following way.  Imagine a pitcher with two pitches: fastball and changeup.  Imagine that the two pitches are equally effective. The formula for said hurler’s entropy would be:

Entropy of Imaginary Pitcher = -1 *(probability of fastball x logarithm(probability of fastball) +probability of changeup x logarithm(probability of changeup) )

It is straightforward to show that the entropy is maximized when the pitcher divides his pitch frequency evenly, so that he pitches exactly 50% fastballs, and 50% changeups.  In the case of two pitches, the maximum entropy is 1 bit.

The more general formula is…

Entropy  (Wikipedia)

Let’s break this down into words, working left to right (H(X) is simply a way to denote entropy).  First, the squiggly sign stands for the sum, which just means that we repeat the calculation for each of the categories in our set.  For pitch types, we can think of these as being the different varieties of pitches, where n is the total number of different types.  Going right once more, we see p(xi), which is the probability of the ith pitch.  We then multiply it by the log of that same probability.  For my purposes, not knowing the true probability of e.g. a fastball, I consider the observed frequency of fastballs in the sample divided by the total number of pitches as an estimate of that probability.  We then repeat this calculation for each of the pitch types, sum the result, and multiply by -1; voila, entropy.  As above, the maximal entropy is achieved when each of n pitches is thrown with probability 1/n.

Entropy’s pretty complex, so for more thorough (and eloquent) treatments, I’d recommend these links.

The Hall is Getting Smaller

This post is in reaction to Dave Cameron’s recent pair of entries on Fangraphs, which show that the percentage of players enshrined in the baseball Hall of Fame (henceforth, HoF) has shrunk quite substantially in recent history.  I liked the argument, but I thought it missed the point to some degree.  The HoF isn’t about inducting the best X% of players per decade, it’s about inducting players who meet some standard of value, and also balancing their baseball achievements against their reputations (rightly or wrongly, this is very much the case).

In the following post, I’ll examine the distribution of WAR by birth year to determine how one aspect of HoF induction has changed over time.  One response to Dave’s post could be summarized as follows: if the percentage of players being enshrined has declined, perhaps that is simply because fewer HoF-type players are being born per year.  This argument is not as silly as it sounds on first glance: there are various reasons, related to baseball’s eligible population changing size over time, modern medicine and training, the use of relievers, and so on, which might cause fluctuations in players’ career WAR over decades.

A disclaimer is in order: for the remainder of the post, I’ll be using WAR as an approximate metric of player value.  All SABRists know that WAR is an imperfect metric, but it is fair in the sense that it is consistently applied to all of the players in the sample.  If you don’t like WAR, though, you’ll think the remainder of my argument is bullshit, and that’s OK with me.  If you have a better suggestion than WAR, let me know.


The Trend of Mean WAR is Constant


The experiment is simple: given all of the position players born in some year (1880-1970), what is the mean WAR they produced?  The data is plotted above.  Clearly, there is no significant trend: mean WAR stays constant over the entire range, and a linear regression of year on WAR produces a p-value of .17.  The same, incidentally, is true of max WAR per birth year, but I didn’t think it deserved a graph.


The Trend of HoF-class Players is Increasing

The above graph is not a precise answer to the question, however.  It may still be possible that despite average WAR not changing, the distribution of that WAR has been altered, so that perhaps there are more mediocre-good players and fewer outstanding players.  This would lead to a decrease in HoF inductions over time.  To examine this, I defined a cutoff of 60 WAR as the minimum for a player to be “in the conversation” for the HoF.  I then examined how many such players were born each year from 1880-1970 (same timeframe as above).


The number of Hall of Fame class players per birth year stays in roughly the same range for the whole duration I’ve examined.  In point of fact, there is a slight increase towards the end of the timeframe, and a regression of year on HoF class player number is significant, surprisingly.  Rephrasing: if any trend exists in the data, there were more HoF class players born recently.


The Hall is Getting Smaller

So, there’s two conclusions from this.  One, the mean WAR per birth year hasn’t changed over time at all.  Two, the number of roughly Hall-class players either hasn’t changed or has perhaps increased over time.

What light does this shed on the Hall voting process?  Not much, but maybe some.  By this I mean the following: clearly, the threshold for induction has increased in recent years.  Interestingly, however, the number of players in the range necessary to be considered for induction has also increased.  One finds, upon perusal of the late 60s players, a rather suspicious trend (of course).  1968, with the greatest number of Hall-class players has a few rather familiar names: Jeff Bagwell, Mike Piazza, and Sammy Sosa.  These players have in common the assumed use of steroids, and I think that is the root of the Smaller Hall.

I won’t pass judgment on the BBWAA for this decision; it is what it is.  What’s clear is that the Hall has become harder to achieve.  If you side with the writers, that’s because people started achieving superhuman greatness via artificial means, and so it became more difficult to disentangle the “true” HoF-class players from the “artificial” HoF-class players.  If you are against this line of reasoning, it is perhaps because you don’t see a real distinction to be made there.

The Hall itself is a weird, artificial construct, designed by humans to celebrate greatness in adults who play kids’ games for lots of money.  It’s hard for me to say definitively “the Hall should be this way!” or that way or another way; it’s made up, so it should be the way whoever made it wanted it to be.  To the extent that the Hall influences the people that society reveres, maybe it should focus on inducting players who were good people.  On the other hand, to the extent that the Hall reflects greatness in baseball achievement specifically, maybe Sammy Sosa and whoever else used anabolic steroids should be there.  Perhaps there’s a perfect balance between those two purposes, but that balance is not something I feel qualified to determine.  It’s fortunate, then, that I’m not a member of the BBWAA and therefore not meant to determine it.

the Possibly Injury Prone Jacoby Ellsbury

Ellsbury as Example

Jacoby Ellsbury just signed with the Yankees.  As part of the analysis of his contract, there was much consternation from the internet concerning two factors: his age and his injury history.  No one doubts that Ellsbury could be a productive player over the length of his contract, but there is a significant concern that due to his position, skills (speed and defense), age, and injury history, he will fail to meet his projections and end up a massive overpay.

Because I had Jeff Zimmerman’s injury data on hand from my recent post on aging and injury probability, I decided to take up the question of how much a player’s prior history predicts his future probability of injury.  In Ellsbury’s case, there is a further complicating factor: his prior injuries occurred in non-standard (“freak”) ways, primarily via collision with other players.  Because these prior injuries didn’t occur in the typical course of playing baseball, the thought is that they will be less predictive for his future injury probability (or so the argument goes).  I can’t speak directly to the question of injury weirdness and how it affects recurrence probability–sadly, there’s no “freak” variable in Jeff Zimmerman’s dataset–but I will look at how different types of injury can be more or less predictive of future injury.


the Model

Like last time, I analyze injury risk by using a set of logistic regression models which consider whether or not a player spent time on the DL in a given year as a response variable, and various factors such as age and injury history as predictors.  For starters, I know from my last post that injury probability is strongly influenced by age to the tune of about a 2% higher injury risk per year.  I incorporate injury history by asking whether a player’s injury status in 2011 is predictive for his injury status in 2012.

Variation in injury risk as explained by different models.  Age and injury history are significant, type of injury is not.
Variation in injury risk as explained by different models. Age and injury history are significant, type of injury is not.

Lo and behold, it is, as symbolized by the fact that the model’s accuracy increases substantially after the inclusion of injury history (for those worried about overfitting, this is true even after model selection using AIC).  In fact, injury history is more predictive than age in the combined model.  It is worth noting that while age and injury history have significant effects, they explain relatively little of the variation in injury occurrence–I suspect because there is a large stochastic (luck-based) component.  However, the point remains: past injury strongly predicts future injury.


Injury Type

As I mentioned above, the significant mitigating factor in Ellsbury’s particular history is that his mishaps looked like freak accidents, not routine plays.  I can’t directly parse how the “freakness” of an injury predicts future injuries, but as a proxy, I can look at whether injury type is useful for predicting the future probability of injury.  I consider here two ‘type’ variables from the data: injury location and injury description.  Injury location is quite simply what part of the body was injured; as you might expect, this has categories such as ribs, legs, foot, shoulder, abdominal, etc.  Injury description is a more nebulous category, but it basically explains what sort of injury occurred: strain, break, bruise and so on (also including sleep disorder, oddly enough).

Rerunning the model whilst considering the location of the injury resulted in a slightly worse model fit (not pictured).  Meanwhile, considering the injury description, the model became only slightly more accurate, which is represented by the third bar in the above graph (not more accurate than you would expect given the additional variable).  For both of these injury types, then, I can say that I see no evidence that they predict future injury probability.


Caveats and Summary

Caveats are in order here.  While I am fairly confident that past injury predicts future injury, the converse conclusion, in the case of injury type, cannot be made.  Absence of evidence is not the same as evidence of absence, and for that reason I am unwilling to definitively state that injury type is inconsequential, only that it is not so consequential as to be obvious in the data.  Indeed, looking more carefully into injury description, one finds some nearly significant patterns: surgeries and tendinitis increase future injury risk quite a bit, whereas sprains, spasms, and infections seem not to increase injury risk at all.  Medically (but I am not a doctor), these patterns of causation make sense to me.  Surgeries are invasive and can result in complications, while infections are easily curable and unimportant after they’re gone.  So a more cautious conclusion might be that we have insufficient power at present to determine whether injury type influences future injury risk.

For this reason, I can’t directly predict whether Ellsbury’s past injuries make it more or less likely that he will be injured in the future.  Personally, however, I am skeptical of the freak injury hypothesis (namely, that nonstandard ways of getting injured shouldn’t increase future injury risk).  Having watched no small sample of baseball in the past, I have seen players regularly perform stunning aerobatic feats; the resulting falls would break, I am quite sure, most of the bones in my body.

Ballplayers are tough.  They have to be, because they are constantly running and jumping around at 10-20 mph, colliding with walls and the ground and each other.  Most of the time, despite that, they remain uninjured.  I wonder whether things we consider as “freak injuries” aren’t really just the metaphorical last straw of, say, a tibia stretched to its absolute, physiological limits.  I wonder whether there isn’t a strong observation bias, whereby we remember all the ballplayers who somehow came back from these injuries and forget all the guys whose now-weakened bones gave in at some later time in more spectacular fashion.

As before, definite conclusions are hard to come by in the case of sabermetrics.  What is clear is that Ellsbury is probably a risky acquisition, if not due to his injury history then due simply to his age and the position he plays.  Whether it pays off for the Yankees is now in the hands of the luck dragons.