Monday, November 16, 2015

Hypothetical Ballot: MVP

Some time in early September, the media decided that Josh Donaldson was the AL MVP. I don't purposefully seek out media on the awards, but I've not heard any mainstream support for a non-Donaldson (read: Mike Trout) candidate since that point. Obviously Donaldson has the playoffs and the RBI, but for my money this is not a particularly close race.

Even if you take away park adjustments, which favor Trout to the tune of 7%, I estimate Trout created 124 runs and Donaldson 123. But Trout did that whilst making 26 fewer outs. Third base and center field are essentially a wash when it comes to position adjustments, and the most favorable comparison in the big three fielding metrics for Donaldson is his 11 DRS to Trout's 0 UZR. Bringing park factors back in, I have Trout with 79 RAR and Donaldson 64, leaving Trout ahead even with the most lopsided fielding comparison feasible.

The rest of my AL ballot is pretty straightforward based on the RAR list, with the exceptions of Manny Machado and Lorenzo Cain, who jump up a few spots on the basis of strong showing in fielding (Machado averaged +14 runs in the big three metrics, Cain +17) and baserunning (+3 and +4 after removing steals respectively, per Baseball Prospectus). I regress fielding just enough to let Nelson Cruz hang on to what started as a 15 run RAR lead over Machado, sprinkle in the top four pitchers, and wind up with this ballot:

1. CF Mike Trout, LAA
2. 3B Josh Donaldson, TOR
3. SP Dallas Keuchel, HOU
4. SP David Price, DET/TOR
5. RF Nelson Cruz, SEA
6. 3B Manny Machado, BAL
7. SP Sonny Gray, OAK
8. CF Lorenzo Cain, KC
9. SP Corey Kluber, CLE
10. RF Jose Bautista, TOR

In the National League, there's absolutely no question for me: Bryce Harper had an epic season with 96 RAR, and that's before adding his positive baserunning and fielding contributions. For the first time in his full-time career, Mike Trout would not be my choice for overall MLB MVP.

Behind him, five candidates have seperation for the next five spots on the ballots--the top first basemen Joey Votto and Paul Goldschmidt, and the top three starting pitchers (Jake Arrieta, Zack Greinke, and Clayton Kershaw). Looking solely at offense, Votto and Goldschmidt are basically even; while Votto's fielding is seen as above average, Goldschmidt is strong across the board (+13 FRAA, +5 UZR, and +18 DRS) and BP's baserunning metric has him as a positive (+2) while Votto is a big negative (-6).

Without Goldschmidt's strong ancillary contributions, I would drop him behind two or maybe even three of the pitchers, but I think he's got just enough value to stay ahead of them as is (and yes, I did consider that both Greinke with 5 runs created and Arrieta with 2 added value that wasn't considered in the Cy Young post. Greinke's offensive edge made me tempted to flip him and Arrieta on the MVP ballot, but it would have been to generate a curiosity rather than borne of strong conviction).

Two things worth discussing on the rest of the ballot: AJ Pollock would be here with 57 RAR regardless, but his defense and baserunning graded out well (-3 FRAA, +7 UZR, +14 DRS, +5 BP baserunning) while Andrew McCutchen's did not (-16, -5, -8, -2), enough to jump Pollock ahead of McCutchen who led him with 65 RAR.

1. RF Bryce Harper, WAS
2. 1B Paul Goldschmidt, ARI
3. SP Jake Arrieta, CHN
4. SP Zack Greinke, LA
5. 1B Joey Votto, CIN
6. SP Clayton Kershaw, LA
7. C Buster Posey, SF
8. CF AJ Pollock, ARI
9. SP Max Scherzer, WAS
10. CF Andrew McCutchen, PIT

Thursday, November 12, 2015

Hypothetical Ballot: Cy Young

I think that the Cy Young is the most interesting award to write about from a sabermetric perspective. The MVP debate can be fierce, but it often gets bogged down in semantic arguments about "what is value?" rather than substantive arguments about the candidates' resumes. It seems as if consensus about who is the "best player" is readily found in many years, and then people attempt to construct a narrative by which they can justify ignoring it.

On the other hand, the Cy Young debate is blissfully free from the semantic debate about what the award should represent, and instead discussion can be focused on how one determines the best pitcher. In the nascent days of sabermetrics, this could take the form of a classic ERA v. wins debate. Today, it often is sabermetricians and pseudo-sabermetricians duking it out over which type of performance metric should be used.

The NL race has that potential, while the AL race seems much more straightforward. Dallas Keuchel topped David Price by 12 RAR based on actual runs allowed adjusted for bullpen support. He topped Sonny Gray by 13 RAR and Price by 14 if you look at component statistics (including actual hits allowed). Using a DIPS-like approach, Keuchel was three RAR behind David Price and Corey Kluber. I give the most weight to the first, but unless you go full DIPS, Keuchel pretty clearly offers the best blend. Since Gray only had 35 RAR by DIPS, Price is a clear #2.

The last two spots on my ballot go to Kluber and Chris Archer, edging ahead of Jose Quintana and besting his teammate Chris Sale. Quintana had a slight edge in RAR over Kluber and Archer, but his 4.17 eRA was the worst of any contender and is enough for me to put Kluber and Archer, whose peripherals were stronger than their actual runs allowed, ahead. Sale led the league in dRA at 2.98 thanks to allowing a .331 average on balls in play (his teammate Quintana fared little better at .329), but Kluber and Archer's edge in the non-DIPS metrics is enough to get my vote:

1. Dallas Keuchel, HOU
2. David Price, DET/TOR
3. Sonny Gray, OAK
4. Corey Kluber, CLE
5. Chris Archer, TB

The NL race is a three-way battle between Zack Greinke, Clayton Kershaw, and Jake Arrieta. Greinke has a slight lead in RAR with 88 to Arrieta's 86 and Kershaw's 79. In RAR based on eRA, the two Dodgers are tied with 79 while Arrieta had 85. In dRA (DIPS)-based RAR, Kershaw leads with 72, while Arrieta had 65 and Greinke 48.

In comparing teammates, it becomes more difficult to accept at face value the DIPS position. They pitched in the same park, with the same teammates behind them. That in no way means that the defensive support they received had to have been of equal quality, or that Greinke couldn't have benefitted from random variation on balls in play (this formulation works better than Kershaw being lucky giving that Greinke's BABIP was .235 and Kershaw's .286). The gap in dRA is large, but not large enough for me to wipe out a nine run difference in RAR.

But while Greinke grades out as the Dodger Cy Young, I don't consider his two run lead in RAR over Arrieta significant enough given the latter's edge in the peripherals. While I think Kershaw is the best NL pitcher from a true talent perspective by a significant margin, I think Arrieta is most worthy of the Cy Young.

Max Scherzer is an easy choice for the #4 spot and would probably be in a virtual tie for second with his short-time teammate Price on my AL ballot. The last spot goes to Gerrit Cole over Jacob deGrom and John Lackey; the former was consistently valued by each of the three approaches (51 RAR based on actual runs allowed, 52 based on peripherals and DIPS):

1. Jake Arrieta, CHN
2. Zack Greinke, LA
3. Clayton Kershaw, LA
4. Max Scherzer, WAS
5. Gerrit Cole, PIT

Monday, November 09, 2015

Hypothetical Ballot: Rookie of the Year

In the AL, only one rookie reached 500 plate appearances (five did in the NL) and none reached 150 innings pitched (three in the NL), so there is a dearth of full season candidates for Rookie of the Year honors. The only full-time rookie was Billy Burns, and his 20 RAR was good for just fourth among AL rookie hitters. Still, two rookie shortstops managed to stand out and rise above the pack as the clear 1 and 2 choices for the award. Offensively, Carlos Correa and Francisco Lindor had nearly identical production; Lindor's OBA was eleven points higher, Correa's SLG was twenty points higher. In ten more PA, Correa created three more runs, so the two were nearly identical in RG and RAR. Correa's 33 to 31 RAR lead doesn't hold up, though, when fielding and baserunning are brought into the equation. While both were average baserunners according to Baseball Prospectus (0 and -1 runs respectively), Lindor was +2 in FRAA, +11 in UZR, and +10 in DRS while Correa was -3, 0, -6. That's convincing enough to place Lindor ahead on my ballot.

One thing to note is that I think Correa's performance was more impressive than Lindor's in terms of "prospect" status, but I don't think that's what the award is for. Correa is a year younger and his offensive performance was less dependent on a high batting average (Lindor hit .313 with a .249 SEC, Correa hit .279 with a .339 SEC) and Lindor's power output was higher than most expected. But while that matters going forward, I think Lindor was a more valuable player in 2015.

Lance McCullers, Nate Karns, Andrew Heaney, and Carlos Rodon were all candidates for ballot spots from the pitching side. I chose to value Karns' 147 innings over Heaney and Rodon's better peripherals. Miguel Sano was sixth in the AL in RG among players with more than 300 PA (basically equivalent to Edwin Encarnacion and Jose Bautista), but with just 333 PA and questionable value as a fielder or baserunner. So I have it:

1. SS Francisco Lindor, CLE
2. SS Carlos Correa, HOU
3. SP Lance McCullers, HOU
4. DH Miguel Sano, MIN
5. SP Nathan Karns, TB

The NL race is not close, as Kris Bryant put up a 50 RAR season and wasn't panned by the fielding metrics (-2 FRAA, +5 UZR, +3 DRS). Matt Duffy was thirteen runs behind offensively and was seen to be a good fielder, but even using the fielding metrics with no accounting for the additional uncertainty, Bryant would still be ahead. Joc Pederson and Jung Ho Kang are the other top position player candidates with 29 and 28 RAR, but FRAA hates Pederson (-19) while UZR and DRS just dislike him (-4 and -3 respectively). And yes I'm intentionally being silly by suggesting that the metrics like or dislike players. The consensus on Kang was slightly above average, which makes him the clear #3 hitter. Randal Grichuk is in the mix at 26 RAR, and one could certainly make a fielding case to put him ahead of Pederson.

Among pitchers, Noah Syndergaard's 29 RAR bests Anthony DeSclafani's 24, and Thor's peripherals are right in line with his RRA. So I see it as:

1. 3B Kris Bryant, CHN
2. 3B Matt Duffy, SF
3. SS Jung Ho Kang, PIT
4. SP Noah Syndergaard, NYN
5. CF Joc Pederson, LA

Monday, November 02, 2015

Royal Mythology

Rarely has the performance of a single team led to so many attempts to rationalize, explain, project virtue, and the like as the 2014-15 Royals. Focusing on the 2015 edition, here are just a handful of Royals myths that I have been particularly annoyed at hearing. The "analysis" that follows is not comprehensive nor is it intended to be. That's kind of the point. The level of extraordinary claims that have been made about the Royals should be apparent even with the crudest of inquiries into the objective record.

Myth #1: Whatever the Heck Andy McCullough Tweeted

"The entire point of the Royals is that baseball is a hard game and if you make your opponent do things, sometimes they will screw up"

The Kansas City Royals reached based on error 58 times in 2015. The AL average was 57. In 2014 they had 51 ROE versus a league average of 57.

Myth #2: The Royals Don't Make Mistakes

Errors leave a lot to be desired as a metric, but when traditional thinkers talk about making mistakes, errors are first and foremost on their mind. The 2015 Royals had a mFA of .973; the AL average was .971. The 2014 Royals had a mFA of .968; the AL average was .970.

Myth #3: The Royals had a long World Series drought

There are 30 MLB teams. It should be obvious, then, that 30 years is the expected time between world titles. Thus a streak of thirty years is not particularly long in theory. It's also not long in practice, as it was only the 12th longest drought (the Mets had the 13th longest drought). Last year en route to the pennant, two of the three teams Kansas City beat had (slightly) longer droughts and the other had a slightly shorter drought.

To find the Royals worthy of any particular sympathy, one must give extra credit for how poorly the franchise performed for much of that period. While this is unfortunate for the fans, it seems like such a group would be less traumatized by losing the World Series and more appreciative just to get there. Fan "suffering" is very low on my list of factors in deciding which teams to pull for in the playoffs, but to the extent I consider it, I tend to side with teams that have been good and just have not had the bounces go their way in October. Teams like the Marlins and the Royals who parlay their only two playoff teams in an extended period into pennants and world titles are quite galling to anyone who has rooted for a titleless yet competent franchise.

But more broadly, I think that the media and fans have yet to understand how championships will be distributed over the long haul in leagues that are double or close to it in size from what they were for so many years. Lengthy droughts, the types that the Red Sox, Cubs, or to a lesser extent Indians and Giants have suffered will be quite commonplace. Basic logic tells you that they have to be.

I did a "simulation" (which is a pretentious way of saying I used the RAND() function in Excel) to simulate 1,000 seasons of a thirty-team league in which each team had a 1/30 chance to win the World Series in any given year. Remember, this is the height of competitive balance. The probability of a championship could not be any more evenly distributed. There are no market disadvantages, no bad franchise stewardship, no billy goats. It is theoretically possible that the timing of championships could be more evenly distributed, but admittedly my imagination is insufficient to describe a specific scenario that would force a more even temporal distribution.

After 1,000 years, the average team should have had 33 1/3 titles. The most successful had 45; the two least successful each had 22 (as an aside, and granting that it was a sixteen team universe for an extended period, think about the Yankees' 27 in this context).

For years 501-1000, I calculated the average of the quartiles, as well as the percentage of active droughts as of a given year greater than 30 years. Since droughts for these 500 years are not independent of one another, be cautious with extrapolating those averages to anything else (for what it's worth, the medians are similar).

The average for these seasons was a first quartile drought of 8.4 years; a median drought of 20.2 years; a third quartile drought of 39.8 years, and a maximum drought of 115.0 years. In the average season, 34.4% of droughts exceeded 30 years (note that the current MLB figure is 12/26 = 46.2% of droughts exceeding 30 years, excluding the four subsequent expansion franchises, which suggests but in no way proves that, not surprisingly, the observed title distribution is not as egalitarian as the theoretical one used here).

Freezing it at year 1,000, this is what the drought picture looks like:


Even with new champions in 7 consecutive and 16 out of 20 seasons, a pretty typical 1/3 of droughts exceed 30 years, one team has exceeded the Cubs, and two more have exceeded the Indians.

The longest drought for any team during the millennium was 215 years. The poor fans of Team 6 celebrated a title in year 306, then went through many generations (or not, who knows, it's the future) before finally winning again in year 622. Then they waited another 120 years for good measure. Should baseball survive for 1,000 years with 30 or more teams, think about all of the narratives that the sportswriters of the future will get to craft.

Myth #4: The Royals Need to Be Explained

This is more of a meta-analytical comment than specific to the Royals, but there is an underlying notion, seen even on some sabermetrically-inclined outlets, that the Royals are an anomaly that demands our attention and an explanation. Please note that I am not criticizing the act of questioning ones premises, of attempting to update hypotheses as new data becomes available, of recognizing that we don't know everything about baseball, or anything of the sort. This all laudable. But such inquisition must not be confused with an imperative to find fault in one's null hypotheses either.

But there all too often is a reflexive desire to be too conciliatory, too eager to throw out one's existing knowledge and toolkit in an attempt to explain something that may just be a fluke. Witness "The Year That Base Runs Failed" (an article that demands a thorough undressing that I just do not have the will to give justice to right now). Recently this has seemed to manifest itself more at outlets that rely on 1) boisterous, opinionated writers and 2) daily content production.

When you are boisterous and opinionated, you need your opinions to be right in order to maintain credibility. If you have to blame the tools (Base Runs, W% Estimators, the entirety of sabermetric theory) that you used to justify your initial opinion, that's fair game. On the other hand, my position on the Royals doesn't demand I apologize for it (maybe I should--as I acknowledged above, I could be wrong, and inquiry into why that might be the case is healthy). My position is simply that the Royals were a fairly average team as indicated by their component statistics, but that sometimes teams outplay their component statistics. The Royals made the playoffs and over two seasons went 22-9, but a .500 team would go 22-9 or better with 1.5% probability--it's not likely but it also must happen now and again. You can disagree, but it's inherently a passive argument.

If you need to produce content daily, then you have to write about something, and writing "the sample size precludes us from drawing firm conclusions" over and over again doesn't drive readership. So there's a temptation to overfit your model, to declare that the secret sauce has been found, to cheat on the degree of certainty you require before you declare correlation to be causation, to investigate one positively correlated variable at the expense of other potential explanatory variables, to overreact to a year in which your metric's standard error is higher than it typically is.

Even great sabermetricians can get caught in this trap, and I have never been confused with a great sabermetrician but I have written things along these lines that I am not proud of as well. Bill James and Nate Silver have both, using different but understandable means when considered in the context of their work, failed pretty miserably at predicting playoff success based on historical data. The simple fact of the matter is that there were 32 playoff games (not counting the wildcard games) this season, which is fairly typical. At 30 games/season, you need five seasons to have a sample size the same as that of one major league team-season.

This is particularly problematic when so many of the attempts to explain playoff performance are based on theories about changes in the game. Contact superseding Moneyball, bullpen construction and usage patterns which have been in a constant state of change throughout baseball history...you could never have credible data without the conditions of the game shifting. This is not to say don't try to advance our understanding, it's to say be extremely cautious as you attempt to do so.

So what winds up happening is that a potential explanation ("Contact works, allow it" is a particularly poor paraphrase since it makes it sound like your pitchers should allow contact, but I saw that Colin Cowherd promo to many times not to use it) is honed in on, and maybe there's evidence of some effect, so other potential explanatory variables are ignored and the correlation is exaggerated and soon there's a truism that must be disproved rather than a hypothesis which must be proved.

There's a difference between saying "I don't know" and "No one will ever know". If it seems as if my school of thought arrives at the latter, that's a fair criticism. But I personally would rather be too certain about how much I can't know than to be too quick to think I've learned something new.