Monday, November 12, 2012

IBA Ballot: MVP

There have been way too many words written about the AL MVP race already. I’m pretty sure that I don’t have any perspective to offer that you have not already had the opportunity to read from someone else. Nonetheless, I will run through a perfunctory comparison of the top two candidates and then address a couple of other side issues that the discussion has raised.

Mike Trout by my estimation created 131 runs, adjusted for park (the key word in that sentence is “estimation”). Miguel Cabrera created about 133 runs. Trout did this while making 382 outs; Cabrera while making 418 outs. It does not take any advanced understanding of sabermetrics to conclude that two less runs in 36 less outs is a tradeoff that would benefit a team. This is before considering the fact that Trout is an excellent center fielder and baserunner and Cabrera is a third baseman of questionable ability and is not going to add much of anything on the bases. It’s pretty clear that Trout is ahead before any factors not captured in the statistics are taken into account.

If you want to poke holes in that perfunctory analysis, one place you might start is the park factors. I estimate that Angels Stadium has a park factor of .96 and that Comerica Park has a park factor of 1.02. I don’t want to get into a debate about the park factors themselves, but rather I’ll assume for the sake of argument that both parks were neutral. After making that change, I estimate that Trout created 125 runs and Cabrera 136. Instead of a two run difference over 36 outs, we now have an eleven run difference over 36 outs, which suggests that Cabrera was the more valuable offensive player. Of course, the aforementioned fielding and baserunning is more than enough to preserve the choice of Trout as more valuable before subjective factors are considered.

Side issues:

* It has become surprisingly common to see sabermetric-minded people suggest that leadoff hitters should have their RAR discounted in some manner due to the extra plate appearances they get due to their role. I don’t know how widespread this view is, or where exactly it got started, but I find it quite odd.

My conception of value holds that if a player is used or is able to take advantage of his talents in such a way as to increase his contribution to the team, then he should be credited for this added value. One example is a hitter than can exploit his home park. Some people would look at the hitter’s home/road splits and discount his value accordingly. I would only discount his home stats to the degree to which the runs have a lower win value (in other words, use a runs-only park factor).

In order for me to believe that a leadoff hitter should not get credit for the additional PA he takes, you would have to demonstrate to me that his average PA had less win impact (production as measured by context-netural metrics being equal) than the average PA of the lower in the order hitter. Before you conclude this would be easy to do, I’d invite you to read the point I made about leverage and relievers in the Cy Young post--I don’t believe it is necessary to limit value to a real-time perspective. This applies within innings as well as within games.

In Trout/Cabrera, though, the real-time perspective measure of RE24 (real-time on the inning level) does not even support the contention that Cabera’s batting order position meant he had a greater impact. Fangraphs’ context-neutral wRAA has Trout at +48, while their RE24 for Trout is +54. Cabrera is +57/+47. So even if I were to accept the premise, I’m not sure how this is supposed to help Cabrera.

*An argument that was oft-cited but got less traction with saber-minded folks was the notion of “penalizing” Cabrera for playing third base. This argument holds that Cabrera made a noble sacrifice for the good of the team to play third, which allowed Detroit to sign Fielder and improve the team. Off the bat, I reject the notion of crediting a player for allowing another to be signed, because it removes the focus from the player’s on-field contributions and opens a Pandora’s box of circumstantial arguments that could not be objectively evaluated or even fully documented (just as a hint of where this road could end up leading, compare Cabrera and Trout’s salaries--or the fact that Fielder was signed after Victor Martinez’s injury, which means that Cabrera’s sacrifice, at least for 2012, allowed Delmon Young and his pitiful 3.9 RG to play every day at DH/LF).

Setting that portion of the debate aside, the RAR figures I use do not account for fielding, so any penalty that Cabrera takes for playing third base can only be added after the fact. Playing third rather than first earned Cabera 7 RAR. Even if Cabrera is an average third baseman (and I don’t think his backers would claim much more than that), it’s hard to spin this into a positive compared to Trout.

* I’ve seen the argument floated that Cabrera has been a great player for several years and has not won an MVP award; this may be his last best chance while given Trout’s age, he may have many MVP opportunities in front of him.

The primary reason I disagree with this position is that the MVP award is a single-season award, and as such I believe that the criteria should be a good faith evaluation of which player was more valuable in the season in question. If the award was a true talent award, then certainly Cabrera’s track record would be relevant, and in fact if I could choose one of these players for my team in 2013 (with no consideration given to anything beyond 2013), I would take Cabrera. But that’s not the criteria suggested by either the voting instructions or consensus of interested parties.

More generally, I call this the Zenyatta argument. Zenyatta won Horse of the Year in 2010 over Blame despite there being no way to argue that Zenyatta had a more impressive 2010 campaign than Blame without twisting one’s self into knots. But Zenyatta was a great mare of historical significance who had been edged out for the award by Curlin in 2008 and Rachel Alexandra in 2009 (in those years, I believe that a very reasonable case could be advanced for Zenyatta, but ultimately agreed with the selections of Curlin and Rachel Alexandra). It was seen as unfair that a horse as accomplished as Zenyatta would never win Horse of the Year.

I find this argument utterly unpersuasive. Miguel Cabrera has been an excellent player over an extended period, which is why he ranked fourth on my IBA ballot in 2006, tenth in 2009, second in 2010, second in 2011, and second in 2012. There’s no shame in being the second-best player in the AL or the second-best horse in the country for three years running--it's a more impressive achievement than being MVP one year and not on the ballot in the other two years. But it doesn’t entitle one to the MVP in any given season.

With respect to the “Trout is young and will have many more chances” component of the argument, we’d all like to think this is the case but you never know. Al Kaline was a great player for many years and finished in the top ten in MVP voting nine times, but he arguably had his best major league season at age 21 and never won an MVP (I’m not suggesting that he should have won it in the age 21 season, only that it may have been his best chance). Mike Trout could be a slam dunk Hall of Famer and yet never match his 2012 season.

* Finally, there is the issue of a margin of error in RAR/WAR calculations. Let’s just assume for the sake of argument that the 95% confidence interval on RAR is 15 runs wide (I pulled this number completely out of thin air, and am just using it to make a point; of course, a 95% confidence standard is also pulled out of thin air despite its ubiquitous application in statistics).

So I have Trout at 81 RAR and Cabrera at 78, not considering fielding and baserunning. Let’ s suppose that Trout was worth 9 runs in these areas to give him an even 90 and that Cabrera was worth -3 for an even 75. Obviously, I’ve engineered this example so that they are separated by 15 runs.

So to put it in stats lingo, we cannot, at a 5% significance level, reject the null hypothesis that Cabrera and Trout were of equal value. So if you believe that Cabrera was as valuable as Trout, it is a defensible position. But saying that we can’t be confident at the 5% significance level that Trout was more valuable than Cabrera does not change the fact that our analysis indicates that it is highly likely that Trout was more valuable than Cabrera.

What I’m trying to get at here is that there is that I sometimes detect (perhaps incorrectly) in the arguments of folks who like to harp on a margin of error that 1) if the confidence intervals overlap, then you cannot use RAR to make the case that Trout was probably more valuable and 2) that in lieu of airtight evidence that Trout was more valuable, you should go with Cabrera. Maybe I’m imagining this, particularly the second, but that is the impression that I was left with after reading some discussions.

Of course, the proponents of the pure confidence interval approach need to be cognizant of the logical conclusion of their arguments--if we can’t argue for Trout ahead of Cabrera on a ballot on the basis of his higher RAR, we also can’t argue for Cabrera over Robinson Cano or Justin Verlander, because our confidence intervals on their RARs overlap with Cabera’s.

It would be nice if a MVP ballot was constructed in such a way that you didn’t have to assign a strict rank order. It might be a better system if you could give Trout the equivalent of a 1.2nd place vote, and Cabrera the equivalent of a 2.5th place vote. It might be a better system if you could somehow throw a net over Trout and Cabrera on your ballot, and then throw another net over Cabrera, Verlander, and Cano, and then another over Verlander, Cano, David Price, and Adrian Beltre. But that’s not how the system works--you have to make a rank order, and all the margin of error tells you is that it’s not crazy to think that perhaps Cabrera was as good as Trout, and that you’re not an abject idiot for putting Cabrera first. It doesn’t do much to convince anyone else to follow suit, though.

Given the way MVP voting is constructed, I am going to vote for a guy with 76 RAR over a guy with 75 RAR every time unless I can be convinced of a reason not incorporated into those figures to do otherwise. I say this even though one run (or an alternatively small quantity) is a meaningless distinction--the ballot structure forces one to make meaningless distinctions, and just parroting your value estimates is no less arbitrary than any other way of making those distinctions (and at least allows for consistency in lieu of confidence).

Getting back to the rest of the ballot, I have five pitchers in eight spots, which I think is a record for me. Outside of Trout, Cabrera, and Cano, the rest of the AL position players didn’t put up seasons that jump out. Joe Mauer is fourth on the RAR list, but that gives him full-time credit for being a catcher; take that away and he drops to 52 RAR. Prince Fielder is at 55, but that’s before baserunning or fielding, which knocks him down a bit. Edwin Encarnacion is next, but he adds nothing outside of the bat, which leaves two Rangers, Adrian Beltre and Josh Hamilton, to battle with Mauer for the ballot spots. I chose the two Rangers. They were extremely close in offensive value (.347/.543 for Beltre and .342/.557 for Hamilton) and equal in RAR (53) thanks to playing positions with even position adjustments. I nudged Beltre ahead on the basis of fielding and Hamilton’s extensive play in left field:

1. CF Mike Trout, LAA
2. 3B Miguel Cabrera, DET
3. SP Justin Verlander, DET
4. 2B Robinson Cano, NYA
5. SP David Price, TB
6. SP Chris Sale, CHA
7. SP Felix Hernandez, SEA
8. 3B Adrian Beltre, TEX
9. CF Josh Hamilton, TEX
10. SP Jered Weaver, LAA

You would never know it from the clash of worldviews offered by the AL race, but the NL MVP race was much closer, and there are three candidates between whom it’s tough to make meaningful distinctions.

My figures credit Buster Posey with 77 RAR, Andrew McCutchen with 75, and Ryan Braun with 74. Posey’s RAR is inflated since he’s considered a full-time catcher, but the evidence seems to suggest that he is a solid enough catcher and not a disastrous baserunner. McCutchen does not fare all that well in fielding metrics, while Ryan Braun is considered a solid left fielder. Seeing no reason to knock Posey down, I put him in the top spot, but I would certainly accept an argument on behalf of any of the three. With so little to separate Braun and McCutchen, I chose to go with the one who fares better in the area in which I’m more confident in our value estimates--offense. Braun created 6 more runs in just 3 more outs, a difference well within a margin of error but also the largest daylight you’ll find between these two.

There are a number of interesting position player candidates for the remainder of the ballot. The two closest position players are a pair of third basemen, Chase Headley and David Wright. Headley is ahead in RAR, 68-61, but fielding metrics suggest that Wright may have been better. UZR really liked Wright’s fielding at +15 to Headley’s +2; Baseball Prospectus’ FRAA was less enthusiastic about both (Wright +1, Headley -7). I’ll side with offense and keep Headley ahead.

Joey Votto missed a significant amount of time (only 111 games), but was brilliant when in the lineup, leading the NL with a .465 OBA and 9.5 RG for 58 RAR. Yadier Molina is also a candidate with 55 RAR and a brilliant fielding reputation backed up by what limited data we have, but he also appears to have been a liability on the bases (-6 runs according to Baseball Prospectus) and it’s tough to know exactly how to evaluate his fielding. Votto is no slouch in the field, either, albeit at a far less demanding position. Aramais Ramirez was also quite good, and I had no idea until looking at the stats systematically that Aaron Hill hit .293/.347/.507. Hill is a case in which fielding metrics disagree (+21 FRAA, +2 UZR), and I’m inclined to give more credence to Molina’s fielding. Mixing in the starting pitchers, I have it as:

1. C Buster Posey, SF
2. LF Ryan Braun, MIL
3. CF Andrew McCutchen, PIT
4. SP Clayton Kershaw, LA
5. 3B Chase Headley, SD
6. 3B David Wright, NYN
7. SP RA Dickey, NYN
8. 1B Joey Votto, CIN
9. C Yadier Molina, STL
10. SP Johnny Cueto, CIN

No comments:

Post a Comment

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.