Friday, November 26, 2010

The Curious Case of Brooks Robinson's Batting Runs (rWAR)

Colin Wyers of Baseball Prospectus pointed this out to me, and neither he nor I have an explanation for it. Rally's WAR estimates have become the most widely used on the internet, especially since they are available at Baseball-Reference. However, some of the batting runs figures don't make a whole lot of sense, and the specific player that Colin brought to my attention was Brooks Robinson.

Rally lists Robinson with a career total of 20 batting runs (above average). That figure does not include baserunning (0 runs) or GDP runs (-35 runs) or reached on error runs (-2 runs), so I will omit those areas of the game from my estimates which follow. The 20 batting runs seems awfully low. My own crude ERP-based estimate is 154 batting runs with park adjustment, 113 without (I estimate Robinson's career season-weighted PF to be .97, meaning he played in moderate pitcher's parks on average). Colin's estimate is 84 batting runs without park adjustment. Pete Palmer's estimate from the 2005 ESPN Baseball Encyclopedia (which does include base stealing) is 53 batting runs. Wyers used OPS+ to generate a crude estimate of +57.

When I was initially discussing this on Twitter, it completely slipped my mind that my figures were comparing Robinson to all league hitters (including pitchers for 1955-1972). Palmer's estimate and OPS+ exclude pitchers from league totals, and they are the closest to Rally's. Still, a thirty-run difference is still fairly large when dealing with offense.

In order to understand why we see discrepancies, it makes sense to attempt to replicate Rally's approach. His explanation of batting runs allows us to get a sense of his process:

Bat runs - This is park adjusted linear weights batting runs, using customized weights at the team level to ensure that total runs credited to players will equal the actual runs scored for that team.

While it is not specified in the quoted entry, Rally has explained elsewhere that he uses Base Runs to generate the linear weights. Since the weights are set so as to ensure that team BsR is equal to actual team runs, I'm going to assume that he's achieving this through the use of a custom B multiplier for each team-season.

I attempted to mimic this through use of a BsR formula that only considered the basic batting events--singles, doubles, triples, homers, walks, and at bats. This formula is far from the most accurate BsR equation ever devised, but it should perform well enough in this role:

A = H + W - HR
B = 2TB - H - 4HR + .05W
C = AB - H
D = HR

Using this equation, I calculated the B multiplier needed for each Orioles season to make BsR = actual runs scored. Then I calculated the corresponding intrinsic weights for each team-season, and used these to estimate Robinson's runs created. From there, I estimated Batting Runs by taking RC - Avg(RC/Out)*Robinson outs.

Using this approach, I estimate that Robinson contributed 107 batting runs (without accounting for pitchers and without a park adjustment).

In order to better mimic Rally's approach, I needed to remove pitcher hitting from the league total. To do this, I used the BsR formula to estimate intrinsic linear weights for each league-season, then figured the league RC/O for non-pitchers (I used a spreadsheet published by Terpsfan101 to get the non-pitcher totals). Using those figures as the baseline for Robinson, I got an estimate of 30 batting runs, which isn't that far off of Rally's. However, when a park adjustment is applied, it shoots back up to 71 runs, which is much closer to the Palmer and Wyers estimates.

More concerning was another curiosity that Wyers noted--the 1969 season. Robinson's Orioles are credited with a team total of 40 batting runs. However, they average 4.81 R/G in a league with an average of 4.09 runs, which means that they scored about 117 runs more than average. That's nearly an 80 run discrepancy!

It gets even more confusing when one looks at the league total listed at Baseball-Reference for the 1969 AL--a whopping -685 batting runs. I have no idea whether this is a problem with B-R's implementation of Rally's method or something else, but it obviously is an error of some sort.

What is different between Rally's figures and my attempts to replicate them? Obviously, if we knew for sure this exercise wouldn't be necessary, but it's safe to assume that:

1. Rally is using a different (and probably better) BsR equation than I am
2. His park factors and mine are probably similar, but surely there are differences
3. Rally may be incorporating some additional categories that I've ignored (intentional walks and sacrifices)

However, the likelihood that all of those differences work against Robinson and account for the difference is not that great (not to mention that the 1969 league figures are illogical). I feel a little guilty posting this without first consulting Rally about it, but I did not have his contact information. He's a good sabermetrician and it is quite possible that I am missing something here--but I do think there is enough smoke to warrant some further explanation.

Monday, November 22, 2010


What follows is a very lightweight post, even for one of this nature.

* I have a half-written post somewhere about the generation gap in sabermetrics between people who got into the discipline prior to the explosion of online sources and those who started at some time after that. I've never finished it or posted it because it's not about baseball--it's about sabermetrics, and because one could easily read it as self-aggrandizing (and perhaps even as a sign of old fogeyism). But the themes have manifested themselves a little bit in the reaction to Felix Hernandez winning the Cy Young.

I'm not crazy about looking at the BBWAA votes for an award as any kind of triumph or defeat for sabermetrics, but if you are inclined to view it in those terms, it's tough to see how Hernandez' win was anything but a victory for the discipline. The win craze in Cy Young voting may have reached its zenith after the Stone/Vuckovich/Hoyt selections stopped in the early 1980s, but it never fully died, not with Jack McDowell in 1993 or John Smoltz in 1996 or Bartolo Colon in 2005. The shiniest W-L may not have been the strong Cy indicator it once was, but a good W-L record was still necessary to get a seat at the BBWAA table (provided the pitcher in question was a starter). It was unprecedented that a 13-12 pitcher would get serious consideration.

It's absolutely true that one didn't need FIP or xFIP or SIERA to make a case for Felix Hernandez; ERA, innings pitched, and strikeouts, which have been kept for the last century, were sufficient to make one consider that Hernandez might have been the league's outstanding hurler. Still, it should not be forgotten that the notion that ERA and strikeouts and the like were useful indicators is one embraced by sabermetrics, that had many less adherents pre-James than it did in 1990, and many less adherents in 1990 than it did in...well, you get the idea.

But for certain members of the community (largely peripheral members, i.e. not the people authoring sabermetric blogs or engaging in their own research), generally those that fall into what could be called (uncharitably to the site) the "Fangraphs generation" of saberites, the notion that actual runs allowed is an acceptable tool by which to evaluate starting pitchers is foreign, as foreign as the notion that W-L was the key evidence was to my generation of sabermetricians.

* Any skirmishes about the baseball awards are a garden party compared to the battle being waged over Horse of the Year between Zenyatta and Blame. I would vote for the latter without a moment's hesitation, and I've yet to see a coherent argument for Zenyatta that is based solely on her 2010 performance. The Zenyatta crowd talks about her "transcending racing" (it's not a popularity contest), or about how she should have won in 2008 or 2009 (arguable, but wrong I believe, and irrelevant to a 2010 award), or about her accomplishments in 2008 and 2009 (beyond irrelevant). Blame ran a more ambitious campaign, beat better horses more times, beat Zenyatta head-to-head, had better speed figures, won more money, and ran exclusively on dirt and at classic distances.

Hernandez/Sabathia is actually not a bad comparison--Sabathia pitched well and wouldn't hardly have been the worst selection in the award's history--but outside of W-L record, it was hard to find an area in which he had Hernandez beat. Outside of the fact that she's Zenyatta, it's hard to find an area where she had Blame beat. To the same degree that I was reasonably confident that Felix Hernandez was the best AL pitcher in 2010, I'm reasonably confident that Blame was the best North American thoroughbred of 2010.

* It now looks as if the expanded playoff format is an unstoppable train. Writing on the idea in an earlier post, I said "In this case, not only do I consider the idea stupid, but it would seriously dampen my own enthusiasm for the playoffs."

Reading it back, I realize that was an overreaction. I don't like the idea of an extra wildcard team any more today than I did then, but I do realize that the likelihood of my enthusiasm for the playoffs being dampened is next to zero. If anything, I'll probably be happy to have a few extra games to watch. The allure of the game is too strong, and to make bold statements about my own ability to resist is self-flattery. I'll object with my head, but I'll tune in and I'll like watching the games if not agreeing with their existence--and so will others, and everyone will make money.

Also, it is worth noting that even with ten playoff teams, MLB will still have the lowest proportion of playoff teams among the big four US leagues.

Tuesday, November 16, 2010

IBA Ballot: MVP

I don't see any slam dunk choice for the AL MVP. My initial RAR numbers have Miguel Cabrera at 74, Jose Bautista 71, Josh Hamilton 68, and Robinson Cano 64. Adding in a crude fielding estimate ((UZR + Dewan's RS)/4) puts Hamilton in the lead at 72, followed by Cabrera 70, Bautista 70, Cano 66, and Longoria 62. Hamilton is also hurt by the fact that the initial RAR considers him a left fielder, but he actually played 22% of his innings in center. Refiguring his position adjustment to take this into account, his offense-only RAR is bumped up by a run, leaving him at 73 total.

It also stands to reason that Hamilton contributed as much or more on the bases than his competitors--BP's EqBRR less stolen base runs (steals are already accounted for in my RC formula) has Hamilton +2, Cabrera 0, Cano +1, Bautista -1, and Longoria +3, and thus only increases Hamilton's insignificant edge. It's not a factor that I consider, but Hamilton will almost certainly win the BBWAA award as he played for a playoff team and Cabrera did not.

There's one player left to consider before handing the award to Hamilton--Felix Hernandez. Hernandez' 76 RAR is definitely comparable to Hamilton's grand total of 75 RAR. However, Hernandez' peripherals are not quite as brilliant as his actual runs allowed, and while I have no qualms about choosing a pitcher as MVP, I like it to be a somewhat clear choice. Since the one run difference in RAR is meaningless and the evidence suggests that Hernandez is getting credit for a fair/favorable runs allowed rate, I can't justify going with him.

The bottom of the ballot is just a matter of mixing in the top starting pitchers with the position players, for whom I see little reason to deviate from RAR ranking. The exception is Paul Konerko who is at 55 RAR but frowned upon by the fielding metrics (-8) and is in front of a bunch of guys for whom I think most people would agree bring a lot more to the table in every area except batting (Adrian Beltre, Joe Mauer, Shin-Soo Choo, Carl Crawford). I would love to be able to justify getting Choo onto my ballot, but Carl Crawford ranks as his equal at the plate and adds more on the field and the basepaths:

1) LF Josh Hamilton, TEX
2) 1B Miguel Cabrera, DET
3) SP Felix Hernandez, SEA
4) RF Jose Bautista, TOR
5) 2B Robinson Cano, NYA
6) 3B Evan Longoria, TB
7) 3B Adrian Beltre, BOS
8) SP CC Sabathia, NYA
9) SP Jered Weaver, LAA
10) LF Carl Crawford, TB

The battle for top position player in the National League can be fairly safely restricted to three first baseman: Albert Pujols (82 RAR), Joey Votto (71), and Adrian Gonzalez (69). Next on the RAR list is Matt Holliday (61). Pujols has a sizeable lead over Votto in my RAR figures, one that may surprise a lot of readers at first glance, and even I was surprised at the margin.

Looking at their unadjusted batting lines, Votto (.324/.420/.600, 8.8 RG) appears to have the slight offensive edge over Pujols (.312/.414/.596, 8.6 RG). However, Pujols still has a four-run cushion in RAR thanks to an extra nine games played and 52 PA. When park is taken into account, Votto (.319/.414/.591, 8.6) and Pujols (.317/.421/.605, 8.9) essentially exchange raw stat lines with one another.

Consider that over the last five seasons, St. Louis's average RPG is 8.8 at home and 9.4 on the road. Cincinnati's split is 9.6/9.1. The parks have played as close to mirror images of one another. Of course park factors can't capture all of the potential influences on those figures--team construction, year-to-year weather fluctuations, chance, etc.--but I don't think it's outlandish to suggest, as my park factors do, that the overall run environment in which Cincinnati plays its schedule is 6% higher than that of St. Louis.

Maybe you don't trust the park adjustment. Maybe you'd prefer to look at each player's performance in the actual run context of his team in 2010, rather than the idealized league average context offered by park adjustments. There are drawbacks to such an approach, most notably that it assumes that each team is equally strong offensively and defensively, but there's an argument to be made that it captures value more effectively than does the neutralization approach. (Bill James made this argument using a fictional Jim Rice as an example in the original Historical Baseball Abstract, and it's something that I intend to ruminate on at some point).

Cincinnati games saw an average of 9.1 runs in 2010 (or 4.55 per team); St. Louis 8.5 (4.25); and throwing in San Diego for good measure, 7.69 (3.85). Using those figures as the new league average, and refiguring HRAA, RAR, and ARG (RG relative to average), the three come out:

Pujols: 70 HRAA, 79 RAR, 203 ARG
Votto: 63, 72, 194
Gonzalez: 53, 61, 184

I have no choice but to conclude that Pujols was the superior offensive player--to the extent that the tools being used capture reality. You can knock a few runs off of Pujols' figure for excess intentional walks, if you'd like, but it's not enough to make the gap disappear. Factoring in other areas of the game don't figure to do much to boost Votto--Pujols has a good fielding reputation and a track record of good performance in metrics, although this year the two are both rated as just about average by both UZR and RS, with a one run edge for Votto. BP's figures have Pujols as a +5 baserunner, Votto average.

To swing the comparison in Votto's favor, you either need to put stock in a metric like WPA (Votto was +7, Pujols +5.4) or give Votto a bonus because his team bested Pujols' for the division crown. I do neither.

The other interesting comparison is Pujols v. Halladay. Both have 82 RAR initially, but Pujols would actually pick up a few runs for fielding and baserunning, while Halladay would have to lose a tick for his hitting (-1 RC). Factor in the peripheral issue discussed re: Hernandez, and I favor Pujols. This is the second time in three years that I have listed Halladay second on a MVP ballot (last time, in the 2008 AL, he was ahead of the position players but lost out to Cliff Lee).

Adam Wainwright and Ubaldo Jimenez are also deserving of prominent positions on the ballot. Among the down ballot position players, I allow fielding to have just enough influence to push Troy Tulowitzki ahead of Carlos Gonzalez for Most Valuable Rockie, and to put Ryan Zimmerman ahead of some others (Dan Uggla, Jayson Werth, Hanley Ramirez, David Wright, notRyan Howard):

1) 1B Albert Pujols, STL
2) SP Roy Halladay, PHI
3) 1B Joey Votto, CIN
4) SP Adam Wainwright, STL
5) SP Ubaldo Jimenez, COL
6) 1B Adrian Gonzalez, SD
7) LF Matt Holliday, STL
8) 3B Ryan Zimmerman, WAS
9) SS Troy Tulowitzki, COL
10) LF Carlos Gonzalez, COL

Monday, November 08, 2010

IBA Ballot: Cy Young

For the Cy Young award, I generally do not consider hitting, although this is more sheer laziness than any strongly held belief that non-pitching aspects of the game shouldn't count towards the Cy. For the majority of pitchers it doesn't really matter (and fielding is at least included in Run Average, even if jumbled up with the other eight guys' glove work).

My suspicion is that the AL Cy will be the award for which my choices most differ from the sabermetric consensus, as I don't make DIPS metrics a primary consideration. My #1 choice is not one of those differences, though. I'll get into the other candidates a bit below, but assume for the sake of argument that the top two candidates are Felix Hernandez and CC Sabathia. Hernandez bests Sabathia in every single category I list on my pitcher report, albeit not always by significant margins:

* Hernandez pitched more innings (249.2 to 237.2)
* Hernandez had a lower ERA (2.34 to 3.12) and a lower RRA (2.96 to 3.39)
* switching to the more traditional RA estimators, Hernandez had a lower eRA (2.97 to 3.48) and a lower dRA (3.50 to 3.82)
* using batted ball inputs, Hernandez had a lower cRA (3.56 to 3.61) and a lower sRA (3.36 to 3.91)
* Hernandez also had a higher percentage of quality starts, which considering it's quality starts and doesn't include a park adjustment isn't something I'd stress, but he leads Sabathia 85% to 76%.
* So of course Hernandez has the margin in RAA (41 to 28) and RAR (76 to 61)

After Felix, it gets a little less clear--Sabathia leads a pack of five pitchers (Jered Weaver, David Price, Clay Buchholz, and Justin Verlander) separated by just five RAR, with two other pitchers cited as candidates (Cliff Lee and Jon Lester) within another five runs of them. Since a lot of saber-minded people consider similar metrics, I added a dRAR column, based on dRA (my BsR application of DIPS). It requires a new innings pitched figure, dIP, which can be figured as (1 - e%H - %W - %HR)*PA/2.84 (see this post for an explanation of the inputs):

I'm sure you'll see a lot of sabermetric ballots that list Felix #1, but then turn to Lee and Liriano due to their strong showing in the DIPS metrics. For me, they are a secondary consideration, enough to move a pitcher ahead of one a few RAR better, but not enough to turn the ballot upside down. Actual runs allowed contain many biases, but they also carry real and important data (at least from a retrospective value perspective) about sequencing (in addition to the more muffled signals about BABIP). It is also worth noting that when batted ball data is considered (another potential minefield, certainly), a pitcher might give back the advantage dRA indicates (Verlander is the best example here, as his sRA (SIERA-style) is 4.09). Weighing all of the metrics very unscientifically, but giving preference to RAR based on actual runs allowed, this is how I see it:

1) Felix Hernandez, SEA
2) CC Sabathia, NYA
3) Jered Weaver, LAA
4) Cliff Lee, TEX
5) David Price, TB

In the National Leauge, I expect to see much more of a consensus as many of the top candidates have peripherals less impressive than their actual runs allowed rate. Unlike the AL in which pitchers like Lee and Liriano have much better DIPS numbers, in the NL a lot of the top starters move in the same direction. Roy Halladay dRA may be .84 runs higher than his RRA, but Adam Wainwright's is .72, Ubaldo Jimenez's .52, Tim Hudson's a whopping 1.87, Roy Oswalt's .84, Matt Cain's .87, Cole Hamels' .83...this allows us to sideline the ideological debates to a greater extent.

Roy Halladay is the obvious #1 choice, trailing only Josh Johnson in RRA while pitching twenty innings more than anyone else and 67 more innings than Johnson. Not that he should care, but this is actually the first time I've personally ranked Halladay as the top starter in his league. When he won the Cy in 2003, I favored Pedro Martinez or Tim Hudson. In 2005 he was on his way to another Cy Young when he was injured; pitching just 141 innings he still would have ranked second on my ballot. In 2006 he lost a few starts in September and was again second to Johan Santana by my reckoning (although unlike in 2005, Santana was still on pace to earn my vote without Halladay's injury). In 2008 he was second by a slim margin to Cliff Lee, a pitcher he'd eventually become inextricably linked to. In 2009 he had a season so good it would almost always win my vote, but Zack Greinke had to go and turn in the season of the decade.

None of that is to put down Halladay, or say that the 2003 award the BBWAA bestowed upon him was a poor choice (while I favored Pedro, Halladay was a thoroughly defensible choice as well). Rather, it should serve to illustrate how consistently good he's been, and how close he has come to winning three or four Cy Youngs.

After park adjustments, Wainwright and Jimenez are impossibly close--they have the same RA (2.74), nearly the same RRA (2.74 to 2.68), similar ERAs (2.50 to 2.67), the same QS% (76), the same RAA (41) and essentially the RAR (72 to 71). Jimenez looks a little better in the traditional peripheral RAs, while Wainwright looks better in the (not park-adjusted) batted ball RAs. Flip a coin, because you can't go wrong.

Tim Hudson actually has a below-average dRA thanks to a .249 BABIP, but his batted ball metrics look a little better and there's no obvious candidate to replace him. Josh Johnson had an outstanding year, but pitching forty innings less than his competitors consigns him to fifth place:

1) Roy Halladay, PHI
2) Adam Wainwright, STL
3) Ubaldo Jimenez, COL
4) Tim Hudson, ATL
5) Josh Johnson, FLA

Monday, November 01, 2010

IBA Ballot: Rookie of the Year

Over the next few weeks I'll be posting the ballots I submitted to the Internet Baseball Awards, hosted by Baseball Prospectus. While I think too much is made about the post-season awards in general, I also can't deny that they are fun to discuss. Additionally, they present the opportunity to put theories about how to compare player's performance into action, and thus have the potential to stimulate a lot of interesting research and philosophical discussion that can be applied to more general questions (To be clear, that potential has not been fulfilled here.)

I approach the ROY the same way I do the MVP, except limited to rookies of course. I don't consider age, expected future production, or any related factor. I'm perfectly happy to vote for a 34 year old Japanese reliever if they were one of the five top-performing rookies.

Throughout my award posts, the fielding numbers I use are based on the average of Dewan's Runs Saved and Lichtman's UZR, regressed 50% towards zero; or (RS + UZR)/4. Looking at multiple metrics and regressing does not completely alleviate concerns about fielding metrics, but I would not feel comfortable throwing them out completely.

In the AL, the top position player is Austin Jackson, who I have at 27 RAR. Dewan's system loves his fielding (+21); UZR is not as enthusiastic (+4), but it's enough to push him past Brian Matusz for the top spot on my ballot. Matusz had 29 RAR, while Wade Davis had 30, but Matusz' DIPS/batted ball estimators are stronger, and so that puts him ahead on my ballot.

Neftali Feliz is getting some buzz as a candidate, thanks to his saves. I have seen Andrew Bailey's victory last year cited as a reason to support Feliz, but of course, that's a red herring. The comparison should not be of Feliz to a similar past winner, but of Feliz to this year's crop. Without considering leverage, it's not entirely clear that he deserves to rank ahead of another rookie reliever, Daniel Bard. With Jackson at 33 RAR, one would have to give Feliz a leverage multiplier of 1.57 to get them even. His 1.74 LI would suggest a multiplier of about 1.37, which brings him to 29 RAR, roughly equal to Matusz and Davis. I'm still uncomfortable with ascribing that much weight to LI and boosting a reliever who pitched 69.1 IP over a batter with 398 PA playing a demanding position (John Jaso).

Jaso will probably be overlooked by a lot of people, but a catcher with a .376 OBA is nothing to sneeze at. Danny Valencia played well, but he had nearly 70 fewer PA than Jaso, didn't have a significant offensive rate advantage (5.6 to 5.3 RG), and while it doesn't matter retrospectively, his offensive value was largely BA-driven (.313 BA, .205 SEC). I have it:

1) CF Austin Jackson, DET
2) SP Brian Matusz, BAL
3) SP Wade Davis, TB
4) C John Jaso, TB
5) RP Neftali Feliz, TEX

In the National League, the race comes down to Heyward and Posey, so I'll set them aside for a moment to discuss other candidates. Neil Walker checks in at 31 RAR, but -5 fielding and the possible over-adjustment for second baseman in my RAR methodology knocks him off the ballot. Ike Davis was the best rookie first baseman, as far as I can tell, on the basis of his superior OBA to Gaby Sanchez (.358 to .337) and high fielding marks. Chris Johnson's fielding is estimated at -5, which is enough to knock him out of contention, while Starlin Castro's season is more impressive for his age (20) than his performance (albeit not bad at all, 10 RAA and 23 RAR).

Among pitchers, Jaime Garcia stands out at 35 RAR. He will be somewhat overrated by mainstream analysis as just 77% of his runs allowed were earned, the lowest percentage of any NL starter. A 3.60 RRA is quite respectable, though, and his peripherals are similar. Madison Bumgarner was very good as well, turning in 28 RAR in just 110 IP.

I side with Heyward over Posey, largely on the basis of playing time: Heyward played 142 games and batted 611 times, while Posey played 108 games and batted 436 times. It also is important to note that Posey played 35% of his innings at first, which lowers his RAR to 33 versus Heyward's 42. After making that adjustment, Heyward's RG relative to position is 131 versus Posey's 140, which really cuts into Posey's rate stat advantage. Yes, it would have been nice if Posey had spent the whole season in the majors, but Brian Sabean prevented him from contributing for two months, and thus made this an easier choice for me than it seems to be for many others:

1) RF Jason Heyward, ATL
2) C Buster Posey, SF
3) SP Jaime Garcia, STL
4) 1B Ike Davis, NYN
5) SP Madison Bumgarner, SF