Friday, November 26, 2010

The Curious Case of Brooks Robinson's Batting Runs (rWAR)

Colin Wyers of Baseball Prospectus pointed this out to me, and neither he nor I have an explanation for it. Rally's WAR estimates have become the most widely used on the internet, especially since they are available at Baseball-Reference. However, some of the batting runs figures don't make a whole lot of sense, and the specific player that Colin brought to my attention was Brooks Robinson.

Rally lists Robinson with a career total of 20 batting runs (above average). That figure does not include baserunning (0 runs) or GDP runs (-35 runs) or reached on error runs (-2 runs), so I will omit those areas of the game from my estimates which follow. The 20 batting runs seems awfully low. My own crude ERP-based estimate is 154 batting runs with park adjustment, 113 without (I estimate Robinson's career season-weighted PF to be .97, meaning he played in moderate pitcher's parks on average). Colin's estimate is 84 batting runs without park adjustment. Pete Palmer's estimate from the 2005 ESPN Baseball Encyclopedia (which does include base stealing) is 53 batting runs. Wyers used OPS+ to generate a crude estimate of +57.

When I was initially discussing this on Twitter, it completely slipped my mind that my figures were comparing Robinson to all league hitters (including pitchers for 1955-1972). Palmer's estimate and OPS+ exclude pitchers from league totals, and they are the closest to Rally's. Still, a thirty-run difference is still fairly large when dealing with offense.

In order to understand why we see discrepancies, it makes sense to attempt to replicate Rally's approach. His explanation of batting runs allows us to get a sense of his process:

Bat runs - This is park adjusted linear weights batting runs, using customized weights at the team level to ensure that total runs credited to players will equal the actual runs scored for that team.

While it is not specified in the quoted entry, Rally has explained elsewhere that he uses Base Runs to generate the linear weights. Since the weights are set so as to ensure that team BsR is equal to actual team runs, I'm going to assume that he's achieving this through the use of a custom B multiplier for each team-season.

I attempted to mimic this through use of a BsR formula that only considered the basic batting events--singles, doubles, triples, homers, walks, and at bats. This formula is far from the most accurate BsR equation ever devised, but it should perform well enough in this role:

A = H + W - HR
B = 2TB - H - 4HR + .05W
C = AB - H
D = HR

Using this equation, I calculated the B multiplier needed for each Orioles season to make BsR = actual runs scored. Then I calculated the corresponding intrinsic weights for each team-season, and used these to estimate Robinson's runs created. From there, I estimated Batting Runs by taking RC - Avg(RC/Out)*Robinson outs.

Using this approach, I estimate that Robinson contributed 107 batting runs (without accounting for pitchers and without a park adjustment).

In order to better mimic Rally's approach, I needed to remove pitcher hitting from the league total. To do this, I used the BsR formula to estimate intrinsic linear weights for each league-season, then figured the league RC/O for non-pitchers (I used a spreadsheet published by Terpsfan101 to get the non-pitcher totals). Using those figures as the baseline for Robinson, I got an estimate of 30 batting runs, which isn't that far off of Rally's. However, when a park adjustment is applied, it shoots back up to 71 runs, which is much closer to the Palmer and Wyers estimates.

More concerning was another curiosity that Wyers noted--the 1969 season. Robinson's Orioles are credited with a team total of 40 batting runs. However, they average 4.81 R/G in a league with an average of 4.09 runs, which means that they scored about 117 runs more than average. That's nearly an 80 run discrepancy!

It gets even more confusing when one looks at the league total listed at Baseball-Reference for the 1969 AL--a whopping -685 batting runs. I have no idea whether this is a problem with B-R's implementation of Rally's method or something else, but it obviously is an error of some sort.

What is different between Rally's figures and my attempts to replicate them? Obviously, if we knew for sure this exercise wouldn't be necessary, but it's safe to assume that:

1. Rally is using a different (and probably better) BsR equation than I am
2. His park factors and mine are probably similar, but surely there are differences
3. Rally may be incorporating some additional categories that I've ignored (intentional walks and sacrifices)

However, the likelihood that all of those differences work against Robinson and account for the difference is not that great (not to mention that the 1969 league figures are illogical). I feel a little guilty posting this without first consulting Rally about it, but I did not have his contact information. He's a good sabermetrician and it is quite possible that I am missing something here--but I do think there is enough smoke to warrant some further explanation.

1. I am learning today that I have a hard time remembering to account for pitcher hitting--it's been pointed out to me that the league totals remove pitcher hitting from the baseline, but include pitchers on the league level. While this explains some of the problem, it doesn't speak well for the value of the team Batting Run totals.

Also, it doesn't fully explain the issue with the 1969 Orioles as a team. The 1969 AL average of BsR-figured RC/O is .173 for non-pitchers. The '69 Orioles made 4053 outs and scored 779 runs, which result in something like 779-.173*4053 = 78 batting runs, still a far cry from the 40 provided by B-R.

2. When I first read this I thought the 1969 league figure had to be an error, but it actually isn't.

I set the baserun linear weights to deal with non-pitchers, and they zero out for those players. But the league total includes pitchers as well. So something like non-pitchers zero, pitchers -685 at the league level.

Another step is to remove any runs from baserunning, ROE, and GIDP from the team totals so that these are not double counted. So if team runs (leaving aside pitcher hitting) is 750 runs, and the team was +15 baserunning, -5 in GIDP, and +5 in ROE, then I assume 735 runs need to be accounted for by the batting events.

Sean Forman's batting wins puts Brooks around +50, for a 20+ year career 30 runs is not an uncommon difference.

3. No, there is no value in the team batting runs, because the intent was to measure the individual, and to put the players in DH leagues on an even playing field with those in non-DH leagues. If I was trying to compare team offense I would have done things differently.

And if you want to ask me questions, you can email me at rallymonkey (numeral five) at comcast dot net.

4. The leage average team for 1969 shows -57 batting runs, so the +40 on BB-ref is equivalent to +97.

5. Rally, Patriot corrected himself on the '69 league figures in the comments after I pointed the pitchers issue out to him.

What I - and I think Patriot as well - are interested in is figuring out how you come to the values you do for Brooks. He obviously put more effort into the matter than I did, but neither of us are able to figure out how to go from the underlying components to the results you give.

6. As for why I didn't contact Rally about it directly - I wasn't necessarily meaning to get into a discussion of the issue, we were simply discussing this blog post:

http://www.beyondtheboxscore.com/2010/11/26/1834014/gidp-the-underrated-production-killer

after it had been linked by Primer and I was rather taken aback by the figures for Brooks. It snowballed from there. I had meant to send you something, but my kid's off school today so I've been rather busy.

(Also, am I the only person that writes a comment here, hits posts and switches windows before it gets to the CAPTCHA?)

7. There are a number of leagues that were non-pitcher hitting that don't add up to zero (although the discrepancies are of a much smaller magnitude, although the ones I checked are still around 4.5 runs/team). For example, for the 1987-93 AL, the league batting totals are:

98, -99, 65, -99, -20, -74, 11

I realize that this may be a B-R problem and not a problem on your end, as you don't provide the team totals on baseballprojection.com.

I set the baserun linear weights to deal with non-pitchers, and they zero out for those players. But the league total includes pitchers as well. So something like non-pitchers zero, pitchers -685 at the league level.

Another step is to remove any runs from baserunning, ROE, and GIDP from the team totals so that these are not double counted. So if team runs (leaving aside pitcher hitting) is 750 runs, and the team was +15 baserunning, -5 in GIDP, and +5 in ROE, then I assume 735 runs need to be accounted for by the batting event

Let's say you have a team that scores 800 runs in a pitcher's league, with the total of +15 baserunning/GDP/ROE runs you described. When you reconcile your BsR, are you reconciling to 800? 785 (this is what I gather from the quote above)? Something else?

IMO, I'm not sure it's a good idea to use the estimated ancillary components as offsets against known runs scored, although I can see the upside to doing it this way. I think it might be a better approach to use unadjusted batting values, and then apply some sort of corrector to the team's total of batting + ancillary.

Don't blame Colin--we were just having a discussion on Twitter, and I'm the one that went and escalated it to blog post level.

(Also, am I the only person that writes a comment here, hits posts and switches windows before it gets to the CAPTCHA?)

I turned the CAPTCHA on because all of a sudden I was getting 40 spam Russian spam comments a day. My spam filter was catching them but when I almost deleted an actual comment by accident....I should probably try turning it back off, since I know I hate it on other sites.

8. "Let's say you have a team that scores 800 runs in a pitcher's league, with the total of +15 baserunning/GDP/ROE runs you described. When you reconcile your BsR, are you reconciling to 800? 785 (this is what I gather from the quote above)? Something else?"

785. If the team scores 800 in an average park where league average is 750, then I want the bat, baserunning, ROE, GIDP to sum to +50. Is that the best way to do it? I don't know. But that's how I did it.

BB-ref has taken the numbers I provided and summed them up for a team up through 2009. For 2010, I believe Sean Forman is just using his linear weights batter runs in the WAR calculation. When we discussed the details, my opinion was that one measure was not better than the other, they usually come to the same results, and since he already calculated one no need to try and implement what may be an overly complex formula.

For Brooks I get the following customized LW values for Baltimore:
1b .49 2b .80 3b 1.1 hr 1.42 ubb .32 ibb .13
hbp .35 out -.11

The park factor for 1969 is 1.01 (I just used the baseball-databank figure), looks like it's the multi-year bpf on bb-ref.

Then, after adjusting the RC from the above formula, subtract .169 * outs (AB-H only) to get runs above average.

9. The linear weight values are fairly close to what I got. Using those weights I had the non-pitcher average as .173 R/O, so .169 is also in the neighborhood. I have the PF at .99, which I certainly wouldn't want to argue strenuously is more accurate than 1.01.

Calculating as you just described, I get -14 versus your -13, so no problem there. It's possible that I overreacted to easily explicable differences.

The ancillary category adjustment rubs me the wrong way (just an opinion), as it seems to treat those figures as correct and adjust real runs scored accordingly, placing all the burden for reconciliation on the batting component.

I'm still a little concerned about the team/league figures (again with the caveat that this might well be a problem on B-R's end and not in your methodology. Some of the leagues, even those without pitchers batting, have offensive runs that don't sum to zero. For instance, 1980 AL:

-142 bat, +5 bsr, -13 roe, -18 dp = -168 (12/ team)

Of course, once you drill down to the player level the 168 runs is spread pretty thin, and there are systems that don't reconcile and thus start with similar differences.

10. Things aren't going to sum to zero because the numbers are rounded at the player level. So things like +13, -18, etc. are completely expected. That does not explain the -142 for the batting runs AL 1980, at least I doubt it would.

That one is on me, for years before 2010 BB-ref just took what I provided and loaded it onto the site. 2010 and on is Sean Forman's implementation. That is the worst figure of the DH era before pitchers took to bat again, some of the other years have differences like 2 or 4. I don't have an explanation but I'll let you know what I can find.

11. Thinking about this has reminded me why I don't particularly like reconciling runs created to equal team runs. Doing it by altering linear weight coefficients makes the downside more obvious than when it is covered up by simply multiplying RC by the ratio of team R to RC (as Bill James does).

Rally's approach assumes that the ancillary +/- run figures are without error--but in fact, they are subject to the same kind of error that the coefficient for any event is. All of the weight of reconciliation is borne by the batting events.

In theory, the value of a batting event is the average change in RE due to events of that type. By using BsR to estimate that value, we doing some combination of 1) acknowledging that the sample size is insufficient in many cases to use empirical values and 2) making the calculations a lot easier (even if you could use a single team-season RE table with confidence, it would be a royal pain). One can view changing the coefficients due to an error in estimating actual runs scored as a shorthand way of adjusting the RE table--but actually doing so would also change the ancillary figures.

The other issue I have is that the process of reconciling has more effect on players that are estimated to create a lot of runs (the percentage change will have a greater effect on them, which can also be seen through the fact that the absolute value of the out will change very little, while the values of the positive events will be more fluid). But I see no obvious reason why this should necessarily be the case.

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.