Thursday, February 13, 2020

Tripod: Clay Davenport's Equivalent Runs

See the first paragraph of this post for an explanation of this series. The content of this article is also the topic of better, more recent posts.

Equivalent Runs and Equivalent Average are offensive evaluation methods published by Clay Davenport of Baseball Prospectus. Equivalent Runs(EQR) is an estimator of runs created. Equivalent Average(EQA) is the rate stat companion. It is EQR/out transposed onto a batting average scale.

There seems to be a lot of misunderstanding about the EQR/EQA system. Although I am not the inventor of the system and don't claim to speak for Davenport, I can address some of the questions I have seen raised as an objective observer. The first thing to get out of the way is how Davenport adjusts his stats. Using Davenport Translations, or DTs, he converts everyone in organized baseball's stats to a common major league. All I know about DTs is that Davenport says that the player retains his value(EQA) after translating his raw stats (except, of course, that minor league stats are converted to Major League equivalents).

But the DTs are not the topic here; we want to know how the EQR formula works. So here are Clay's formulas, as given in the 1999 BP:

RAW = (H+TB+SB+1.5W)/(AB+W+CS+.33SB)
EQR(absolute) = (RAW/LgRAW)^2*PA*LgR/PA
EQR(marginal) = (2*RAW/LgRAW-1)*PA*LgR/PA
EQA =(.2*EQR/(AB-H+CS))^.4

where PA is AB+W

When I refer to various figures here, like what the league RAW was or what the RMSE of a formula was, it is based on data for all teams 1980-2000. Now, RAW is the basis of the whole method. It has a good correlation with runs scored, and is an odd formula that Davenport has said is based on what worked rather than on a theory.

Both the absolute and marginal EQR formulas lay out a relationship between RAW and runs. The absolute formula is designed to work for teams, where their offensive interaction compounds and increases scoring(thus the exponential function). The marginal formula is designed to estimate how much a player has added to the league(and is basically linear). Both formulas though, try to relate the Adjusted RAW(ARAW,RAW/LgRAW) to the Adjusted Runs/PA(aR/PA). This brings in one of the most misunderstood issues in EQR.

Many people have said that Davenport "cheated" by including LgRAW and LgR/PA in his formula. By doing this, they say, you reduce the potential error of the formula by honing it in to the league values, whereas a formula like Runs Created is estimating runs from scratch, without any knowledge of anything other than the team's basic stats. This is true to some extent, that if you are doing an accuracy test, EQR has an unfair advantage. But every formula was developed with empirical data as a guide, so they all have a built in consideration. To put EQR on a level playing field, just take a long term average for LgRAW and LgR/PA and plug that into the formula. For the 1980-2000 period we are testing, the LgRAW is .746 and the LgR/PA is .121. If we use these as constants, the accuracy test will be fair.

One of the largest(and most widely read) errors in this area is an accuracy test written up by Jim Furtado in the 1999 Big Bad Baseball Annual. Furtado tests EQR in both the ways prescribed by Davenport and the way he converts all rate stats to runs. Furtado takes RAW/LgRAW*LgR/O*O. He also does this for OPS, Total Average, and the like. Davenport railed against this test in the 2000 BP, and he was right to do so. First of all, most stats will have better accuracy if the comparison is based on R/PA, which is why Davenport uses R/PA in his EQR statistic in the first place. In all fairness to Furtado, though, he was just following the precedent set by Pete Palmer in The Hidden Game of Baseball, where he based the conversion of rate stats on innings batted, essentially outs/3. Unfortunately, Furtado did not emulate a good part of Palmer's test. Palmer used this equation to relate rate stats to runs:

Runs = (m*X/LgX+b)*IB*LgR/IB

Where X is the rate stat in question and IB is Innings Batted. m and b are, respectively, the slope and intercept of a linear regression relating the adjusted rate stat to the adjusted scoring rate. This is exactly what Davenport did; he uses m=2 and b=-1. Why is this necessary? Because the relationship between RAW and runs is not 1:1. For most stats the relationship isn't; OBA*SLG is the only one really, and that is the reason why it scores so high in the Furtado study. So Furtado finds RAW as worse than Slugging Average just because of this issue. The whole study is a joke, really-he finds OPS worse than SLG too! However, when EQR's accuracy comes up, people will invariably say, "Furtado found that..." It doesn't matter-the study is useless.

Now let's move on to a discussion of the Absolute EQR formula. It states that ARAW^2 = aR/PA, and uses this fact to estimate runs. How well does it estimate runs? In the period we are studying, RMSE = 23.80. For comparison, RC comes in at 24.80 and BsR is at 22.65. One thing that is suspicious about the formula is that the exponent is the simple 2. Could we get better results with a different exponent? We can determine the perfect exponent for a team by taking (log aR/PA)/(log ARAW). The median value for our teams is 1.91, and plugging that in gives a RMSE of 23.25.

In the BsR article, I describe how you can find linear values for a non-linear formula. Using the long term stats we used in the BsR article(1946-1995), this is the resulting equation for Absolute EQR:
.52S+.83D+1.14T+1.46HR+.36W+.24SB-.23CS-.113(AB-H)

Those weights are fairly reasonable, but unfortunately, the Absolute EQR formula isn't. We can demonstrate using BsR that as the OBA approaches 1, the run value of the offensive events converge around 1. We can see the flaw in Absolute EQR by finding the LW for Babe Ruth's best season, 1920:

EVENT BsR EQR

S .68 .74

D 1.00 1.28

T 1.32 1.82

HR 1.40 2.36

W .52 .47

O -.22 -.33

SB .24 .31

CS -.52 -.68

As you can see, absolute EQR overestimates the benefit of positive events and the cost of negative events. The reason for this is that the compounding effect in EQR is wrong. When a team has a lot of HR, it also means that runners are taken off base, reducing the potential impact of singles, etc. that follow. The Absolute EQR seems to assume that once a runner gets on base, he stays there for a while-thus the high value for the HR. Besides, the Absolute EQR formula is supposed to work better for teams, but the Marginal EQR formula has a RMSE of 23.23, better than Absolute EQR. So the entire Absolute EQR formula should be scrapped(incidentally, I haven't seen it in print since 1999, so it may have been).

The Marginal formula can also be improved. If we run a linear regression of ARAW to predict aR/PA for our sample, we get:

EQR=(1.9*ARAW-.9)*PA*LgR/PA, which improves the RMSE to 22.89.

Some misunderstanding has also been perpetuated about the linearity of Marginal EQR. Basically, Marginal EQR is technically not linear but it is very close to it. If the denominator for RAW was just PA, it would be linear because it would cancel out with the multiplication by PA. But since SB and CS are also included in the denominator, it isn't quite linear. However, since most players don't have high SB or CS totals, the difference is hard to see. So Marginal EQR is essentially linear. Some, myself included, would consider it a flaw to include SB and CS in the denominator. It would have been better, for linearity's sake, to put just PA in the denominator and everything else in the numerator. But Davenport apparently was looking to maximize accuracy, and it may be the best way to go for his goals. One possible solution would be to use the RAW denominator as the multiplier in place of PA, and multiply this by LgR/Denominator. However, I tried this, and the RMSE was 23.04. I'll publish the formula here: EQR = (1.92*RAW/LgRAW-.92)*(AB+W+CS+.33SB)*LgR/(AB+W+CS+.33SB)

Now, back to the material at hand, Davenport's EQR. If we find the linear weights for the marginal equation we get:

.52S +.84D+1.16T+1.48HR+.36W+.24SB-.23CS-.117(AB-H)

As was the case with the Absolute formula, I generated these weights through Davenport's actual formula, not my proposed modification using 1.9 and .9 rather than 2 and 1 for the slope and intercept. I wondered what difference this would make if any, so I tried it with my formula:

.50S+.80D+1.11T+1.41HR+.35W+.23SB-.22CS-.105(AB-H)

These values seem to be more in line with the "accepted" LW formulas. However, EQR does not seem to properly penalize the CS-it should be more harmful than the SB is helpful.

Finally, we are ready to discuss EQA. Most of the complaints about EQA are along the lines of taking an important value, like runs/out, and putting it on a scale(BA), which has no organic meaning. Also mentioned is that it dumbs people down. In trying to reach out to non-sabermetricians and give them standards that they understand easily, you fail to educate them about what is really important. Both of these arguments have merit. But ultimately, it is the inventor's call. You can convert between EQA and R/O, so if you don't like how Clay publishes it, you can convert it to R/O yourself. R/O = EQA^2.5*5.

Personally, I don't like EQA because it distorts the relationship between players:

PLAYER R/O EQA

A .2 .276

B . .3 .325

Player B has a R/O 1.5x that of player A, but his EQA is only 1.18x player Bs-the 2.5th root of 1.5.

But again, this is a quick thing you can change if you so desire, so I think it is wrong to criticize Davenport for his scale because it is his method.

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.