Comments on Walk Like a Sabermetrician: Esoteric Ramblings About +

Actually, if individual ERA is ER*9/IP And leagu...

2014-02-08T23:30:45.033-05:00

Actually, if individual ERA is

ER*9/IP

And league ERA is

LgER*9/LgIP,

you get:

ERA+ = (LgER/ER)*(IP/LgIP)

That is, one way to treat the "weighting" is as an individual's share of IP. The other way to treat the weighting is as the inverse of the individual's share of ER. (As I think of it, we should actually compute the league ERA as

LgERA = (LgER*9 -ER*9)/(LgIP - IP)

I'm not sure I want to try to decompose that into league and individual effects...

Kincaid and Greg have demonstrated that this post ...

2010-03-30T22:23:33.547-04:00

Kincaid and Greg have demonstrated that this post needed some more proof-reading. At least I got Sean's name right the first time.

I didn't touch on the name confusion issue, which is certainly a valid point. It's probably true that the ship has sailed on the use of ERA+ as a name for anything other than LgERA/ERA.

1. There's no e in Forman, in this case at lea...

2010-03-30T14:16:03.317-04:00

1. There's no e in Forman, in this case at least

2. The problem with changing the definition of statistics is that you end up causing arguments that are the result of people using the two different definitions. I remember having a long argument with Michael Wolverton on Usenet in the mid-90s that happened only because we were using ERA+ figures from different editions of Total Baseball. (That argument directly resulted in the invention of SNWL, so it turned out to be a good thing). Regardless of the merits of the versions of the stats, I'd like to avoid confusion. Maybe the new version could be called ERA* instead of ERA+

(P.S. - You are correct that AERA was used in the ESPN Baseball Encyclopedia for legal reasons only. Of course, OPS+ and ERA+ were listed under a variety of names in the various editions of Total Baseball; it wasn't until I came in for the 7th edition that we settled on ERA+ and OPS+ there.)

Kincaid, you're absolutely right about me reve...

2010-03-29T09:05:02.427-04:00

Kincaid, you're absolutely right about me reversing signs on the equation. Thanks a bunch for pointing that out--I've now corrected both the equations and the accompanying text.

I didn't even consider the ramifications when using a floating exponent, but you're right, that does skew things even more in favor of above-average pitchers since standard Pyth on which the approximation is based over-estimates their W%s.

Good summary of ERA+ vs ERA# (and aERA). I like t...

2010-03-29T07:10:01.856-04:00

Good summary of ERA+ vs ERA# (and aERA). I like the way you presented the mathematical implications of each.

For this part:

"W% = R/(2*RA) if R > RA, W% = 1 - RA/(2*R) if R < RA The second equation, for RA < R, is the one that can be directly tied to ERA#."

I think there is a typo where the > and < signs are reversed. R/(2*RA) is for when a team is outscored (R < RA), and 1 - RA/(2*R) is for when a team outscores its opponents (R > RA). Then, in the next sentence, you have it right: the second equation (1 - RA/(2*R)) is ERA#/2, and it is the equations for RA < R, which is the opposite of what the first sentence says. But the next paragraph says ERA# ~ 2*W% for below average pitchers: that should be for above average pitchers, when RA < R.

That only really matters once you start to get close to the negative ERA# range, though, like you noted with it working pretty well in normal situations either way. The errors for the estimation vs. pythagorean record (with an exponent of 2) are pretty symmetrical around average until you get to the extreme ranges (i.e. the error in the estimation for a 90 ERA# is about the same as for a 110 ERA#, and the error for a 70 ERA# is about the same as for a 130 ERA#).

There is another quirk of the ERA#/2 trick that causes it to skew toward working better for above-average pitchers, though. When you use PythagenPat to estimate W%, then the exponent will generally be smaller than 2, and that will change the linear estimate. Using the estimate for z=2 eliminates the symmetry of errors (against PythagenPat) around average, and the estimation becomes better for above average pitchers and worse for below average pitchers. So it is probably worth noting that in practice, the ERA#/2 trick actually works better for above average pitchers, although it's because using the linear estimate for z=2 skews the errors, not so much because the run ratio is greater than or less than 1.