Tuesday, December 27, 2005

A Review of "Baseball Superstats 1989"

Baseball Superstats 1989 was a book written by Steve Mann outlining his approach to sabermetric analysis and using his methods to make predictions for the 1989 season and display complete career statistics for players. The “superstats” referred to in the title are his own Runs Above Average figures based essentially on OPS for hitters and ERA for pitchers. The book, read sixteen years later, does not really provide any new sabermetric insight. Mann did note back then that a “plays made” approach to fielding would result in a very cloudy picture when it came to team fielding evaluation, because there are a certain number of outs that have to be made by somebody. This is now regarded as a truism in sabermetrics, as range factor methods have died out for new advanced approaches based on play-by-play, regression analysis, and assorted other approaches. Mann essentially predicts this in the book.

The pitcher ratings are structurally sound; they are essentially equivalent to Pitching Runs, except that Mann has normalized everything to an ideal context of 4.00 ERA, similar to how Clay Davenport would later construct an ideal league for his evaluations. The major quibble I have is with the use of one year park factors (for hitters too); one line in the book that made me chuckle is “It’s often not until July or August or even September that the run production rate at a ballpark will settle into its true range, causing early evaluations to bounce around wildly.” As if you can tell the true range of a ballpark from 81 games played there and on the road. Now if one wants to use a one year PF based on some consideration, be it about the weather or what have you, they can go ahead, but don’t try to pass it off as if you know the true range of the park.

The offensive ratings are, well, not so super, but I’m not sure Mr. Mann (at least his 1989 incarnation) would take kindly to hearing this. The book is not at all short on, ahem, naked displays of ego. For example, “[if you] would like to get clean, accurate, reliable answers to your most burning baseball questions, then you’ve come to the right oasis. The superstats are like a cool refreshing dip in a clear blue pool of common sense.” Of course they are. He later writes “There is a smattering of unofficial stats that have been foisted on the public by the baseball media in recent years that are generally even more seriously flawed…we won’t even go into the spate of statistical inventions that has flowed from the fertile minds of Earnshaw Cook, Bob Kingsley, Bill James, Tom Boswell, and other researchers and writers during the last quarter century.”

Now I suppose those two clauses don’t have to be connected, but it seems fairly clear to me at least that they are. Now some of the stuff Cook and James came up with is flawed, but all of it was ground-breaking. And of course Mann leaves his atrocious offensive centerpiece out of this mix.

That offensive centerpiece is the Run Productivity Average. The RPA was developed by Mann based on tallies of runs scored and RBIs generated by various offensive events in that season’s Phillies games. So one team season is used as the baseline, and runs and RBI are used rather then run expectancy. This causes some serious problems. Mann’s method is to see what percentage of singles wind up scoring, and add the average number of RBI per single. That is the single coefficient (the SB and CS coefficients were figured through a different approach which he does not exactly explain). Then he does this for all of the events, sums them, and multiplies by .52(R+RBI are generally equal to 52% of runs). This is the base estimate, and is:
(.51S + .82D + 1.38T + 2.63HR + .25W + .29HB + .15SB - .28CS)*.52 = .265S + .426D + .718T + 1.368HR + .13W + .151HB + .078SB - .146CS

As you can see, these weights are very low compared to the RE based weights from Linear Weights, with the exception of the homer, which is already right around where it should be. Incredibly, Mann touts his run/RBI based approach as easier to understand then the RE approach of Pete Palmer, who Mann refers to reverently throughout the book.

Perhaps it is easier on first blush to understand the weights based on R and RBI, but understanding RE opens the door to a whole world of knowledge in sabermetrics: win expectancy, strategy analysis, linear weights, how context affects event values, etc. Any serious analyst or would-be sabermetrician should make the effort to learn how RE works. Of course, the LW approach also saves complexity later in the process, when Mann has to add bells and whistles to feign accuracy for the RPA.

The first is that the estimate is low by about 52 runs/team, so he adds 52 as the “garbage constant”. Then he also adds a “corrector” for OBA, since by not accounting for outs, he has not in any way allowed his method to include the effect of not making outs, and the negative impact of runs scored. The OBA corrector is 3000*(OBA - .330). The reasoning is that each point of OBA away from the long-term average of .330 adds 3 runs. After this, the final formula as Mann writes it is:
RPA = (.51S + .82D + 1.38T + 2.63HR + .25W + .29HB + .15SB - .28CS)*.52 + 52 + 3000*(OBA - .330)

How does this formula stack up in an accuracy test? I tried all teams 1980-1989 with the exception of the strike-shortened 1981. The SB version of James’ RC came in at a RMSE of 25.15. The SB version of ERP comes in at 23.64, and the SB version of BsR at 22.85. Mann’s RPA, with both of the correctors, comes in at 27.82. It is not really in the same ballpark.

However, Mann publishes a table based on Pete Palmer’s accuracy tests that show RPA at 22.5 behind Batting Runs and OTS Plus, ahead of RC, DLSI, TA, OPS, DX, etc. I am not quite sure how that worked out, but perhaps the regression equations that Palmer uses helped Mann because if he applied them yearly, the “garbage constant” would have been customized by year rather then a constant 52. I am not sure if this is what happened, but that’s my best guess.

Anyway, looking at the RPA formula, we do not have any way of knowing what the intrinsic linear weights used by the formula are. However, we can very closely approximate this. Each event has a linear weight, but these are complicated by the garbage constant and the OBA corrector. Let’s look at each separately. The garbage constant is 52 for every team. How can we apportion this across a team’s offensive events? Well, the one event that is pretty much a constant for every team is outs. So let’s take 52/(AB-H). This is our first value on the out, and it is positive.

The OBA corrector for a particular team gives them an additional 3000*(OBA-.330) runs. We can calculate this value for any team. For example, the 1980 Orioles had a .3441 OBA(figured as (H+W+HB)/(AB+W+HB)), so their corrector was 42.3 runs. We add these to the 52 above, and now have an out constant of (52+42.3)/(AB-H). For the O’s, this comes to .0232 runs per out, positive. A general equation for a team is (52+3000*(OBA-.330))/(AB-H).

There is another factor we have to consider with respect to the OBA corrector, though, which is that each on base event adds additional value for each on base event and reduces the run estimate for each out value. We can differentiate OBA to approximate this. We’ll write OBA as N/P, where N = H+W+HB and P = AB+W+HB. Then the derivative of OBA is dOBA = (P*n-N*p)/P^2, where n is the derivative of N with respect to any event(one for on base events, zero for other events) and p is the derivative of P with respect to any event(one for any batting event). We can simplify those formulas to pdOBA, the derivative for an on base event, and ndOBA, the derivative for a batting out:
pdOBA = ((P-N)/P^2)*3000 ndOBA = (-N/P^2)*3000
I have multiplied by 3000 because the OBA corrector multiplies by 3000. For the Orioles, this means that each on base event raises their RPA by .318 runs and each out reduces their RPA by .167 runs. -.167+.0232 = -.1438, our final out coefficient. We then add .167 to the coefficients for each batting event. So we have:
RPA = (.13 + pdOBA)*W + (.151 + pdOBA)*HB + (.265 + pdOBA)*S + (.426 + pdOBA)*D + (.718 + pdOBA)*T + (1.368 + pdOBA)*HR + .078*SB - .146*CS + (ndOBA+52/(AB-H))*(AB-H)

If we use this formula to estimate runs, it has a RMSE of 27.85, just slightly worse then the official RPA formula. The estimates for teams generally agree with in 2 or 3 runs(this formula has a RMSE of .252 in predicting RPA).

We can also approximate the adjustments for a hypothetical team. Mann assumes that each team will make 25 outs a game for 4050 for a season with a .330 OBA. This means they will have 4050/(1-.330) = 6044.78 PA, which means they have N = 6044.78-4050 = 1994.78. This gives them a pdOBA of .3325 and a ndOBA of -.1638. Applying this to the above formula, we have this 100% linear equation:
RPA = .598S + .759D + 1.051T + 1.701HR + .463W + .484HB + .078SB - .146CS - .151(AB-H)

This formula has a RMSE of 27.71, slightly better then the official version, but still not nearly as accurate as the other run estimators. You can see why, when you compare to linear weights coefficients. RPA overvalues walks, and overvalues home runs with respect to other hits. That’s the basic reason why the RPA does not work very well.

Anyway, the full version of RPA is to divide by PA. There are also some corrections to apply it to individual hitters, splitting the 52 and the OBA corrector so that players are not treated as teams. But I’m not going to go into that, because this method is not accurate enough to waste my time on it further. And of course, putting it over PA will not give a proper rate stat, as we saw in part 3 of my rate stat series. The OBA corrector corrects for the fact that outs are not considered at all and that the value of reaching base is underweighted, not for the extra PA generation effect of PA on a player level, just like ERP, RC, and BsR, which are better (but not precisely) put into a rate by dividing by outs.

Mann then goes on to lay out the superstats for batters, which are said to closely approximate the true values. And he is right that these methods are not that bad. They are basically based on OPS. OBS, which is On Base plus Steals, is used. OBS = (H+W+HB+.5SB-CS)/(AB+W+HB). I’m not exactly sure why steals are included in the OBA, but they are. Then (OBS/LgOBS + SLG/LgSLG - 1) is essentially set equal to the percentage that a player exceeds the league R/PA by.

He also offers a quick superstats rate, which replaces the above equation with (2*OPS/LgOPS-1). These equations are perfectly acceptable ways to convert OPS to runs, but OPS has its own problems as have been well-established elsewhere.

So basically, as far as I am concerned, Mann’s theory behind run production is flawed and overly complex, and his specific execution is nothing special in modern sabermetrics, or even 1989 sabermetrics.

It probably sounds as if I’m hammering the book. That’s not really my intention; I’m just trying to explain his methodology. It is not a bad book; Mann is obviously a smart fellow, and it is a book you might wish to have in your sabermetric library. But it does not live up to its title, and if you don’t read it, you won’t be missing out on any great sabermetric truth you cannot read anywhere else.

No comments:

Post a Comment

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.