Comments on Walk Like a Sabermetrician: Run Estimation Stuff, pt. 3

Ideally, the BsR B factor does not need to be cali...

2008-07-16T15:41:00.000-04:00

Ideally, the BsR B factor does not need to be calibrated for each particular environment to produce an accurate estimate. Obviously, you can make it return the exact number of runs scored for a given dataset if you do calibrate particular for it, but that isn't a requirement of using the model.

As a model of the run scoring process, BsR should adapt to any conditions. It doesn't of course--after all, there is no such thing as a perfect model. However, I think that it gives reasonable results for a very wide range of conditions.

The charts in part 4 for the '68 NL and '96 AL, while far from conclusive, demonstrate that Base Runs tracks the empirical linear weights pretty well, with only the SF being particularly troublesome. It certainly matches the empirical coefficients of each events better over disparate conditions than Runs Created does for the period in which it was designed to work.

Fair enough. My point with the last example was j...

2008-07-16T14:51:00.000-04:00

Fair enough. My point with the last example was just that exactly one run scores either way.

My preference for counting bases vs. run expectancy is because the former is valid in the absence of any prior knowledge or assumptions (other than all bases being equal). One argument for BaseRuns is that it works in any environment, but that is true only if you have a large enough and representative enough data set for proper calibration of the B factor.

So far in 2008, Major League teams are scoring an average of 4.54 runs per game. Over the time period 1961-2005 (which you referenced above), the average is 4.39 runs per game. So it seems to me that you already have a 3.4% discrepancy between the basis for your formulas and something to which you presumably want to be able to apply them.

Thanks again.

Using your example, moving from first to second wi...

2008-07-16T09:54:00.000-04:00

Using your example, moving from first to second with no outs has no run value if the next three hitters all strike out. Moving from second to third with two outs is very valuable if the next thing that happens is a wild pitch. Both events are equally valuable if the next hitter singles on a ground ball up the middle.

I don't want to belabor these disagreements too much, because I have absolutely no illusions that I am going to persuade you (*), but the runner would have, more often than not, scored from second on the single with 2 outs. Sure, there are infield hits, hard hit singles to left, cases in which the baserunner is Ernie Lombardi, but more often than not, he will score.

While it is true that using RE is using past results, it also captures a level of reality that is left out by using bases. Which is not to say that the differences won't be subtle, or that the base approach won't give *similar* results.

(*) Just to clarify, that is not a comment on you, it is a recognition that I was not on the debate team and that I'm sure you've thoughtfully considered these issues, and have for whatever reason come to a different conclusion than I have.

Thanks for the thoughtful discussion.Perhaps the d...

2008-07-15T17:50:00.000-04:00

Thanks for the thoughtful discussion.

Perhaps the difference between us is captured by your comment about "serious evaluation of players". I guess that I am more interested in using approaches that are "really cool and convenient", but also accurate enough to be legitimate. It figures, since I am a structural engineer; that is pretty much what I do for a living.

As far as the different values of bases gained and lost, my assumption is that this would tend to take care of itself in the aggregate. Assuming that the scoring rules are unambiguous, you would have an objective way of giving partial run credit to individuals in any environment without having to do statistical analysis beforehand.

Using your example, moving from first to second with no outs has no run value if the next three hitters all strike out. Moving from second to third with two outs is very valuable if the next thing that happens is a wild pitch. Both events are equally valuable if the next hitter singles on a ground ball up the middle.

My point here is that applying run expectancies is essentially using past data to predict future results; assigning BG and BL happens in real time and reflects actual outcomes. The close correlation of average bases and average runs suggests that counting actual bases would be a good way to attribute actual runs.

as long as we use the same metric for everyone--RC...

2008-07-15T15:40:00.000-04:00

as long as we use the same metric for everyone--RC, BsR, or whatever--we are still comparing apples to apples.

The problem that I have with this is that we know that the RC model is biased in favor of certain profiles. Just becuase everyone is evaluated with the same metric does not wash those biases away. We could rate everyone by (HB + SB)/D and it would be perfectly "fair". It just wouldn't measure anything useful.

Of course, I'm not trying to say that RC measures nothing useful, and I conceded in the original post that it is useful for quick calculations. And the OBA*SLG = runs/at bat, OBA*SLG/(1-BA) = runs/out, and OBA*SLG*(1-OBA)/(1-BA) = runs/PA properties are really cool and convenient. But for serious evaluation of players, they leave a lot to be desired.

The problem with the bases gained/lost line of analysis is that it treats all bases equally. But this is simply not the case. Moving from first to second with no one out benefits your team a lot more than moving from second to third with two outs.

Run Expectancy, the foundation of Linear Weights, enables each play to be valued in a number of runs, based on actual situational data instead of assuming that all bases are created equal. I realize that RE/LW are based on average situations, and that you may be hesitant to say "this play is worth .3 runs" or whatever because of this.

However, assuming an average situation is not a whole lot different than assuming that all bases are equally valuable.

Anyway, just as the linear weight of a single is the average of all of the changes in RE on singles, you can find the average number of bases produced by each event. Tango Tiger figured this for 1999-2002 recently. ( Bases ).

You can see that the average bases correlate very well to the average runs as measured by linear weights. Of the existing offensive metrics, LW is the one that comes closest to matching the results that you would get from your idealized metric.

Of course, run expectancy can also be applied to each play individually, giving each play a different value based on the unique situation in which it occurred, as they do at Fangraphs with BRAA.

Well, Bill James is pretty powerful--he may have a...

2008-07-15T13:48:00.000-04:00

Well, Bill James is pretty powerful--he may have a deathray at his disposal! :-)

I take exception to the assertion that various events are each "worth" a particular number of runs. Maybe I am just getting hung up on semantics. It is certainly valid to use linear weights in run estimators, but they are another model that approximates reality--not the Truth in an absolute sense.

Consequently, ten-run discrepancies between different estimators do not bother me. We are not actually measuring real runs anyway--we are measuring overall hitting performance with a single number, which can be characterized as the runs attributable to an individual player. As long as we use the same metric for everyone--RC, BsR, or whatever--we are still comparing apples to apples.

Personally, I think that the most accurate way to assign runs to players would be to count the bases gained and lost by batters and runners within each inning, add up the results, and then divide the difference (BG-BL) by 4. This has the advantage of being 100% accurate at the team level, but obviously requires complete play-by-play data and some basic "scoring" rules.

For example, if a runner advances from first to third on a single, the batter would get 2 BG (himself to first and the runner to second), while the runner would get 1 BG (for taking the extra base). Meanwhile, the pitcher would presumably be charged with all 3 BG; or maybe just the batter's 2 BG, with the defense taking the runner's 1 BG.

Other tricky parts include what to do with errors (perhaps keep these BG in a separate category) and how to distribute BL among the three out-makers when runners are left on base. I am sure that those details could be worked out by the sabermetric community. Does this idea have merit, or am I way off base here?

That is one of my favorite temporary insanity homo...

2008-07-15T12:25:00.000-04:00

That is one of my favorite temporary insanity homophone mixups of all-time. It sounds as if you are going to shot by a deathray from OBA*SLG/(1 - BA) or something.

I was looking for "faze".

To each his own.However, I will point out for the ...

2008-07-15T09:48:00.000-04:00

To each his own.

However, I will point out for the benefit of other readers that RC/O is starting by assuming (for the 1961-2005 major league totals) that a single is worth .57 runs, a double .88, a triple 1.20, a homer 1.52, a walk just .25, and an out -.115.

It will also tell you that Magglio Ordonez was 84 runs better than an average AL hitter last year, whereas I would say he was around +64 and Palmer's Batting Runs say he was around +62.

But if having ten run discrepancies for the best hitters in a league, solely as a result of which estimator you choose, doesn't phase you, go for it.

I like the BaseRuns concept, but I will always pre...

2008-07-14T22:29:00.000-04:00

I like the BaseRuns concept, but I will always prefer basic RC per out = OBP*SLG/(1-AVG) because it is both simple and elegant, plus it incorporates all three statistics that are commonly used to characterize the overall hitting performance of a player or team. Multiplying by 25 (not 27) then gives you the approximate RC per game for comparison with the league average, since the denominator of AB-H omits DP, CS, and other outs on the bases. No run estimator can be 100% accurate--there are way too many variables--and this one is certainly close enough for me.

As David seemed to be assuming, the "wrong" versio...

2008-07-10T20:12:00.000-04:00

As David seemed to be assuming, the "wrong" version does in fact win--23.37 to 23.77.

The next post will have the RMSEs, FWIW, for all of the formulas I've discussed in the first three posts.

Patriot, what is the RMSE of the "wrong" RC versio...

2008-07-10T12:10:00.000-04:00

Patriot, what is the RMSE of the "wrong" RC version and your corrected "B" weights for that RC version?

Thanks. And THT deserves props for making the swi...

2008-07-08T22:38:00.000-04:00

Thanks. And THT deserves props for making the switch, even if it was long overdue...:-)

Another great article. You do a great job explaini...

2008-07-08T18:53:00.000-04:00

Another great article. You do a great job explaining the technical aspects of Run Estimators in a simple and concise manner. The only reason I can come up with why some baseball fans continue to use Runs Created is that there are custom versions of the formula dating back to 1876. I don't think the reason is that RC is simpler to compute and understand. You were right when you said "If it's really simplicity that you crave, a simple linear weights formula can't be beat." Keep up the good work.