Wednesday, June 09, 2021

Rate Stat Series, pt. 5: Linear Weights Background

Linear methods sidestep the issues that arise from applying dynamic run estimators to players by simply ignoring any non-linearity in the run scoring process altogether. While this is clearly technically incorrect, it is closer to reality than pretending that a player’s performance interacts with itself. Since an individual makes up only 1/9 of a lineup, it is much closer to reality to pretend that his performance has no impact on the run environment of his team than to pretend that it defines the run environment of his team. Linear weights also have the advantage of being easy to work with, easy to adapt to different baselines, and easy to understand and build. Their major drawback is that the weights are subject to variation due to changes in the macro run environment (as distinguished from the marginal change to the run environment attributable to an individual player). 

Linear methods were pioneered by FC Lane and George Lindsey, but it was Pete Palmer who used them to develop an entire player evaluation system, publish historical results, and bring them into the position of the chief rival to Runs Created in the 1980s. Curiously (especially since Palmer is a prolific and brilliant sabermetrician whose pioneering work includes park factors, variable runs per win, using the negative binomial distribution to model team runs per game, and more), Palmer’s player evaluation system as laid out in The Hidden Game of Baseball and later Total Baseball and the ESPN Baseball Encyclopedia never bothered to convert its offensive centerpiece Linear Weights into a rate statistic.

This gap contributed to two developments that I personally consider unfortunate. First, confusion about how to convert linear weights to a rate may have hampered the adoption of the entire family of metrics, and this confusion generally persisted until the publication of The Book by Tom Tango, Mitchel Lichtman, and Andy Dolphin. Second, Palmer did offer up a rate stat, but he did not tie it to linear weights, or in its crudest form to any meaningful units at all. That’s because Normalized OPS (later called Production), which you may know as OPS+, was the rate stat coupled with linear weights batting runs.

To my knowledge, Palmer has never really explained why he didn’t derive a rate stat from linear weights to use; the explanations have instead focused on the ease and reasonable accuracy of OPS. In The Hidden Game, the discussion of linear weights transitions to OPS with “For those to whom calculation is anathema, or at least no pleasure, Batter Runs, or Linear Weights, has a ‘shadow stat’ which tracks its accuracy to a remarkable degree and is a breeze to calculate: OPS, or On Base Average Plus Slugging Percentage.”

Coincidentally, Palmer recently published an article in the Fall 2019 Baseball Research Journal titled “Why OPS Works”, which covers a lot of the history of his development of linear weights and OPS, but still doesn’t explain exactly why a linear weights rate wasn’t part of the presentation.

Without the brilliant mind of Palmer to guide us, where should we turn for a proper linear weights-based rate stat? To answer that question, I think it’s necessary to briefly examine how linear weights work. For this discussion, I am taking for granted that the empirical derivation of linear weights is representative of all linear weight formulas. This is not literally true, as belied by the fact that the linear weights I’m using in this series were derived from Base Runs, not from empirical data. If we were using an optimized Base Runs formula, the resulting weights would be very close to empirical weights derived for a similar offensive environment, but other approaches to calculating linear coefficients like multiple regression can deviate significantly from the empirical weights. Even so, the final results are similar enough that the principles hold for reasonable alternative linear weight approaches.

What follows will be elementary for those of you familiar with linear weights, but let’s walk through a sample inning featuring the star of our series, Frank Thomas. I want to use this example to illustrate two properties of linear weights when using the “-.3 type out value” (i.e. when the result is runs above average): the conservation of runs, and the constant negative value of outs. This example will simplify things slightly, as in reality not every event in the inning cleanly maps to a batting event that is included in a given linear weights formula (e.g. wild pitches, balks, extra bases on errors, etc.) It also will presume that the run expectancy table we use for the example corresponds perfectly to our linear weights, which it does not. Still, the principles are generally applicable to properly constructed linear weights methods, even if the weights were derived from other run expectancy tables or, as is the case for us in this series, by another means altogether (I’m using the intrinsic weights derived from Base Runs for the 1994 AL totals).

Baseball Prospectus has annual run expectancy tables; their table for the 1994 majors is:


On July 18, Chicago came to bat in the bottom of the seventh trailing Detroit 9-5. Their run expectancy for the inning was .5545 as Mike LaValliere stood in against Greg Cadaret. He drew a walk, which raised the Sox RE to .9543, and thus was worth .3998 runs. The rest of the inning played out as follows:


1. If we were going to develop empirical LW coefficients based on this inning, we would conclude that a home run was worth 2.658 runs on average, and thus our linear weight coefficient for a home run would be 2.658. The other events would be valued:


This is in fact how empirical LW are developed, but of course a much larger sample size (typically at least an entire league-season) is used.

2. The team’s runs above average for the inning is always conserved. We started the inning with the bases empty and nobody out for a RE of .5545. This is the same as saying that the average for an inning is .5545 runs scored. The White Sox actually scored 4 runs, and the total of the linear weight values of the plays was 3.4455 runs, which is 4 - .5545. They scored 3.4455 runs more than an average team would be expected to in an inning. The sum of the linear weight values will always match this.

Because of this, we can be assured that the run value of additional plate appearances created by the positive events of the batters has been taken into account in the linear weight values. If this were not the case, runs would not be conserved.

3. Since that is true, it is also true that the sum of the LW values of the positive events (which is 4.8128 runs) plus the sum of the LW values of the outs (-1.3673) must be equal to the runs above average for the inning (3.4455). The sum of the values of the outs will be higher in innings in which more potential runs were “undone” by outs, as was the case here. On the other hand, an inning in which three outs are recorded in order will result in -.5545 runs.

We can use this fact to isolate the run value of the out between the portion that is due to ending the inning (what Tom Tango has called the “inning killer” effect of the out; this is the -.5545 that is the minimum out value for an inning), and that which is due to wasting the run potential of the positive events (what’s left over, in this case, -.8128 runs). 

If we wish to convert our linear weights from an estimator of runs above average to an estimator of absolute runs, we need to back out the inning killer value of the out (since it will be present for every inning equally and serves to conserve total RAA) from the overall value of the out, leaving the remainder which we do not need to worry about as it would have to be debited from the value of the positive events in order to conserve runs.

So we can take -.5545/3 = .1848 and add it back to the linear weight RAA out value, which for our example was -.3150. This results in an absolute out run value of -.1302, In our example we’re using -.1076; these don’t reconcile because:

1. our linear weights don’t consider all events (we’re ignoring hit batters, sacrifices, all manner of baserunning outs, etc.)

2. our linear weights weren’t empirically derived from the 1994 RE table as the .1848 adjustment was

While the numbers don’t (and shouldn’t!) balance perfectly in this case, this is the theoretical bridge for converting empirical linear weights from a RAA basis to an absolute runs basis. I would also contend is serves as a demonstration by inductive reasoning that absolute linear weights do not capture the PA generation impact of avoiding outs, but RAA linear weights do.

Note that converting to the “-.1 type out value” does not eliminate the result of negative runs altogether. An offensive player who is bad enough will be credited with negative runs created (if it helps you to imagine what this level of production might look like, consider that the total offensive contributions of pitchers has hovered near zero absolute runs created in the last decade). For real major league position players, this will not happen except due to sample size. If you’d like an interpretation, I have found this helpful (I stole it from someone, probably Tom Tango, and have badly paraphrased): Since linear weights fix the values of each event for all members of the team, the level at which runs created are negative is the level at which in order to conserve team runs, the weights of positive events cannot be reduced – the poor batter essentially undoes some of the positive contributions of his teammates.

As an aside, the first paper I’m aware of that made the connection between the two linear weight approaches in this manner (rather than simply solving algebraically for the difference between the two without providing theoretical underpinning) was published by Gary Skoog in a guest article in the 1987 Baseball Abstract. This article, titled “Measuring Runs Created: The Value Added Approach” is available at Baseball Think Factory.

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.