Wednesday, November 04, 2020

Musings on Positional Adjustments

This is an old post that I never published. It is sort of an attempt to justify why I use offensive positional adjustments, which is an even more dated position today than it was when I wrote it. In re-reading it, though, I thought my comments about zero-level defense were at least somewhat pertinent (if not particularly insightful) given Bill James' current effort at developing "Runs Saved Against Zero".

This post is not intended to be a comprehensive discussion of the issue of position adjustments; it will just quickly sketch out a system to classify adjustments and then I’ll offer a few of my opinions on them. There is a lot more that could be said and most of it could and has been said more eloquently by others.

The most important technical distinction between position adjustments (which I’ll shorten to PADJ sometimes) is which type of metric is used to set them--offensive or defensive. This distinction is well-known and gets a lot of attention. One that is talked about less is the difference between explicit and implicit position adjustments, and while people who get their hands dirty with various rating systems are well aware of implicit position adjustments, the average reader presented with a metric might gloss over them.

Explicit position adjustments are obvious and are acknowledged as being position adjustments. The first well-known example of their usage was in Pete Palmer’s linear weights system. They have also been used in VORP, just about every implementation of WAR, and many other metrics.

Implicit position adjustments usually crop up in the work of Bill James, although there are other metrics out there that utilize them. An implicit position adjustment is not really implicit in the truest sense of the word--they are obviously position adjustments if you look at them and consider what their function is. James likes to hide them in his fielding systems.

James’ metrics have always attempted to measure absolute wins and losses. I’ve always maintained that this is a fool’s errand, and that absolute wins and losses only make sense on the team level, not the player level. Most sabermetricians are in general agreement on this, and construct systems that are built to yield results against some baseline.

This is especially true for defensive metrics, whether for pitching for fielding. Absolute metrics (such as runs created) are tempting to apply to individual batters because there is a theoretical minimum on the number of runs a player can create (zero, of course), and such a performance represents the worst possible performance. There is no such cap on the poor performance of a defense; a team could allow an infinite number of runs. The only real cap on the poor performance of an individual fielder is the number of balls that are hit to the locations on the field that fall under his responsibility.

As such, it is impossible to develop a true zero baseline metric to evaluate pitchers or fielders (one can certainly argue that it’s impossible for batters as well, but the existence of the theoretical floor makes it undeniably more tempting). You have to start by comparing to a non-zero baseline (average being the most straightforward), but the problem is compounded for fielders by the fact that it’s also impossible to directly compare fielders at different positions. The fielding standards, be it in fielding average, range factor, or more complex methods vary wildly from one position to another. While all fielders have the same objective (record outs and prevent the opponent from scoring), the primary ways in which fielders at different positions contribute to the common goal are very different.

That pretty much leaves comparing a player to the average fielder at his position as the only viable starting point for the developer of a fielding metric. As is, the results are not satisfactory for inclusion in a total value metric, because they implicitly assume that an average fielder at any position is equal in value to an average fielder at any other position.

There is no one with any degree of baseball knowledge that believes this to be true. Everyone agrees that an average shortstop is harder to find than an average first baseman--that is, the pool of available players that can adequately field shortstop is much smaller than the pool of adequate available first basemen. This basic truth is sometimes obfuscated by silly hypotheticals (i.e. “if you didn’t have a catcher, every pitch with a runner on base would be a passed ball” and “without a first baseman, it would be nearly impossible to convert a groundball into an out”), but serious people agree on this.

So what is one to do about this problem? You have to do something--you cannot have a functioning estimate of total value that pretends first basemen and shortstops are equal in fielding value. The easiest answer is a position adjustment.

While James attempts to express all of his value metrics relative to an absolute baseline, he of course can’t pull off a clean implementation. His solution, in both his early 1980s Defensive Winning Percentage and his more recent Win Shares, is to develop a fielding winning percentage for each position and convert this to wins and losses (the terminology and procedure is a little different in Win Shares but that’s a long story).

To make the conversion to from a rate to a total of wins and losses, James multiplies by a number of games for which each position is assumed to be responsibility. Positions on the left side of the defensive spectrum are assigned less responsibility than those on the right side…and thus this is an implicit position adjustment.

In pointing this out, I don’t mean to suggest that James is in any way dishonest in describing his systems--the assigned games are clearly defined in the system and aren’t hidden. The characterization I’ve offered of these adjustments as “implicit” is therefore not really accurate. The real difference between James-style position adjustments and the ones I’ve defined as “explicit” is that explicit adjustments either add or subtract a set number of runs dependent upon a player’s position or apply a different baseline to their rate in converting to a value stat.

The other major characteristic that defines a position adjustment’s type is whether it is an offensive PADJ or a defensive PADJ. The categories are not black and white--many positional adjustments incorporate subjective weighting of various factors, which could include offensive performance by players at a position, the range of offensive performance, the performance of fielders at multiple positions, comparisons of average salary as a stand-in for how teams value players, subjective corrections that the developer feels better matches the results of the system to common sense--but usually the primary basis can be identified as either offensive or defensive.

Offensive position adjustments have fallen out of favor recently, although there are still some people using them (including me). The offensive PADJ originated with Pete Palmer, who used it as part of his linear weights system. The other most prominent use came in Keith Woolner’s VORP.

Defensive positional adjustments are a more recent phenomenon, but are key to both the Fangraphs and Chone WAR methodology. Tango Tiger was the driving force behind their development, and Chone has also done his own research to establish the adjustments for his WAR.

Before deciding how to construct a position adjustment, it’s a good idea to take a step back and figure out why you need a position adjustment at all. Taking the reason behind your metric for granted is a path to just slapping numbers around indiscriminately and failing to model baseball reality. From my perspective, the only real reason that a PADJ is necessary is that it is essentially impossible to measure a player’s fielding value independent of his position. Therefore, one has to have a way of comparing the value of fielding performances across positions--a position adjustment.

A common misperception regarding all position adjustments among people not that well-versed in sabermetrics is that they provide a bonus “just for playing the position”. While I suppose that might be technically true in the sense of calculation, the underlying need for such an adjustment is discussed above. If one does not believe in applying a positional adjustment, and accepts the use of defensive metrics baselined to an average fielder at the position, then they must conclude that, as a group, the most valuable players are those at left-side of the spectrum positions. Or, in other words, that the overall average value of players at a given position is strictly a function of their aggregate offensive production.

It is possible to complicate the question of position adjustments by talking about baselines (particularly replacement level) and other considerations, but at the heart of the issue is the need to compare the value of a shortstop -5 fielding runs relative to an average shortstop to a third baseman +10 runs relative to an average third baseman to an average first baseman.

Such a viewpoint suggests that a defensive PADJ is the way to go, since the sole reason for needing the adjustment is consideration of defense. So while the overwhelming positive of a defensive PADJ is that it is defined in the terms that necessitate the entire endeavor, it also carries a few negatives.

One is the difficulty of accurately evaluating fielding value, even within the constraints of one’s own position. While it is quite possible that any biases or methodological errors will balance out when aggregated over a large population of players, it would nonetheless be more comforting to begin from metrics in which one had a great deal of confidence.

Another key issue is that the pool of players who log times at multiple positions, while relatively large when comparing similar position groups (particularly outfielders, but also middle infielders, corner infielders, etc.), there is a much smaller available sample of players who play very different positions, at least in the same or adjacent seasons. And catcher? Forget about it--Bill James even left catcher off of the defensive spectrum due to the difficulty of comparing it directly to the positions whose occupants stand in fair territory.

Players that move positions introduce all kinds of selective sampling issues as well. Consider the problem comparing positions where left-handers are de facto ineligible, and the fact that players moved off a position are more likely to have been stopgaps. For a more complete discussion of these issues (and an all-around good discussion of PADJ issues), see Colin Wyers’ article at Baseball Prospectus.

Thus, to avoid strange conclusions, defensive position adjustments are always going to require a little subjective massaging. That’s not necessarily a bad thing--the construction of any metric requires subjective decisions made on the part of the developer--but it makes them inherently high maintenance.

Of course, offensive position adjustments are best employed with a measure of caution as well. The pros of offensive adjustments are that they are very easy to work with. Offensive statistics are more reliable than fielding statistics, require much more basic data to calculate, and are available throughout the entire history of the game. Rather than having to compare performance of players across positions, one can at least start by simply looking at the average performance of all players at a particular position.

An offensive PADJ implicitly assumes that teams allocate their talent in such a manner that the average player at any position is equal to the average player at any other position--alternatively stated, that the offensive value gap between positional averages is equal to the defensive value gap. This is certainly never 100% truly the case for any sample period, particularly for single years. Offensive PADJs based on one year of data or other short stretches should be viewed with a great deal of skepticism.

Another problem lurking is what Wyers, in the linked article, refers to as the “Mays problem”--the existence of supremely talented players that excel at both hitting and fielding. Such players might be superstars at any position (ignoring handedness and other impediments), even first base, thanks to their hitting alone but are able to handle the defensive rigors of right-side defensive spectrum positions. While more ordinary players offer a package of offensive and defensive skills that limits their possible fielding positions commensurate to their offensive production, these players are playable anywhere. There are also potential issues with the very worst players at a position.

The Mays problem skews offensive positional averages, so Wyers proposes using an alternative offensive PADJ that adjusts the overall positional average for the gap between the upper and lower median of observed performance at the position. This approach (and other similar algorithms that could be offered) is novel but involves subjective choices similar to those necessitated by defensive PADJs.

The offensive PADJ will surely fail at lower levels of baseball thanks to the Mays problem--the best high school players, for instance, are often the cleanup hitter, ace pitcher, and center fielder or shortstop when not pitching. Such all-around stars are also more common in college ball or in earlier, less developed major leagues than they are in the modern major leagues with their high overall quality of play and relatively strong competitive balance. An offensive PADJ approach will surely break down at those low levels without serious alterations.

There are other relevant issues to discuss with respect to position adjustments, such as their relationship to replacement level and the manner in which they are applied (That is, if they should be used to change the baseline to which performance is compared or if they should be assigned as a lump sum based on playing time. The possible answers to this question is also closely tied to one’s choice of offensive or defensive adjustment), but those will have to wait for some other time.