This blog has existed for sixteen years now, and yet with the exception of some (relatively) recent stuff I’ve written about the Enby distribution for team runs per game and the Cigol approach to estimating team winning percentage from Enby, almost all of the interesting sabermetric work appeared in the blog’s first five years, and most in the first year or two.
There are a number of reasons for that - one is that when I started, I was a college student with a lot more free time on his hands than I have with a 9-5. Related, I was also more eager to spend a lot of time staring at numbers on my free time when I didn’t spend a good portion of my day staring at numbers. Remember the Bill James line about how a column of numbers that would put an actuary to sleep can be made to dance if you put Bombo Rivera’s picture on the flip side of the card? Sometimes the numbers do indeed dance, but the actuary in question would rather watch a ballgame or read about the Battle of Gravelines than manipulate them in the evening, dancing or no.
More generally, there has been much less to investigate in the area of sabermetrics that I primarily practice, which I will call for the lack of a better term “classical sabermetrics”. I would define classical sabermetrics as sabermetric study which is primarily focused on game-level (or higher, e.g. season, player career, etc.) data that relates to baseball outcomes on the field (e.g. hits, walks, runs scored, wins). Classical sabermetrics is/was the primary field of inquiry of those I have previously called first or second-generation sabermetricians.
Classical sabermetrics is not dead, but to date the last great achievement of the field was turned in by Voros McCracken when he developed DIPS. I’m not arrogant enough to declare that nothing more will ever be found in the classical field, and there is still much work to be done, but at least as far as I can see, it is highly likely that it will consist of tinkering and incrementally improving work that has already been done, and probably with little impact on the practical implementation of sabermetric ideas. For example, I still would love to find a modification to Pythagenpat that works better for 2 RPG environments, or a different run estimator construct that would preserve the good properties of Base Runs while better handling teams that hit tons of triples. All of this is quite theoretical, and of no practical value to someone who is attempting to run the Pirates.
Which increasingly is what sabermetric practitioners are attempting to do, whether directly through employment by major league teams, or indirectly through publishing post-classical sabermetric research in the public sphere. Let me be very clear: this is not in any way a lament for a simpler, purer time in the past. I think it’s wonderful that sabermetric analysis has transcended the constraints of the data used in its classical practice and is exerting an influence on the game on the field.
Notwithstanding, I am still a classical sabermetrician, not because I don’t value the insight provided by post-classical sabermetrics but because I don’t have some combination of the skillset or the way of thinking or the resources or the drive to become proficient enough in newer techniques to offer anything of value in that space. Thus it is natural that I have less to share here.
The topic that I am embarking on discussing is squarely in the realm of “quite theoretical and of no practical to someone who is attempting to run the Pirates”. About fifteen years ago, I started writing a “Rate Stat Series”, and aborted it somewhere in the middle. I have stated several times that I intend to revisit it, but until now have not. The Rate Stat Series was and now is intended to be a discussion of how best to express a batter’s overall productivity in a single rate stat. I should note three things that it is not:
1. The discussion is strictly limited to the construction of a rate stat measuring overall offensive productivity, not a subset thereof. I am not suggesting that if you are measuring a batter’s walk rate, strikeout rate, ground-rule double rate, or any other component rate you can dream up, that you should follow the conclusions here. For most general applications, plate appearances makes perfect sense as the denominator for a rate for any of those quantities. There may be reasons to follow a sort of decision tree approach that results in different denominators for some applications (McCracken was an innovator in this approach, in DIPS and park factors). All of that is well and good and completely outside the scope of this series.
2. The premise presupposes that the unit of measurement of a batter’s productivity has already been converted to a run-basis. Thus it is not a question of OPS v. OTS v. OPS+ v. 1.8*OBA + SLG v. wOBA v. EqA v. TAv v. whatever, but rather what the denominator for a batter’s estimated run contribution should be. The obvious choices are outs and plate appearances, but there are other possibilities. Spoiler alert: My answer is “it depends”.
3. Revolutionary, groundbreaking, or any other similar adjective. I’m attempting to describe my thoughts on methods that already exist and were created by other people in a coherent, unified format.
In sitting down to write this, I realized I made two fundamental mistakes in my first attempt:
1. I was attempting to “prove” my preferences mathematically, which is not a bad thing in theory, but some of what I was doing begged the question and some of this discussion is of a theoretical nature that lends itself more to logical reasoning/“proofs” than to mathematical “proofs”. I’ve tried to anchor my conclusions in math, logic, and reason where possible, but have also embraced that some of it is subjective and must be so.
2. I posted pieces before I finished writing the whole thing, or even knowing exactly where it was going.
These are rectified in this attempt – all of my assertions are wildly unsupported and as I hit post, all planned installments exist in at least a detailed outline form. While I have attempted to avoid the two mistakes I identified in the previous series, as I look at this series in full I can see I have may have just replaced them with two characteristics that will make reading this a real chore:
1. I’m overly wordy; repeating myself a lot and trying to be way too precise in my language (although I fear not as precise as the topic demands). There’s a lot of jargon in an attempt to delineate between the various concepts and methodological choices.
2. There’s way too much algebra; where possible, I didn’t want to just assert that mathematical operations resolved in a certain way and give an empirical example that backs me up, so there’s a lot of “proofs” that will be of no general interest.
Allow me to
close by laying some groundwork for future posts. I am going to use the 1994
1. 1994 was the
year I became a baseball fan, and I was primarily focused on the
2. As the year in which the “silly ball era” really broke out, and due to the strike shortening the season, there are some fairly extreme performances that are useful when talking about the differences between rate stat approaches.
As discussed, this series starts from the premise that a batter’s contribution is measured in terms of runs, and work from there. This approach does not require the use of any particular run estimator, although one of my assertions is that the choice of run estimator and the choice of rate/denominator for the rate are logically linked. There are three types of run estimators that I will use in the series: a dynamic model, a linear model, and a hybrid theoretical team model.
In order to
avoid differences in the run estimator(s) used unduly influencing differences
in the resulting rate stats, I am going to anchor a set of internally
consistent run estimators in the reference period of the 1994
A version of Base Runs I have used is below. It’s not perfect by any means; it overvalues extra base hits as we’ll see below, but again, the specific estimator is for example only in this series – the thinking behind constructing the resulting rates is what we’re after:
A = H + W – HR
B = (2TB - H – 4HR + .05W)*.78
C = AB – H
D = HR
BsR = (A*B)/(B + C) + D
Typically, any
reconciliation of Base Runs to a desired estimate number of runs scored for an
entity like a league is done using the B factor, since it is already something
of a balancing factor in the formula, representing the somewhat nebulous
concept of “advancement” while the other components (A = baserunners, C = outs,
D = automatic runs) represent much more tightly defined quantities. In order to
force the Base Runs estimate for the 1994
Needed B = (R – D)*C/(A – R + D)
Divide this by (2TB – H – 4HR + .05W) and you get a .79776 multiplier. I usually don’t force the estimated runs equal to the actual runs, but for this series, I want to be internally consistent between all of the estimators and also be able to write formulas using league runs rather than having to worry about any discrepancies between league runs and estimated runs.
So our dynamic run estimator (BsR) used throughout this series will be:
A = H + W – HR = S + D + T + W
B = (2TB - H – 4HR + .05W)*.79776 = .7978S + 2.3933D + 3.9888T + 2.3933HR + .0399W
C = AB – H = Outs
D = HR
BsR = (A*B)/(B + C) + D
To be
consistent, I will also use the intrinsic linear weights for the 1994
LW = ((B + C)*(A*b + B*a) – A*B*(b + c))/(B + C)^2 + d
For the 1994
LW_RC = .5069S + .8382D + 1.1695T + 1.4970HR + .3495W - .1076(outs)
We will also need a version of LW expressed in the classic Pete Palmer style to produce runs above average rather than absolute runs. That’s just a simple algebra problem to solve for the out value needed to bring the league total to zero, which results in:
LW_RAA = .5069S + .8382D + 1.1695T + 1.4970HR + .3495W - .3150(outs)
I am ignoring any questions about what the appropriate baseline for valuing individual offensive performance is. Regardless of where you side between replacement level, average, and other less common approaches, I hope you will agree that average is a good starting point which can usually be converted to an alternative baseline much more easily than if you start with an alternative baseline. Average is also the natural starting point for linear weights analysis since the empirical technique of calculating linear weights based on average changes in average run expectancy is by definition going to produce an estimate of runs above average.
Later we will also have some “theoretical team” run estimators built off this same foundation, but discussion of them will fit better when discussing that concept in greater detail.
I will also be
ignoring park factors and the question of context in this series (at least
until the very end, where I will circle back to context). Since I am narrowly
focused on the construction of the final rate stat, rather than a full-blown
implementation of a rating system for players, park factors can be ignored.
Since I am anchoring everything in the 1994
No comments:
Post a Comment
I reserve the right to reject any comment for any reason.