Wednesday, April 28, 2021

Rate Stat Series, pt. 2: PA Generation

This is a little bit of a detour and certainly nothing new (I don’t know who originally laid out this logic/math – the earliest use I’m aware of was in 1960 by D’Esopo & Lefkowitz as part of their Scoring Index model), but I think a discussion of it is appropriate in the context of this series, and I will later make use of these formulas.  It’s also ground I covered in the original series, but I think my explanation this time is slightly more coherent.

Each batting team starts each inning (excluding scenarios where a walkoff is possible) with three plate appearances guaranteed. Thus each team starts each game with twenty-seven plate appearances guaranteed (excluding scenarios where the home team forgoes batting the bottom of the ninth, rainouts, post-2020 doubleheaders, etc.). Any plate appearances beyond that must be earned by batters avoiding outs. Since it’s more natural to think of a positive outcome rather than the avoidance of a negative outcome, I will simplify and say that each extra plate appearance must be earned by a batter reaching base (and not being subsequently retired on the basepaths).

For the sake of discussion (and keeping with the simple set of statistics being used in the metrics in this series), I’m going to ignore the existence of baserunning outs, including caught stealing, pickoffs, outs stretching, outs advancing, and runners retired on double/triple plays (although not on fielder’s choices, since the batter is charged with an out in that case). I’m going to assume that the out rate is the complement of on base average, which in this series will be defined simply as (H + W)/(AB + W). In reality, considering all the ways in which outs can be made, it would be a more involved equation (I’ve used the acronym NOA for Not Out Average and OA for the complement, Out Average) which would look something like this, although it still doesn’t think I’ve accounted for every possible event (you try incorporating fielders’ choices without complicating the equation significantly):

NOA = (H + W + HB + CI + ROE – CS – DP – Outs Stretching – Outs Advancing – Pickoffs – 2*TP)/(AB + W + HB + SF + SH + CI)

Alternatively, for a team when LOB data is available (and ignoring the walkoff situation), you could have OA = (Plate Appearances – Runs Scored – Left On Base)/Plate Appearances. All of this is just an attempt to calculate, as best we can from the available statistics we have restricted ourselves to, Outs/Plate Appearances. NOA or OA as appropriate could be substituted for OBA in the equations that follow as long as the appropriate corresponding adjustments are made to the numerator.

Let’s assume for the purpose of developing an equation for team plate appearances that the OBA is constant across each of the nine batters in the lineup and doesn’t vary for any other reason (this is obviously never true, but it is a fine simplifying assumption for modeling PA generation). Then a team will start an inning with three plate appearances. For each of those three guaranteed PAs, there is a probability (equal to OBA, given our assumption) that the batter avoids an out (reaches base, given that there are no baserunning outs). This increases the expected number of plate appearances by OBA.

It doesn’t stop there, though. Each additional PA that is generated also has an OBA chance of creating an additional PA, which itself has an OBA chance of creating an additional PA. Thus, for each of the guaranteed PA, the expected final number of team PA is:

OBA + OBA*OBA + OBA*OBA*OBA + … = OBA + OBA^2 + OBA^3 + … OBA^n

which when n is infinity and OBA is between 0 and 1 (which it must be by definition) resolves to:

OBA/(1 – OBA)

The 1994 AL had an OBA of .343. Thus, each guaranteed plate appearance should have generated .343/(1 - .343) = .522 additional plate appearances. In an average inning, starting with three guaranteed PA, we would expect 3 + 3*.522 = 3*(1 + .522) = 4.566 PA, and thus in a game we would expect 9*4.566 = 41.09 PA. Note that instead of calculating the .522 additional PA, we can simplify this to 3/(1 – OBA) for an inning or 27/(1 – OBA) for a game. In reality there were 39.24 PA, so we have an unacceptable 4.7% error. What went wrong?

I’m mixing definitions of plate appearances and definitions of OBA incorrectly, and also ignored that the three guaranteed PA are equal to the number of outs permitted in the inning. In order to estimate the number of plate appearances per inning or game consistently, we need to divide the average number of outs/game by 1 – OBA:

PA/G = (O/G)/(1 – OBA)

The definition of outs that corresponds to our simple (H + W)/(AB + W) complement of out average is AB – H. In the 1994 AL there were 25.19 outs/game using this definition, so our expected PA/G is:

25.19/(1 - .353) = 38.34

The actual average was 38.35; we’re off due to rounding as this is now just a mathematical truism since by our simplified definitions plate appearances = outs + times on base. Using this equation to estimate team PA/G from their OBA for the 1994 AL, the RMSE is .259, which is about .7% of the average PA/G. We shouldn’t expect perfect accuracy at the team level since team PA will be affected by different quantities of all the statistical categories we’re ignoring that have an impact on the actual number of PA a team generates, as well as differences in number of extra inning games, foregone bottom of the ninths, and walkoff-shortened innings.

The key points to keep in mind as we move forward in discussing rate stats are:

1.      The number of plate appearances a team will get is a function of their out rate, and simplifying terms we can very accurately estimate team PA as a function of on base average

2.      Since players have an impact on the number of plate appearances their team gets, and thus the number of plate appearances they get, a proper rate stat for measuring overall offensive productivity must account for that impact

Thursday, April 15, 2021

Almost Perfect

In my earlier days as a baseball fan, I was really interested in no-hitters, and outside of the Indians winning the World Series, my most fervent desire as a fan was to witness one even if only on the radio. Eventually this faded, due to some combination of growing jaded about the extent to which baseball fans sometimes elevate trivial events above game outcomes, the pernicious influence of Voros McCracken on how I thought about the hits column for pitchers, and after fifteen years of intense baseball-watching finally witnessing one (I'm now up to five).

Perfect games retain a bit more of their mystique for me, due to being much more rare (someone who has watched as many games over the years as I have is bound to have seen a no-hitter, but one can't really expect to see a perfect game) and not relying on any arbitrary distinction between hits and errors (which of course doesn't affect all no-hitters). The three closest games I have taken in to being perfect games prior to last night were Mike Mussina against the Indians in 1997 and Armando Galarraga's should-have been perfect game against the Indians in 2010. The latter game is case in point of what I meant about fans sometimes being more interested in trivial events than game outcomes - there was more outcry in favor of replay as a result of that game then there was cumulatively from many calls that much more directly influenced which team won a given game.

Last night's effort by Carlos Rodon combined elements of both of the ninth innings of these games in the way that people who believe in hocus pocus should embrace. From Galarraga's, we took the extremely close play at first base, with Josh Naylor playing the role of Jason Donald, desperately trying to reach first after making weak contract towards first base. In this case, the play was actually much closer, but no replay was required as the call on the field was that Jose Abreu beat him to the bag by a narrow margin. 

From the Mussina game, we borrowed the man, lineup slot, and fielding position to break it up. With one out in the ninth, the Indians catcher. Sandy Alomar singled off Mussina, while Roberto Perez was only hit in the back foot with a slider, but history repeated itself in who ended it. Of course, if Rodon had to lose the perfect game, he got the better outcome than the other two, as he at least got to keep the no-hitter.

Naturally, all of the near perfect games I've seen have been pitched against the Indians. In addition to the infinitely more important distinction of now having the longest World Series drought, after Joe Musgrove's no-hitter for the Padres, the Indians now have the longest drought between no-hitters, it having been nearly forty years since Len Barker's perfect game.

I was keeping score of the Mussina game and Rodon's effort last night, but not the Galarraga game, which I listened to on the radio while I watched some other game on TV. 



Wednesday, April 14, 2021

Rate Stat Series, pt. 1: Introduction

This blog has existed for sixteen years now, and yet with the exception of some (relatively) recent stuff I’ve written about the Enby distribution for team runs per game and the Cigol approach to estimating team winning percentage from Enby, almost all of the interesting sabermetric work appeared in the blog’s first five years, and most in the first year or two.

There are a number of reasons for that - one is that when I started, I was a college student with a lot more free time on his hands than I have with a 9-5. Related, I was also more eager to spend a lot of time staring at numbers on my free time when I didn’t spend a good portion of my day staring at numbers. Remember the Bill James line about how a column of numbers that would put an actuary to sleep can be made to dance if you put Bombo Rivera’s picture on the flip side of the card? Sometimes the numbers do indeed dance, but the actuary in question would rather watch a ballgame or read about the Battle of Gravelines than manipulate them in the evening, dancing or no.

More generally, there has been much less to investigate in the area of sabermetrics that I primarily practice, which I will call for the lack of a better term “classical sabermetrics”. I would define classical sabermetrics as sabermetric study which is primarily focused on game-level (or higher, e.g. season, player career, etc.) data that relates to baseball outcomes on the field (e.g. hits, walks, runs scored, wins). Classical sabermetrics is/was the primary field of inquiry of those I have previously called first or second-generation sabermetricians.

Classical sabermetrics is not dead, but to date the last great achievement of the field was turned in by Voros McCracken when he developed DIPS. I’m not arrogant enough to declare that nothing more will ever be found in the classical field, and there is still much work to be done, but at least as far as I can see, it is highly likely that it will consist of tinkering and incrementally improving work that has already been done, and probably with little impact on the practical implementation of sabermetric ideas. For example, I still would love to find a modification to Pythagenpat that works better for 2 RPG environments, or a different run estimator construct that would preserve the good properties of Base Runs while better handling teams that hit tons of triples. All of this is quite theoretical, and of no practical value to someone who is attempting to run the Pirates.

Which increasingly is what sabermetric practitioners are attempting to do, whether directly through employment by major league teams, or indirectly through publishing post-classical sabermetric research in the public sphere. Let me be very clear: this is not in any way a lament for a simpler, purer time in the past. I think it’s wonderful that sabermetric analysis has transcended the constraints of the data used in its classical practice and is exerting an influence on the game on the field.

Notwithstanding, I am still a classical sabermetrician, not because I don’t value the insight provided by post-classical sabermetrics but because I don’t have some combination of the skillset or the way of thinking or the resources or the drive to become proficient enough in newer techniques to offer anything of value in that space. Thus it is natural that I have less to share here.

The topic that I am embarking on discussing is squarely in the realm of “quite theoretical and of no practical to someone who is attempting to run the Pirates”. About fifteen years ago, I started writing a “Rate Stat Series”, and aborted it somewhere in the middle. I have stated several times that I intend to revisit it, but until now have not. The Rate Stat Series was and now is intended to be a discussion of how best to express a batter’s overall productivity in a single rate stat. I should note three things that it is not:

1. The discussion is strictly limited to the construction of a rate stat measuring overall offensive productivity, not a subset thereof. I am not suggesting that if you are measuring a batter’s walk rate, strikeout rate, ground-rule double rate, or any other component rate you can dream up, that you should follow the conclusions here. For most general applications, plate appearances makes perfect sense as the denominator for a rate for any of those quantities. There may be reasons to follow a sort of decision tree approach that results in different denominators for some applications (McCracken was an innovator in this approach, in DIPS and park factors). All of that is well and good and completely outside the scope of this series.

2. The premise presupposes that the unit of measurement of a batter’s productivity has already been converted to a run-basis. Thus it is not a question of OPS v. OTS v. OPS+ v. 1.8*OBA + SLG v. wOBA v. EqA v. TAv v. whatever, but rather what the denominator for a batter’s estimated run contribution should be. The obvious choices are outs and plate appearances, but there are other possibilities. Spoiler alert: My answer is “it depends”.

3. Revolutionary, groundbreaking, or any other similar adjective. I’m attempting to describe my thoughts on methods that already exist and were created by other people in a coherent, unified format.

In sitting down to write this, I realized I made two fundamental mistakes in my first attempt:

1. I was attempting to “prove” my preferences mathematically, which is not a bad thing in theory, but some of what I was doing begged the question and some of this discussion is of a theoretical nature that lends itself more to logical reasoning/“proofs” than to mathematical “proofs”. I’ve tried to anchor my conclusions in math, logic, and reason where possible, but have also embraced that some of it is subjective and must be so.

2. I posted pieces before I finished writing the whole thing, or even knowing exactly where it was going.

These are rectified in this attempt – all of my assertions are wildly unsupported and as I hit post, all planned installments exist in at least a detailed outline form. While I have attempted to avoid the two mistakes I identified in the previous series, as I look at this series in full I can see I have may have just replaced them with two characteristics that will make reading this a real chore:

1. I’m overly wordy; repeating myself a lot and trying to be way too precise in my language (although I fear not as precise as the topic demands). There’s a lot of jargon in an attempt to delineate between the various concepts and methodological choices.

2. There’s way too much algebra; where possible, I didn’t want to just assert that mathematical operations resolved in a certain way and give an empirical example that backs me up, so there’s a lot of “proofs” that will be of no general interest.

Allow me to close by laying some groundwork for future posts. I am going to use the 1994 AL as a reference point, and when I use examples they will generally be drawn from this league-season. Why have I chosen the 1994 AL?

1. 1994 was the year I became a baseball fan, and I was primarily focused on the AL at that time, so it is nostalgic. I have not turned into a get off my lawn type who thinks that baseball reached its zenith in 1994 and it’s all been downhill since, but I do think that about 1994 Topps, the greatest baseball card set of all-time.

2. As the year in which the “silly ball era” really broke out, and due to the strike shortening the season, there are some fairly extreme performances that are useful when talking about the differences between rate stat approaches.

As discussed, this series starts from the premise that a batter’s contribution is measured in terms of runs, and work from there. This approach does not require the use of any particular run estimator, although one of my assertions is that the choice of run estimator and the choice of rate/denominator for the rate are logically linked. There are three types of run estimators that I will use in the series: a dynamic model, a linear model, and a hybrid theoretical team model.

In order to avoid differences in the run estimator(s) used unduly influencing differences in the resulting rate stats, I am going to anchor a set of internally consistent run estimators in the reference period of the 1994 AL. It will come as no surprise if you’ve read anything I’ve written about run estimators in the past that I am using Base Runs for this job. The point of this series is not to tell you which particular run estimator to use or how to construct it. It really doesn’t matter which version of Base Runs I use (if you are still stuck on Runs Created, there’s no judgment from this corner, at least for the duration of this discussion), or which categories I include in the formula – this is about the conceptual issues regarding the rate that you calculate after estimating the batter’s run contribution, so I am keeping it very simple, looking just at hits, walks, and at bats (thus defining outs as at bats minus hits) and ignoring steals/caught stealing, hit batters, intentional walks, sacrifices, etc..  Since I’m doing this with the run estimator, I will also do it with most other statistics I cite – for example, throughout this series OBA will be (H + W)/(AB + W), and PA will just be AB + W.

A version of Base Runs I have used is below. It’s not perfect by any means; it overvalues extra base hits as we’ll see below, but again, the specific estimator is for example only in this series – the thinking behind constructing the resulting rates is what we’re after:

A = H + W – HR

B = (2TB - H – 4HR + .05W)*.78

C = AB – H

D = HR

BsR = (A*B)/(B + C) + D

Typically, any reconciliation of Base Runs to a desired estimate number of runs scored for an entity like a league is done using the B factor, since it is already something of a balancing factor in the formula, representing the somewhat nebulous concept of “advancement” while the other components (A = baserunners, C = outs, D = automatic runs) represent much more tightly defined quantities. In order to force the Base Runs estimate for the 1994 AL to equal the actual number of runs scored, you need to replace the .78 multiplier with .79776, which can be determined by first calculating the needed B value (where R is the actual runs scored total):

Needed B = (R – D)*C/(A – R + D)

Divide this by (2TB – H – 4HR + .05W) and you get a .79776 multiplier. I usually don’t force the estimated runs equal to the actual runs, but for this series, I want to be internally consistent between all of the estimators and also be able to write formulas using league runs rather than having to worry about any discrepancies between league runs and estimated runs.

So our dynamic run estimator (BsR) used throughout this series will be:

A = H + W – HR = S + D + T + W

B = (2TB - H – 4HR + .05W)*.79776 = .7978S + 2.3933D + 3.9888T + 2.3933HR + .0399W

C = AB – H = Outs

D = HR

BsR = (A*B)/(B + C) + D

To be consistent, I will also use the intrinsic linear weights for the 1994 AL that are derived from this BsR equation as the linear weights run estimator. The intrinsic linear weights are derived through partial differentiation of BsR with respect to each component. If we define A, B, C, and D to be the league totals of those, and a, b, c, and d to be the coefficient for a given event in each of the A, B, C, and D factors respectively, than the linear weight of a given event is calculated as:

LW = ((B + C)*(A*b + B*a) – A*B*(b + c))/(B + C)^2 + d

For the 1994 AL, this results in the equation, where RC is to denote absolute runs created:

LW_RC = .5069S + .8382D + 1.1695T + 1.4970HR + .3495W - .1076(outs)

We will also need a version of LW expressed in the classic Pete Palmer style to produce runs above average rather than absolute runs. That’s just a simple algebra problem to solve for the out value needed to bring the league total to zero, which results in:

LW_RAA = .5069S + .8382D + 1.1695T + 1.4970HR + .3495W - .3150(outs)

I am ignoring any questions about what the appropriate baseline for valuing individual offensive performance is. Regardless of where you side between replacement level, average, and other less common approaches, I hope you will agree that average is a good starting point which can usually be converted to an alternative baseline much more easily than if you start with an alternative baseline. Average is also the natural starting point for linear weights analysis since the empirical technique of calculating linear weights based on average changes in average run expectancy is by definition going to produce an estimate of runs above average.

Later we will also have some “theoretical team” run estimators built off this same foundation, but discussion of them will fit better when discussing that concept in greater detail.

I will also be ignoring park factors and the question of context in this series (at least until the very end, where I will circle back to context). Since I am narrowly focused on the construction of the final rate stat, rather than a full-blown implementation of a rating system for players, park factors can be ignored. Since I am anchoring everything in the 1994 AL, the context of the league run environment can also be ignored since it will be equal for all players once we ignore park factors.

Thursday, April 01, 2021

Give Us This Day Our Daily Ball

Rob Manfred, who art Commissioner

Halloweth be our game

Thy rule changes be undone, thy no longer assault fun

In 2022 as it was in 2002

Give us this day our daily ball

And reconcile with Tony Clark as we reconcile to runners on in extra innings

And lead us not into strike or lockout

And deliver us from pitchers hitting

For thine is the office and the power and the responsibility until 2024

Play ball