Wednesday, August 18, 2021

Rate Stat Series, pt. 10: Rate Stats for the Theoretical Team Framework I

In calculating TT_BsR for a batter, we have taken into account both his primary and tertiary impact on the offense, but we have neglected to address his secondary impact – that is, the value of the additional plate appearances he generates for his team by avoiding outs. There’s a relatively simple way to apply an adjustment for this using the framework for TT_BsR we’ve already developed. David Smyth called this adjustment PAR for Plate Appearance Ratio, and it is based on the same logic about how PAs are generated that we have relied on many times.

PAR is equal to the ratio of the theoretical team’s plate appearances to the plate appearances a league average team would have had. Remember that:

PA/G = (O/G)/(1 – OBA)

O/G is a constant that we set at the league level – I will call it X in the algebra that follows. We need to know the OBA of the theoretical team; since our player in question gets 1/9 of the PA and the rest of the team is assumed to be league average, this is very simply:

T_OBA = 1/9*OBA + 8/9*LgOBA

Then T_PA/G  = X/(1 - (1/9*OBA + 8/9*LgOBA)) = X/(1 – 1/9*OBA - 8/9*LgOBA), while the league PA/G will be X/(1 – LgOBA). The ratio between the two will be:

(X/(1 – 1/9*OBA – 8/9*LgOBA))/(X/(1 – LgOBA)) = (1 – LgOBA)/(1 – 1/9*OBA – 8/9*LgOBA) = PAR

Since Frank Thomas had a .4921 OBA and the league average was .3433, his PAR is:

(1 - .3433)/(1 – 1/9*.4921 – 8/9*.3433) = 1.0258

This means that a theoretical team on which the Big Hurt had an equal share of the PA would end up generating 2.58% more PA than a league average team. 

In order to take Thomas’ secondary contribution into account, we can return to the definitions from the last installment and calculate:

TT_BsRP = T_BsR*PAR – R_BsR

PAR is only applied to T_BsR (the base runs estimate for the theoretical team with Thomas) because the reference team, filled with league average players, will continue to have the same number of PA as before (which we’ve set to equal eight times Thomas’ PA). Filling in those terms for the 1994, the formula is:

TT_BsRP = ((A + 2.514PA)*(B + 2.722PA)/(B + C + 7.975PA) + D + .232PA)*PAR – 1.090PA

Note that we can no longer combine the D term from T_BsR with the R_BsR term as the former also needs to be inflated by PAR (Thomas’ teammates will hit more homers in those extra 2.58% PA they now enjoy).

Applying PAR increases Thomas’ TT_BsR from 132.2 to 149.9, a significant increase. This figure is more comparable to his wRC (147.1) than to other runs created estimates we’ve examined, as it’s already taken into account the value of his secondary contributions.

You may note that there is the potential of some circularity here, as we are using Thomas’ actual PA as the starting point, but Thomas’ actual PA already inherently include his real secondary contribution to the 1994 White Sox. That is to say that some of the 508 PA that Thomas actually recorded were made possible by his own generation of PA for that team. This is a good argument for using a theoretical number of PA for Thomas rather than his actual PA. Thomas recorded 508 of Chicago’s 4439 PA, or 11.44%. So we could instead use 11.44% of the league average team PA total (4366.9), in which case he would have 499.7 restated PA to plug into the Theoretical Team methodology (this is ignoring that his contribution to the White Sox also had an impact on the league average PA). Of course, in so doing we would also have to proportionally scale back his portion of the T_A, T_B, T_C, and T_D components by 499.7/508. 

On the other hand, the secondary contribution of a batter through generating PA is in the background of the linear framework as well (and any other framework that considers his actual PA), it’s just that the connection leaps to the mind more quickly when modeling the other aspects of a theoretical team. I’m going to ignore this going forward, as this is after all a rate stat series, and also note that we shouldn’t ignore the fact that a batter can benefit from the additional opportunities he helps to create. The fact that the quality of his teammates influences how many opportunities he gets in the real world is at some level unavoidable.

At this point, we should also express Thomas’ contribution in terms of RAA. This is a simple modification; instead of setting R_BsR equal to the league average BsR/PA times 8 times the player’s PA, we would just need to multiply by 9 times the player’s PA so that the lineup isn’t magically shortened and instead we compare T_BsR to what a team would score with an average player in our man’s place. I did not bother running this before introducing PAR, because if there’s one thing we’ve learned from this series is that it doesn’t make a lot of sense to talk about batter RAA without taking out rate into account. So with PAR for the 1994 AL have:

TT_RAA = ((A + 2.514PA)*(B + 2.722PA)/(B + C + 7.975PA) + D + .232PA)*PAR – 1.226PA

We now have three possible theoretical team approaches, and have yet to address the question of this series: what should the rate form be? The guiding principle of this series has been that the properties of the numerator (usually a run estimate) should be logically consistent with the choice of denominator, so we should consider each of the three theoretical team approaches separately.

First is TT_BsR, which is just an estimate of the batter’s impact on the team runs scored, taking into account primary and tertiary (but not secondary) impacts. It is akin to LW_RC, with the key difference being that LW_RC does not attempt to value the batter’s tertiary impact. However, I contend that incorporating tertiary contributions does not alter the considerations when developing a rate stat. The tertiary effect is how the batter’s performance changes the underlying run environment of the team, independently of the change in plate appearances. What we are left which is an estimate of the contribution the batter made in his actual plate appearances – the only difference is that we recognize that those outcomes influenced the value of all of the other offensive events recorded by the team.

So our choices for a rate stat are the same as those for LW_RC. We can first calculate RAA (using R/O), and then take RAA/PA, or we can calculate how many additional PA the batter generated/outs he avoided, add those to his TT_BsR, and divide by PA (the R+/PA) approach. These approaches will be equivalent if we add back in LgR/PA to RAA/PA, we could convert to wOBA, we could calculate wRC along the way...all the same options.

The math will be the same as shown in parts 7 and 8, except we will substitute TT_BsR for LW_RC everywhere it pops up. Here is a leaderboard for some of the key metrics using LW_RC (we’ve seen this all before):

Now the same metrics, except substituting TT_BsR for LW_RC in all calculations:

This was a lot of work to get largely the same results. Maybe applying PAR will make things more interesting? 

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.