Tuesday, January 03, 2006

Win Shares Walkthrough, pt. 6

Distributing Fielding Win Shares to Positions

Win Shares takes a different approach from other fielding evaluation methods in that it first assigns a value to each position, then splits that up among the men who played that position. This allows James to use data for the team as a whole, rather then try to estimate how many strikeouts there were when a particular player was in the field.

Each position has four criteria which are used to assign Win Shares. A “claim percentage” is derived from the sum of these four scales divided by 100. Each position has different criteria and different weightings assigned to them. I will call the four criteria N, X, Y, and Z in order to keep the quantity of abbreviations to a minimum.

At catcher, the four criteria are Caught Stealing Percentage, Error Percentage, Passed Balls, and Sacrifice Hits. The 50 point scale is CS%(meaning that this criteria will compose 50% of the rating). CS% = CS/(SB + CS) for the team as a whole. The Braves allowed 121 SB and 53 CS, for a CS% of 53/(53 + 121) = .305. N = 25 + (CS% - LgCS%)*150 The NL average was .3149, so the Braves’ N is 25 + (.305-.3149)*150 = 23.52. I should point out now that all of the scales at each position have a minimum value of 0 and a maximum value of the number of points the criteria is assigned (50 in this case).

Error Percentage for a catcher is E% = 1 - (cPO + cA - TmK)/(cPO + cA - TmK + cE). This removes the putout credited for a strikeout from the catchers’ total. The Braves catchers had 1056 PO, 89 A, and 13 E, while the team struck out 1036. So their E% is 1-(1056+89-1036)/(1056+89-1036+13) = .107 X = 30 - 15*E%/LgE%. NL catchers had an E% of .084, so the Braves’ X is 30-15*.107/.084 = 10.89 on a 30 point scale.

The Passed Ball criteria incorporates something we will use in many fielding formulas, the Team League Putout Percentage(TLPO%). TLPO% = Tm(PO - K)/Lg(PO - K), or in other words is the percentage of the total putouts in the field in the league recorded by our team. The Braves had 4365 PO, while the league had 60854 PO and 13358 K, so the TLPO% = (4365-1036)/(60854-13358) = .070.

Y = (LgPB*TLPO% - TmPB)/5 + 5. The Braves had 13 PB versus 199 for the league, so their Y was (199*.07-13)/5 + 5 = 5.19 on a 10 point scale.

The final criteria is based on team sacrifice hits allowed and is Z = 10 - TmSH/(LgSH*TLPO%)*5. Atlanta allowed 77 SH while the league allowed 1110, so the Z = 10 - 77/(1110*.07)*5 = 5.05 on a 10 point scale.

The claim percentage for Atlanta catchers will be, just as it will be at all positions, (N + X + Y + Z)/100. This is (23.52+10.89+5.19+5.05)/100 = .447. An average team would score .500.

At first base, the criteria are Plays Made, Error %, “Arm Rating”, and errors by shortstop and third baseman. To find Plays Made, first a very complicated estimate of estimated unassisted putouts by first baseman is made:
Est1BUnPO = (1bPO - .7*pA - .86*2bA - .78*(3bA + ssA) + .115*(RoF + SH) - .0575*BIP)*2/3 + (BIP*.1 - 1bA)*1/3
BIP = IP*3 - K, and is an estimate of Balls in Play. The Braves’ first baseman had 1423 PO and 130 A, while P, 2B, 3B, and SS had 228, 496, 331, and 476 A respectively. We earlier found RoF as 1287, and the BIP is 4314. So the Braves’ Estimated UA PO at first is (1423 - .7*228 - .86*496 - .78*(331 + 476) + .115*(1287 + 77)- .0575*4314)*2/3 + (4314*.1 - 130)*1/3 = 177.9

We also need to find what is called the LHP+/-, the number of balls in play against left-handed pitcher’s above what you would expect from the league average. The formula is:
LHP+/- = TmBIP(lefties) - (LgBIP(lefties)/LgBIP*TmBIP)
Atlanta lefthanders pitched 582 innings with 349 Ks for 1397 BIP. The NL had 5918 IP from lefties with 3655 Ks for 14099 BIP, while the total league BIP was 47494. Therefore, the Braves’ LHP+/- is 1397-(14099/47494*4314) = 116.4

Then N = ((Est1BUnPO + 1bA + .0285*LHP+/-) - Lg(Est1BUnPO + 1bA)*TLPO%)/5 + 20. The league first baseman had 3114 estimated unassisted putouts and 1670 assists, so the Atlanta N = (177.9 + 130 + .0285*116.4 - (3114+1670)*.07)/5 + 20 = 15.27 on a 40 point scale.

The E% at all positions other then catcher is figured as E/(PO + A+ E). The Braves 1B made 9 errors, for an E% of 9/(1423+130+186) = .0058. X = 30 - 15*E%/LgE%, so with a LgE% at 1B of .0085, Atlanta gets 30 - 15*.0058/.0085 = 19.76 claim points on a 30 point scale.

The Arm rating is figured as Arm = 1bA + .5*ssDP - pPO - .5*2bDP + .015*LHP+/-. Braves 2B and SS had 108 and 97 DP, while the pitchers had 119 PO, giving an Arm of 130 + .5*97 - 119 - .5*108 + .015*116.4 = 7.25. Y = (Arm - LgArm/T)/5 + 10, where T = the number of teams in the league. The LgArm was 27.32 per team, so the Braves get (7.25-27.32)/5 + 10 = 5.99 points on a 20 point scale.

Z = 10 - 5*(3bE + ssE)/(Lg(3bE + ssE)*TLPO%). NL third baseman made 357 errors and the shortstops made 389. Braves third baseman and shortstops each made 19, so the Z is 10 - 5*(19 + 19)/((357+389)*.07) = 6.36 on a 10 point scale. The claim% at first base is (15.27+19.76+5.99+6.36)/100 = .474

At second base, the criteria are team DP, Assists, E%, and Putouts. N = 20 + (TmDP - ExpDP)/3. We already found the Braves’ ExpDP of 133.1, and they actually turned 146, giving a N of 24.3 on a 40 point scale.

The Assists rating is found as:
X = ((2bA - 2bDP) - (Lg(2bA - 2bDP)*TLPO% - 1/35*LHP+/-))/6 + 15. The Braves 2B had 364 PO, 496 A, 15 E, and 108 DP. The NL 2B had 4863 PO, 6776 A, 236 E, and 1412 DP. So they have an X of ((496-108)-((6776-1412)*.07 - 1/35*116.4))/6 + 15 = 17.64 on a 30 point scale.

Y = 24 - 14*2bE%/Lg2bE%. Atlanta’s E% at second is .0171 versus .0199 for the league, for a Y of 24-14*.0171/.0199 = 11.97 on a 20 point scale.

To find the putout criteria, we first find expected 2B PO by this formula:
Exp2bPO = Tm(PO - K)*Lg2bPO/Lg(PO - K) + 1/13*(W - Lg(W/IP)*TmIP) + 1/32*LHP+/-. The team PO-K is 3329 while the league is at 47496. The Braves walked 480 batters, while the league average was .3502 W/IP. This gives:
3329*4863/47496 + 1/13*(480 - .3502*1455) + 1/32*116.4 = 342.2.
Z = 5 + (2bPO - Exp2bPO)/12, giving 5 + (364-342.2)/12 = 6.82 on a 10 point scale. The claim% at 2B is (24.3+17.64+11.97+6.82)/100 = .607

At third base the criteria are Assists, Errors Above Average, Sacrifice Hits, and Double Plays. We first find Exp3bA = TmA*Lg3bA/LgA + 1/31*LHP+/-. Braves 3B had 131 PO, 331 A, 19 E, and 32 DP against the league totals of 1676, 4414, 357, and 374. The Braves had 1769 total assists and the league had 24442. Therefore, the Exp3bA = 1769*4414/24442 + 1/31*116.4 = 323.3. N = 25 + (3bA - Exp3bA)/4, giving 25 + (331-323.3)/4 = 26.93 on a 50 point scale.

Exp3bE = (3bPO + 3bA)/LgFA@3B - (3bPO + 3bA). The league FA at third base is .945(figured as (PO + A)/(PO + A + E), giving the Braves (131+331)/.945 - (131+331) = 26.9 expected errors. X = 15 + (Exp3bE - 3bE)/2 or 15 + (26.9 - 19)/2 = 18.95 on a 30 point scale.

The Sacrifice Hit criteria uses what I will call Sacrifice Hit Rating, or SH/(G + L) = SH/(W + 2*L). The Atlanta SHR is 77/(104 + 2*58) = .35 against a league average of .326. Y = 10 - SHR/LgSHR*5, or 10 - .35/.326*5 = 4.63 points on a 10 point scale.

Expected DP at third base are found very simply as ExpDP*Lg3bDP/LgDP, or 133.1*374/2028 = 24.55. Z = (3bDP - Exp3bDP)/2 + 5 or (32-24.55)/2 + 5 = 8.73 on a 10 point scale. The Claim% at 3B is (26.93+18.95+4.63+8.73)/100 = .592

For shortstops, the criteria are Assists, Double Plays, E%, and Putouts. First we find ExpssA = TmA*LgssA/LgA + 1/100*LHP+/-. Atlanta shortstops had 217 PO, 476 A, and 19 E versus league totals of 3647, 6930, and 389, giving an expectation of 1769*6930/24442 + 1/100*116.4 = 502.7. N = (ssA - ExpssA)/4 + 20 or (476-502.7)/4 + 20 = 13.33 on a 40 point scale.

X = 15 + (TmDP – ExpDP)/4 = 15 + (146-133.1)/4 = 18.23 on a 30 point scale.

Y = 20 - 10*ssE%/LgssE% = 20 - 10*.0267/.0355 = 12.48 on a 20 point scale.

Expected PO at shortstop are found by ExpssPO = Tm(PO - K)*Lg(ssPO/(PO - K)) + 1/14*(W - LgW/IP*TmIP) - 1/64*LHP+/- or 3329*3647/47496 + 1/14*(480-.3502*1455) - 1/64*116.4 = 251.7. Then Z = 5 + (ssPO - ExpssPO)/15 = 5 + (217 - 251.7)/15 = 2.69 on a 10 point scale. Therefore, the Claim% at shortstop = (13.33+18.23+12.48+2.69)/100 = .467.

For outfielders, the criteria are Putouts, the team’s Defensive Efficiency Record, “Arm Elements”, and E%. Outfield putouts are first expressed as a percentage of team putouts less strikeouts and assists (assists generally come on groundballs). I will call this Putout Rating, POR = ofPO/(TmPO - TmA - TmK). Braves outfielders recorded 1055 PO, 19 A, 21 E, and 5 DP while the league had 15361, 480, 337, and 87. So the ATL POR is 1055/(4365-1769-1036) = .6763. N = 20 + 100*(POR - LgPOR) so with a league POR of .6663, it is 20 + 100*(.6763-.6663) = 21 on a 40 point scale.

The second criteria is very easy to calculate, using CL-1 from way back in the process when we were dividing defense between the pitchers and the fielders. X = CL-1*.29 - 9, which for Atlanta is 134.82*.24 - 9 = 23.36 on a 30 point scale.

The third criteria, “Arm Elements”, compares the team sum of outfield assists and DP less SF to the league total of the same, discounted at the TLPO%:
Y = ((ofA + ofDP - TmSF) - Lg(ofA + ofDP - SF)*TLPO%)/5 + 10. Since the Braves allowed 39 SF and the league 701, their Y is ((19+5-39)-(480+87-701)*.07)/5 + 10 = 8.88 on a 20 point scale.

Finally, Z = 10 - 5*E%/LgE%, which is 10 - 5*.0192/.0208 = 5.38 on a 10 point scale. The OF Claim% is therefore (21+23.36+8.88+5.38)/100 = .586.

We are now ready to distribute the fielding win shares to each position. Each position has an “intrinsic weight”, which we will abbreviate IW. These weight the claim percentages at each position by the importance of that position. The IWs are: C = 38, 1B = 12, 2B = 32, 3B = 24, SS = 36, and OF = 58. We take, for each position (Claim% - .200)*IW, and sum these:
C = (.447-.200)*38 = 9.39
1B = (.474-.200)*12 = 3.29
2B = (.607-.200)*32 = 13.02
3B = (.592-.200)*24 = 9.41
SS = (.467-.200)*36 = 9.61
OF = (.586-.200)*58 = 22.39
These sum up to 67.11. So catcher’s get 9.39/67.11 = 14% of the team’s 54 FWS, or 7.56. Doing this for all positions(and not rounding the numbers):
C = 7.523, 1B = 2.644, 2B = 10.476, 3B = 7.616, SS = 7.753, OF = 18.068

My take: As I said earlier, fielding analysis is not something I am really qualified to pontificate about. I will leave it to the Tangos and the Mike Emeighs and the MGLs, etc. to debate the merits of the method. I will instead focus on its similarity to Defensive Winning Percentage.

DW% was used by James in his early Abstracts to evaluate fielding and then combined with OW% to give a total player rating. DW% was not used after the 1984 book. My first reaction when I saw the Defensive Win Shares formula was “it’s a revised DW%”.

Just like in WS, each position had four criteria, rated on scales that added up to 100. The criteria have changed over the years, sometimes based on better data being available (for example, James used to use opposition SB/G to rate catchers, whereas now we know SB and CS and can find the percentage) or based on new research and ideas of how to evaluate fielding (James used A/G for 1B, but now estimates unassisted putouts as well). But many of the criteria are the same or similar.

Another feature of the system was that each position had an intrinsic weight. These were 10 at C, 3 @ 1B, 8 @ 2B, 6 @ 3B, 11 @ SS, 4 @ LF, 6 @ CF, and 5 @ RF. These sum up to 53, which is the value in games given to fielding (both wins and losses) for a 162-game season. Dividing 53 by 162 gives .327, i.e. the system considers fielding to make up 32.7% of defense. Win Shares puts fielding, for an average team, at 32.5%. If you consider the outfield as a unit and scale these to 200 (the total of the intrinsic weights in Win Shares), you have:
C = 37.7(38); 1B = 11.3(12), 2B = 30.2(32), 3B = 22.6(24), SS = 41.5(36), OF = 56.6(58)
The numbers in parentheses are the WS intrinsic weights. As you can see, both systems say that fielding is approximately 32.5% of defense and weight the positions equally (shortstop is the only position with a significantly different weight).

I must reach the same conclusion as my first glance: Fielding Win Shares is an updated Defensive Winning Percentage. This is not necessarily a bad thing; perhaps the original system was very good to begin with, and it has been improved by better data, better estimates, and presumably more research. And of course a huge difference is the fact that DW% looks at each fielder individually while WS starts by crediting the team, and distributes value to the players from there. But the similarities between the two systems, separated by twenty years, are still striking, at least to me.

Again, just as in the stage where responsibility was split between pitching and fielding, there is an explanation of how, but not why. Why is the intrinsic weight at shortstop 36? Why are team sacrifice hits allowed weighted double a third baseman’s double plays over expectation? Etc. These questions are not answered, nor even acknowledged by James. That is not to say that he did not think of them himself, as I’m sure he did--just that we have no way of knowing what the thought process behind the system was, and are left to puzzle over it ourselves.

Along the same line, there are differences in how the ratings are formed at each position. Most positions are given a rating for errors based on their error percentage. But at third base it is based on errors above average. These sorts of things seem like inconsistencies within the system, but if there is a good reason for them, we have not been told what it is.

Aside from the fielding nature of the method itself, the subtraction of .200 from each claim percentage hammers home that the system is giving out absolute wins on the basis of marginal runs. 50% of the league average in runs scored, with a Pythagorean exponent of 2, corresponds to a W% of .200. It is for this reason that in old FanHome discussions myself and others said that WS had an intrinsic baseline of .200 (James changed the offensive margin line to 52%, which corresponds to about .213).

In an essay in the book, James discusses this, and says that the margin level(i.e. 52%) “is not a replacement level; it’s assumed to be a zero-win level”. This is fine on it’s face; you can assume 105% to be a zero-win level if you want. But the simple fact is that a team that scored runs at 52% of the league average with average defense will win around 20% of their games. Just because we assume this to not be the case does not mean that it is so.

Win Shares would not work for a team with a .200 W%, because the team itself would come out with negative marginal runs. If it doesn’t work at .200, how well does it work at .300, where there are real teams? That’s a rhetorical question; I don’t know. I do know that there will be a little bit of distortion every where.

In discussing the .200 subtraction, James says “Intuitively, we would assume that one player who creates 50 runs while making 400 outs does not have one-half the offensive value of a player who creates 100 runs while making 400 outs.” This is either true or not true, depending on what you mean by “value”. The first player has one-half the run value of the second player; 50/100 = 1/2, a mathematical fact. The first player will not have one-half the value of the second player if they are compared to some other standard. From zero, i.e. zero RC, one is valued at 50 and one is valued at 100.

By using team absolute wins as the unit to be split up, James implies that zero is the value line in win shares. Anyone who creates a run has done something to help the team win. It may be very small, but he has contributed more wins then zero. Wins above zero are useless in a rating system; you need wins and losses to evaluate something. If I told you one pitcher won 20 and the other won 18, what can you do? I guess you assume the guy who won 20 was more valuable. But what if he was 20-9, and the other guy was 18-5?

You can’t rate players on wins alone. You must have losses, or games. The problem with Win Shares is that they are neither wins nor wins above some baseline. They are wins above some very small baseline, re-scaled against team wins. If you want to evaluate WS against some baseline, you will have to jump through all sorts of hoops because you first must determine what a performance at that baseline will imply in win shares. Sabermetricians commonly use a .350 OW%, about 73% of the average runs/out, as the replacement level for a batter. A 73% batter though will not get 73% as many win shares as an average player. He will get less then that, because only 21%(73% - 52%) of his runs went to win shares, while for an average player it was 48%. So maybe he will get .21/.48 = 44%. I’m not sure, because I don’t jump through hoops.

Bill could use his system, and get Loss Shares, and have the whole thing balance out all right in the end. But to do it, you would have to accept negative loss shares for some players, just as you would have to accept negative win shares for some players. Since there are few players who get negative wins, and they rarely have much playing time, you can ignore them and get away with it for the most part. But in the James system, you could not just wipe out all of the negative loss shares. Any hitter who performed at greater then 152% of the league average would wind up with them, and there are (relatively) a lot hitters who create seven runs a game.

James writes in the book that with Win Shares, he has recognized that Pete Palmer was right after all in saying that using linear methods to evaluate players would result in only “limited distortions”. And it’s true that a linear method involves distortions, because when you add a player to a team, he changes the linear weights of the team. This is why Theoretical Team approaches are sometimes used. But the difference between the Palmer system and the James system is that Palmer takes one member of the team, isolates him, and evaluates him. James takes the entire team.

So while individual players vary far more in their performance then teams, they are still just a part of the team. Barry Bonds changes the linear weight values of his team, no doubt; but the difference might only be five or ten runs. Significant? Yes. Crippling to the system? Probably not. But when you take a team, particularly an unusually good or bad team, and use a linear method on the entire team, you have much bigger distortions.

Take the 1962 Mets. They scored 617 and allowed 948, in a league where the average was 726. Win Shares’ W% estimator tells me they should be (617-948+726)/(2*726) = .272. Pythagorus tells us they should be .304. That’s a difference of 5 wins. WS proceeds as if this team will win 5 less games then it probably will. Bonds’ LW estimate may be off by 1 win, but that is for him only. It does not distort the rest of the players (they cause their own smaller distortions themselves, but the error does not compound). Win Shares takes the linear distortion and thrusts it onto the whole team.

Finally, the defensive margin of 152% corresponds to a W% of about .300, compared to .213 for the offense. The only possible cutoffs which would produce equal percentages are .618/1.618 (the Fibonacci number). That is not to say that they are right, because Bill is trying to make margins that work out in a linear system, but we like to think of 2 runs and 5 allowed as being equal to the complement of 5 runs and 2 allowed. In Win Shares, this is not the case. And it could be another reason why pitchers seem to rate too low with respect to batters (and our expectations).

Finally, one little nit-picky thing; why do expected putouts by second baseman and shortstops go up as walks go up? Obviously, more walks means more runners on first who may be putout at second on fielder’s choices, or steal attempts, or double plays, but so do singles and hit batters. Am I missing something really obvious here?

6 comments:

  1. Really nice exposition, Patriot.

    ReplyDelete
  2. I didn't really tackle the fielding data. I don't feel qualified, either, but I am also hoping that someday Michael Humphreys will publish his DRA system so I can plug it into Win Shares. But not having negative fielding Win Shares is a huge issue.

    I agree that Win Shares aren't complete without "Loss Shares." I've spent a lot of time researching that and came up with a system that does exactly what you say: it sometimes results in negative Loss Shares for extraordinary players.

    It's counterintuitive, but it does work logically. That is the basis for "expected Win Shares" that we publish at the Hardball Times. Here are a couple of articles I printed at baseball graphs about it...

    Loss Shares (with a comment by you!), and
    The Win Shares Baseline (which I've since renamed expected Win Shares).

    I've refined the system for expected Win Shares since I first published this article, and I plan to publish it on baseball graphs soon.

    Expected Win Shares are really powerful. You can come up with a Win Shares rate stat (win shares percentage, or wsp) and calculate Win Shares above/below expected. You can also use them as a baseline for establishing replacement levels for individuals.

    To his credit, James realized that something like "Loss Shares" was required, but I think he just burnt out on working on it. I think he has plans to republish something, but I don't know if he's actively working on it.

    ReplyDelete
  3. Am likely missing something obvious, but how did you get 4314 for Atl's BIP? I keep getting 3*1455-1036 = 3329

    ReplyDelete
  4. You're right, that was my mistake. I'd have to go back through the series to try to figure out where the 4314 came from, but I assume it was a proper BIP estimate (like the one from DER) rather than James' quick 3*IP-K.

    ReplyDelete
    Replies
    1. I take it from the seventh part that that won't be anytime soon (especially since this has been some time since the original post)! What is a good alternative to the quick version? would (H-HR+3IP-K-.5DP+.75E) be valid?

      Delete
    2. Earlier in the Win Shares system, James uses DER with a denominator (BIP) of BF - HR - W - K - HB. Reading back through this, I don't think there's anything wrong with James using 3*IP-K in this instance, since he's appears to be using it as an estimate of outs in play rather than balls in play. But he shouldn't have called it balls in play when it is actually outs in play, and I shouldn't have accidentally used the wrong number in this post.

      Delete

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.