Wednesday, September 27, 2006

Evaluating Pitcher W%, Pt. 2

As discussed in the first part of this series, an overlooked aspect of the traditional NW%/WAT approach is that it makes certain assumptions about how a team achieves its winning percentage (namely, all through the efforts of non-pitchers). So why not attempt to improve our methodology by using a more realistic model of W% causation?

I should note at this point that a lot of the ideas I am going to discuss were first published by Rob Wood in the August, 1999 edition of By the Numbers (see link to BTN archives on the right side of the page). While my results may not exactly match his, and my explanation is my own, it would be disingenuous to not acknowledge that he did this stuff first.

For any given team, our best assumption will be that their offense and defense are equally responsible for the team’s deviation from .500. Certainly this assumption will be wrong in some cases, worse then using the Oliver or Deane assumptions. But it will be correct more often and the overall error introduced by this assumption will be less then for others.

To keep things simple, I will assume that all defense is pitching. This is an obviously faulty assumption, but it will keep things workable, and again while this assumption will not always hold, it is better to assume that all pitching is defense then to assume that all deviation from .500 is the product of the offense. If one wanted to get even more precise then we are going to, they could introduce a correction for this.

Suppose we have a pitcher working for a team with Mate .540, who goes 15-10(.600 W%). His NW% and WAT under Oliver are .540 and +1. Under the Deane method, they are .565 and +1.63.

However, we are now going to assume that this team has pulled itself away from .500 through equal efforts by the offense and the pitching (excluding the pitcher in question of course, since we have removed his decisions from the rest of the team’s when calculating Mate). Using the Pythagorean theory, we can write this equation:
Mate = x^2/(x^2+(1/x)^2)

x is the percentage of league average runs the team must score (or inversely allow) in order to achieve a given W%. x can be solved for:
x = (Mate/(1-Mate))^.25
Thus, we expect a .540 team to score runs at 104.1% of the league average and allow runs at 96.1%. Once we know this, we can calculate the W% that we expect the team to have given only the non-average offense (since we are assuming that all defense is pitching and the other pitchers are irrelevant when evaluating our pitcher) to have a W% of x^2/(x^2+1), in this case .520.

We can also solve for the implied runs allowed ratio of our pitcher. He achieved a W% of .600 on a team with an offense that scored at 1.041% of average, so:
W% = x^2/(x^2 + y^2)
.6 = 1.041^2/(1.041^2 + y^2)

y can be solved for as:
y = x*sqrt((1-W%)/W%)
Or in this case, y = .85.

Now we know that by achieving a .600 W% for a team of this caliber, the pitcher’s performance was equivalent to allowing runs at 85% of the league average. To calculate his Neutral W%, we put him on a team with an average offense, and find that 1/(1+.85^2) = .581. That makes his WAT +2.03--significantly different then the Oliver and Deane estimates, because they (largely) assume that only the offense has caused the team to rise above .500, whereas we are assuming that it is a joint and balanced effort between the offense and the other members of the pitching staff.

We can generalize this for non-2 exponents as x = (Mate/(1-Mate))^(1/z), y as x*((1-W%)/W%)^(1/z), and NW% as 1/(1 + y^z), where z is the exponent we are using. But is any of this really necessary?

We found that a balanced .540 team would allow a .500 pitcher to be a .520 pitcher. If we simply calculate NW% for our pitcher as .600-.520+.500, as we did for Oliver, we find .580--pretty much equivalent to our convoluted Pythagorean approach. Not only that, but .520 is also equal to the average of .540 and .500. So can we just use this kind of approximation?

While I am a strong advocate of using methods that are theoretically sound across as many potential contexts as possible, practically we only care about the real range of major league teams, which I’ll just assume for the modern times are bounded between .250 and .750. If we find our expected W% for a .250 team with only the offense, it is .366. The average of .5 and .25 is .375, a difference of less then 3%. So it seems pretty safe to use this simplified assumption, and not screw around with all of the Pythagorean calculations. I should also note that there is a further error introduced when you eschew the calculation of the pitcher’s NW% through the Pythagorean approach as well. So there are two sources of error 1) is estimating the comparison level as the midway point between Mate and .500 and 2) is estimating that the pitcher’s NW% will be the same linear difference from .500 as his W% is from the estimate in part 1).

Again, I should note that this is the conclusion that Rob Wood came to, and also the same as Tango Tiger’s quick and dirty method linked below, in response to the first installment of this series. (As an aside, that is the problem with doing series in installments as I am wont to do, and at the same time having a few smart people read it. They figure out what you are doing, or what you should be doing, before you post it. I should either scrap the installment approach or not allow any readers). So, to summarize the quick approach:
NW% = W% - (Mate + .5)/2 + .5

How does this kind of approach change the standing of the historical pitchers discussed in the first installment? Well, Red Ruffing now has a NW% of .521 and +10.5 WAT. Still not in the league with many other Hall of Fame pitchers, but far from concluding that he was a true sub-.500 pitcher. Steve Carlton in 1972 now has a NW% of .846 and +12.8 WAT. Interestingly, he does better under this approach then the Deane approach, probably because the Deane approach looks at the percentage of possible improvement. But in Carlton’s case, the offense is, at least by our assumptions, so bad, that even an otherworldly performance can only do so much to raise the team’s fortunes.

In a third and final installment, which I promise will be posted by the end of the decade, I will look at replacement level and ask the question “Why even compare to Mate at all”, and perhaps throw in some other odds and ends.

Rob Wood in Aug 99 BTN(pdf)

Tango's blog entry