Tuesday, January 09, 2007

Career WAT Data, Intro

This post is the introduction to a series I will be doing that presents Wins Above Team statistics for each primarily twentieth century Hall of Fame pitcher (except those who were primarily relievers, so Eck is here but Sutter, Fingers, etc. are not) and some other assorted pitchers that I have chosen. Some of these pitchers were chosen by me for idiosyncratic reasons--it should not be construed as an attempt to identify the top 100 pitchers or something, just those that I was interested enough in running the numbers for. It will be broken down into historical periods roughly corresponding to those that are used to divide the history of baseball post-1900 up in the Neft & Cohen Sports Encyclopedia: Baseball. In the modern era (as divided by Neft & Cohen, 1973 to the present), I have split it up into earlier and more recent pitchers, but rather arbitrarily as you will see later. For example, Dave Stieb is a “recent” pitcher in that period and Nolan Ryan an “early” one, despite the fact that both of their careers essentially ended in 1993. I didn’t try to divide them precisely into groups, and really, the only reason I have divided them into groups is so that there is a workable number in each installment.

I am doing this not because I feel that Wins Above Team is a particularly important statistic--I would definitely consider measures based on runs or earned runs allowed, or even estimated runs allowed, to be the primary way that the value of pitchers of today and the past should be assessed. However, won-loss records can be a decent measure, if they are interpreted properly, and might possibly be able to provide some insight. Plus, W-L records will always be a primary part of the discussion by non-sabermetricians when evaluating pitchers, so we may as well be able to utilize them in the most sabermetrically sound way.

I believe that the figures that are often published, either based on comparing a pitcher to his teammate’s W-L record with his decisions removed, or Bill Deane’s modification that is used in Total Baseball, are not the most logically sound way to evaluate W-L record. I discussed this in a three part series, and will not rehash those posts here, except to give a brief explanation of the method I prefer.

Simply comparing a pitcher’s W% to that of his teammates, which I’ll call Mate as Rob Wood does, implicitly assumes that his team’s deviation from .500 is solely the product of pitching. After all, if a staff had Greg Maddux, John Smoltz, Tom Glavine, and Denny Neagle, what shame would it be if the fifth starter had a worse record then Mate, which would largely be composed of the records of the four aces? The point of considering his team’s record when evaluating a pitcher’s won-loss record is to account for the support that he received. Having Denny Neagle on the staff does not make it any easier for the #5 starter to win a game, so why compare to him?

Now of course when the standard of comparison is Mate, we cannot completely remove the influence of other pitchers, because we do not know how much of the Mate’s deviation from .500 is due to pitching or offense or any other facet of the game, without additional data, which would defeat the purpose. But we can make an assumption that will be most accurate in more situations then any other.

And that is to assume that the deviation of Mate from .500 is equally a result of offensive and defensive (or, to keep it simple, pitching) efforts. Obviously, this will not be true in all or even most cases, and sometimes will be more incorrect then assuming that the deviation is solely a product of offense, but it will have a lower average degree of error then any other assumption.

For an example of this in action, let’s look at the case of Iron Joe McGinnity, pitching for the 1905 New York Giants. His record was 21-15(.583), fairly impressive on its face, but his teammates were 84-33(.718) without him. McGinnity recorded just 24% of the team’s decisions, but 32% of its losses.

The traditional Wins Above Team method will look at the direct comparison between .718 and .583, extrapolate it over McGinnity’s 36 decisions, and proclaim that he was 4.8 wins worse then an average pitcher would have been. Bill Deane’s modified method would account for the fact that it is hard to improve on a .718 mark, with just .282 potential wins to improve, and instead estimate that McGinnity was -3.4 wins.

My method, which is essentially the same as those that have been proposed by Rob Wood and Tango Tiger (with Wood being the formative influence in my thinking on this matter), will assume that the .218 extra wins/game compared to average are half the result of offense and half of pitching, and will therefore credit .109 wins to the offense. Therefore, an average pitcher coupled with this offense would record a .609 W%. Comparing McGinnity’s .583 to this lower standard, we conclude that he was only .9 wins worse then an average pitcher.

Continuing along the same logic, I can also compare to a replacement level pitcher (the standard Oliver and Deane approaches can do this as well, although their creators did not go down this path). I assume that a replacement level pitcher is a .390 pitcher on a .500 team, and conclude that he would be a .499 pitcher with this Giants team. Comparing McGinnity’s .583 to the .499 replacement, we conclude that he was 3 wins better then replacement.

The formulas for the Wood-inspired methods are:
NW% = W% - Mate/2 + .25
WAT = (NW% - .5)*(W+L)
WCR = (NW% - .35)*(W+L)
NW = NW%*(W+L)
NL = W + L - NW

Where NW% is Neutral W%, the W% we would expect for this pitcher on a .500 team; WAT is Wins Above Team, the number of wins over what a .500 pitcher on a .500 team would have won with this team; and WCR is Wins Compared to Replacement, the number of wins above and beyond those of a hypothetical replacement level pitcher, who is assumed to be a .390 pitcher on a .500 team. NW and NL are Neutral Wins and Neutral Losses; these are the win/loss totals a pitcher would have had if he recorded the same number of decisions that he actually did, but won them at his Neutral W% rather then his actual percentage.

Next time, I will begin with a look at pitchers primarily active in the 1900-1919 period.

No comments:

Post a Comment

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.