Monday, September 03, 2007

Early NL Series: Pitchers and Teams

We have looked at batter evaluation; now we need to tackle pitchers. Or we could run away whimpering and go hide in the closet. This at times seems like a very appealing option when you have to deal with the problems of evaluating early (in baseball terms) nineteenth century pitchers.

First, throw win-loss records out the window. Many pitchers are throwing most of their team’s innings, making them almost solely responsible for the team’s W% (insofar as the pitching staff can control it), which makes it impossible to compare win-loss records to those of the team. Of course, even if that was feasible, it would not be preferable. I am an advocate of looking at win-loss records intelligently, but only on a multi-season level (and still with many caveats), and we want to be able to evaluate single seasons.

What about Run Average? Well, there are so many errors that even a die-hard RA supporter like me starts to get queasy about using it. Not to mention that fielding was a much bigger factor in the game then it is today. In modern times, when we evaluate a pitcher solely based on RA, we may overstate his value a bit--after all, some of the credit is due to the fielders. But in this time, we will be wildly overstating his value.

As for ERA, I oppose the practice of “reconstructing innings” when there is one error per game, as I think it muddies things up more then it clarifies them, by wiping out real events given up by the pitcher just because they were preceded by an error. How much worse then would it be when errors infest the game? So many innings were reconstructed to produce earned runs that we could be losing tons of information.

How about an Estimated RA, most often exemplified by Bill James’ Component ERA? It will allow us to use expected errors in place of actual errors, but it still will not alleviate the problem of over-crediting the pitcher.

So then we come to DIPS. But we quickly realize that the three true outcomes are a misnomer in this time; most home runs are actually in play. We can’t set them aside, and even if we did, they would have very little effect on the evaluation as there were only 533 home runs hit over the eight seasons in question, just 1.2% of all hits. And strikeouts and walks make up just 10% of all plate appearances, compared to, just picking a year out of the air, 17.4% in the 1939 NL. That doesn’t leave us with a large base to evaluate on.

And who’s to say that hits/ball in play were as bunched together as they are now? Given the greater importance of fielding, they may not have been. On the other hand, pitchers are throwing underhand from 45 feet, needing a large number of strikes to retire the batter, and without the wide variety of pitches we see today. So maybe we should expect all hurlers to be about equal in that regard.

If we try to test year-to-year correlations, though, we will have a lot of issues. Most obvious is the small sample size. Of course selective sampling is ever-present. We can’t really compare a pitcher to his teammates, since he often makes up such a large share of the team total. Even if we were able to and found a low year-to-year correlation on H/BIP, that would not necessarily render it meaningless or make a DIPS-like approach appropriate for evaluating value retrospectively.

So what to do? I have decided to go an extremely lazy route, and to evaluate pitchers based on Wins Above Average, based on runs allowed, except I have multiplied that result by the league percentage of earned runs for the given season. This is a terrible hodge-podge of an approach, as I don’t think that earned runs are particularly useful. But you have to account in some way for the fact that fielding was a much larger share of defense then it is today, and my guess would not be any more legitimate then the percentage of earned runs.

I have also chosen average as a baseline, rather then replacement, because I have decided to evaluate the pitchers first as hitters, versus a replacement level hitter, and then add on their pitching value. This is what I would do today for a position player, at least for simplicity’s (there are many knocks against offensive positional adjustments and I don’t disagree) sake; first find the batter’s offensive value versus an average player at his position, and then add in the runs he saved above an average fielder at his position.

For an example of how this works, let’s take a look at 1876 Hartford’s #2 pitcher, Candy Cummings of supposed curveball invention fame (the Dark Blues played 69 games; Tommy Bond started and completed 45 of them, Cummings 25). The Dark Blues context is 10 total RPG, so we would expect Candy to allow 5 runs per 9 innings. In fact, he allowed 97 runs in 216 innings for a 4.04 RA. This means that his Adjusted RA, which I will use pretty extensively instead of the actual figure, was 100*5/4.04 = 124.

In 1876, 39.6% of the league runs were earned. Cummings’ RAA is therefore (5-4.04)*.396*216/9 = +9.12. Converting this to WAA is as simple as dividing by 10, and so he is +.91 wins. If you want formulas:
ARA = 200*RPG/RA
RAA = (RPG/2 - RA)*LgER%*IP/9
WAA = RAA/RPG

Cummings was a “replacement-level hitter” for a pitcher (the phrase “replacement-level hitter” is silly, since what we really mean by a replacement level player is one who is replacement level in his total value; however, it is a necessary evil when one has taken the flawed offensive positional adjustment path), with +.01 WAR, so his total value is +.92 WAR.

As for teams, things are much more straightforward. Expected Winning Percentage based on runs and runs allowed can be calculated using Pythagenpat. I have also figured Predicted W%, based on Base Runs and Base Runs Allowed; this is a little more shaky since the Base Runs formula isn’t particularly accurate, and many of the defensive components had to be estimated. To find PW%, just use BsR and BsRA as you would R and RA. I have included those figures here, but I wouldn’t put much stock into it for this era.

No comments:

Post a Comment

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.