Monday, February 21, 2011

Comments on Bill James Gold Mine 2010, pt. 1

I quite enjoyed the third edition of the Bill James Gold Mine, even though I didn't get around to reading it until a few months after it was published. It jogged some thoughts, which lead to this post, which is not fully based on James' essays but on the semi-related paths they sent my mind down. To me, that is one of the tests of a really good sabermetric work--does it get you thinking, even if not about the exact topics covered? James' book passed that test for me.

However, I do think that the book would be stronger if it contained more of James' essays and less "statistical nuggets". The nuggets were of less interest to me, and seemed to be present in lesser quantity than they were in the first two editions of the book. The reverse was true for the essays, and those are what compel me to buy the book. Not being a subscriber to Bill James Online, I'm not positive about this, but I believe that James writes a number of additional essays in each year that are not included in the book.

If that is indeed the case, I believe that they'd be much better off to collect all of Bill's essays in the Gold Mine, and leave the nuggets for the individual to drudge up themselves online. Not only does the website lend itself more to the statistics (the data there is much more extensive than what can be printed in the book even if the book were the size of one of the old Great American Baseball Stat Books) and the essays to the printed page, but if there are any folks out there who still refuse to use the Internet and are interested in James, I'd think they'd be more enticed by the essays. A book of just the essays, with some other filler of some sort, would have a character not unlike that of the 1990-1992 Baseball Books, which I liked very much.

Of course, since it appears that the book is not even being published in 2011, those suggestions are for naught.

I have three subjects to touch on, two of which could be considered critiques and one of which is just a good old-fashioned tangent. This post went a lot longer than I originally intended so it's been broken up into two portions:

1. Starting pitcher rankings

The longest essay in the book deals with a system to rate starting pitchers based on where they place among other starters in each of their league seasons. James first ranks pitchers by Season Score (*), and then assigns points based on the pitcher's standing in the league. Each league season has 5.5 points per team available. In a fourteen team league, the top ranked pitcher gets 12 points, the #2 ranked pitcher gets 11 points, and so on down to the #11 pitcher who gets 2 points. There are also three-point bonuses, up to nine points per season, available for truly historic seasons. The resulting metric is called Strong Season points.

(*) James does not give the formula for Season Score in the article, but explains that is based on W, L, IP, ERA, K, W, and SV. "The point of the system is to evaluate a pitcher's record without context"..."This was a way of trying to say 'How good are the numbers themselves?', rather than 'How good was the pitcher who compiled these numbers?'".

Personally, I'm not sure that I have a whole lot of interest in rankings of pitchers based on a method that deliberately ignores context (and James certainly does not deny the importance of context). Setting my objection aside, though, it seems to me as if the Season Score is yet another result of a process that James has repeated over the course of his career: the re-invention of Approximate Value. Of all of his methods, my impression is that there is none that James personally likes more than AV. Even Win Shares is in some respects a return to AV--while it attempts to adjust for everything, it still expresses the result in an integer. The scale is higher than that of AV (a 20 AV would be an extraordinary season, while 20 WS is good but ordinary).

And so after attempting to adjust for everything, it seems James still had a void in his own toolkit, and so he filled it with the Season Score.

Digression aside, James found that a career total of 43 strong season points marks a fairly clear line for the Hall of Fame in retrospect. Only five pitchers retired for a significant length of time have more than 44 points and are not in the Hall--Vida Blue, Bert Blyleven, Ron Guidry, Carl Mays and Billy Pierce. James says that Blyleven and Guidry (60 points) are the only two pitchers that were far above 43 yet are excluded from Cooperstown. (Blyleven has been elected since James wrote the book and I wrote this post, obviously).

Since Guidry's is the most surprising result of James' survey, I'll take a closer look at him. I do not intend the discussion about Guidry to be a commentary on his Hall worthiness or even his value, but rather as a means of discussing the issue I have with the strong season method. It is important to note that James does not in any claim that the strong season method must be used in ranking pitchers, that it is better than any methods X, Y, and Z, or any such thing. James does not argue that Guidry should be in the Hall of Fame because of his showing in the system.

Guidry earned points for six seasons in James' analysis--1977-79, 1982-83, and 1985. Suppose we apply James' method, but use a different metric--a simple Runs Above Replacement, figured using total runs allowed and adjusted for park. How many points would Guidry earn under such a system?

* James ranked Guidry #6 in the AL in 1977, which is worth seven points. I have him #7, worth six points.

* James and I both have Guidry #1 in 1978 with an extraordinary season for 12 points (James awards the 9 point bonus, and I'll do so as well to keep things comparable). Guidry turned in 101 RAR, seventeen more than the next closest pitcher and nine more than any other AL pitcher in any of these six years.

* James had Guidry #3 in 1979 for ten points. I have him second, for eleven points.

* James ranks Guidry #11 in 1982 for two points. I have him all the way down at #26. His RA was 4.22 in a league in which 4.5 runs were scored per game, and he pitched in a moderate pitchers' park (.97 PF). At 34 RAR, he is eleven runs behind the eleventh-place pitcher (Geoff Zahn, 45). Presumably Season Score gives Guidry a boost because of his 14-8 record, one of the most impressive in the league (seventh in the league in Win Points).

* James ranks Guidry #4 in 1983 for nine points; I have him #6 for seven points.

* James ranks Guidry #2 in 1985 for eleven points; I have him #11 for two points. This is another season in which Guidry's W-L record seems to give him a huge season score boost (22-6).

Add it all up, and I have Guidry at 47 points--suddenly not that far above the Hall of Fame line James observed. I followed his scoring method exactly, but the results changed significantly simply by changing ranking methods.

More interesting, IMO, is how the use of in-season rank elevates the importance of very small performance differences. In 1979, Guidry ranked second in RAR at 71. However, Tommy John (71) and Jerry Koosman (70) were right behind him. Given that Guidry relied much less on his fielders, I strongly support the notion that he had a better season than the other lefties. Still, negligible differences in actual performance are given much greater impact when one uses a points system like James'.

Another example is 1985, in which Guidry ranks eleventh on my list at 61. Jimmy Key ranks sixth at 62--there are six pitchers within two RAR of each other. Guidry could very easily rank sixth in this season, which would be worth an additional five points. That would vault him from 47 points to 52 points, and give him a great deal more clearance over the HOF line.

This is not to say that James' ranking system is without its strong points with respect to its aims--it values peak performance and it sets an equal total value relative to the size of the league, which depending on one's perspective might be very good properties. My contention is that such a system is very sensitive to small changes in statistics, ones that would have no impact on a career-based evaluation. If Guidry had been evaluated at 62 RAR and thus sixth in 1985, the extra run saved would have zero impact on your evaluation of his career RAR total--and rightfully so. Allowing one run to exert a significant difference in a player's rank on an all-time list strikes me as utterly illogical and unsatisfactory.

You may object and say that I am using RAR rather than Season Score, and that Season Score is not subject to minute differences in performance having a large effect on rank order as is the case for RAR. While it is true that RAR and Season Score are very different methods, and that their application to Guidry might be very different as well, any metric is going to be subject to the same concerns when making a rank order over one season. There is always the potential that a very small margin could be the difference between a batting title and third place, between fifth in the league on a list and out of the top ten. That is true for any metric you want to pick, from BA to home runs to ERA to Season Score to RAR.

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.