Tuesday, June 14, 2011

Comments on Bill James' “Solid Fool’s Gold”

For the past three seasons, Bill James had published an annual book called the Bill James Gold Mine. The book included a sampling of some of the material available to subscribers of his Bill James Online website, including some unique split data (unique in the sense that it’s not commonly in print on actual pieces of paper), like breakdowns of pitches thrown to left-handed and right-handed batters. There were little boxes with “nuggets” (gold is a theme needless to say) pointing out various oddities.

Those elements of the book really took up a lot of space, but were woefully incomplete (they clearly weren’t intended to be complete, but the point is that the book had no utility as a reference). The most interesting aspect of the book was that it reprinted a number of full-length essays that James had written for his website over the course of the year. I generally enjoyed those essays as you will see if you look back at my comments on the previous editions of the book.

This year, there was no Gold Mine. Instead, ACTA and James released a slimmer, smaller volume entitled Solid Fool’s Gold, which includes only essays. I didn’t bother to count, but I’d guess that the new book has about as many essays as the Gold Mine, it is sold for a lower price, and personally I won’t miss the hodgepodge of charts that much.

If a fairly quick read of non-technical essays by Bill James is something you think you’d enjoy, you’ll probably like Solid Fool’s Gold, unless you already subscribe to Bill James Online. The new format is much better, as it includes a bunch of essays that you could pick up and read in five years rather than some of the more limited shelf-life aspects of the Gold Mine and it does it more cheaply and compactly. I certainly enjoyed it, which you should keep in mind as I now launch into a more critical review of some specific essays in the book.

The essay that has probably gotten the most attention (outside of the much-panned essay on Shakespeare and Topeka which was published on Slate) is called “Minor League Pyramid”, and it includes James’ outline of a way to reform the minor league structure to make it resemble a pyramid rather than a tube (his description) as it does now. I won’t repeat his argument here, but there are couple points that I feel strongly about:

1. I agree with James that talent is choked out of the game by the limited number of openings at the entry level. This is offset somewhat by the existence of college baseball as an alternative means to improve baseball, but scholarship restrictions make it a less reliable means of keeping quality talent engaged than college football or basketball. The scholarship restrictions also leave college baseball as next to useless for attracting low-income players to the game.

2. The proposal that James offers includes limits on the rate at which a prospect can be advanced through the minors, with the bottom line being that a player could not reach the majors until he’d played three seasons in the minors. This is impossible to square with the existence of college baseball, and it also removes the illusion of a meritocracy which I think is very important, even if it is only an illusion. We all know that teams play service time games, and that there are few twenty-year olds ready to be major league contributors anyway, but the potential for a wunderkind to reach the majors, even if there are only a handful each year, is something that should be preserved.

3. James doesn’t seem to think his system would have much of an impact on the ability of baseball to attract talented athletes with other options. In fact, he claims that a minor league pyramid would reduce the pressure on signing bonuses. While I’m sure MLB CFOs would like that (as they would like the destruction of college as an alternative path), I can’t fathom how it wouldn’t make MLB a much less attractive option.

He doubles down on this by saying that it would reduce “pressure” on teams to scout internationally to find talent, since they would have a larger supply of homegrown players. I can’t for the life of me understand why that would be considered a positive. The piece does make some good points, but that one is a real head-scratcher.

Another essay in the book is called “Stink-O-Meter”; it discusses a fairly simple method of tracking the persistency of losing for a franchise. The article is good in that it reinforces something that the average fan with no historical perspective constantly needs to be reminded--the state of even sorry franchises like the Pirates and Royals is nothing like that of the terrible franchises of the game’s past.

The article is way too long, though--James feels compelled to run a chart every few paragraphs listing the top five or ten losing teams at a given moment in time. This works better online than in print where it just wastes space, but it also helped me to see what I think is a pattern in James more recent work. James has realized that it’s more difficult (not just for him, but for anyone) to produce cutting edge technical work. Rather than introducing more rigorous means of analysis, James has decided to play show-and-tell; his more recent essays are filled with tables that in the past would have been left to the imagination or determination of the reader. I could be off base, but it seems that increased comprehensiveness represents James’ attempt to keep up with the Joneses.

Another essay is a reprinting of a speech James gave called “Battling Expertise with the Power of Ignorance”. It’s a good read, but there was a portion that mentioned Pythagorean record and Runs Created that, shockingly, I can’t help but comment on.

James describes the two methods as the best-known of the “large number of heuristic rules” he developed during what could loosely be defined as his Abstract years. About the Pythagorean theorem, James said: “Later research has demonstrated that it works better still if you modify the exponent for the level of scoring”.

The relationship between the slope of a run to win converter has been known for many years (dating back at least to Pete Palmer), but Clay Davenport’s Pythagenport was the first well-known modification to James’ Pythagorean theorem. Later, this was refined further into Pythagenpat by recognizing the minimum theoretical exponent was one.

James freely acknowledges the refinements to Pythagorean and in fact has used Pythagenpat in at least one of his own studies. That only makes it all the more strange that he continues to cling to Runs Created, even when RC has been demonstrated to be a less accurate tool than Pythagorean with a fixed exponent. RC is subject to complete meltdown under theoretical extreme conditions; while Pythagorean incorrectly handles the known point of 1 RPG, it correctly imposes a range of [0, 1] on all of its estimates.

Discussing RC, James made no mention of subsequent work on other run estimators. Understand that I am not trying to claim that he had any obligation to do so in the context of this speech--only that his continuing clinging to RC while recognizing other refinements to his original tools grows more bizarre as time passes. I suppose one could argue that at least the Pythagorean refinements maintain the original R^x/(R^x + RA^x) model, and James always pointed out that an exponent other than two could result in more accurate estimates. In any event, it’s extremely hard for me not to comment on a run estimator when the opportunity arises.

While James recognized the existence of variable exponent refinements to Pythagorean record, he unfortunately missed a golden opportunity to utilize them in another essay in the book. There is an article in which James examines the performance of starting pitchers when supported by X runs--basically, an attempt to examine the mystical phenomenon of “pitching to the score”. I won’t steal his thunder by discussing his conclusions, but I will point out a methodological shortcoming in his approach.

When Whitey Ford’s teams scored one run with him pitching, their record (not Ford’s record) was 10-28. James converts this to an effective rate of runs allowed using Pythagorean math. In this case we know that Ford’s teams scored 38 runs (one for each game), so the equivalent number of runs Ford allowed to produce a Pythagorean record of 10-28 is x in the following equation:

10/(10 + 28) = 38^2/(38^2 + x^2)

This eventually simplifies to sqrt(L/W)*R, or sqrt(28/10)*38 = 63.6. With two runs, Ford’s team were 19-22, which is equivalent to sqrt(22/19)*82 = 88.2 runs. Adding these up and dividing by the total number of games produces an “Effective Runs Allowed Rate” for Ford in games in which his team scored one or two runs: (62.6 + 88.2)/(38 + 41) = 1.91. Continuing in this manner for scoring three, four, five, … runs (while ignoring shutouts which are always losses), James has a measure of pitching effectiveness given the level of offensive support on a discrete game-by-game basis.

However, the use of a fixed exponent severely distorts things by essentially assuming an average run scoring environment (an exponent of 2 corresponds to a RPG of around 10.9 using Pythagenpat), when we know the scoring output of one of the teams involved. If one team scored only one run, the expected RPG is going to be lower than average.
We could assume that an average number of runs would be scored by the other team involved in the game, and instead say that the RPG is 5.5, and the Pythagorean exponent should be around 1.64. In that case, the equivalent runs allowed would be (28/10)^(1/1.64)*38 = 71.2 runs, a 12% difference from James’ estimate.

That approach assumes that we know nothing about the “other” team’s run scoring rate--but of course, we know a great deal about it, because we know the identity of the starting pitcher: Whitey Ford. For his career, Whitey Ford had a Run Average of 3.14 and averaged 6.94 innings/start in a league that averaged about 4.31 runs/game, so we could estimate that his team’s RA is (6.94*3.14 + (9 - 6.94)*4.31)/9 = 3.41, and that the expected RPG for a game in which his offense scores one run is 4.41, producing a Pythagorean exponent of 1.54 and (28/10)^(1/1.54)*38 = 74.2 run equivalent. This new estimate is approximately 17% higher than James’ original estimate.

The good news is that when you extend things across the entire spectrum of the run distribution, much of the distortion is canceled out. James presents complete breakouts for several pitchers, but the two of historical interest are Ford and Tom Seaver. Setting aside the adjustments he introduces to smooth the data and restate the effective runs allowed rate on the actual RA scale, let me just run the crude totals for those two under three assumptions: the James approach of a Pythagorean exponent of 2, an assumption that the RPG at each scoring level is 4.5 plus the number of runs the pitcher’s team scored, and the customized type assumption I described for Ford above (Seaver had a career RA of 3.15 and averaged 7.38 innings/start in a league that averaged approximately 4.11 runs per game, resulting in a customized team RA of 3.32 runs/game):

These differences are small, in the neighborhood of 1%, and thus not worth getting too worked up about. However, it’s important to keep in mind that fixed Pythagorean will not work particularly well at the extremes, and it would be a mistake to put a lot of confidence in the isolated application of the fixed exponent Pythagorean estimate to an extreme RPG.

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.