Monday, May 04, 2015

Counter-Revolution

Continuing the tradition of haphazard “book reviews” appearing on this blog well past the time that such a review would be relevant, I recently read The Sabermetric Revolution by Benjamin Baumer and Andrew Zimbalist and have a few thoughts on the book.

On the whole, I am not a fan of the book. While I am not personally very familiar with Baumer’s work, Zimbalist is a seminal figure baseball economics (starting over twenty years ago with his Baseball and Billions). Unfortunately, The Sabermetric Revolution is too short (153 pages of prose not counting footnotes) and too unfocused to really showcase the authors’ knowledge.

In many respects it appears that the book was intended to be something of a rejoinder to Moneyball, both by pointing out areas in which Michael Lewis either played fast and loose with the facts or omitted key details. The preface is clear about this motivation, as the authors write: “This book will attempt to set the record straight on Moneyball and the role of ‘analytics’ in baseball.”

There’s no doubt from reading the book that this is a major goal of the authors, as the first chapter is devoted to “Revisiting Moneyball”. I found some of the criticism to be fair (for example, Lewis’ tendency to gloss over the contribution of young talent the A’s had produced that contributed to the team’s success, such as Eric Chavez, Miguel Tejada, Tim Hudson, Mark Mulder, and Barry Zito). Some, though, strikes me as of 20/20 hindsight (such as a review of the infamous 2002 amateur draft) or nit-picking (such as the fact the A’s OBA decreased in 2002 despite Beane’s emphasis on OBA). In other places, I would contend the authors are guilty of some of the same offenses they accuse Lewis of (for example, they state that Lewis gives short shrift to the work of Bill James and other sabermetric pioneers; however, their own discussion of the internet sabermetric community begins at Baseball Prospectus).

The fundamental issue I had with the book is that it is not clear what it is intended to be (aside from a Moneyball response) or who the intended audience is. The book is not detailed enough to serve as a technical introduction to sabermetrics for newcomers (for instance, I’m not sure park factors are ever discussed outside of brief allusions), but neither is it detailed or advanced enough to strongly appeal to the smaller audience of practicing sabermetricians. There is even a chapter on statistical analysis in other sports, a topic on which I am closer to the novice group, but it also is short on details, even more glaring of an omission since at least there is a quick overview of sabermetric theory.

At the cost of myself falling into the trap of nit-picking this book, I think listing a number of my issues with the book might be the easiest way to write it up:

* There are also a number of incorrect acronyms used in the book, some of which were surprising to me. OPS is said to be an acronym for “Offensive Performance Statistic”; DER an acronym for “Defensive Efficiency Rating”.

* The authors state that the formula for Isolated Power weights doubles and triples equally and is roughly the difference between SLG and BA, “or sometimes” is (D + 2T + 3HR)/AB. While I understand the argument for treating doubles and triples equally in a power metric, Isolated Power is not “sometimes” defined as SLG minus BA. That formulation has been used in conjunction with the term “Isolated Power” since Branch Rickey linked the two (but did not set them equal) in his 1954 Life magazine article and it was used in the manner by Bill James. While this or the meaning of the OPS acronym may seem like insignificant details, they suggest something less than a full command of sabermetric history.

* The authors state that in economic terms, WAR measures “marginal physical product” and state that this is a good idea, but are not fans of the methodology used to calculate current WAR implementations. Their concerns include fair ones, such as failure to report error bars and the use of black box methodologies. But while their reasoning behind these criticisms are clearly laid out, they sometimes engage in what might be called “drive-by” criticisms, in which issues are alluded to but not fully fleshed out to the point where the creators and users of these metrics could offer a defense. In this manner, Baumer and Zimbalist reflect the attitude of another “insider” who has criticized replacement-level metric, Christopher Long.

One such comment is “It is not clear that there exists a pool of replacement players with the productivity that is ascribed to them”. This basically questions the entire concept of replacement level, but is not supported other than with a footnote to site the work of JC Bradbury. This does nothing to forward the discussion of replacement-level, nor does it alert the readers to the well-reasoned and spirited rejoinders sabermetricians have issued to Bradbury’s contentions.

The authors then use a single example to question what is one of the least controversial and most similar step in any WAR methodology--the run to win conversion. The authors simply write: “The use of James’ Pythagorean Expectation to convert runs to wins is less than robust. One need only reflect on the 2012 Baltimore Orioles, who outperformed their expected win total by 11 games, to see how inaccurate the runs to wins conversion can be.”

If I may be impolitic and a bit unhinged for a moment, the authors should be ashamed of themselves for this statement. It is the type of statistically illiterate cherry-picking that one might expect from a Bill Madden rather than from respected professionals familiar with statistical methods. While it is without question true that win estimators (like every other statistical estimator known to man) produce poor estimates in certain individual cases, a reasoned discussion of their error bars does not begin and end with a single poor estimate. Any regression equation presented by Zimbalist in Baseball and Billions or in this work could be easily impugned by similar rhetoric, and likely more effectively given that win estimators are among the more accurate and stable estimates one will find in baseball analytics.

It might also be pointed out that the run to win converters actually used in WAR calculations are likely more robust (in the true meaning of the term, rather than denoting a single outlier) than Pythagorean by recognizing that the shape of the relationship between runs and wins changes as the scoring environment changes. While the authors are surely aware of this, one could never tell from the discussion of run/win estimators in the book, as only Pythagorean constructs with fixed exponents are discussed, with no reference to alternative exponent constructions like Pythagenport/pat or dynamic linear run to win estimators.

* My sense, and it may be unfair, from reading the book, is that Baumer and Zimbalist are eager to emphasis areas and issues in which sabermetric findings have been wrong and/or incomplete. An example is the discussion on sacrifice bunts, which points out that the initial sabermetric analysis (they do not reference Palmer and Thorn by name in this section, but The Hidden Game of Baseball is the usual source of the classical argument) was incomplete in not considering the other outcomes that may occur on a sacrifice bunt attempts, such as bunt hits and errors.

This is without question a valid criticism. However, neither Baumer/Zimbalist nor other present day critics of the conclusion acknowledge that the conventional wisdom that was pushed back against was not that the bunt was a good play because of those outcomes, but that the sacrifice if successfully executed was a good play. I still find myself as one of the few patrons clapping when I attend a game and the team for which I am rooting successfully records the out at first base on a sacrifice. This play was seen, and still is seen by casual fans and presumably a non-negligible portion of major league managers, as a success for the offense, even without the benefit of the error or hit that make the play a palatable strategy in certain situations. Sabermetricians have moved to a more “nuanced understanding” of the sacrifice, but they have also forced the conventional wisdom to tack on a bunch of addendums and hypotheticals that had rarely been discussed before.

* In other cases it is unclear how deep of a literature review of the field the authors have performed. For instance, the authors criticize FIP due to using an ERA scale (a criticism with which I agree but also note can be relatively easily corrected) but state that “What this field needs is a simple, illustrative, but effective model to evaluate pitchers. Until a model can be constructed with interpretable coefficients (a la linear weights), or with meaningful interaction of terms (a la Runs Created), no real insight will be gained, and there is unlikely to be any consensus about which metric is best.”

In all,The Sabermetric Revolution is a book that I think might have been better conceived as a couple of separate journal articles on the topics on which Baumer and Zimbalist have something new to say, because the rest of the book feels like filler and does not establish a consistent purpose or tone.

Sunday, May 03, 2015