Wednesday, March 22, 2006

Review of "Baseball Prospectus 2006"

I assume that most of the readers of this blog are familiar with the Baseball Prospectus, and know whether it is something you want to read or not, and don’t really need somebody to tell you what they think. So this review will simply point out a few changes in the book and discuss some of the sabermetric articles in it.

They have cut back on the number of player comments this year ever so slightly, which is actually IMO a good thing. Personally, I got tired reading somebody try to come up with something insightful to say about another middling pseudo-prospect. Instead, they have added a little section at the end of each team’s chapter which lists a stat lines for several other batters and pitchers and a little sentence or two about each of them.

One slightly annoying thing is that they have cut back on the number of stat lines displayed for a player that only played with a team for a brief period, like a cup of coffee in AAA or the majors. It is kind of annoying to read a comment about some guy that says something like "He showed good command in a brief September call-up", and then look up and not see the numbers for yourself. Also, if a guy split time at a level between a couple of teams, it would make sense to combine the numbers if one of the stat lines falls below your stand-alone standard.

Also interesting to note is that EQA has reclaimed its position in the book, while MLVr is only shown in its non-translated form. I am not crazy about either of these metrics, as EQA distorts the scale and MLVr is based on the Basic RC model, but it is interesting how their fortunes in print at BP have varied over the years.

The articles in the back of the book are a mixed bag. The article on the off-field business aspects of the game in 2005, written by Andrew Bahrlias, is excellent. According to his bio in the back of the book, he is a former counsel to the Yankees, so he knows his stuff.

Then there is "Injury Accounting", by Thomas Gorman. This piece attempts to quantify the effects of injuries, but is quite dull for my money’s worth. It simply assumes that the player will be replaced by a replacement-level player while ignoring chaining, uses a combination of actual performance and PECOTA projection to estimate how the player would have performed if healthy, and uses days on the DL to estimate games missed due to injury. I am not saying that any of the stuff is bad or wrong, just that it is pretty straightforward and doesn’t break any new ground. Chaining is a difficult thing to quantify, and so I appreciate the fact that it was ignored for simplicity’s sake, but it is too big of a problem to ignore on the injury front I think. If they could have modeled chaining, even rudimentarily, then this would be a grand slam piece.

Keith Woolner has a piece on Win Expectancy, expanding upon his work from last year. This article is interesting, but WE is hardly an unexplored topic, so it is most interesting for some the results that are generated, like breakeven percentages for stolen bases in different leagues. Definitely a worthwhile article, but there is lots of other good work being done with WE by others as well.

The other "fungo" is Gary Huckabay’s article entitled "Where Does Statistical Analysis Fall Down? Reality and Perception". This piece did not sit well with me at all. To be fair, one must keep in mind that the main focus of it is the usefulness and implementation of "performance analysis" in Major League front offices. So some of the comments are directed towards the use of statistical/sabermetric/performance analysis in that context.

Still, the piece comes off as dismissive of much of the sabermetric community. For example, "First, throw the term ‘sabermetrics’ out the window. It’s slippery, doesn’t describe anything of substance, and trivializes the nature of serious analysis." Okay, then. Sabermetrics describes nothing of substance. Now perhaps it is true, a front office may not really care about the exact Pythagorean exponent that should be used, or some similar thing that sabermetricians like myself care about. I accept that, and quite frankly don’t care. Maybe another couple of quotes will help me explain further:

"The baseball analysis ‘community’ lacks standards; people self-publish their work and feel confident that they’re qualified to offer advice on multi-million dollar transactions."

Again, there is an element of truth to this, but is this not true of just about everything in life, not just "baseball analysis"? Aren’t there people who’ve never spent a day in the military who feel compelled to give military advice, people who’ve never taken an economics course who decry "price gouging" every time the gas price goes up by a nickel. Now there is no reason why one should heed the advice of many of these self-appointed experts, but such a dismissive attitude towards all commentary by non-experts insulates the industry in question from any criticism. Perhaps only soldiers should comment on the military, only gas station owners on gas prices, and only actors and professional movie critics on movies. This all sounds fine and dandy until you realize that nobody is an expert on everything, and you will be silencing yourself on some matter that interests you be it politics or military strategy or whether Ben Roethlisberger made it across the goal line in the Super Bowl.

Of course, the bit about self-publishing is sort of funny as well, because self-published outlets like Baseball Think Factory or blogs have more interaction and out in the open peer review then does the Baseball Prospectus. Who has ever been allowed to review PECOTA? Heck, in their Baseball Between the Numbers book, they don’t even give formulas for things as elementary as EQR and the Pythagenpat exponent! I understand the need to keep things proprietary, but if you’re going to do that, don’t turn around and lecture others who publish their results in the open and solicit rebuttals and debate as "lacking standards".

"There is excessive attention paid to the ‘academic’ race, refining a model to another 1% of precision, without regard to its utility for making decisions that will actually help a ballclub, or the enormous error bars inherent in the entire exercise."

This one could have been intended at me (note I am not suggesting that it is, because it's obviously not, and there are plenty of other people, some of whom people have actually heard of, who it could apply to; what I mean that it is aimed at people with sabermetric interests similar to mine). It presumes that the only purpose of doing sabermetric-type research would be to help a ballclub. I’ve never made any pretensions that RPG^.285 as a pyth exponent will have any tangible consequences on running a team then using "2", or similar things. Some things are worth knowing for extreme theoretical situations, to people who are interested in theoreticals. Knowing how many runs it will take to win a game in a 25 RPG context may not interest a major league team. But neither will knowing pi to one thousand digits interest someone who needs to know the area of a circle. Yes, that is an academic pursuit, but that is what academics do. Maybe I should pretentiously call myself an academic sabermetrician in order to avoid confusion and give the impression that I secretly want total control over the Rangers organization.

Now the point about error bars is one that I agree with, to a point--there will be sizeable error in all estimates, but that is no reason that those with the patience and interest should not endeavor to make the initial estimate as precise as is possible, and as well-reasoned as is possible, and as applicable across a wide range of contexts as is possible.

Of course, this is yet another funny criticism to be coming from Baseball Prospectus, since it is them who published an accuracy study showing EQR to be slightly more accurate since 1871 then BsR, XR, or other choices of run estimators. It is them who have a six page article in the very same book about optimizing PECOTA, their projection method, which is the most error-prone activity of all--trying to forecast future performance of individual human beings trying to hit a little white sphere going 90 miles an hour with a thirty-four ounce piece of wood. It is them who in the next article(the Woolner article on WE) print results of a regression slope to five places and the intercept to six places. I realize that Baseball Prospectus is made up of individuals and is not monolithic--however, there is no acknowledgement that some of these criticisms could easily be applied to members of his group and their work.

Some readers have always found the BP to be unduly arrogant. I have never really shared that opinion, but I think Gary Huckabay’s piece is. However, if you have read BP in the past and enjoyed it, you should not let that stop you from reading it this year.

In closing though, the book has a quote from an Esquire review on the cover calling it the "heir" to the Baseball Abstract. For many years, this may have been true, because since the BBBA went under in 2001 there has been no other sabermetric annuals published, so the title would fall to BP by default. But that is no longer the case. While not as true to the Abstract format as the BBBA was, there is no doubt in my mind that the Hardball Times Annual is the current book that best encapsulates the spirit of the Abstract. This is not an inherently good or bad thing; the books have different purposes and different target audiences. But if you’re looking for which is more like the Abstract, it’s not even close.

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.