## Tuesday, October 30, 2007

### A Replacement Post (brought to you by eMachines)

The next few entries at this blog will test replacement level theory, as these are replacement posts. My computer on which the regular posts were stored was done in by a power surge, which apparently is a recurring problem with eMachines systems. Anyway, the disk drive appears alright, but I don’t have access to it at the moment, so it will be some time until I can publish the posts I had already written, and since they are already written, I’m not going to go back and write them again. So the 1876-1881 NL series is on hold, as well as a few other articles.

So one replacement topic will be rate stats. I still intend to finish the rate stat series in which I discuss all of the options for expressing an individual’s run creation performance in a rate form. For now, I just want to talk about the mathematical consequences of a couple approaches, not their theoretical underpinnings.

Jin-AZ at On Baseball and the Reds has been doing a sort of “Player Valuation 101” series that I would recommend to anyone, but particularly novice sabermetricians (in the interest of full disclosure, he liberally cites some of the stuff I have written in his series). Anyway, one thing that he mentioned was the choice between runs per plate appearance and runs per out to calculate individual offensive value above baseline.

Like I said, I don’t want to get into the theoretical underpinnings here, so suffice it to say, R/PA doesn’t cut it. Absolute RC methods do not account for the “inning killer” value of the out (as Tango called it in his series on run creation); this is not a flaw in their design, since they are attempting to measure the number of runs that actually resulted from the batter’s performance. But when discussing his value to a team, the inning killer value of the out must be considered. Absolute RC does not do so, and neither does the denominator of plate appearances.

Runs per out, on the other hand, should not really be directly applied to players (it is the one true rate stat for teams), but by dividing runs by outs it does incorporate the full effect of the out.

Let’s look at some actual numbers to illustrate the point. First, let’s define our RC formula as ERP, with the out values customized for the 2007 AL. Whether these particular values are correct are immaterial for the purposes of this exercise:

Abs RC = .49S + .81D + 1.13T + 1.46 HR + .32W - .09628(AB - H)
BR = .49S + .81D + 1.13T + 1.46HR + .32W - .29083(AB - H)

Let’s use David Ortiz as our example. Big Papi had 140.595 RC and 69.195 BR under these formulas in 660 PA and 367 outs.

One approach to RAA would be to take (R/O - LgR/O)*O. The league R/O was .19455, and so Ortiz was (140.595/367 - .19455)*367 = +69.195. As you can see, this is exactly equal to his Batting Runs. So while R/O may not be the ideal rate stat, using it as the fuel for the RAA figure is equivalent to using Batting Runs.

Switching subjects, during the Breeder’s Cup on Saturday, George Washington was injured in the Classic and had to be destroyed. This lead to a few European racing folks blasting the Breeder’s Cup for being run on dirt.

One of the complaints is that the Breeder’s Cup are subtitled with some variation of “world championships”, and since dirt racing is largely an American phenomenon, this is a misrepresentation. That complaint means nothing to me; if the Euros want to get agitated about the semantics of how the event markets itself, they can knock themselves out. What was obnoxious was the complaint that the event was run on the dirt at all.

Americans have always preferred dirt racing. I prefer dirt racing--it is much more conducive to speed, and speed is exciting in thoroughbred racing. America’s major events have always been run on dirt, a tradition dating back nearly 150 years for races like the Travers or the Kentucky Derby.

If European trainers think dirt is too dangerous, or they don’t think their horses will adapt well to the surface, they are free to leave their horses at home. Of course, of the eleven BC races, four are run on the turf, so it’s hardly as if the opportunity is not there. And of course the Europeans have their own high profile meets in which horses run solely on the turf.

George Washington’s connections obviously felt that he should run in the Classic. Perhaps that was a poor decision (although more likely it was a flukish event that couldn’t have been foreseen). It’s not the other Euros’ business.

It is common to see Americans, usually but not always liberal, carp about how Americans are so provincial when it comes to sports. The fact is that people all around the world have similar mindsets. Europeans thumb their nose at dirt racing, which is king in America but also practiced to limited extents in other parts of the world, including South America, Japan, and Dubai (where the world’s richest race, the World Cup, is run on the…dirt). The IOC drops baseball because it supposedly is not played by enough countries around the world. In fact, baseball is played on a high level by just about every country in North and Central America, several in South America (Venezuela and Colombia along with a lesser presence in Brazil), three major east Asian countries (Japan, Korea, and Taiwan), and Australia. What they really mean is that baseball is not played by enough European countries. Casting baseball aside as a major sport cannot be done from a global perspective. It can only be done from a Euro-centric perspective (or Africa, or West Asia, but of course it is not people from those areas that dominate these types of governing bodies).

The fact of the matter is that each individual has their own opinion about what makes good sport (mine are that baseball is easily the best sport with football a close second. Basketball played with college rules is excellent, but the NBA and international games put me to sleep. Hockey is great if played by skilled players and awful at a non-professional or collegiate level. Horse racing on the dirt is better than horse racing on the turf, but turf racing is still interesting. Soccer is the most boring thing mankind has ever invented). From the preferences of individuals rise the preferences of nations viewed as a whole, and regions. I refuse to make apologies for my individual sports preferences, and if some European trainer doesn’t like it, he can jump in a lake. And if Europeans and Africans and Brazilians (or my fellow Americans) want to kick around a checkered ball for 90 minutes, that’s no skin off my back, nor is it an excuse for soccer fans to act morally superior because their sport is played by more people than mine.

1. Hi p,

Thanks for the nice words about my series--your work has been very helpful in getting me up to speed on some of these issues (not that I've mastered everything yet..).

I think you mentioned at baseball fever that the issue with using r/o as a rate for players comes up when using non-average baselines, like a 73% replacement level. How much of a difference does it make? And is there a better alternative when dealing with individual hitters (I guess you mentioned RC+/PA?). Thanks--just trying to figure this stuff out.
-j

2. I didn't do a good job of explaining what I meant about the non-average baselines. When I or BBBA or Clay Davenport say that 73% is what we use for replacement level, we mean in terms of runs/out.

Tango defines things a bit differently (see post #10 in the linked thread), and so his 73% is the same as others' 73% under certain conditions, but it wouldn't be exactly the same based on how you do his process.

I have to admit that I have never sat down and figured out exactly how the two approaches compare, but eyeballing it they are pretty close. And since replacement level is a fuzzy, inexact point anyway, I'm not bothered by estimates being a little bit off from each other.