Wednesday, April 23, 2008

While I'm on a Roll...

I have written about this before, and asked nicely. Now I’m going to be a little snarky, since I’ve seen more of it lately. “It” is the use of OBA-BA, which goes by all sorts of different names, to measure patience or walking ability or however you would phrase it. If you take just a second to look at the underlying math, you will see what a stupid, stupid stat this is.

Let’s make everything a lot more simple by eliminating sacrifices from existence, and if you want to consider hit batters, you can count them as walks.

Now, what is OBA-BA? It is:

(H + W)/(AB + W) - H/AB

Anyone who thinks about this should immediately notice that the different denominators are going to make things difficult. The difficulty is illustrated when I ask you, “What is the unit of OBA-BA?” I challenge anyone to explain what this denominator represents, in coherent baseball terms, in ten words or less.

If you write it with a common denominator, you end up with something equal to this (you can obviously re-write it in some other ways):

W*(AB - H)/(AB*(AB + W))

You are multiplying walks by outs. And then dividing by at bats times plate appearances. Great stat you have there.

Tying this back into BA, you can also write that as:

W/PA * (1 - BA)

Walks/PA is what you are after…the player’s walk rate. But OBA-BA multiplies walk rate by one minus BA. So if you have two guys who each draw 50 walks in 550 PA, but one gets 150 hits (.340) and one gets 135 hits (.270), the one with the lower BA has a higher OBA-BA, .066 to .060.

It is remarkably lazy to leave it in the form OBA-BA. You can very easily get walk rate from (OBA - BA)/(1 - BA). If you’d like, you can also use (OBA - BA)/(1 - OBA) = W/AB. Walks per AB is not in as useful a form as Walks per PA, but given the assumptions of this post, it will produce a ranking in the same order. All W/AB is is (I like doing that, even though my English teachers would strangle me) a ratio of walks to non-walks, which if divided by one plus it self will give W/PA.

Of course, a lot of this confusion can be traced to the somewhat silly nature of Batting Average and At Bats themselves. Joe Posnanski’s rift on that subject encapsulates my thoughts fairly well (and of course is a billion times funnier than the way I’d write it).

Tuesday, April 22, 2008

No Post, but a Link

I do not have a post for you this week, however, I do have a link to an article I wrote for The Hardball Times about bases/(outs or plate appearances) metric. It is somewhat similar to the blog post I wrote a couple months ago, but goes in a different direction at the end. Thanks to Tango Tiger for providing data (and Retrosheet for compiling raw data), and to Studes and the gang at THT for publishing it.

Monday, April 14, 2008

Jimmy Rollins, Curtis Granderson, and the Barbara Walters Number

The 20-20-20-20 achievement of Jimmy Rollins and Curtis Granderson is a great starting point for rumination on the difference between sabermetricians and what I will call, for the lack of a better term, traditionalists. (And an upfront disclaimer: I am by no means the first person to write about this).

Bill James, in his 1982 Abstract, wrote about how people were bothered by sabermetricians because sabermetricians treat numbers as numbers. We multiply and divide them, we endeavor to express relationships between them, we try to put them in context. The traditionalists, for all of their complaints about numbers, use them all the time. Everybody reads the stats; it’s a matter of which ones and how they are interpreted.

Speaking for myself, I have very little interest in numbers for numbers sake, and thus the Barbara Walters number means absolutely nothing to me. It goes without saying that it is worthless from a valuation standpoint, but I’m referring to my own curiosity here. I couldn’t be less interested in those types of statistical groupings, even if they are offered in the spirit of fun.

Of course, that is personal preference; there’s nothing wrong with being interested in odd statistical combinations as long as you don’t ascribe deep meaning to them. Unfortunately, there are folks out there who do, whether it’s the Barbara Walters number, or the 30/30 club, or what have you.

I may be off base here, but it seems as if sabermetrically-inclined people are less interested in the steroids issue than others. I am in no ways saying that there is a correct sabermetric position on the issue, or that there is any sort of uniform view among sabermetricians; it may well be the case that a serious examination would find my intuition faulty.

If we accept my premise, though, I think the “sanctity of the records” may be a contributing factor. As Bonds approached the record, it seemed as if some people were realizing for the first time that 755 home runs in one context might not be equally impressive, equally valuable, equally something as 755 home runs in another context. Of course, most people of non-sabermetric perspective recognize this, but the reactions of a minority seemed to indicate that the idea was relatively unfamiliar to them.

As for me, I don’t care about the home run record. I recognize its historical significance as the most hollowed record in baseball, and think it is worthy of attention for that reason. It has absolutely no personal meaning to me, though, and I don’t attach any meaning to it like “greatest home run hitter of all-time”, etc. It simply is the man who hit the most home runs in major league games. Without even a cursory consideration of context, it takes on no significance to me. Nor do 3,000 hits, 300 wins, a 56 game hitting streak, and other such marks.

So when I look at Barry Bonds breaking the record, I do not feel in anyway violated. (Of course, I don’t care about steroids for other reasons, which are in the philosophical/political realm and are not germane to this discussion. Even if I was opposed to steroids, though, I wouldn’t be any more upset at the record being broken by a user than I already would be that the guy was allowed to play and have an impact on pennant races).

I will admit to having an obsession with no-hitters, particularly my quest to have a scoresheet of one. I don’t care about no-hitters analytically, particularly in a post-DIPS world, but they are at least a rare occurrence that is a feat of unique value. It’s tough to give up many runs without yielding a hit (yes, yes, I know about Matt Young and company). Contrast the no-hitter with the cycle, though. Of course, any game in one manages to collect four hits, three of them for extra bases, is a superlative performance. But the cycle is noted not because it is the most valuable of games, as is arguably true for a no-hitter, but because it is a curiosity. There are many combinations of hits, walks, and outs that produce more value than a cycle. These, however, don’t get put on any special lists for historical posterity, unless they involve hitting a lot of home runs.

And that’s the point I’m trying to make here. The general baseball public loves things that they can wrap their head around easily. Sabermetricians love things that produce the most runs or the most wins, regardless of whether they arise from unique combinations. Those on the other side who sneer at the seeming analytical coldness of the position should not lose track of why Jimmy Rollins is trying to hit doubles in the first place. Likewise, I need to remind myself occasionally that there is nothing wrong with numbers for their own sake in the spirit of fun.

Monday, April 07, 2008