Friday, February 10, 2006

Rate Stat Series, pt. 5

This series is pretty disjointed, and this entry will be no exception. I have realized that in this piece I will discuss some of the assumptions that influence my thinking on the other issues I have discussed, so this probably should have come first.

The major point of this installment is that we should state our assumptions and our goals before we begin so that we can make the right choices. The right choice will be different, potentially wildly different, depending on what we are setting out to do. This series purports to choose which rate stat is best to use for an individual batter. But what is best?

It depends on what you are trying to measure of course. A frivolous example is that ISO is a good metric if you are trying to measure power, but a horrible metric if you are trying to measure on base ability. We first must define what the properties of a good rate stat for a batter would be. If we use a different definition, we will get different answers.

What I have used as my definition, throughout this series without explicitly stating it(which was a mistake), is that the true measure of a batter’s production is how many runs an otherwise average team would score if we added the batter to it. Actually, wins instead of runs ideally, but adding more runs will almost always add more wins for a team that begins at average.

What I want to do is look at a team that scored, say, 750 runs in a league where the average team scored 750 runs, and add one player to that team, and give him one-ninth of the team plate appearances, and see how many runs that team will score with the player added. We will account for both the player’s impact on the team scoring rate and the number of plate appearances that they have. A “good” rate stat will be one that accurately reflects the rank order of players when using this criteria and as a bonus, if it could accurately reflect the magnitude of the players’ contributions. In other words, if we find that Batter A will add 50 runs to a team and Batter B will add 45 runs, our ideal rate stat would rank A ahead of B, but not by an enormous amount--in fact, by a margin that if converted to runs above average would be about five.

If you start with different assumptions, you may get different answers. For example, if your goal was to find out how many runs a batter would add to a team filled with replacement-level players, you will likely reach similar conclusions about which rate stat is superior, but you may not, especially for close calls. If your goal is to estimate a batter’s contribution if he does not affect the team’s run environment, you will potentially get different answers. If your goal is to estimate an individual’s ability on a realistic range of different teams, you will have a much more complicated probabilistic function and get potentially different answers. And on and on. But for my purposes, I have defined the goal above, and every comment I make about a rate stat being “right” or “wrong”, “better” or “worse”, etc., is based on that assumption.

With that out of the way, we can start to tackle the issue of comparing players to different baselines. I have written a long article on my site entitled “Baselines” which talks at length about various ways people have set baselines, why they have done so, which I prefer, etc, so I will not repeat that here. Instead, I will point out that based on the assumption I gave above, about adding a player to an average team, the most obvious choice is to compare to the league average. This would be .500 in OW% terms(while most sabermetricians acknowledge OW% as faulty, it is still convenient to use the terminology, so long as we understand it is just a shorthand and do not start building bridges based on it).

So the baseline I will look at is average. This is also convenient because the other baselines are not as straightforward to apply. Later we will see a rate stat, R+/PA, that requires not only R/PA but also a comparison of OBA to the league average. If you want to apply a replacement baseline to this, all sorts of sticky problems arise. First of all, when most sabermetricians say the “replacement level” is a .350 OW%, they are defining OW% by R/O as we did in the last installment. So you need to convert the .350 OW% into a runs/PA ratio. And then you still have to deal with the OBA. Do you still use league average OBA in the R+/PA formula, and then compare an individual’s R+/PA to a replacement player’s R+/PA, or do you compare the player’s OBA directly to a replacement player’s OBA? And what is a replacement player’s OBA anyway? That answer is tied directly to your answer to what is a replacement player’s R/PA, since R/O = (R/PA)/(1 - OBA). But how did you answer that? What assumptions are you making about a replacement player? Is he a certain X% below the league average in hitting singles, doubles, etc? Or is he around 95% of the league average in terms of singles with bigger losses in secondary skills? And how does his sacrifice bunt rate compare to the league? Is it higher? Does he hit into more double plays, or does he strike out more and hit into less?

Those are all useful questions to ask if you are serious about applying a replacement level type analysis. But they make life a lot more complicated. Average, while it may well be flawed, has the advantage of being very clean. It is a mathematically defined fact rather then a calculated value based on a series of assumptions.

So we’re using average, if for nothing else then to make this discussion manageable. This does not mean that I advocate using average as the baseline for all of the types of questions you want to answer, or even many types of questions. But I do think that average is a good starting point for theoretical discussion, especially since, again, it is the only choice for which we know all of the parameters we need to know. Now what do I mean by applying a baseline anyway?

All it means is that we compare the player’s performance to the performance of a baseline (in this case average) player. If a player creates 100 runs in 400 outs, and an average player would create 75 runs in 400 outs, then he is +25 runs above average (or +.0625 RAA/PA). Now since this is a series primarily about rate stats, the second format is a rate, and is more useful to us. But if you want to go from “rate” to “value” or include playing time, then you are going to want to use some sort of baseline.

Sometimes, it may be useful to use the baseline even if we do not convert our rate stat to take playing time into account. For example, suppose we are have determined that a team would score 800 runs with our player and 750 without him. We could leave that as +50, which would be a rate if we have made some constant assumption about how much playing time he will get. For example, the simplest version of Marginal Lineup Value(link) assumes that the player got 1/9 of the team plate appearances, and was expressed as the number of runs he would add over the course of a full season. While it is not a format that one usually sees, a +50 MLV is still a rate--it's a rate of runs added/season.

And that leads to another point about rates. Since people are used to seeing a rate expressed as runs per out, or runs per PA, they will sometimes have a negative initial reaction to a rate which does not look like that. Like the MLV rate. Or like RAA/PA, a very important rate stat we’ll discuss later. That can have negatives, of course (as can MLV). And you can no longer divide them. For example, a player with 6 R/G in a 4.5 R/G is often written as 1.33. The relative stat in this way is instantly adjusted for league context, and people like percentages. But if you are working with RAA/PA, you can’t express it as a percentage of the league, because the league is zero. You can’t say a player who is +.08 RAA/PA is -3.846 times better then a player who is -.02 RAA/PA. RAA/PA must be compared relatively as differences.

I will expound on that topic more in an upcoming installment. The point here is just that a figure like that is every bit as much a potential choice of rate stat as the formats that people are used to seeing. And that when you bring the baseline into it, the difference is just the total above the baseline divided by some unit of playing time, while a ratio needs to be manipulated to be in that kind of format. So if your total stat of choice is RAR, you might want your rate stat of choice to be expressed in the same units. The difference allows you to do what the ratio cannot.

2 comments:

  1. Nice article. But when are you gonna stop beating around the bush and get to the crux? :}

    ReplyDelete
  2. Actually, that's a good question. I think that this is overly verbose. The next installment will talk about R+/PA.

    ReplyDelete

I reserve the right to reject any comment for any reason.