Tuesday, November 29, 2005

Rate Stats, part 1

A while back I attempted to write an article for my website on rate stats for batters. I wrote about a page and got frustrated and quit. It was not going to be anything groundbreaking, just a summary of the existing work in the area mixed with some of my personal thoughts, just like most of the other articles on my site. I have decided to try again, but instead I will just write several blog segments and copy and paste when I'm done.

In sabermetrics, we usually express individual hitter contributions in terms of an estimated number of runs created, because the goal of a baseball offense is to score runs. This is extremely clear on the team level, and it logically follows that if the team's goal is to score runs, the player's goal should be to create runs for his team. There is a little caveat that complicates everything, however. Everything a team does offensively is captured in its statistics(I do not mean things like taking the extra base or hitting with runners scoring in position and other stuff that is not accounted for in the official statistics). When the team avoids an out by reaching base, and therby creates another opportunity for itself, whatever production is created by the extra opportunity is included in their statistics. When an individual does this, he gets credit for reaching base, but he is not explicitly credited for creating an opportunity for a teammate(depending on what kind of metric we are using to evaluate him, which adds more complexity).

So for a team, it is all very simple. The goal is to score runs, specifically as many runs as possible within the constraints set by the rules of the game. In most sports, the constraint is time, but in baseball, it is outs. A team's goal is to maximize the number of runs it scores per out. Since innings = 3*outs, you could also state that the goal is to maximize runs scored per inning. Since games = 9*innings, it could be stated as maximizing runs scored per game(of course not all innings have 3 outs, particularly 3 outs that can be found in the official statistics, and not all games have 9 innings, but you get the point. In theory this is true and in practice it is close enough to true to not cause any problems). And since seasons = 162*games, you can say the goal is to maximize runs scored for the entire season.

So really, for teams, you don't need a rate stat. The only reason we use R/G or R/Inning or R/Out for a team is because of the variance from the theoretical conversions. But if there was no variance from the theoretical conversions, runs scored would be the only stat you would need to know for a team's offense.

It is absolutely clear that Runs per Plate Appearance(R/PA) is an inappropriate measure of a team's offense. The number of plate appearances is a product of the team's performance. Higher on base averages lead to more plate appearances. If two teams score 800 runs, but one does it in 6200 PA and one does it in 6400 PA, the first team may have had an offense more slanted towards power, and the second towards getting on base. But they are equivalent in their impact on the game. Scoring one run in an inning on 3 outs and a home run is worth exactly the same as scoring one run in an inning on 3 outs and 4 walks. R/PA may have use as a descriptive stat for a team, but it is not a measure of their offesnive productivity.

Next time: R/PA and R/O applied to individual batters

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.