Walk Like a Sabermetrician: A Perfunctory Look at Run Distribution and W%, 2008

Before I start, I want to emphasize the word “perfunctory” in the title again, as it really is just that. There is a lot of stuff that you could do with this data, and I’m just looking at a couple of things here.

Baseball Prospectus has some nice stats on their website for team record by runs scored, runs allowed, and run differential. You can download these as CSV files and open them in your spreadsheet program.

Let’s start by looking at one-run games. There were 680 one-run games in the majors last season out of a total of 2,428 (28%). The team with the most one-run games was Toronto (56), not a surprise considering that their games also had the lowest RPG in the majors. Cleveland (31) played in the fewest one-run games.

The chart is sorted by the difference between W% in one-run games and W% in all other games (labeled “else”):

I don’t need to lecture the readers of this blog about how unsurprising it is that most of the teams at the top of this list were not very good in terms of overall record, and that the opposite is true for those on the bottom.

Atlanta was peerless in their struggles in one-run games; Seattle had the second-worst W%, but they lost the same number as the Braves while winning seven more. Milwaukee had the highest W% and tied with Tampa Bay at 11 over (I am using “over .500” in the sense that the mainstream uses it, simply wins minus losses. I am well aware of the difference between this and wins above average).

Let’s also consider record in blowouts. There is no agreed upon definition for a blowout (nor is “blowout” necessarily the best word to use in this context), but I have defined them as games decided by five or more runs. The five-run threshold, for 2008 at least, has the nice property that there were 712 such games, or 29% of all games, compared to the 28% decided by one-run.

The team that participated in the most blowouts was Colorado (58), which is no surprise when you consider their home park. Next was the Dodgers (55), which goes to show you that park is by no means the sole determining factor (although Dodger Stadium’s PF has crept up to hover near 1.00 in recent years). Toronto played in the fewest blowouts (34):

However, when a Jays game was not close, it was usually a very good thing for them. Their .706 W% was best in the majors, although Chicago was 19 games over .500 while Boston and Minnesota also had a greater margin than Toronto’s +14. On the flip side, the Nationals’ dreadful .280 W% and -22 was most closely approached by the Royals (.340, -15). The Angels were the only playoff team that did not have a winning record in blowouts (20-20).

What about the other 43% of games which are decided by two to four runs? Cleveland played the most of these games (82), while Colorado and Minnesota participated in the fewest (59):

It was in these games that the Angels really made their hay, playing .700 baseball; among the other 29 clubs, only Houston was above .600, and LA’s +28 blew away the Astros’ + 17. Seattle (.390, -16) brought up the rear in these intermediate margin games.

Since BP kept data based on the total number of runs scored and allowed, we can look at some other, more interesting breakdowns. The most basic is the overall MLB frequencies and winning percentages by runs scored:

Three runs is the mode, occurring in 13.7% of games. Teams that scored four runs had a .480 W%, while a fifth run boosted the W% up to .639; that increase is the largest for any one-run increment.

In the 1986 Baseball Abstract, Bill James wrote about an alternative kind of Offensive Winning Percentage based on the number of times X runs were scored. If a team scored one run, they would get credit for .083 offensive wins, since the average team that scores one run wins 8.3% of the time. If they scored two, they would be credited with .202 offensive wins, and so on.

I will apply this Jamesian concept here. I have lumped all games with ten or more runs together with a uniform .950 W%. I acknowledge that it would be better to base this type of analysis on a theoretical model rather than the actual winning percentages, but again, I’m not holding up anything in this post as state of the art. I have figured the team’s conventional OW% as well, with the exponent figured as x = (R/G + 4.65)^.29, and OW% = R^x/(R^x + 4.65^x). 4.65 is the combined MLB average for R/G. There is no park adjustment in either figure; I have labeled the James alternate approach, based on the number of times scoring x runs, as “gOW%”.

One note about my use of OW%, a construct I have railed against in the past, for this application. My objections to OW% are mostly to its use on the player level. On the team level, it is much more palatable, as it does answer the question “If this team had average defense, how many games would we expect them to win?” When you ask that question for a player, it is at best an abstraction since no player is his own offense.

As you will see, the two approaches generally yield similar results. The large differences, arbitrarily defined as +/- .015, are as follows:

Positive (gOW% > OW%): HOU, SF, MIL, SEA
Negative: TEX, NYA, DET

If a team has a higher gOW% than OW%, they are expected to win more games when you consider their run distribution than if you just look at their average runs scored. gOW% is not any more impressed by a team scoring 17 runs in a game than 11 (it should, a little bit, but it is definitely a situation in which diminishing returns are in play), while conventional OW% would add around half a win/162 games for those six (mostly meaningless) extra runs.

The Tigers are a team that many people observed where boom or bust offensively. Based on their average of 5.07 R/G, they have an OW% of .542, but their gOW% is just .524. The comparison of OW% to gOW% backs up the observation, although any potential predictive value is undemonstrated.

We can also turn things around and look at team defenses by figuring DW% and gDW%, assuming for DW% that the offense is league average (4.65 R/G).

The biggest divergences:

Positive (gDW% > DW%): TOR, PHI, TB
Negative: TEX, COL

Nice coincidence that both pennant winners are in the positive group.

We can combine gOW% and gDW% into what I will call gEW%, and then compare it to actual W% or EW% (based on Pythagenpat or your win estimator of choice). As I have set it up here, we are assuming independence between the number of runs scored and allowed in a game (this is very likely a faulty assumption). In order to make this easy, I first convert gOW% and gDW% into an equivalent run ratio (rather than a winning percentage). Given a pythagorean exponent of x (x is unique for the offense and defense as above):

Rrat = (gOW%/(1 - gOW%))^(1/x)
RArat = ((1-gDW%)/gDW%)^(1/x)

Then, figure a new pyth exponent for the entire team (y = RPG^.29). That allows us to calculate gEW%:

gEW% = Rrat^y/(Rrat^y + RArat^y)

Here are the teams sorted by gEW%:

Most of the big differences between gEW% and W% will be teams whose pythagorean record also diverge (like the Angels). So the differences I’ll highlight here are between gEW% and EW%, by two or more games:

Positive (gEW% > EW%): HOU, SF, KC, COL, LAA
Negative: DET, CHN, TOR, PHI, CHA

To put this into words, teams in the positive category were expected to win more games when their distribution of runs scored and allowed are considered independently as opposed to win their averages of runs scored and allowed are considered independently.

While the Angels look better when you consider their run distribution rather than average runs, they still ended up with an actual W% far beyond their gEW% (.617 to .559, easily the largest positive difference in the majors at +9.5 wins). The worst team in this difference was the Mariners (.420 to .377, -7 wins).

Unsurprisingly, twenty of the thirty teams have a smaller difference between gEW% and W% than between EW% and W%. The RMSE/162 games for predicting W% is 3.47 for gEW% and 4.19 for EW%.

One interesting thing is that no teams are in the range of .477-.505 gEW%. Only two teams were in this range in actual W% although four were for EW%. It doesn’t mean anything, but it’s odd to have a .028 range of gEW% (between 78 and 81 wins) right around the mean not represented.

As I said, there are other sabermetricians out there who could come up with some more revealing ways to look at this data. Hopefully you have found something interesting in this perfunctory look.