Thursday, December 08, 2011

2011 Leadoff Hitters

This post kicks off a series of posts that I write every year, and therefore struggle to infuse with any sort of new perspective. However, they're a tradition on this blog and hold some general interest, so away we go.

This post looks at the offensive performance of teams' leadoff batters. I will try to make this as clear as possible: the statistics are based on the players that hit in the #1 slot in the batting order, whether they were actually leading off an inning or not. It includes the performance of all players who batted in that spot, including substitutes like pinch-hitters.

Listed in parentheses after a team are all players that appeared in twenty or more games in the leadoff slot--while you may see a listing like "BOS (Ellsbury)” this does not mean that the statistic is only based solely on Ellsbury's performance; it is the total of all Boston batters in the #1 spot, of which Ellsbury was the only one to appear in that spot in twenty or more games. I will list the top and bottom three teams in each category (plus the top/bottom team from each league if they don't make the ML top/bottom three); complete data is available in a spreadsheet linked at the end of the article. There are also no park factors applied anywhere in this article.

That's as clear as I can make it, and I hope it will suffice. I always feel obligated to point out that as a sabermetrician, I think that the importance of the batting order is often overstated, and that the best leadoff hitters would generally be the best cleanup hitters, the best #9 hitters, etc. However, since the leadoff spot gets a lot of attention, and teams pay particular attention to the spot, it is instructive to look at how each team fared there.

The conventional wisdom is that the primary job of the leadoff hitter is to get on base, and most simply, score runs. So let's start by looking at runs scored per 25.5 outs (AB - H + CS):

1. TEX (Kinsler), 6.8
2. MIL (Weeks/Hart), 6.5
3. BOS (Ellsbury), 6.4
Leadoff average, 5.0
ML average, 4.3
28. LAA (Izturis/Aybar), 4.0
29. STL (Theriot/Furcal), 3.9
30. WAS (Bernadina/Desmond/Espinosa), 3.9

Obviously you all know the biases inherent in looking at actual runs scored. It is odd to see St. Louis near the bottom as they had a good offense overall. Usually the leadoff hitters will manage to score some runs when they have Pujols, Holliday and Berkman coming up behind them whether they get on base that much or not.

Speaking of getting on base, the other obvious measure to look at is On Base Average. The figures here exclude HB and SF to be directly comparable to earlier versions of this article, but those categories are available in the spreadsheet if you'd like to include them:

1. CHN (Castro/Fukudome), .364
2. NYN (Reyes/Pagan), .364
3. BOS (Ellsbury), .362
Leadoff average, .324
ML average, .317
28. BAL (Hardy/Roberts/Andino), .287
29. SF (Torres/Rowand), .282
30. WAS (Bernadina/Desmond/Espinosa), .277

I would not have correctly identified the Cubs as having the highest OBA out of the leadoff spot in my first fifteen guesses, I don’t think. The seven point difference between the overall major league OBA and the OBA of leadoff men is a little smaller than it usually is, but last year the gap was just two points.

The next statistic is what I call Runners On Base Average. The genesis of it is from the A factor of Base Runs. It measures the number of times a batter reaches base per PA--excluding homers, since a batter that hits a home run never actually runs the bases. It also subtracts caught stealing here because the BsR version I often use does as well, but BsR versions based on initial baserunners rather than final baserunners do not.

My 2009 leadoff post was linked to a Cardinals message board, and this metric was the cause of a lot of confusion (this was mostly because the poster in question was thick-headed as could be, but it's still worth addressing). ROBA, like several other methods that follow, is not really a quality metric, it is a descriptive metric. A high ROBA is a good thing, but it's not necessarily better than a slightly lower ROBA plus a higher home run rate (which would produce a higher OBA and more runs). Listing ROBA is not in any way, shape or form a statement that hitting home runs is bad for a leadoff hitter. It is simply a recognition of the fact that a batter that hits a home run is not a baserunner. Base Runs is an excellent model of offense and ROBA is one of its components, and thus it holds some interest in describing how a team scored its runs, rather than how many it scored:

1. CHN (Castro/Fukudome), .339
2. NYN (Reyes/Pagan), .336
3. PIT (Tabata/McCutchen/Presley), .315
Leadoff average, .291
ML average, .285
28. SF (Torres/Rowand), .253
29. BAL (Hardy/Roberts/Andino), .253
30. WAS (Bernadina/Desmond/Espinosa), .247

You are probably starting to notice a lot of repetition in the leaders and trailers. Obviously a lot of these metrics measure the same thing in slightly different ways or measure similar things, so it’s to be expected.

I will also include what I've called Literal OBA here--this is just ROBA with HR subtracted from the denominator so that a homer does not lower LOBA, it simply has no effect. You don't really need ROBA and LOBA (or either, for that matter), but this might save some poor message board out there twenty posts, so here goes. LOBA = (H + W - HR - CS)/(AB + W - HR):

1. CHN (Castro/Fukudome), .344
2. NYN (Reyes/Pagan), .341
3. PIT (Tabata/McCutchen/Presley), .321
Leadoff average, .297
ML average, .292
28. BAL (Hardy/Roberts/Andino), .261
29. SF (Torres/Rowand), .257
30. WAS (Bernadina/Desmond/Espinosa), .252

In this presentation, the rank difference between ROBA and LOBA is barely noticeable.

The next two categories are most definitely categories of shape, not value. The first is the ratio of runs scored to RBI. Leadoff hitters as a group score many more runs than they drive in, partly due to their skills and partly due to lineup dynamics. Those with low ratios don’t fit the traditional leadoff profile as closely as those with high ratios (at least in the way their seasons played out):

1. LA (Gordon/Gwynn/Carroll/Furcal), 2.5
2. HOU (Bourn/Bourgeois/Schafer), 2.1
3. DET (Jackson), 2.0
Leadoff average, 1.6
26. WAS (Bernadina/Desmond/Espinosa), 1.2
28. KC (Gordon/Getz), 1.2
29. BOS (Ellsbury), 1.2
30. BAL (Hardy/Roberts/Andino), 1.2
ML average, 1.1

The presence of the Red Sox in the bottom three on this list should drive home the point about this not being a quality metric. The leadoff hitters that rank the lowest in R/BI are those that drive in almost as many runs as they score. If you had a leadoff hitter that was driving in many more runs than he scored, that might be cause for some reconsideration of your batting order, but having some scored/batted in parity is not inherently a bad thing.

A similar gauge, but one that doesn't rely on the teammate-dependent R and RBI totals, is Bill James' Run Element Ratio. RER was described by James as the ratio between those things that were especially helpful at the beginning of an inning (walks and stolen bases) to those that were especially helpful at the end of an inning (extra bases). It is a ratio of "setup" events to "cleanup" events. Singles aren't included because they often function in both roles.

Of course, there are RBI walks and doubles are a great way to start an inning, but RER classifies events based on when they have the highest relative value, at least from a simple analysis:

1. CHA (Pierre), 2.4
2. MIN (Revere/Span), 1.9
3. LA (Gordon/Gwynn/Carroll/Furcal), 1.8
Leadoff average, 1.0
ML average, .8
28. BAL (Hardy/Roberts/Andino), .6
29. BOS (Ellsbury), .6
30. MIL (Weeks/Hart), .6

Last year, the White Sox led handily in RER, due in large part to Pierre’s steals. This year, Pierre didn’t steal as many bases but still managed to slap his team to the top.

Speaking of stolen bases, last year I started including a measure that considered only base stealing. Obviously there's a lot more that goes into being a leadoff hitter than simply stealing bases, but it is one of the areas that is often cited as important. So I've included the ranking for what some analysts call net steals, SB - 2*CS. I'm not going to worry about the precise breakeven rate, which is probably closer to 75% than 67%, but is also variable based on situation. The ML and leadoff averages in this case are per team lineup slot:

1. HOU (Bourn/Bourgeois/Schafer), 29
1. NYN (Reyes/Pagan), 29
3. SEA (Suzuki), 26
Leadoff average, 11
ML average, 3
28. CHA (Pierre), -3
29. STL (Theriot/Furcal), -6
29. CLE (Brantley/Sizemore/Carrera), -6

The Indians have been just missed the trailer spots on a number of these lists. At least Cleveland and St. Louis are at the bottom largely because their leadoff hitters didn’t attempt that many steals. Only Milwaukee and Baltimore leadoff hitters (16 and 21 respectively) attempted fewer steals than Cleveland (24) and St. Louis (18). Neither the Tribe (58%) nor the Redbirds (56%) had success when they did steal, but they weren’t trying it all that much. The White Sox, on the other hand, were 31-48 (65%), a poor percentage and the eleventh-most attempts.

Let's shift gears back to quality measures, beginning with one that David Smyth proposed when I first wrote this annual leadoff review. Since the optimal weight for OBA in a x*OBA + SLG metric is generally something like 1.7, David suggested figuring 2*OBA + SLG for leadoff hitters, as a way to give a little extra boost to OBA while not distorting things too much, or even suffering an accuracy decline from standard OPS. Since this is a unitless measure anyway, I multiply it by .7 to approximate the standard OPS scale and call it 2OPS:

1. BOS (Ellsbury), 882
2. NYN (Reyes/Pagan), 835
3. MIL (Weeks/Hart), 834
Leadoff average, 733
ML average, 723
28. CHA (Pierre), 669
29. SF (Torres/Rowand), 645
30. WAS (Bernadina/Desmond/Espinosa), 630

Along the same lines, one can also evaluate leadoff hitters in the same way I'd go about evaluating any hitter, and just use Runs Created per Game with standard weights (this will include SB and CS, which are ignored by 2OPS):

1. BOS (Ellsbury), 6.7
2. NYN (Reyes/Pagan), 6.2
3. TEX (Kinsler), 6.1
Leadoff average, 4.6
ML average, 4.4
28. CHA (Pierre), 3.4
29. SF (Torres/Rowand), 3.4
30. WAS (Bernadina/Desmond/Espinosa), 3.4

Finally, allow me to close with a crude theoretical measure of linear weights supposing that the player always led off an inning (that is, batted in the bases empty, no outs state). There are weights out there (see The Book) for the leadoff slot in its average situation, but this variation is much easier to calculate (although also based on a silly and impossible premise).

The weights I used were based on the 2010 run expectancy table from Baseball Prospectus. Ideally I would have used multiple seasons but this is a seat-of-the-pants metric. Last year’s post went into the detail of how I figured it; this year, I’ll just tell you that the out coefficient was -.22, the CS coefficient was -.587, and for other details refer you to that post. I then restate it per the number of PA for an average leadoff spot (741 in 2011):

1. BOS (Ellsbury), 29
2. TEX (Kinsler), 26
3. NYN (Reyes/Pagan), 25
Leadoff average, 0
ML average, -3
28. CHA (Pierre), -20
29. WAS (Bernadina/Desmond/Espinosa), -20
30. SF (Torres/Rowand), -21

From an overview of all of these metrics, I think it’s safe to say that Red Sox and Mets leadoff hitters were pretty effective while White Sox, Nationals and Giants were not. I was a little disappointed that the Braves and Astros didn’t make any lists together here as each team used both Michael Bourn and Jordan Schafer in twenty or more games out of the #1 spot. Obviously that’s a possibility when players are traded for each other, but it would have been particularly amusing had one team been on the leader list and the other on the trailer list.

A spreadsheet with all of the data and the full lists is available.

2 comments:

  1. Wow! Excellent information that is well written, thorough, yet easy to read.

    I've been seeking the best metric available (I know there is most likely more than one) to determine if Ian Kinsler is truly the better leadoff option over Elvis Andrus for the Texas Rangers.

    I know that regardless of which statistic(s) implicated, the argument is so subjective in nature that there might not be a definitive answer.

    Any suggestions?

    Regardless, I greatly appreciate you compiling all of this information and making it available online! Take care.

    Best,

    Tim

    ReplyDelete
  2. Tim, as you allude to it's hard to pick just one criterion on which to choose a leadoff hitter. If I had to pick just one metric, I would look at the last one (the linear weights metric), but that doesn't help you identify whether a player should be a leadoff hitter or say a #3 hitter (most of the top scorers on the league level in many of these metrics will be guys like Joey Votto or Jose Bautista that no manager would ever lead off).

    If I was Ron Washington, and Elvis Andrus was roughly my seventh best overall offensive player (behind Napoli, Kinsler, Beltre, Hamilton, Cruz for the sake of discussion), I wouldn't consider giving my 7th best hitter the most PA on the team.

    ReplyDelete

I reserve the right to reject any comment for any reason.