Monday, November 12, 2007

Leadoff Hitters, 2007

For the last two years I have written a piece giving the leading and trailing teams in various categories that can be used to evaluate leadoff performance. I always try to stress that, as numerous studies have shown, batting order construction is not as crucial as conventional wisdom holds it to be. I am personally much more concerned about how a player performs in an average situation than in any particular lineup slot.

Nonetheless, the matter of who will leadoff for a team is certainly one that is oft-discussed and is given particular attention by the men who run major league teams. Thus, it is useful to actually know which teams got good production out of the leadoff spot and which did not.

Before I start going into the various categories, let me first emphasize that the data is for team’s aggregate leadoff performance. In parentheses after each team on a list, I will give the names of the individuals who appeared in at least 20 games in the leadoff spot, but unless the player took every plate appearance of the team’s season in the #1 slot, the statistics are not solely his. Also, the 20 games does not mean 20 starts at leadoff hitter--it is 20 appearances, regardless of whether some of those came as a pinch hitter, pinch runner, defensive replacement, or what have you.

With the disclaimers out of the way, the most basic job of a leadoff hitter is to score runs. So runs scored per 25.5 outs (outs here are AB-H+CS) seem to be a good place to start:
1. PHI (Rollins), 7.2
2. MIL (Weeks/Hart), 7.2
3. DET (Granderson), 6.8
Leadoff Average, 5.6
ML Average, 4.8
28. CHA (Owens/Erstad/Podsednik), 4.5
29. STL (Eckstein/Taguchi/Miles), 4.5
30. WAS (Lopez/Logan), 4.1

Leadoff Average is the average for the team’s leadoff performances, while ML Average is the average for the league as a whole, slots one through nine. This is a sabermetric blog--I don’t need to point out to you the biases that exist in using actual runs scored data, so I will let those figures stand without comment.

Perhaps even more elemental to the traditional role of the leadoff hitter than scoring runs is getting on base. On Base Average is as important of a statistic as there is anyway, so it’s only natural to look at how the leadoff men did:
1. SEA (Suzuki), .389
2. LAA (Willits/Figgins/Matthews), .377
3. FLA (Ramirez/Amezaga), .376
Leadoff Average, .341
ML Average, .332
28. ARI (Young/Byrnes/Drew), .309
29. HOU (Biggio/Burke), .305
30. WAS (Lopez/Logan), .305

To me, the Angels high showing is a bit of a surprise, as Chone Figgins and Gary Matthews have never been huge OBA guys, and Reggie Willits was a relative unknown. On the flip side, seeing Craig Biggio and company in a virtual tie for last in baseball is somewhat sad.

A slightly modified version of OBA that is worth looking at is what I call the Runners On Base Average. ROBA removes home runs and caught stealings from the OBA numerator, leaving only those times in which a runner was actually on base to be advanced by his teammates. However, in this stat the home run is treated no differently than an out, so it is to some extent a “style” stat and not a quality stat. That is not to say that ROBA is not a practical thing to know--it is after all just the Base Runs A factor per PA. Just keep in mind that it is a statistic in which higher is usually, but not always, better:
1. SEA (Suzuki), .370
2. BAL (Roberts), .352
3. LAA (Willits/Figgins/Matthews), .351
Leadoff Average, .309
ML Average, .300
28. HOU (Biggio/Burke), .278
29. TOR (Rios/Johnson/Wells), .277
30. ARI (Young/Byrnes/Drew), .260

Not surprisingly, four of the extreme teams are holdovers from the OBA list.

Moving further down the path of style stats is Bill James’ Run Element Ratio, which divides walks and steals by extra bases. The idea behind RER was that it was the ratio of those events that are most important early in an inning (table-setting events with little advancement value like the walk) against those that are most important late in an inning, when runners are already on base (power). Singles are ignored because they serve both purposes well.

RER is not really a statement of quality at all, but a statement of shape. In theory, players with high RERs would seem to be better suited as leadoff hitters than those with low RERs, but it doesn’t necessarily mean that they are actually more productive in the role. I believe RER is most useful when discussing leadoff hitters as a tool to pick out players who don’t fit the conventional wisdom of what a leadoff hitter should be, but who were utilized as such:
1. MIN (Castillo/Casilla/Tyner/Bartlett), 2.3
2. LAA (Willits/Figgins/Matthews), 2.2
3. CHA (Owens/Erstad/Podsednik), 2.1
Leadoff Average, 1.0
ML Average, .7
28. HOU (Biggio/Burke), .5
29. DET (Granderson), .5
30. CHN (Soriano/Theriot), .4

As you can see, we have teams show up in the leaders who have previously been among the trailers in “effectiveness” categories, leaders who were previously leaders, and all other such combinations.

Going back to context neutral effectivness metrics, another Bill James’ invention was an estimated runs scored figure, based on assumptions about how often a leadoff hitter scored from each base (James used 35% from first, 55% from second, and 80% from third). I call this Leadoff Efficiency when viewed per 25.5 outs:
1. FLA (Ramirez/Amezaga), 8.1
2. MIL (Weeks/Hart), 7.6
3. BAL (Roberts), 7.6
Leadoff Average, 6.2
ML Average, 5.8
28. WAS (Lopez/Logan), 5.1
29. HOU (Biggio/Burke), 5.1
30. STL (Eckstein/Taguchi/Miles), 4.9

Of course, we can always just look at leadoff hitters the same way we would any other player, with a standard, context neutral run estimator. Using ERP as the estimator, here is good old Runs Created per Game:
1. FLA (Ramirez/Amezaga), 7.0
2. MIL (Weeks/Hart), 6.5
3. DET (Granderson), 6.5
Leadoff Average, 5.0
ML Average, 4.9
28. WAS (Lopez/Logan), 3.8
29. STL (Eckstein/Taguchi/Miles), 3.8
30. CHA (Owens/Erstad/Podsednik), 3.6

Finally, as David Smyth suggested for the first incarnation of this piece, we can look at a modified OPS with a weight of 2 for OBA. The most accurate weight for OBA is somewhere in the neighborhood of 1.7, so using 2 is closer to optimal than using 1, but serves to give a little extra boost to OBA, which may be justified when looking at leadoff hitters. The list presented below is actually (2*OBA + SLG)*.7, as the .7 multiplier makes it approximately equal to traditional OPS on the league level. Since we are dealing with meaningless units anyway, we might as well scale them to a meaningless scale with more familiarity (OPS):
1. FLA (Ramirez/Amezaga), 889
2. CHN (Soriano/Theriot), 855
3. DET (Granderson), 851
Leadoff Average, 767
ML Average, 761
28. STL (Eckstein/Taguchi/Miles), 685
29. WAS (Lopez/Logan), 679
30. CHA (Owens/Erstad/Podsednik), 673

If you are interested in looking at this stuff on your own, I have posted a Google spreadsheet with all of the data.


  1. IIRC, a leadoff hitter will come to bat 48% with bases empty, and 26% with 1 out, and 26% with 2 outs.

    The chance of scoring from 1B are: 40%, 27%, 13%, respectively. That means, the scoring rate for a leadoff hitter from 1B is 30%.

    From 2B: 66%, 44%, 22%, for a weighted total of 50%.

    From 3B: 85%, 66%, 28%: 65%.

    James: 35/65/80 don't make sense, unless you make the leadoff hitter come to bat with 0 outs 80 or 90% of the time.

  2. Error on my part: Bill used 55% for second, not 65%. That brings it closer to the optimal value that Tango derived above, leaving the triple with the biggest discrepancy.

  3. If you make the leadoff hitter come to bat 70% with 0 outs, you get this:

    1B: 34%
    2B: 56%
    3B: 74%

    I recommend using 30/50/65.


I reserve the right to reject any comment for any reason.