Monday, December 26, 2005

Ranking the Leadoff Hitters

The recent signing of Johnny Damon by the Yankees raised the question of “who is the best leadoff man in baseball?”, or at least “how did the ML leadoff hitters perform last year?" On one hand, this is silly, because in general, the guys who would be the best leadoff hitters are the guys who are the best hitters period. Albert Pujols would create more runs out of the leadoff spot then the best hitter who actually bats leadoff. And this is true for every lineup spot. But this is implicitly recognized by a lot of people, sabermetricians at least, when you ask the question. The question then becomes “of the players who actually hit leadoff, who is the best” or “whose talents are best suited to hitting leadoff”. There is also the issue that leadoff hitters only are guaranteed to leadoff an inning once a game. They presumably will have less runners on base ahead of them then other hitters because when they truly leadoff, there is nobody on, and when they bat after others they follow the weaker hitters at the bottom of the lineup.

I will go through a number of different methods and show the top and bottom three teams in the league, as well as how the Yankees and Red Sox ranked last year. I have the complete leadoff stats for each team, from STATS Inc., which include all players who hit in the leadoff men. In parentheses I have the primary leadoff hitter for the team. Some of these primary guys played almost 162 games out of the leadoff spot, while some might have led the team with 60 games. In one case two players were so close in terms of playing time that I have them designated as co-primary leadoff hitters(Jason Ellison and Randy Winn in SF--Winn obviously did the bulk of the leading off after he was acquired).

Anyway, the basic job of a leadoff hitter is said to be to get on base or score runs. So we’ll start with Runs scored per 25.5 outs (AB-H+CS):
1. BOS(Damon), 6.75
2. NYA(Jeter), 6.55
3. PHI(Rollins), 6.04
MLB Avg, 5.24
28. MIN(Stewart), 4.18
29. LA(Izturis), 4.08
30. CHN(Hairston), 3.98
The MLB average in this case is the average for leadoff hitters. This average for all hitters will be pretty much equal to league runs/game. In some of the other categories below, I will provide the overall MLB figures to go along with the leadoff average.

Or since these figures are of course dependent on the hitters coming up behind the leadoff me, we can look at getting on base, with On Base Average:
1. NYA(Jeter), .372
2. BAL(Roberts), .370
3. BOS(Damon), .364
MLB Avg, .337
28. COL(Barmes), .293
29. NYN(Reyes), .292
30. CHN(Hairston), .291
Interestingly, MLB leadoff men’s OBA of .337 is just slightly better then the overall OBA of .327. Sadly, the Rockies’ leadoff men put up a .293 OBA despite the fact that their park inflates rate stats by about six or seven percent, which would be .276 park-adjusted. Ouch.

OBA includes the times the runner gets on base, but it does not subtract the outs that they make once they are there. If you are leading off, and you get thrown out on the bases, you have done nothing to help your team, because there was nobody to advance. So let’s use what I have called Not Out Average, which in this case is (H+W-CS)/(AB+W):
1. NYA(Jeter), .365
2. BOS(Damon), .363
3. BAL(Roberts), .354
MLB Average, .323
28. COL(Barmes), .288
29. CHN(Hairston), .274
30. NYN(Reyes), .263
This list of course is very similar to the others because all we have done is take out caught stealing. The MLB Average for all hitters was .320. There is a ten point difference between leadoff hitters and the average in OBA, but just three here, because leadoff men tend to get caught stealing more then other hitters since they attempt more steals.

We could also look at this in terms of Runners On Base Average. ROBA is the A factor from BsR, per PA. This subtracts HR as well as CS. We could offer this ranking of leadoff men on the grounds that their job is to set the table, and the home run clears the table. The implication is not necessarily that the HR is a bad thing, just that it is not something that lends itself to being a leadoff hitter. I do not personally support this line of thinking, but we can still look at a list:
1. BOS(Damon), .348
2. NYA(Jeter), .340
3. STL(Eckstein), .334
MLB Average, .305
28. TEX(Dellucci), .266
29. NYN(Reyes), .263
30. CHN(Hairston), .256
Texas leadoff men rate poorly here because they clubbed 37 home runs, between Dellucci(23), Matthews(8), Soriano(4), and DeRosa(2). They are well below average in OBA as well(.319, 25th), but they hit seventeen more longballs then the next highest team(Cleveland),

Another thing we can look at is Bill James’ Run Element Ratio, which is (W + SB)/EB. This is a ratio of things that “set up” innings over things that finish off innings(drive in the runs). It’s not a measure of who is the best leadoff man, it just gives an indication as to which players have strengths that are suited to batting earlier in the inning. If two hitters are equally productive, then the one with the highest RER might well be the better choice to leadoff. But a player can have a very high RER while being a terrible player on the basis of a complete lack of power:
1. CHA(Podsednik), 3.24
2. FLA(Pierre), 1.82
3. LAA(Figgins), 1.59
12. NYA(Jeter), .979
MLB Average, .977
22. BOS(Damon), .837
28. CLE(Sizemore), .678
29. KC(DeJesus), .598
30. TEX(Dellucci), .576
Here we see the first significant difference between Damon and Jeter, that being that Jeter’s talents are better suited to setting up an inning, at least according to RER. Again the Rangers’ power out of this spot puts them near the bottom. Grady Sizemore is often mentioned as a guy who will mature out of the leadoff spot and this provides some evidence for that, although the Indians’ leadoff men hit ten triples which increases their EB total, but is not really a power indicator anymore then doubles is. The RER for all players was .690 so you can see that this stat seems to incorporate some of the thinking that goes into choosing a leadoff man.

Another Bill James tool which does speak directly to the question of best leadoff man, and is used by Bill for that purpose, is what I will call LE for Leadoff Efficiency. It is the number of runs per 25.5 outs that a leadoff man is expected to score. Apparently, this formula was introduced in the 1979 Abstract and has not changed since. The premise is that a leadoff man will score 35% of the time he is on first, 55% of the time from second, 80% of the time from third, and that he always score on a home run(if only Bill would see that this last part is true in team run estimation as well). Times on first is singles plus walks minus stolen base attempts. Times on second is doubles plus steals, and times on third and home are triples and homers respectively. This method gives these rankings:
1. BAL(Roberts), 6.48
2. NYA(Jeter), 6.35
3. BOS(Damon), 6.28
MLB Average, 5.44
28. HOU(Taveras), 4.51
29. COL(Barmes), 4.34
30. CHN(Hairston), 4.31
You can see that this formula expected leadoff hitters to score 5.44 runs per game, but in fact they scored 5.24. Perhaps the formula had a bad year, or perhaps it is a little too rosy. One would expect that with the increased overall offense in MLB since this formula was developed, that it would estimate too low. But that was not the case, this year at least. Willy Taveras shows up here, and is the perfect example of a leadoff hitter who does nothing to help his teams score runs. Taveras actually did not score many runs, ranking fourth to the bottom in R/G. Having speed can certainly be a plus from the leadoff spot. But getting on base is the key, because you can’t use your speed on the bench. Baltimore’s leadoff hitters, ranked on top here, would actually gain runs on their estimation because they only stole bases at a 70% clip. Their raw run estimation would decrease, but saving the twelve extra outs take their efficiency down. Interestingly, the average for all players was 5.41, meaning that an average hitter would have scored runs with practically the same efficiency leading off as the real leadoff men.

Reminding ourselves that the main job of a leadoff hitter, just like for any hitter, is to create runs and avoid outs, we can look at good old RG:
1. BAL(Roberts), 6.13
2. BOS(Damon), 5.83
3. NYA(Jeter), 5.82
MLB Average, 4.77
28. HOU(Taveras), 3.66
29. COL(Barmes), 3.39
30. CHN(Hairston), 3.38
Again, Colorado has a pathetic showing despite a 15% boost from the park. The Cubs are also frequent residents at the bottom of these lists, which gives you some idea why they set out to get a leadoff hitter. Unfortunately for them, Juan Pierre is 26th at 3.96, and is also just 24th in LE/G at 4.81. The leadoff average of 4.77 is higher then the league R/G, so at least teams put an above average player in the leadoff spot.

Finally, we can look at three specialized measures derived from linear weights-type thinking. The first is to ask the question, “What would the batter’s RC be if he leadoff an in inning in every plate appearance?” I will further add the assumption that if he attempts a steal, it is of second base and it comes while the second batter is at the plate. Ideally I would use a RE table based on current data, but for ease I will use the one published by Palmer and Thorn in The Hidden Game. The RE for the inning, before anything happens, is .454. This goes up to .783 with a runner at first and no outs, so times on first add .329 runs. With a runner at second, it is 1.068, so times on second add .614 runs. With a runner at third, it is 1.277, so times on third add .823 runs. Homers of course add one run. If he makes an out, it drops to .249, so outs are worth -.205 runs. This will be expressed in terms of RAA, which I will just keep as a total. So what I’ll call Pure Leadoff RAA can be written as:
PLRAA = .329(S + W - SB - CS) + .614(D + SB) + .823(T) + HR - .205(AB - H +CS)
1. BAL(Roberts), +25
2. ATL(Furcal), +22
3. BOS(Damon), +20.1
4. NYA(Jeter), +19.8
MLB Average, +5
28. HOU(Taveras), -15
29. CHN(Hairston), -18.7
30. COL(Barmes), -19.4
This can be considered an abstract rating for a leadoff hitter because while they face this situation more then other batting order positions, it is only guaranteed to happen once a game.

In the end, it is clear that of the guys who actually hit leadoff last year, Johnny Damon was one of the best. As was Derek Jeter. So the Yankees didn’t need Damon because they had a deficiency at leadoff. What they did have was a deficiency, particularly defensive, in center field. Paying Damon thirteen million a year certainly seems excessive. He is a good player, but not one of the top players in the game. For the past three years his offensive RAA compared to a center fielder have been 0, +20, and +16. That put him second among AL centerfielders to Sizemore and 23rd among all AL batters. However, given that the Yankees apparently still have money to spend, and that the biggest need they had(offensively at least; I would like to have better starting pitchers if I was Brian Cashman) was a center fielder, the contract does not seem unjustifiable.


  1. A good way to evaluate leadoff men is by 2*OBA+SLG (as contrasted with the 1.6 to 1.7*OBA +SLG for overall batters). The change in weights pretty much captures the dynamics of the slot (twice as many innings led off as any other slot, fewest runners on base, etc.) This is better than those choices which only concentrate on the scoring part. You could add a SB component if you wanted. Patriot, can you draw up a list for this?

  2. Here are the top and bottom 5, along with a few other names of interest that I mentioned in the post:
    1. BAL(Roberts), 1213
    2. NYA(Jeter), 1182
    3. BOS(Damon), 1170
    4. PIT(Lawton et al), 1169
    5. SF(Ellison/Winn), 1163
    6. ATL(Furcal), 1155
    7. SEA(Suzuki), 1132
    13. TEX(Dellucci), 1102
    15. CLE(Sizemore), 1088
    MLB Average, 1081
    25. CHA(Podsednik), 1007
    26. FLA(Pierre), 989
    27. HOU(Taveras), 979
    28. NYN(Reyes), 962
    29. CHN(Hairston), 940
    30. COL(Barmes), 933

  3. A more pertinent question, once you're through with various estimations of leadoff slot performance, is whether stockpiling high-level top-of-the-batting-order hitters (Jeter and Damon) has any more intrinsic value for run scoring in the context of the batting order. The Yanks are probably going to set up their #1-#2 as Damon-Jeter; how much of an advantage do they get over the average performance in the #2 slot?


Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.