Monday, December 12, 2016

Hitting by Lineup Position, 2016

I devoted a whole post to leadoff hitters, whether justified or not, so it's only fair to have a post about hitting by batting order position in general. I certainly consider this piece to be more trivia than sabermetrics, since there’s no analytic content.

The data in this post was taken from Baseball-Reference. The figures are park-adjusted. RC is ERP, including SB and CS, as used in my end of season stat posts. The weights used are constant across lineup positions; there was no attempt to apply specific weights to each position, although they are out there and would certainly make this a little bit more interesting:

The seven year run of NL #3 hitters as the best position in baseball was snapped, albeit by an insignificant .01 RG by AL #3 hitters. Since Mike Trout’s previous career high in PA out of the #3 spot was 336 in 2015 and he racked up 533 this year, I’m going to give full credit to Trout; as we will see in a moment, the Angels’ #3 hitters were the best single lineup spot in baseball. #2 hitters did not outperform #5 in both circuits as they did last year, just the AL. However, the NL made up for hit by having their leadoff hitters create runs at almost the exact same rate as their #5s.

Next are the team leaders and trailers in RG at each lineup position. The player listed is the one who appeared in the most games in that spot (which can be misleading, especially for spots low in the batting order where many players cycle through):

A couple things that stood out to me was St. Louis’ dominance at the bottom of the order and the way in which catchers named Perez managed to sabotage lineup spots for two teams. Apologies to Carlos Beltran (the real culprits for the poor showing of Texas #3 hitters were Adrian Beltre, Prince Fielder, and Nomar Mazara) and Luis Valbuena (Carlos Gomez and Marwin Gonzalez).

The case of San Diego’s cleanup hitters deserves special attention. Yangervis Solarte was actually pretty good when batting cleanup, as his .289/.346/.485 line in 289 PA compares favorably to the NL average for cleanup hitters. The rest of the Padres who appeared in that spot combined for 399 PA with a dreadful .187/.282/.336 line. Just to give you a quick idea of how bad this is, the 618 OPS would have been the eleventh-worst among any non-NL #9 lineup spot in the majors, leading only 6 AL #9s, 2 #2s, a #7, and the horrible Oakland #2s. It was also worse than the Cardinals’ #9 hitters.

The next list is the ten best positions in terms of runs above average relative to average for their particular league spot (so AL leadoff spots are compared to the AL average leadoff performance, etc.):

And the ten worst:

Joe Mauer himself wasn’t that bad, with a 799 OPS when hitting third. That’s still well-below the AL average, but not bottom ten in RAA bad without help from his friends.

The last set of charts show each team’s RG rank within their league at each lineup spot. The top three are bolded and the bottom three displayed in red to provide quick visual identification of excellent and poor production:

The full spreadsheet is available here.

Monday, December 05, 2016

Leadoff Hitters, 2016

I will try to make this as clear as possible: the statistics are based on the players that hit in the #1 slot in the batting order, whether they were actually leading off an inning or not. It includes the performance of all players who batted in that spot, including substitutes like pinch-hitters.

Listed in parentheses after a team are all players that started in twenty or more games in the leadoff slot--while you may see a listing like "COL (Blackmon)" this does not mean that the statistic is only based solely on Blackmon's performance; it is the total of all Colorado batters in the #1 spot, of which Blackmon was the only one to start in that spot in twenty or more games. I will list the top and bottom three teams in each category (plus the top/bottom team from each league if they don't make the ML top/bottom three); complete data is available in a spreadsheet linked at the end of the article. There are also no park factors applied anywhere in this article.

That's as clear as I can make it, and I hope it will suffice. I always feel obligated to point out that as a sabermetrician, I think that the importance of the batting order is often overstated, and that the best leadoff hitters would generally be the best cleanup hitters, the best #9 hitters, etc. However, since the leadoff spot gets a lot of attention, and teams pay particular attention to the spot, it is instructive to look at how each team fared there.

The conventional wisdom is that the primary job of the leadoff hitter is to get on base, and most simply, score runs. It should go without saying on this blog that runs scored are heavily dependent on the performance of one’s teammates, but when writing on the internet it’s usually best to assume nothing. So let's start by looking at runs scored per 25.5 outs (AB - H + CS):

1. HOU (Springer/Altuve), 6.9
2. COL (Blackmon), 6.7
3. DET (Kinsler), 6.6
Leadoff average, 5.2
ML average, 4.5
28. SF (Span), 4.4
29. KC (Escobar/Dyson/Merrifield), 4.1
30. OAK (Crisp/Burns), 3.4

Again, no park adjustments were applied, so the Rockies performance was good but it wasn’t really “best in the NL good”. I’m also going to have a hard time resisting just writing “Esky Magic” every time the Royals appear on a trailers list.

The most basic team independent category that we could look at is OBA (figured as (H + W + HB)/(AB + W + HB)):

1. CHN (Fowler/Zobrist), .383
2. HOU (Springer/Altuve), .375
3. STL (Carpenter), .370
Leadoff average, .341
ML average, .324
28. WAS (Turner/Revere/Taylor), .305
29. KC (Escobar/Dyson/Merrifield), .298
30. OAK (Crisp/Burns), .290

Esky Magic. And once again Billy Burns chipping in to Oakland’s anemic showing and of course Kansas City just had to have Billy Burns.

The next statistic is what I call Runners On Base Average. The genesis for ROBA is the A factor of Base Runs. It measures the number of times a batter reaches base per PA--excluding homers, since a batter that hits a home run never actually runs the bases. It also subtracts caught stealing here because the BsR version I often use does as well, but BsR versions based on initial baserunners rather than final baserunners do not. Here ROBA = (H + W + HB - HR - CS)/(AB + W + HB).

This metric has caused some confusion, so I’ll expound. ROBA, like several other methods that follow, is not really a quality metric, it is a descriptive metric. A high ROBA is a good thing, but it's not necessarily better than a slightly lower ROBA plus a higher home run rate (which would produce a higher OBA and more runs). Listing ROBA is not in any way, shape or form a statement that hitting home runs is bad for a leadoff hitter. It is simply a recognition of the fact that a batter that hits a home run is not a baserunner. Base Runs is an excellent model of offense and ROBA is one of its components, and thus it holds some interest in describing how a team scored its runs, rather than how many it scored:

1. CHN (Fowler/Zobrist), .351
2. MIA (Gordon/Suzuki/Dietrich/Realmuto), .335
3. ATL (Inciarte/Peterson/Markakis), .331
4. HOU (Springer/Altuve), .331
Leadoff average, .305
ML average, .287
28. TEX (Choo/Odor/DeShields/Profar), .264
29. WAS (Turner/Revere/Taylor), .260
30. MIN (Dozier/Nunez), .256

Kansas City leadoff hitters finished tied for last in the majors with five home runs (with Miami), so Esky Magic was only good for 23rd place. Twins leadoff hitters, thanks primarily to Dozier, led the majors with 39 homers. So only after around 25.6% of leadoff hitter plate appearances did they actually wind up with a runner on base. Their .320 OBA was well-below average too, but again ROBA describes how an offense plays out--other considerations are necessary to determine how good it was.

I also include what I've called Literal OBA--this is just ROBA with HR subtracted from the denominator so that a homer does not lower LOBA, it simply has no effect. It “literally” (not really, thanks to errors, out stretching, caught stealing after subsequent plate appearances, etc.) is the proportion of plate appearances in which the batter becomes a baserunner able to be advanced by his teammates. You don't really need ROBA and LOBA (or either, for that matter), but this might save some poor message board out there twenty posts, by not implying that I think home runs are bad, so here goes. LOBA = (H + W + HB - HR - CS)/(AB + W + HB - HR):

1. CHN (Fowler/Zobrist), .360
2. HOU (Springer/Altuve), .344
3. STL (Carpenter), .342
Leadoff average, .313
ML average, .297
28. OAK (Crisp/Burns), .273
29. MIN (Dozier/Nunez), .270
30. WAS (Turner/Revere/Taylor), .268

The next two categories are most definitely categories of shape, not value. The first is the ratio of runs scored to RBI. Leadoff hitters as a group score many more runs than they drive in, partly due to their skills and partly due to lineup dynamics. Those with low ratios don’t fit the traditional leadoff profile as closely as those with high ratios (at least in the way their seasons played out):

1. MIA (Gordon/Suzuki/Dietrich/Realmuto), 2.6
2. SD (Jankowski/Jay), 2.3
3. ATL (Inciarte/Peterson/Markakis), 2.0
6. LAA (Escobar/Calhoun), 1.9
Leadoff average, 1.5
ML average, 1.0
26. STL (Carpenter), 1.3
28. BOS (Betts/Pedroia), 1.2
29. OAK (Crisp/Burns), 1.2
30. MIN (Dozier/Nunez), 1.1

This speaks more to me than the measure, but the most interesting thing I learned from that list was that Travis Jankowski was San Diego’s primary leadoff hitter (71 games). Looking at the rest of the list, I think I could have guessed most team’s in two or three, I never would have gotten the Padres.

A similar gauge, but one that doesn't rely on the teammate-dependent R and RBI totals, is Bill James' Run Element Ratio. RER was described by James as the ratio between those things that were especially helpful at the beginning of an inning (walks and stolen bases) to those that were especially helpful at the end of an inning (extra bases). It is a ratio of "setup" events to "cleanup" events. Singles aren't included because they often function in both roles.

Of course, there are RBI walks and doubles are a great way to start an inning, but RER classifies events based on when they have the highest relative value, at least from a simple analysis:

1. MIA (Gordon/Suzuki/Dietrich/Realmuto), 1.8
2. ATL (Inciarte/Peterson/Markakis), 1.4
3. PHI (Herrera/Hernandez), 1.4
6. NYA (Ellsbury/Gardner), 1.2
Leadoff average, .8
ML average, .7
26. COL (Blackmon), .5
28. TB (Forsythe/Guyer), .5
29. DET (Kinsler), .5
30. BAL (Jones/Rickard), .4

The Orioles certainly had a non-traditional leadoff profile thanks mostly to Jones; their five stolen base attempts was the fewest of any team, they were tied for third with 30 homers, and they drew 20 less walks than an average team out of the leadoff spot.

Since stealing bases is part of the traditional skill set for a leadoff hitter, I've included the ranking for what some analysts call net steals, SB - 2*CS. I'm not going to worry about the precise breakeven rate, which is probably closer to 75% than 67%, but is also variable based on situation. The ML and leadoff averages in this case are per team lineup slot:

1. WAS (Turner/Revere/Taylor), 30
2. MIL (Villar/Santana), 27
3. MIA (Gordon/Suzuki/Dietrich/Realmuto), 22
4. CLE (Santana/Davis), 20
Leadoff average, 6
ML average, 2
28. TB (Forsythe/Guyer), -11
29. SEA (Aoki/Martin), -13
30. PHI (Herrera/Hernandez), -16

The Indians are a good example of why I list all players who had at least twenty starts in the leadoff spot; AL steal leader Rajai Davis’ 69 games leading off led to them leading the AL in net steals.

Shifting back to quality measures, first up is one that David Smyth proposed when I first wrote this annual leadoff review. Since the optimal weight for OBA in a x*OBA + SLG metric is generally something like 1.7, David suggested figuring 2*OBA + SLG for leadoff hitters, as a way to give a little extra boost to OBA while not distorting things too much, or even suffering an accuracy decline from standard OPS. Since this is a unitless measure anyway, I multiply it by .7 to approximate the standard OPS scale and call it 2OPS:

1. COL (Blackmon), 880
2. BOS (Betts/Pedroia), 872
3. HOU (Springer/Altuve), 865
Leadoff average, 775
ML average, 745
28. SF (Span), 722
29. OAK (Crisp/Burns), 654
30. KC (Escobar/Dyson/Merrifield), 650

Esky Magic.

Along the same lines, one can also evaluate leadoff hitters in the same way I'd go about evaluating any hitter, and just use Runs Created per Game with standard weights (this will include SB and CS, which are ignored by 2OPS):

1. COL (Blackmon), 6.4
2. BOS (Betts/Pedroia), 6.3
3. HOU (Springer/Altuve), 6.2
Leadoff average, 4.9
ML average, 4.5
28. SF (Span), 4.1
29. KC (Escobar/Dyson/Merrifield), 3.4
30. OAK (Crisp/Burns), 3.3

Esky Magic.

The same six teams make up the leaders and trailers, which shouldn’t be a big surprise.

Allow me to close with a crude theoretical measure of linear weights supposing that the player always led off an inning (that is, batted in the bases empty, no outs state). There are weights out there (see The Book) for the leadoff slot in its average situation, but this variation is much easier to calculate (although also based on a silly and impossible premise).

The weights I used were based on the 2010 run expectancy table from Baseball Prospectus. Ideally I would have used multiple seasons but this is a seat-of-the-pants metric. The 2010 post goes into the detail of how this measure is figured; this year, I’ll just tell you that the out coefficient was -.224, the CS coefficient was -.591, and for other details refer you to that post. I then restate it per the number of PA for an average leadoff spot (746 in 2014):

1. HOU (Springer/Altuve), 30
2. COL (Blackmon), 28
3. CHN (Fowler/Zobrist), 27
Leadoff average, 7
ML average, 0
28. SF (Span), -8
29. KC (Escobar/Dyson/Merrifield), -19
30. OAK (Crisp/Burns), -21

Esky Magic. Lest anyone think I am being unduly critical of Escobar's performance (he did after all start only half (actually 82) of KC's games as the leadoff hitter), note that Escobar when in the #1 spot hit .242/.272/.289. The rest of the Royals combined for .274/.317/.378, which would only rank second worst in the majors in 2OPS. So the Royals team performance was terrible, but Escobar was dreadful. Just the worst.

The spreadsheet with full data is available here.