Monday, December 21, 2015

Leadoff Hitters, 2015

I will try to make this as clear as possible: the statistics are based on the players that hit in the #1 slot in the batting order, whether they were actually leading off an inning or not. It includes the performance of all players who batted in that spot, including substitutes like pinch-hitters.

Listed in parentheses after a team are all players that started in twenty or more games in the leadoff slot--while you may see a listing like "COL (Blackmon)” this does not mean that the statistic is only based solely on Blackmon's performance; it is the total of all Colorado batters in the #1 spot, of which Blackmon was the only one to start in that spot in twenty or more games. I will list the top and bottom three teams in each category (plus the top/bottom team from each league if they don't make the ML top/bottom three); complete data is available in a spreadsheet linked at the end of the article. There are also no park factors applied anywhere in this article.

That's as clear as I can make it, and I hope it will suffice. I always feel obligated to point out that as a sabermetrician, I think that the importance of the batting order is often overstated, and that the best leadoff hitters would generally be the best cleanup hitters, the best #9 hitters, etc. However, since the leadoff spot gets a lot of attention, and teams pay particular attention to the spot, it is instructive to look at how each team fared there.

The conventional wisdom is that the primary job of the leadoff hitter is to get on base, and most simply, score runs. It should go without saying on this blog that runs scored are heavily dependent on the performance of one’s teammates, but when writing on the internet it’s usually best to assume nothing. So let's start by looking at runs scored per 25.5 outs (AB - H + CS):

1. BOS (Betts/Pedroia), 5.7
2. CHN (Fowler), 5.6
3. STL (Carpenter/Wong), 5.5
Leadoff average, 4.9
28. TB (Guyer/Kiermeier/Jaso), 4.4
ML average, 4.2
29. ATL (Peterson/Markakis), 4.0
30. SEA (Marte/Jackson/Morrison), 3.8

The Rays are the team that stands out here, below average despite a healthy .339 OBA. Otherwise the leaders were above average in OBA and the trailers below average, although they weren’t extreme:

1. CLE (Kipnis), .368
2. HOU (Altuve/Springer), .367
3. CHA (Eaton), .356
4. SF (Aoki/Pagan/Blanco), .353
Leadoff average, .329
ML average, .319
28. KC (Escobar), .297
29. CIN (Phillips/Hamilton/Bourgeois), .291
30. LAA (Aybar/Calhoun/Giavotella), .282

I did include HB in OBA this year, so it is (H + W + HB)/(AB + W + HB).

I recently heard some on MLB Network saying that a key for the White Sox would be Adam Eaton getting back to form. But the Eaton-led Chicago leadoff men were quite solid. They even posted a .138 ISO which was one point better than the average for leadoff hitters, so I’m not sure where the notion that Eaton was the problem with the Chicago offense came from.

Escy-magic alright. But if it magically works for a handful of playoff games, by all means, let’s start a trend towards hacking low OBA leadoff hitters. Maybe the Angels will be the first takers and leadoff Andrelton Simmons--he couldn’t do much worse than their 2015 output.

The next statistic is what I call Runners On Base Average. The genesis for ROBA is the A factor of Base Runs. It measures the number of times a batter reaches base per PA--excluding homers, since a batter that hits a home run never actually runs the bases. It also subtracts caught stealing here because the BsR version I often use does as well, but BsR versions based on initial baserunners rather than final baserunners do not. Here ROBA = (H + W + HB - HR - CS)/(AB + W + HB).

My 2009 leadoff post was linked to a Cardinals message board, and this metric was the cause of a lot of confusion (this was mostly because the poster in question was thick-headed as could be, but it's still worth addressing). ROBA, like several other methods that follow, is not really a quality metric, it is a descriptive metric. A high ROBA is a good thing, but it's not necessarily better than a slightly lower ROBA plus a higher home run rate (which would produce a higher OBA and more runs). Listing ROBA is not in any way, shape or form a statement that hitting home runs is bad for a leadoff hitter. It is simply a recognition of the fact that a batter that hits a home run is not a baserunner. Base Runs is an excellent model of offense and ROBA is one of its components, and thus it holds some interest in describing how a team scored its runs, rather than how many it scored:

1. CLE (Kipnis), .337
2. SF (Aoki/Pagan/Blanco), .326
3. HOU (Altuve/Springer), .324
Leadoff average, .296
ML average, .286
28. SD (Myers/Solarte/Venable), .267
29. LAA (Aybar/Calhoun/Giavotella), .263
30. MIN (Dozier/Hicks), .257

I will also include what I've called Literal OBA here--this is just ROBA with HR subtracted from the denominator so that a homer does not lower LOBA, it simply has no effect. It “literally” (not really, thanks to errors, out stretching, caught stealing after subsequent plate appearances, etc.) is the proportion of plate appearances in which the batter becomes a baserunner able to be advanced by his teammates. You don't really need ROBA and LOBA (or either, for that matter), but this might save some poor message board out there twenty posts, by not implying that I think home runs are bad, so here goes. LOBA = (H + W + HB - HR - CS)/(AB + W + HB - HR):

1. CLE (Kipnis), .343
2. HOU (Altuve/Springer), .332
3. SF (Aoki/Pagan/Blanco), .331
Leadoff average, .303
ML average, .294
28. CIN (Phillips/Hamilton/Bourgeois), .273
29. LAA (Aybar/Calhoun/Giavotella), .267
30. MIN (Dozier/Hicks), .267

Usually the various OBA lists are pretty stable, and that was the case in 2015 as the Indians, Astros, and Giants leadoff hitters were the best at getting on base regardless of any slight differences in one’s definition of “getting on base” in this context.

The next two categories are most definitely categories of shape, not value. The first is the ratio of runs scored to RBI. Leadoff hitters as a group score many more runs than they drive in, partly due to their skills and partly due to lineup dynamics. Those with low ratios don’t fit the traditional leadoff profile as closely as those with high ratios (at least in the way their seasons played out):

1. CHN (Folwer), 2.3
2. CIN (Phillips/Hamilton/Bourgeois), 2.2
3. MIA (Gordon), 2.0
4. TEX (DeShields/Choo/Martin), 1.9
Leadoff average, 1.6
28. ATL (Peterson/Markakis), 1.3
29. SEA (Marte/Jackson/Morrison), 1.3
30. BOS (Betts/Pedroia), 1.2
ML average, 1.1

You may recall that the Red Sox leadoff hitters led the majors in runs scored per out, so seeing them with the lowest R/RBI ratio suggests they drove in a whole bunch of runs. Their 95 RBI easily led the majors (St. Louis was next with 82). Meanwhile, the Braves and Mariners had the lowest runs scored per out, so they got here more conventionally.

A similar gauge, but one that doesn't rely on the teammate-dependent R and RBI totals, is Bill James' Run Element Ratio. RER was described by James as the ratio between those things that were especially helpful at the beginning of an inning (walks and stolen bases) to those that were especially helpful at the end of an inning (extra bases). It is a ratio of "setup" events to "cleanup" events. Singles aren't included because they often function in both roles.

Of course, there are RBI walks and doubles are a great way to start an inning, but RER classifies events based on when they have the highest relative value, at least from a simple analysis:

1. MIA (Gordon), 1.7
2. CIN (Phillips/Hamilton/Bourgeois), 1.3
3. SF (Aoki/Pagan/Blanco), 1.2
Leadoff average, .8
ML average, .7
28. BOS (Betts/Pedroia), .5
29. MIN (Dozier/Hicks), .5
30. STL (Carpenter/Wong), .5

Since stealing bases is part of the traditional skill set for a leadoff hitter, I've included the ranking for what some analysts call net steals, SB - 2*CS. I'm not going to worry about the precise breakeven rate, which is probably closer to 75% than 67%, but is also variable based on situation. The ML and leadoff averages in this case are per team lineup slot:

1. CIN (Phillips/Hamilton/Bourgeois), 28
2. MIA (Gordon), 21
3. COL (Blackmon), 17
4. TOR (Reyes/Revere/Tulowitzki/Travis), 14
Leadoff average, 4
ML average, 1
28. STL (Carpenter/Wong), -8
29. ATL (Peterson/Markakis), -10
30. CLE (Kipnis), -10

Shifting back to quality measures, beginning with one that David Smyth proposed when I first wrote this annual leadoff review. Since the optimal weight for OBA in a x*OBA + SLG metric is generally something like 1.7, David suggested figuring 2*OBA + SLG for leadoff hitters, as a way to give a little extra boost to OBA while not distorting things too much, or even suffering an accuracy decline from standard OPS. Since this is a unitless measure anyway, I multiply it by .7 to approximate the standard OPS scale and call it 2OPS:

1. HOU (Altuve/Springer), 833
2. BOS (Betts/Pedroia), 823
3. CLE (Kipnis), 823
4. BAL (Machado), 815
5. STL (Carpenter/Wong), 812
Leadoff average, 745
ML average, 730
28. CIN (Phillips/Hamilton/Bourgeois), 641
29. KC (Escobar), 640
30. LAA (Aybar/Calhoun/Giavotella), 639

Along the same lines, one can also evaluate leadoff hitters in the same way I'd go about evaluating any hitter, and just use Runs Created per Game with standard weights (this will include SB and CS, which are ignored by 2OPS):

1. BOS (Betts/Pedroia), 5.6
2. HOU (Altuve/Springer), 5.6
3. COL (Blackmon), 5.4
Leadoff average, 4.4
ML average, 4.2
28. CIN (Phillips/Hamilton/Bourgeois), 3.4
29. LAA (Aybar/Calhoun/Giavotella), 3.2
30. KC (Escobar), 3.1

This is as good of a time as any to note that no park adjustments are applied anywhere in this post, which explains the presence of Colorado (St. Louis was the next highest-ranked NL team with 5.3).

Allow me to close with a crude theoretical measure of linear weights supposing that the player always led off an inning (that is, batted in the bases empty, no outs state). There are weights out there (see The Book) for the leadoff slot in its average situation, but this variation is much easier to calculate (although also based on a silly and impossible premise).

The weights I used were based on the 2010 run expectancy table from Baseball Prospectus. Ideally I would have used multiple seasons but this is a seat-of-the-pants metric. The 2010 post goes into the detail of how this measure is figured; this year, I’ll just tell you that the out coefficient was -.217, the CS coefficient was -.584, and for other details refer you to that post. I then restate it per the number of PA for an average leadoff spot (742 in 2014):

1. HOU (Altuve/Springer), 19
2. COL (Blackmon), 19
3. BOS (Betts/Pedroia), 19
Leadoff average, 3
ML average, 0
28. CIN (Phillips/Hamilton/Bourgeois), -14
29. LAA (Aybar/Calhoun/Giavotella), -19
30. KC (Escobar), -20

The Mets (Granderson) were the top non-Coors NL team at 16. Just to think that a few years ago Billy Hamilton was being hyped as a potential leadoff dynamo, the Angels had Mike Trout doing leadoff duties, and Alcides Escobar…well, I’m pretty sure everyone thought he would be a terrible leadoff hitter.

The spreadsheet with full data is available here.

Tuesday, December 15, 2015

Statistical Meanderings 2015

This is an annual, largely analysis-free look at some things that I found interesting when compiling my end of season statistical reports. My whole series of annual posts will be a little late and a little brief thanks to some computer issues that prevented me from working on them for a few weeks. They might be the better for it:

* Minnesota was 46-35 at home and 37-44 on the road, close to an inverse record. Nothing noteworthy about that. More amusing is that they almost had an inverse R-RA, scoring 373 and allowing 323 at home while scoring 323 and allowing 377 on the road.

* Every year I run a chart showing runs above average (based on park adjusted runs per game) for each playoff team’s offense and defense. Usually I do this and get to slyly point out that the average playoff team was stronger offensively, but that is not the case this year, and it would be bad form not to show it even when there are no guffaws to be had:

Although it is interesting narrative-wise that the Mets’ offense wound up being twenty runs better than their defense.

* There were nine teams whose starters had a lower eRA than their relievers, led by the Dodgers (3.67/4.36) and also including the A’s, Red Sox, Mariners, Cubs, Rays, Braves, Cardinals, and Mets. One might note that four of the five NL playoff teams are represented; only the Pirates had a lower bullpen eRA (4.09/3.50).

In 2014 there were eight teams with a lower starter eRA and two made the playoffs; in 2013 seven with two playoff participants; in 2012 five with just one playoff club; and in 2011 eight and two.

I certainly would not claim that this little piece of trivia demonstrates any larger truth about the importance of starters and relievers, but it certainly is the kind of factoid that could be used in the style of a Verducci to do so. Of course, the blessed Royals completely break the narrative as the team with the biggest difference in favor of their relievers (4.74/3.34; that 1.39 run gap was much higher than the next closest team, the Brewers (4.92/3.88)).

It is also interesting to see the Rays on the list given the attention they got for aggressively pulling starters on the basis of times through the order. Tampa was 23rd in the majors in innings/start, but second to last in the AL (TB starters worked 5.65 innings per game, KC 5.63).

* Speaking of things you’re probably not supposed to say about the Royals, they were an excellent fielding team with a .690 DER, fourth in the majors. But the two teams they best in the AL playoffs each had a better DER (TOR .696, HOU also .690; San Francisco led the NL at .694).

* Minnesota starters had a 4.68 eRA, above the AL average of 4.47 and in the bottom third of the circuit. But this was a big improvement from their deplorable pitching of the last three seasons. That leaves Philadelphia as the team that can make everyone else feel good about their rotations. Phillie starters had a 5.59 eRA, much worse than their closest competition, Colorado at 5.08 (these figures are all park-adjusted). Rockie starters were last in IP/S (5.29, .2 innings fewer than PHI and ARI) and QS% (33%, MIL at 39% and PHI at 41%).

* If you’d have given me ten guesses, I’m not sure I would have come up with San Francisco leading the majors in park adjusted OBA (.342). In my defense it was a BA-driven performance as their .278 BA was nine points better than Detroit and their walk & hit batter per at bat ratio was .097, just three points above the NL average.

* Dellin Betances and Andrew Miller were 1-2 among AL relievers in strikeout rate. Granted he and Miller wouldn't have both been in the same bullpen, but David Robertson was third.

* Evan Scribner had one of the craziest lines you will ever see. He struck out 64 and walked 4 in 60 innings, but he yielded 14 homers, so he was sub-replacement level (I have him at -4 RAR, which is based on runs allowed adjusted for inherited and bequeathed runners). Scribner had the best K/W ratio among relievers; the next best was Kenley Jansen at 80/8.

If you rank AL relievers by the difference between strikeout and walk rate ((K-W)/PA), a better metric, Scribner ranks eighth. The seven relievers ahead of him were all at least 15 RAR except David Robertson (6). The next sub-replacement level relievers on the list are Aaron Loup (17th) and Mike Morin (19th), but both of them were hit-unlucky (.352 and .353 BABIP respectively) and comfortably above average in dRA. To find the next sub-replacement level performance you have to go all the way down to 46th and Danny Farquhar.

Scribner's 2.2 HR/G (games based on 37 PA rather than 9 IP) rate was the highest among major league relievers. The top three AL relievers in HR rate were all A's: Fernando Abad (2.0) and Edward Mujica (1.9), but OAK's HR park factor of 93 is tied for lowest in the AL.

* My stat reports set a minimum of 40 relief appearances to be included as a reliever, but sometimes I cheat and let in players I’m interested in. One case this year was Jeff Manship. Manship pitched 39 1/3 innings over 32 games. But if you include him, he:

1. Led in RRA (.67 to Wade Davis’ ridiculous .75 over 67 1/3 innings)
2. Led in eRA (1.51 to Davis’ 1.79)
3. was 13th in dRA (2.92, teammate Cody Allen led the way at 2.24)
4. And as you probably surmised by now, led the AL in lowest BABIP (.194, Will Harris was next at .201. Manship’s teammate Allen of the league-leading dRA gave up a .348, eleventh worse of the 95 AL relievers)

Terry Francona frequently used Allen in the eighth inning. Allen’s .37 IR/G was fifth among AL relievers with double digit saves, and Roberto Osuna was the only one of those five with twenty or more saves (twenty on the nose and .49 IR/G). Allen allowed only 4/26 inherited runners to score, lowering his 3.38 RA to a 2.86 RRA

* Does Jerry DiPoto know that Joaquin Benoit had a .190 BABIP? (That's not intended as shot at Jerry DiPoto, Benoit was in the news so it stood out.)

* Ground zero for DIPS intrigue was Toronto. Toronto led the majors with a .696 DER, and their starting pitchers with 15 starts were:

1. Marco Estrada, who had the highest ratio of dRA/eRA (basically, my DIPS run average to a component run average, both based on the same Base Runs formula but the latter considering actual singles, doubles, and triples allowed) of any AL starter (4.73/3.40) thanks to a .223 BABIP

2. RA Dickey, who ranked eighth with 4.72/4.00 and as a knuckleballer falls in one of the first categories of pitchers Voros McCracken carved out of DIPS theory

3. Mark Buehrle, whose dRA/eRA ratio in his final (?) season was an unremarkable 4.52/4.41 but who over the course of his career was an occasional DIPS lightning rod

4. Poor Drew Hutchison, who had the third lowest ratio at 4.46/5.50 and was pounded for a .344 BABIP. On the other hand, he had a 13-5 record despite his BABIP-fueled -6 RAR (second worst in the league, ahead of only...)

* One of the more amusing bits of media silliness during 2015 was Bill Madden's fixation on Shane Greene, which included a caption on an article that asked if Shane Greene was Brian Cashman's biggest mistake, and Madden pondering whether the Yankees would still rather have Nathan Eovaldi and Didi Gregorius than Greene and Martin Prado.

Greene was the worst starting pitcher in the AL with -13 RAR.

I like that as a punchline, but the alternate punchline is that while Prado hit fine (20 RAR) for Miami, Gregorius hit enough (only -3 RAA versus an average shortstop) and gave New York their first good fielding shortstop in goodness knows how long, while Eovaldi chipped in 22 RAR. 36 RAR to 7 RAR, I think Cashman is pretty happy with his choices.

* I will point out that my RAR formula includes no leverage adjustment (which I defend), but then leave this without further comment because you can get chastised for talking about this:

2015 RAR
Jake Odorizzi +36
Wade Davis +30
James Shields +26
Wil Myers +14

* Would you concur that it’s plausible that all five of these seasons could have been produced by the same pitcher?

These are by no means the five most similar in value seasons you could pull out of this year’s pitching lines, but they are broadly similar, no? The reason I like this group so much is that the pitchers are John Lackey, Shelby Miller, Jaime Garcia, Carlos Martinez, Lance Lynn, and Michael Wacha. Not only did St. Louis use five clones as their rotation, they traded a sixth away.

* Ichiro was last in the NL in RAR as a 42 year old corner outfielder. His batting average--Ichiro's batting average--was .229. It is almost inconceivable that he will get another job and that my Twitter feed will react with anything but scorn. But sometimes the inconceivable is reality.

* Speaking of Marlins with terrible secondary averages, Dee Gordon posted a .128, same as Suzuki. The only NL hitters with 250 PA and SECs lower than .128 were Milwaukee's sometimes double play combination of Jean Segura (.110) and Hernan Perez (.101). Ben Revere posted a .128 as well, and both Revere and Gordon were above average offensively, but the next lowest SEC by an above-average NL hitter was .151 (Brandon Phillips). I remain skeptical about Gordon's long-term outlook; it is exceedingly rare for a player to be able to remain an offensive contributor with so little to offer other than singles.

Nonetheless, for 2015, Gordon was a defensible choice for the Silver Slugger, as only Joe Panik had a higher RG and he compiled 220 fewer PA. Still not a good look for NL second basemen.

* The Speed Score trajectories of Bryce Harper and Mike Trout have been something I’ve been watching as much has been made of Trout reconstituting his offensive game as more of a power hitter and less of a baserunning threat. But as of last year his Speed Score was still quite high, albeit lower than when he broke in. With a fourth season under his belt, Trout’s Speed Score sequence is 8.7, 7.0, 7.2, 4.9. So 2015 did mark a significant downturn in terms of Trout’s speed manifesting itself through the official statistics (or at least stolen base percentage, stolen base attempt frequency, triples/BIP, and runs scored per time on base).

Meanwhile, Harper’s sequence is 7.5, 4.9, 2.7, 3.0. If that keeps up Dusty Baker will accuse him of clogging the bases.

* The best season you probably weren’t aware of (which is really to say the best season I wasn’t aware of): Logan Forsythe hit .287/.370/.454 over 609 PA, good for 39 RAR. It was basically the same season as Jason Kipnis had trading some BA for SEC.