Thursday, December 11, 2014

Hitting by Position, 2014

Of all the annual repeat posts I write, this is the one which most interests me--I have always been fascinated by patterns of offensive production by fielding position, particularly trends over baseball history and cases in which teams have unusual distributions of offense by position. I also contend that offensive positional adjustments, when carefully crafted and appropriately applied, remain a viable and somewhat more objective competitor to the defensive positional adjustments often in use, although this post does not really address those broad philosophical questions.

The first obvious thing to look at is the positional totals for 2014, with the data coming from "MLB” is the overall total for MLB, which is not the same as the sum of all the positions here, as pinch-hitters and runners are not included in those. “POS” is the MLB totals minus the pitcher totals, yielding the composite performance by non-pitchers. “PADJ” is the position adjustment, which is the position RG divided by the overall major league average (this is a departure from past posts; I’ll discuss this a little at the end). “LPADJ” is the long-term positional adjustment that I use, based on 2002-2011 data. The rows “79” and “3D” are the combined corner outfield and 1B/DH totals, respectively:

The most notable deviations from historical norms (which, when limited to one year, are strictly trivia rather than trends) were present in the outfield, where all three spots provided essentially equal production. The shape was not equal--centerfielders had a higher batting average and lower secondary average (.213 to .236) than did corner outfielders. Catchers also outhit their usual levels, bringing the three rightmost positions on the defensive spectrum together around 95% of league average production. DHs rebounded from a poor 2013 showing (102 PADJ) to get back to their historical level.

Of course, a DH-supporter like myself can’t avoid commenting on pitchers, who are wont to set a new low every couple years but went so far as to fall below the negative absolute RC threshold in 2014, with a -4 PADJ eclipsing 2012’s 1 as the worst in history. Pitchers struck out in 41% of their plate appearances, double the rate of position players (20%). Still, I’ll take a moment and provide the list of NL pitching staffs by runs above average. I need to stress that the runs created method I’m using here does not take into account sacrifices, which usually is not a big deal but can be significant for pitchers. Note that all team figures from this point forward in the post are park-adjusted. The RAA figures for each position are baselined against the overall major league average RG for the position, except for left field and right field which are pooled. So for pitchers, the formula for RAA was fun to write this year, since it involved adding the league average performance (well, subtracting the negative):

RAA = (RG + .15)*(AB - H + CS)/25.5

This marked the second straight triumph for Dodger pitchers as most productive, with Zack Greinke again the standout, although his numbers were much less gaudy than in 2013 (this year he hit .200/.262/.350). Pittsburgh’s hurlers extended a complete power outage to a third season; while they topped Miami and Milwaukee in isolated power thanks to a Gerrit Cole home run, their two extra base hits (Vance Worley doubled) were the fewest. This comes on the heels of 2012 (one double) and 2013 (zero extra base hits), giving them a three year stretch of 894 at bats with two doubles and a home run (.006 ISO).

I don’t run a full chart of the leading positions since you will very easily be able to go down the list and identify the individual primarily responsible for the team’s performance and you won’t be shocked by any of them, but the teams with the highest RAA at each spot were:


More interesting are the worst performing positions; the player listed is the one who started the most games at that position for the team:

I will take the poor performance of the Jeter-led Yankee shortstops as an opportunity to share some wholly unoriginal thoughts about the Didi Gregorius trade. I have no particularly strong feelings on Gregorius’ long-term outlook; I’ll leave that to the projection mavens and the scouts. However, some remarkably silly columns have been written about his assuming Saint Derek’s mantle. One sneered that he wasn’t likely to be a long-term solution. This is probably true, but most major league lineup spots are filled by guys who are long-term solutions. A minority of teams have a long-term answer at shortstop, let alone a ten-year answer.

But more importantly is that even a static Gregorius could be an immediate boost to the Yankees. No team in baseball got less out of the position offensively, and fielding? (Rhetorical question). Last year, Gregorius hit .222/.279/.356 in 292 PA (3.4 RG); Jeter hit .253/.294/.309 in 616 PA (3.1 RG).

I like to attempt to measure each team’s offensive profile by position relative to a typical profile. I’ve found it frustrating as a fan when my team’s offensive production has come disproportionately from “defensive” positions rather than offensive positions (“Why can’t we just find a corner outfielder who can hit?”) The best way I’ve yet been able to come up with to measure this is to look at the correlation between RG at each position and the long-term positional adjustment. A positive correlation indicates a “traditional” distribution of offense by position--more production from the positions on the right side of the defensive spectrum. (To calculate this, I use the long-term positional adjustments that pool 1B/DH as well as LF/RF, and because of the DH I split it out by league):

My comments on frustration are based on the Indians, who have often had a negative correlation but this year exhibited a more normal profile. The Tigers +.88 is about as high as you’ll see. Of course, offensive positions were their biggest producers with Cabrera and the two Martinezes, and their right fielders were their fourth most productive position. They did get more out of second than third or right, and more out of catcher than shortstop, but otherwise they were fell right in place.

The following charts, broken out by division, display RAA for each position, with teams sorted by the sum of positional RAA. Positions with negative RAA are in red, and positions that are +/-20 RAA are bolded:

Washington led the majors in RAA from both their corner infield spots and the entire infield. Giancarlo Stanton almost single-handedly led Miami to the majors top outfield RAA total. Atlanta had the majors worst RAA from middle infielders. Philadelphia’s corner infielders had the lowest RAA in the NL.

I was bullish on Milwaukee this year, which looked smart for four and a half months before they wound up with an overall season record close to where most people picked them to finish. One factor I cited was how bad their production from first base had been in 2013. While the Brewers did not replicate their dreadful -37 RAA first base performance from ’13, they only gained a win or so by improving to -26, still the worst first base production in the NL. 84% of their first base PA went to Lyle Overbay (628 OPS as a first baseman) and Mark Reynolds (632), who had basically the same overall production with different shapes. And so it was no surprise that for the second straight year, Milwaukee had the NL’s most oddly distributed offense by position (based on the correlation approach described above). Cincinnati had the majors worst outfield production, and consistently so from left to right (-24, -20, -18).

The Dodgers were the only NL team to be above average at seven of the eight positions, but their catchers went all-in as the NL’s least productive unit. Los Angeles tied Pittsburgh with 123 total RAA, but they did it with opposite production from the backstops (the Pirates’ +24 was a perfect offset for LA’s -24). Dodger middle infielders led the NL in RAA. San Diego was on the other end of the spectrum, the NL’s only team with just one above-average position, with catcher once again serving as the exception. The Padres infield was the least productive in MLB.

Boston went from having one below-average position in 2013 to having just two above-average in 2014, which is how a team can go from leading the majors in total RAA to ranking second-last in the AL. Their infielders were the worst in the AL with a total of -46 RAA; their outfielders only tied for second-last, but matched the total of -46. Yet much of the winter discussion regarding the Red Sox has involved how they will parcel out their outfield surplus.

Detroit led the AL in corner infield RAA thanks to first base. Just eyeballing the charts, the Indians may have been the most average in terms of combining roughly average overall RAA with close to average production at many positions. Minus the outfield corners, there weren’t many extremes in Cleveland.

The Angels led the AL in outfield RAA; the corner outfielders washed each other out for a total of -4 RAA, but the Trout-led centerfielders could not be washed out. Houston had the majors best middle infield production, but the worst corner infield production, and the latter exceeded the former by 24 absolute runs. The Astros had three -20 positions (the corner infielders and left field, so get your JD Martinez victim of their own success jokes in); so did the Reds, but just barely (two of theirs were -20 and the other -24). Seattle had the AL’s worst outfield, largely due to trotting James Jones out there for 72 starts, then seeing Austin Jackson crater when they acquired him to address the problem. Texas had eight below-average positions, but made their one bright spot count with the AL’s best third base RAA.

The data for each team-position is available in this spreadsheet.

Wednesday, December 03, 2014

Hitting by Lineup Position, 2014

I devoted a whole post to leadoff hitters, whether justified or not, so it's only fair to have a post about hitting by batting order position in general. I certainly consider this piece to be more trivia than sabermetrics, since there’s no analytic content.

The data in this post was taken from Baseball-Reference. The figures are park-adjusted. RC is ERP, including SB and CS, as used in my end of season stat posts. When I started I didn’t have easy access to HB, so they are not included in any of the stats, including OBA. The weights used are constant across lineup positions; there was no attempt to apply specific weights to each position, although they are out there and would certainly make this a little bit more interesting.

This is the sixth consecutive season in which NL #3 hitters were the top producing lineup spot, while AL teams demonstrated more balance between #3 and #4. This is a fairly consistent pattern and the most interesting thing I’ve found from doing this every year. I have no explanation for this phenomenon and suspect that there really is none--the NL has had a run of outstanding hitters who happened to bat third in the lineup (e.g. Pujols, Votto, Braun, Gonzalez). NL hitters were more productive at spots 2-3 and 5-7, while the AL got more production at leadoff, cleanup, #8, and #9 (the latter is a given, of course).

The position that sticks out the most to me is AL #6; even with hit batters included, they managed an OBA of just .300 and outhit only AL #9, NL #8, and NL #9. I’d assume this is a one-year oddity and nothing more; in 2013 they created 4.55 runs, trailing only 3-5 among AL slots.

Next are the team leaders and trailers in RG at each lineup position. The player listed is the one who appeared in the most games in that spot (which can be misleading, particularly when there is no fixed regular as in the case of the Astros #5 spot). Or poor Matt Dominguez, who was perhaps the worst regular hitter in MLB (.215/.253/.330 for 2.6 RG, only Zack Cozart was worse among those with 500 PA), but doesn’t deserve to be blamed for sinking two Houston lineup spots as he had plenty of help in both.

Some random thoughts:

* You can see why Seattle felt they needed Nelson Cruz, with the worst production out of the cleanup spot in the AL.

* Texas #3 hitters were a complete disaster. At 2.54 RG, they managed to outhit only five non-pitcher lineup spots. No #1, #2, #3 (obviously), #4, #5, or #6 spots in the majors were worse.

* Kansas City’s production was oddly distributed. Their #1-3 hitters combined for 3.59 RG (no park adjustment applied in this bullet), their #4-6 for 4.86, and their #7-9 for 3.76. I’ll call that a 3-1-2 pattern of hitting by batting order third (1-3 least productive, 4-6 most productive, 7-9 in the middle). 22 teams exhibited a 1-2-3 pattern; 4 teams a 2-1-3; and 2 teams each with 1-3-2 and 3-1-2. Texas was the other team with a 3-1-2 pattern.

* And then there are the Padres, who take on the role that was filled so well by the Mariners for many years of being the source of ridiculous offensive futility factoids. As you can see, San Diego got the NL’s worst production at four lineup spots, all at the top or middle of the order, but also had the most productive #7 and #8 hitters. In fact, Padre #7 hitters were the most productive of any of their lineup spots. Two teams got their top production from leadoff hitters, six from #2, seventeen from #3, two from #4, two from #5, but only one from #7:

The next list is the ten best positions in terms of runs above average relative to average for their particular league spot (so AL leadoff spots are compared to the AL average leadoff performance, etc.):

And the worst:

The -54 figure for Texas #3 hitters is a big number; I’ve been running this report since 2009 and that is the worst performance by a team batting spot, topping the -53 runs turned in by KC’s Mike Jacobs-led cleanup hitters in 2009. Considering that the AL average RPG in 2014 was 13% lower than in 2009, that one run difference is approximately a full win difference.

The last set of charts show each team’s RG rank within their league at each lineup spot. The top three are bolded and the bottom three displayed in red to provide quick visual identification of excellent and poor production:

If you are interested in digging in yourself, see the spreadsheet here.

Monday, November 24, 2014

Leadoff Hitters, 2014

This post kicks off a series of posts that I write every year, and therefore struggle to infuse with any sort of new perspective. However, they're a tradition on this blog and hold some general interest, so away we go.

First, the offensive performance of teams' leadoff batters. I will try to make this as clear as possible: the statistics are based on the players that hit in the #1 slot in the batting order, whether they were actually leading off an inning or not. It includes the performance of all players who batted in that spot, including substitutes like pinch-hitters.

Listed in parentheses after a team are all players that started in twenty or more games in the leadoff slot--while you may see a listing like "OAK (Crisp)” this does not mean that the statistic is only based solely on Crisp's performance; it is the total of all Oakland batters in the #1 spot, of which Crisp was the only one to start in that spot in twenty or more games. I will list the top and bottom three teams in each category (plus the top/bottom team from each league if they don't make the ML top/bottom three); complete data is available in a spreadsheet linked at the end of the article. There are also no park factors applied anywhere in this article.

That's as clear as I can make it, and I hope it will suffice. I always feel obligated to point out that as a sabermetrician, I think that the importance of the batting order is often overstated, and that the best leadoff hitters would generally be the best cleanup hitters, the best #9 hitters, etc. However, since the leadoff spot gets a lot of attention, and teams pay particular attention to the spot, it is instructive to look at how each team fared there.

The conventional wisdom is that the primary job of the leadoff hitter is to get on base, and most simply, score runs. It should go without saying on this blog that runs scored are heavily dependent on the performance of one’s teammates, but when writing on the internet it’s usually best to assume nothing. So let's start by looking at runs scored per 25.5 outs (AB - H + CS):

1. MIL (Gomez/Gennett), 5.9
2. MIN (Santana/Dozier), 5.8
3. STL (Carpenter), 5.6
Leadoff average, 4.8
28. CHN (Bonifacio/Coghlan), 4.1
ML average, 4.0
29. SD (Cabrera/Solarte/Venable/Denorfia), 3.5
30. SEA (Jackson/Chavez/Jones/Almonte), 3.5

The Twins leading the AL in run scoring rate for leadoff hitters is a surprise--usually leading teams in this category are good offenses or high OBA guys, neither category describes Minnesota. They combined for a .324 OBA,
just a tick above the major league average in the other obvious measure to look at. The figures here exclude HB and SF to be directly comparable to earlier versions of this article, but those categories are available in the spreadsheet if you'd like to include them:

1. STL (Carpenter), .366
2. HOU (Altuve/Grossman/Fowler), .352
3. WAS (Span), .346
Leadoff average, .322
ML average, .310
28. CIN (Hamilton), .295
29. SD (Cabrera/Solarte/Venable/Denorfia), .293
30. SEA (Jackson/Chavez/Jones/Almonte), .287

The next statistic is what I call Runners On Base Average. The genesis for ROBA is the A factor of Base Runs. It measures the number of times a batter reaches base per PA--excluding homers, since a batter that hits a home run never actually runs the bases. It also subtracts caught stealing here because the BsR version I often use does as well, but BsR versions based on initial baserunners rather than final baserunners do not.

My 2009 leadoff post was linked to a Cardinals message board, and this metric was the cause of a lot of confusion (this was mostly because the poster in question was thick-headed as could be, but it's still worth addressing). ROBA, like several other methods that follow, is not really a quality metric, it is a descriptive metric. A high ROBA is a good thing, but it's not necessarily better than a slightly lower ROBA plus a higher home run rate (which would produce a higher OBA and more runs). Listing ROBA is not in any way, shape or form a statement that hitting home runs is bad for a leadoff hitter. It is simply a recognition of the fact that a batter that hits a home run is not a baserunner. Base Runs is an excellent model of offense and ROBA is one of its components, and thus it holds some interest in describing how a team scored its runs, rather than how many it scored:

1. STL (Carpenter), .349
2. HOU (Altuve/Grossman/Fowler), .326
3. WAS (Span), .325
Leadoff average, .294
ML average, .281
27. SEA (Jackson/Chavez/Jones/Almonte), .272
28. MIL (Gomez/Gennett), .270
29. SD (Cabrera/Solarte/Venable/Denorfia), .266
30. CIN (Hamilton), .253

Milwaukee’s leadoff hitters are a good example of why ROBA is not a quality metric. Their .325 OBA was slightly above average, but they also led leadoff hitters with 26 home runs. They also were caught stealing 13 times, which tied for the fourth-most among leadoff hitters, which brought it down some more. It’s CS that really brings down the Reds, as the Hamilton-led leadoff hitters led all teams by getting caught 20 times.

I will also include what I've called Literal OBA here--this is just ROBA with HR subtracted from the denominator so that a homer does not lower LOBA, it simply has no effect. You don't really need ROBA and LOBA (or either, for that matter), but this might save some poor message board out there twenty posts, by not implying that I think home runs are bad, so here goes. LOBA = (H + W - HR - CS)/(AB + W - HR):

1. STL (Carpenter), .353
2. HOU (Altuve/Grossman/Fowler), .331
3. WAS (Span), .328
Leadoff average, .298
ML average, .287
28. SEA (Jackson/Chavez/Jones/Almonte), .274
29. SD (Cabrera/Solarte/Venable/Denorfia), .269
30. CIN (Hamilton), .257

There is a high degree of repetition for the various OBA lists, which shouldn’t come as a surprise since they are just minor variations on each other.

The next two categories are most definitely categories of shape, not value. The first is the ratio of runs scored to RBI. Leadoff hitters as a group score many more runs than they drive in, partly due to their skills and partly due to lineup dynamics. Those with low ratios don’t fit the traditional leadoff profile as closely as those with high ratios (at least in the way their seasons played out):

1. LA (Gordon), 2.5
2. PHI (Revere), 2.4
3. BOS (Holt/Pedroia/Betts), 2.3
Leadoff average, 1.7
28. NYA (Gardner/Ellsbury), 1.3
29. DET (Kinsler/Davis/Jackson), 1.3
30. COL (Blackmon), 1.3
ML average, 1.1

A similar gauge, but one that doesn't rely on the teammate-dependent R and RBI totals, is Bill James' Run Element Ratio. RER was described by James as the ratio between those things that were especially helpful at the beginning of an inning (walks and stolen bases) to those that were especially helpful at the end of an inning (extra bases). It is a ratio of "setup" events to "cleanup" events. Singles aren't included because they often function in both roles.

Of course, there are RBI walks and doubles are a great way to start an inning, but RER classifies events based on when they have the highest relative value, at least from a simple analysis:

1. STL (Carpenter), 1.6
2. LA (Gordon), 1.5
3. PHI (Revere), 1.5
4. KC (Aoki/Cain), 1.5
Leadoff average, 1.0
ML average, .7
28. PIT (Harrison/Polanco/Marte), .6
29. LAA (Calhoun/Cowgill), .5
30. DET (Kinsler/Davis/Jackson), .5

Since stealing bases is part of the traditional skill set for a leadoff hitter, I've included the ranking for what some analysts call net steals, SB - 2*CS. I'm not going to worry about the precise breakeven rate, which is probably closer to 75% than 67%, but is also variable based on situation. The ML and leadoff averages in this case are per team lineup slot:

1. PHI (Revere), 34
2. TOR (Reyes), 29
3. LA (Gordon), 23
Leadoff average, 8
ML average, 3
28. LAA (Calhoun/Cowgill), -3
29. CHA (Eaton), -4
30. SD (Cabrera/Solarte/Venable/Denorfia), -7

Last year I noted that since 2007, the percentage of major league stolen base attempts from leadoff hitters has declined. It was up to 28.8% in 2014, so the 2007-14 figures are (2007 is an arbitrary endpoint due to it being the first year I have the data at my finger tips):

30.2%, 29.6%, 27.8%, 25.9%, 27.9%, 25.1%, 25.9%, 28.8%

Shifting back to quality measures, beginning with one that David Smyth proposed when I first wrote this annual leadoff review. Since the optimal weight for OBA in a x*OBA + SLG metric is generally something like 1.7, David suggested figuring 2*OBA + SLG for leadoff hitters, as a way to give a little extra boost to OBA while not distorting things too much, or even suffering an accuracy decline from standard OPS. Since this is a unitless measure anyway, I multiply it by .7 to approximate the standard OPS scale and call it 2OPS:

1. HOU (Altuve/Grossman/Fowler), 789
2. MIL (Gomez/Gennett), 781
3. WAS (Span), 776
Leadoff average, 724
ML average, 704
28. NYN (Young/Granderson/Lagares), 654
29. SD (Cabrera/Solarte/Venable/Denorfia), 637
30. SEA (Jackson/Chavez/Jones/Almonte), 625

Along the same lines, one can also evaluate leadoff hitters in the same way I'd go about evaluating any hitter, and just use Runs Created per Game with standard weights (this will include SB and CS, which are ignored by 2OPS):

1. HOU (Altuve/Grossman/Fowler), 5.4
2. MIL (Gomez/Gennett), 5.2
3. NYA (Gardner/Ellsbury), 5.2
Leadoff average, 4.4
ML average, 4.1
28. NYN (Young/Granderson/Lagares), 3.6
29. SEA (Jackson/Chavez/Jones/Almonte), 3.1
30. SD (Cabrera/Solarte/Venable/Denorfia), 3.1

You may note that the spread in RG between team leadoff spots is not that great, ranging from just 3.1 to 5.4. This seemed very unusual to me, so I checked the last five years and it was in fact an unusual year (chart shows standard deviation and coefficient of variation of leadoff RG by team):

Originally I just included the most recent five seasons, but I’m glad I dug up the 2009 data, because the COV was similar to that in 2014. However, the history does indicate that this is an unusually small spread in production from the leadoff spot. It seems far more likely to a blip than anything of note, though.

Allow me to close with a crude theoretical measure of linear weights supposing that the player always led off an inning (that is, batted in the bases empty, no outs state). There are weights out there (see The Book) for
the leadoff slot in its average situation, but this variation is much easier to calculate (although also based on a silly and impossible premise).

The weights I used were based on the 2010 run expectancy table from Baseball Prospectus. Ideally I would have used multiple seasons but this is a seat-of-the-pants metric. The 2010 post goes into the detail of how this measure is figured; this year, I’ll just tell you that the out coefficient was -.215, the CS coefficient was -.582, and for other details refer you to that post. I then restate it per the number of PA for an average leadoff spot (736 in 2014):

1. HOU (Altuve/Grossman/Fowler), 16
2. STL (Carpenter), 12
3. NYA (Gardner/Ellsbury), 12
Leadoff average, 0
ML average, -5
28. CHN (Bonifacio/Coghlan), -12
29. SEA (Jackson/Chavez/Jones/Almonte), -21
30. SD (Cabrera/Solarte/Venable/Denorfia), -23

I doubt I would have guessed Houston in ten guesses at the most productive leadoff spot in MLB, but Altuve and Fowler both were very productive when leading off (Robbie Grossman started 43 games as a leadoff hitter but hit .262/.340/.337, and thus was not a major contributor to the Astros’ #1 rank). Seattle managed to contend for a playoff spot despite woeful leadoff production, and attempted to address the issue (and the related center field woes) at the trade deadline by acquiring Austin Jackson. But Jackson hit just .229/.267/.260 in 236 PA in those roles after the trade. A center fielder led off for Seattle in 117 of 162 games, an outfielder in 148 of 162 games.

For the full lists and data, see the spreadsheet here.

Tuesday, November 11, 2014

Hypothetical Ballot: MVP

Last year, I thought that Clayton Kershaw was the most valuable player in the National League. The BBWAA voters did not concur, placing Kershaw seventh in the voting; the IBA voters were more generous at third. This season, though, it appears Kershaw is going to win the MVP award.

When you compare Kershaw 2013 to Kershaw 2014, it’s difficult to find good reasons for this (one obvious reason which I’ll discuss in a minute is anything but good). Granted, MVP voting does not occur in a vacuum--the 2013 field had much more to offer from a position player perspective, with Yadier Molina have an outstanding full season, his teammate Matt Carpenter, a pair of big-mashing first basemen in Joey Votto and Paul Goldschmidt, and the one holdover, Andrew McCutchen. Thus it makes sense that more voters would turn to Kershaw in a season in which there are fewer alternatives. Still, Kershaw pitched 38 fewer innings over six fewer starts in 2014. His ERA dropped slightly (1.83 to 1.77), but that hardly makes up for 38 innings. A big factor for the BBWAA will be his win-loss record, 21-3 in 2014 rather than a more pedestrian 16-9 in 2013, but it goes without saying on this blog that W-L is a silly basis to vote for MVP.

It might be useful to take a look at Kershaw’s performance in the categories that I feel are useful, with adjustments for league average since we are comparing across seasons, seasons in which the NL average RA dropped very slightly from 4.04 to 4.01 (the three RA figures have been divided by the league average RA; RAA and RAR have been very simply converted to WAA and WAR by dividing by the league average runs scored per game by both teams):

While Kershaw was slightly better in the pitching metrics that focus on actual results (RRA and eRA) and noticeably better in DIPS (dRA), the 36 inning difference looms large. I would take Kershaw’s 2013 season over his 2014 season. Obviously both were outstanding, but the fact that he will be an MVP afterthought in one and a strong winner in another speaks to the arbitrary and narrative-driven voting that still reigns supreme in the BBWAA even as more sabermetric approaches gain some traction.

For my ballot, last year I chose Kershaw narrowly over McCutchen. This year I’ve done the opposite. McCutchen starts with a 77 to 70 lead over Kershaw in RAR, but he does give some of that back. Per Fangraphs McCutchen was an average baserunner, while Kershaw created three runs at the plate with a .178/.228/.211 line. Since pitchers essentially average zero runs created per out, I credit the three absolute RC with no baseline. McCutchen also doesn’t fare particularly well in defensive metrics, -11 DRS, -11 UZR, -8 FRAA (Baseball Prospectus). Regressing these a little as I am wont to do, it’s very close between Kershaw and McCutchen.

However, Kershaw’s RAR is based on his actual runs allowed; were one to use eRA or dRA as the basis, he’d start from just 63 or 58 RAR respectively, and that would be too large of a gap to McCutchen to close with fielding, even with no regression. I have no issue with the notion of a pitcher being MVP, but I think it’s a pretty high bar, and when the alternate ways of valuing pitchers don’t support placing the pitcher ahead, I can’t do it either.

Giancarlo Stanton would have made things very interesting had he not been injured, although in the end there was very little difference between the amount of time missed by McCutchen and Stanton. McCutchen played 146 games with 632 PA; Stanton played 145 games with 633 PA. Stanton came in at 67 RAR, ten fewer than McCutchen, partially due to the position adjustment difference between right field and center field, but not exclusively. Based on my estimates McCutchen created five more runs (121 to 116) in six fewer outs, making him the superior (albeit well within the margin of error) hitter (62 to 56 runs above average, hitting-only). While Stanton fares better in the fielding metrics, Fangraphs has him as a -2 baserunner, and so the ten run gap holds up for McCutchen. Kershaw/Stanton for second is a tossup, but I went with Stanton on the same reasoning discussed in the prior paragraph.

After them I have the top Cy Young challengers, who each pitched significantly more than Kershaw despite less impressive rates (Johnny Cueto and Adam Wainwright). The rest of my ballot is pretty self-explanatory from my RAR leaders, except Anthony Rendon is placed ahead of Jonathan Lucroy and Cole Hamels thanks to strong showing in baserunning (+6) and fielding metrics (16 DRS, 7 UZR, -1 FRAA):

1. CF Andrew McCutchen, PIT
2. RF Giancarlo Stanton, MIA
3. SP Clayton Kershaw, LA
4. SP Johnny Cueto, CIN
5. SP Adam Wainwright, STL
6. C Buster Posey, SF
7. 3B Anthony Rendon, WAS
8. C Jonathan Lucroy, MIL
9. SP Cole Hamels, PHI
10. RF Yasiel Puig, LA

Many words were used to discuss the 2012 and 2013 AL MVP votes; many fewer will be used in 2014, but the basic story is the same for me--Mike Trout was pretty clearly the most valuable player in the AL. This year, the Angels’ record, Trout leading the league in RBI, and the lack of a triple crown stat standout other than Trout have combined to make the mainstream media agree. While comprehensive WAR metrics that include fielding with no regression may suggest this was the least valuable of Trout’s three full seasons, I would point out that offensively, there’s no pattern that could not be due to sheer random fluctuation. Trout’s RG relative to the league average for the past three seasons is 196, 209, 186. Trout is such a towering figure among intelligent followers of the game that he has become subject to intense scrutiny--Trout death watch has become a bizarrely popular topic at sites that should know better. This is not to say that Trout will continue to dominate baseball for the next decade with no risk, or that Trout will ever match his 2012-2014 performances. But if you think you've found a clear decline trend in a 23 year old who was the best player in baseball for a third straight season, you are likely overanalyzing. You may want to take a gander at Alex Rodriguez 1997-1999 as well. It’s more than a little uncouth if you ask me.

Rant aside, the rest of the ballot is not particular interesting, and I’ve mostly stuck with the RAR list. Some exceptions to note:

* Victor Martinez, a distant second among hitters with 64 RAR, just squeaks on to my ballot. Martinez was a -5 baserunner per Fangraphs, his RAR doesn’t penalize him for being a DH at all, and when he did play the field, he was poor in just 280 innings (-4 DRS, -6 UZR, -4 FRAA). It was not an easy choice to keep Martinez on the ballot ahead of Josh Donaldson or Adrian Beltre, who spotted him around twenty RAR but did everything else better.

* Similarly, Jose Abreu drops off entirely--it's the same story except he starts from 55 RAR.

* Jose Altuve is knocked down a few pegs thanks to fielding metrics; I kept him ahead of Cano among second basemen on the basis of his forty plate appearance edge, but with no strong conviction:

* Corey Kluber beats out Michael Brantley as Most Valuable Indian; the two are tied at 61 RAR prior to considering Brantley’s baserunning (good) and fielding (meh). Kluber would do worse using eRA, better using dRA, and the latter tipped the scales in his favor for me. After watching the Tribe all season, it would feel wrong to decide a tossup in favor of the pitcher rather than a fielder who played behind him. Cleveland’s fielding was dreadful and Brantley, while not a main culprit, did not really help either. I remain unimpressed by Brantley as an outfielder, even in left; his arm is solid, but he’s a left fielder, so...

1. CF Mike Trout, LAA
2. SP Felix Hernandez, SEA
3. SP Corey Kluber, CLE
4. LF Michael Brantley, CLE
5. SP Chris Sale, CHA
6. RF Jose Bautista, TOR
7. SP Jon Lester, BOS/OAK
8. 2B Jose Altuve, HOU
9. 2B Robinson Cano, SEA
10. DH Victor Martinez, DET

Sunday, November 09, 2014

Hypothetical Ballot: Cy Young

The National League Cy Young voting will not entail much intrigue. Clayton Kershaw will win in a romp, and will probably win the MVP as well. And while I agree that Kershaw deserves the Cy Young, I believe that the margin in the voting will greatly overstate the value difference between Kershaw and his closest competitors, Johnny Cueto and Adam Wainwright.

Kershaw and Cueto each were worth about 70 RAR through their pitching efforts based on actual runs allowed adjusted for bullpen support. Kershaw had a much better RRA at 1.98 while Cueto’s was 2.55, but Cueto pitched an additional 45.1 IP. The difference between the two in runs allowed amounts to 45.1 innings of 5.07 RRA pitching. I estimate that the replacement level for starting pitchers is 128% of the league average runs allowed, which for the 2014 NL works out to 5.13. Thus there is essentially no difference between Cueto and Kershaw from a replacement level perspective. Cueto essentially tacked 45 innings of replacement level performance onto what Kershaw did.

Of course actual runs allowed are just one way to evaluate a pitcher. Cueto actually closes the gap when using eRA, which estimates runs allowed based on inputs, including actual hits allowed. Kershaw’s eRA was 2.30 to Cueto’s 2.73, a more narrow gap than the difference in actual runs allowed. Figuring RAR based on eRA, Cueto edges Kershaw 65 to 63. Kershaw has a significant advantage in DIPS measures, though, 2.47 to Cueto’s 3.60 in dRA, as Cueto’s BABIP allowed was just .246. And even considering just actual runs allowed, I am slightly biased towards the better performer on a rate basis rather than compiler. I concur that Kershaw is more deserving of the Cy than Cueto, but the gap just isn’t that large given Kershaw’s missed starts and 198 innings.

I’ve focused on Cueto v. Kershaw, but Wainwright is right on Cueto’s heels with 67 RAR. Wainwright also has an edge on Cueto in dRA (3.38 to 3.60), and could easily place ahead if one values the DIPS metrics.

For the rest of the ballot, Cole Hamels is a comfortable pick for fourth, and I have fifth as a close battle between Washington teammates, Tanner Roark and Jordan Zimmermann. Roark ranks ahead in RAR 50 to 46, but Zimmermann’s .78 dRA edge and superior peripherals are enough to slip ahead in my book:

1. Clayton Kershaw, LA
2. Johnny Cueto, CIN
3. Adam Wainwright, STL
4. Cole Hamels, PHI
5. Jordan Zimmermann, WAS

There will be more controversy associated with the AL award as Felix Hernandez and Corey Kluber jockey for the top spot. Kluber has become something of a darling among the DIPS-first portion of the sabermetric crowd, as his dRA is better than Hernandez’ (2.88 to 3.07) and he had around 38 additional opponent plate appearances. Cleveland’s fielders, in general, were bad--they ranked second-to-last in the AL in DER while Seattle led the AL.

However, I don’t believe in throwing out all elements of pitching results outside of the three true outcomes, nor do I believe that it’s a trivial matter to parcel out adjustments for fielding support amongst pitchers. For me, Hernandez’ large edge in runs-based metrics (Hernandez has a 12 RAR lead; the two differed in innings pitched by a single out, but Hernandez’ RRA of 2.54 was better than Kluber’s 2.98; using eRA, Hernandez has an even larger advantage of 20 RAR) is too large to ignore.

One point to note when using runs allowed metrics--Hernandez got less help from his bullpen then did Kluber. Hernandez bequeathed 13 runners, and 7 of them came around to score. Kluber bequeathed 20 runners and only 2 were allowed to score. That’s a seven run swing in the King’s favor that is not apparent from the traditional stat line.

Using RRA (which considers bequeathed runners), Hernandez’ advantage is 12 runs. Using eRA, which is just a component estimate, Hernandez leads by 20 runs. Using dRA, Kluber leads by 8. Quality of opposing hitters doesn’t change the picture much; according to Baseball Prospectus, Hernandez’ opponents combined for a .264 True Average, Kluber’s for .263. To vault Kluber ahead, one must put a lot more stock in DIPS or in the quality of adjustments for fielding support than I am willing to grant. I’m not saying it’s wrong to do so, but if you read any columns ripping the choice of Hernandez (and I don’t know there will be any), it’s likely you are dealing with a zealot.

And while I did not consciously consider it in my choice, another plus of choosing Hernandez is that I can dodge charges of pro-Indians bias.

For the rest of the ballot, I have stuck with the RAR order, as I see no particular reason to make any changes. Chris Sale is held back by just 174 innings, but he led the AL in RRA and dRA and was second in eRA to Garret Richards, who pitched five fewer innings:

1. Felix Hernandez, SEA
2. Corey Kluber, CLE
3. Chris Sale, CHA
4. Jon Lester, BOS/OAK
5. Max Scherzer, DET

Sunday, November 02, 2014

Hypothetical Ballot: Rookie of the Year

Just off the top of my head, the National League rookie class is one of the least inspiring that I can remember. Not only were there no real standout performances, there’s not a lot of competition for the top of the ballot, and there aren’t a lot of big-time prospects who simply didn’t produce in their first major league season (tragically, the closest to meeting this description is the late Oscar Taveras).

Jacob deGrom is the relatively clear choice with 33 RAR over just 22 starts. Over those 140 frames, though, he was excellent by any measure--his eRA and dRA were commensurate with his RRA, and it’s hard to argue with 9.5/2.8 strikeout/walks per game. Another pitcher is a name that I don’t recall seeing in much chatter about the award, Colorado lefty Tyler Matzek. Matzek pitched just 117 innings, which may help explain why he didn’t draw much attention, and of course his statistics don’t look very good without a park adjustment. After park adjustment, a 3.35 RRA was good for 23 RAR.

The other ballot spots go to batters; Ken Giles has drawn some attention, and he had an excellent season, ranking eighth among NL relievers with 17 RAR in just 45.2 innings thanks to sub-2 figures in all of the run average categories. Giles’ strikeout rate of 14.1 trailed only the usual suspects among NL relievers (Chapman, Kimbrel, Jansen). However, 45.2 innings is the rub--it's hard for a reliever facing less than 200 batters to stand up against even average everyday rookies.

The NL had three such players worthy of recognition. Travis d’Arnaud led NL rookies with 22 RAR and also led with 4.5 RG. While his defensive reputation is not great, it would take a fair amount of credit for fielding and baserunning to move Kolten Wong (15 RAR) or Billy Hamilton (11 RAR) ahead. Both appear to be good fielders and baserunners, Hamilton’s puzzling 23 caught stealings notwithstanding. Hamilton had 160 more PA thanks to leading off and not being subject to odd management by Michigan Mike, and he rates very highly in the various fielding metrics. After deGrom, it’s splitting hairs to fill out the rest of the ballot:

1. SP Jacob deGrom, NYN
2. C Travis d’Arnaud, NYN
3. SP Tyler Matzek, COL
4. CF Billy Hamilton, CIN
5. 2B Kolten Wong, STL

Were they in the AL, only deGrom would crack the ballot, as the AL crop put the NL’s to shame. Part of that is due to experienced international players, who are subject to bizarre treatment by the BBWAA. The BBWAA has rarely voted for Japanese rookies in recent years, but Jose Abreu will win the award despite having high-level experience in Cuba. Personally, I draw no distinction between international free agents and minor league graduates for award purposes.

Abreu is an easy choice for the top of the ballot, as his 55 RAR ranked sixth among all AL hitters and led first basemen, and his 7.2 RG ranked fourth in the league. The rest of the ballot spots go to pitchers, although Minnesota’s Danny Santana could certainly be considered with 30 RAR, and George Springer might have been a contender even with his late recall had he not been injured.

The three starters who made my ballot were Collin McHugh, Masahiro Tanaka, and Yordano Ventura. Tanaka looked like a Cy Young contender until his injury, but McHugh ended up edging him in RRA (3.01 to 3.05) in addition to pitching eighteen more innings. And while I wouldn’t have guessed it (largely due to McHugh toiling in obscurity with Houston), McHugh’s peripherals were every bit a match for Tanaka’s. It is worth noting that Tanaka, despite his experience pitching in NPB, is a year younger than McHugh.

Ventura pitched many more innings than either (183), but wasn’t as good on a rate basis and despite his ridiculous velocity struck out two batters fewer per game than either. In the end, 40 RAR for McHugh, 35 for Ventura, and 34 for Tanaka make it easy to justify any order depending on what factors one values. I’ve slid Tanaka ahead of Ventura thanks to better peripherals.
Apologies to Matt Shoemaker and Marcus Stroman, but the last spot on my ballot goes to Dellin Betances. Betances, unlike Giles in the NL, was a workhorse out of the pen, throwing 90 innings which helped him lead all AL relievers with 33 RAR. Only Andrew Miller and Brad Boxberger topped his 15.0 strikeout rate, and Betances was outstanding in the peripheral run averages as well. Were there an award for best reliever, Betances would get my vote, but on the rookie ballot he’s just fifth in a strong season for the AL:

1. 1B Jose Abreu, CHA
2. SP Collin McHugh, HOU
3. SP Masahiro Tanaka, NYA
4. SP Yordano Ventura, KC
5. RP Dellin Betances, NYA

Saturday, October 04, 2014

End of Season Statistics, 2014

These reports will be trickling out over the next month as I don’t have as much time as I’d like to devote to them right now. I wanted to get the park factors out as soon as possible and everything else will be added later.

The spreadsheets are published as Google Spreadsheets, which you can download in Excel format by changing the extension in the address from "=html" to "=xls". That way you can download them and manipulate things however you see fit.

The data comes from a number of different sources. Most of the basic data comes from Doug's Stats, which is a very handy site, or Baseball-Reference. KJOK's park database provided some of the data used in the park factors, but for recent seasons park data comes from B-R. Data on pitcher's batted ball types allowed, doubles/triples allowed, and inherited/bequeathed runners comes from Baseball Prospectus.

The basic philosophy behind these stats is to use the simplest methods that have acceptable accuracy. Of course, "acceptable" is in the eye of the beholder, namely me. I use Pythagenpat not because other run/win converters, like a constant RPW or a fixed exponent are not accurate enough for this purpose, but because it's mine and it would be kind of odd if I didn't use it.

If I seem to be a stickler for purity in my critiques of others' methods, I'd contend it is usually in a theoretical sense, not an input sense. So when I exclude hit batters, I'm not saying that hit batters are worthless or that they *should* be ignored; it's just easier not to mess with them and not that much less accurate.

I also don't really have a problem with people using sub-standard methods (say, Basic RC) as long as they acknowledge that they are sub-standard. If someone pretends that Basic RC doesn't undervalue walks or cause problems when applied to extreme individuals, I'll call them on it; if they explain its shortcomings but use it regardless, I accept that. Take these last three paragraphs as my acknowledgment that some of the statistics displayed here have shortcomings as well.

The League spreadsheet is pretty straightforward--it includes league totals and averages for a number of categories, most or all of which are explained at appropriate junctures throughout this piece. The advent of interleague play has created two different sets of league totals--one for the offense of league teams and one for the defense of league teams. Before interleague play, these two were identical. I do not present both sets of totals (you can figure the defensive ones yourself from the team spreadsheet, if you desire), just those for the offenses. The exception is for the defense-specific statistics, like innings pitched and quality starts. The figures for those categories in the league report are for the defenses of the league's teams. However, I do include each league's breakdown of basic pitching stats between starters and relievers (denoted by "s" or "r" prefixes), and so summing those will yield the totals from the pitching side. The one abbreviation you might not recognize is "N"--this is the league average of runs/game for one team, and it will pop up again.

The Team spreadsheet focuses on overall team performance--wins, losses, runs scored, runs allowed. The columns included are: Park Factor (PF), Home Run Park Factor (PFhr), Winning Percentage (W%), Expected W% (EW%), Predicted W% (PW%), wins, losses, runs, runs allowed, Runs Created (RC), Runs Created Allowed (RCA), Home Winning Percentage (HW%), Road Winning Percentage (RW%) [exactly what they sound like--W% at home and on the road], Runs/Game (R/G), Runs Allowed/Game (RA/G), Runs Created/Game (RCG), Runs Created Allowed/Game (RCAG), and Runs Per Game (the average number of runs scored an allowed per game). Ideally, I would use outs as the denominator, but for teams, outs and games are so closely related that I don’t think it’s worth the extra effort.

The runs and Runs Created figures are unadjusted, but the per-game averages are park-adjusted, except for RPG which is also raw. Runs Created and Runs Created Allowed are both based on a simple Base Runs formula. The formula is:

A = H + W - HR - CS
B = (2TB - H - 4HR + .05W + 1.5SB)*.76
C = AB - H
D = HR
Naturally, A*B/(B + C) + D.

I have explained the methodology used to figure the PFs before, but the cliff’s notes version is that they are based on five years of data when applicable, include both runs scored and allowed, and they are regressed towards average (PF = 1), with the amount of regression varying based on the number of years of data used. There are factors for both runs and home runs. The initial PF (not shown) is:

iPF = (H*T/(R*(T - 1) + H) + 1)/2
where H = RPG in home games, R = RPG in road games, T = # teams in league (14 for AL and 16 for NL). Then the iPF is converted to the PF by taking x*iPF + (1-x), where x = .6 if one year of data is used, .7 for 2, .8 for 3, and .9 for 4+.

It is important to note, since there always seems to be confusion about this, that these park factors already incorporate the fact that the average player plays 50% on the road and 50% at home. That is what the adding one and dividing by 2 in the iPF is all about. So if I list Fenway Park with a 1.02 PF, that means that it actually increases RPG by 4%.

In the calculation of the PFs, I did not get picky and take out “home” games that were actually at neutral sites.

There are also Team Offense and Defense spreadsheets. These include the following categories:

Team offense: Plate Appearances, Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Walks per At Bat (WAB), Isolated Power (SLG - BA), R/G at home (hR/G), and R/G on the road (rR/G) BA, OBA, SLG, WAB, and ISO are park-adjusted by dividing by the square root of park factor (or the equivalent; WAB = (OBA - BA)/(1 - OBA) and ISO = SLG - BA).

Team defense: Innings Pitched, BA, OBA, SLG, Innings per Start (IP/S), Starter's eRA (seRA), Reliever's eRA (reRA), Quality Start Percentage (QS%), RA/G at home (hRA/G), RA/G on the road (rRA/G), Battery Mishap Rate (BMR), Modified Fielding Average (mFA), and Defensive Efficiency Record (DER). BA, OBA, and SLG are park-adjusted by dividing by the square root of PF; seRA and reRA are divided by PF.

The three fielding metrics I've included are limited it only to metrics that a) I can calculate myself and b) are based on the basic available data, not specialized PBP data. The three metrics are explained in this post, but here are quick descriptions of each:

1) BMR--wild pitches and passed balls per 100 baserunners = (WP + PB)/(H + W - HR)*100

2) mFA--fielding average removing strikeouts and assists = (PO - K)/(PO - K + E)

3) DER--the Bill James classic, using only the PA-based estimate of plays made. Based on a suggestion by Terpsfan101, I've tweaked the error coefficient. Plays Made = PA - K - H - W - HR - HB - .64E and DER = PM/(PM + H - HR + .64E)

Next are the individual player reports. I defined a starting pitcher as one with 15 or more starts. All other pitchers are eligible to be included as a reliever. If a pitcher has 40 appearances, then they are included. Additionally, if a pitcher has 50 innings and less than 50% of his appearances are starts, he is also included as a reliever (this allows some swingmen type pitchers who wouldn’t meet either the minimum start or appearance standards to get in).

For all of the player reports, ages are based on simply subtracting their year of birth from 2013. I realize that this is not compatible with how ages are usually listed and so “Age 27” doesn’t necessarily correspond to age 27 as I list it, but it makes everything a heckuva lot easier, and I am more interested in comparing the ages of the players to their contemporaries, for which case it makes very little difference. The "R" category records rookie status with a "R" for rookies and a blank for everyone else; I've trusted Baseball Prospectus on this. Also, all players are counted as being on the team with whom they played/pitched (IP or PA as appropriate) the most.

For relievers, the categories listed are: Games, Innings Pitched, estimated Plate Appearances (PA), Run Average (RA), Relief Run Average (RRA), Earned Run Average (ERA), Estimated Run Average (eRA), DIPS Run Average (dRA), Strikeouts per Game (KG), Walks per Game (WG), Guess-Future (G-F), Inherited Runners per Game (IR/G), Batting Average on Balls in Play (%H), Runs Above Average (RAA), and Runs Above Replacement (RAR).

IR/G is per relief appearance (G - GS); it is an interesting thing to look at, I think, in lieu of actual leverage data. You can see which closers come in with runners on base, and which are used nearly exclusively to start innings. Of course, you can’t infer too much; there are bad relievers who come in with a lot of people on base, not because they are being used in high leverage situations, but because they are long men being used in low-leverage situations already out of hand.

For starting pitchers, the columns are: Wins, Losses, Innings Pitched, Estimated Plate Appearances (PA), RA, RRA, ERA, eRA, dRA, KG, WG, G-F, %H, Pitches/Start (P/S), Quality Start Percentage (QS%), RAA, and RAR. RA and ERA you know--R*9/IP or ER*9/IP, park-adjusted by dividing by PF. The formulas for eRA and dRA are based on the same Base Runs equation and they estimate RA, not ERA.

* eRA is based on the actual results allowed by the pitcher (hits, doubles, home runs, walks, strikeouts, etc.). It is park-adjusted by dividing by PF.

* dRA is the classic DIPS-style RA, assuming that the pitcher allows a league average %H, and that his hits in play have a league-average S/D/T split. It is park-adjusted by dividing by PF.

The formula for eRA is:

A = H + W - HR
B = (2*TB - H - 4*HR + .05*W)*.78
C = AB - H = K + (3*IP - K)*x (where x is figured as described below for PA estimation and is typically around .93) = PA (from below) - H - W
eRA = (A*B/(B + C) + HR)*9/IP

To figure dRA, you first need the estimate of PA described below. Then you calculate W, K, and HR per PA (call these %W, %K, and %HR). Percentage of balls in play (BIP%) = 1 - %W - %K - %HR. This is used to calculate the DIPS-friendly estimate of %H (H per PA) as e%H = Lg%H*BIP%.

Now everything has a common denominator of PA, so we can plug into Base Runs:

A = e%H + %W
B = (2*(z*e%H + 4*%HR) - e%H - 5*%HR + .05*%W)*.78
C = 1 - e%H - %W - %HR
cRA = (A*B/(B + C) + %HR)/C*a

z is the league average of total bases per non-HR hit (TB - 4*HR)/(H - HR), and a is the league average of (AB - H) per game.

In the past couple years I’ve presented a couple of batted ball RA estimates. I’ve removed these this year, not just because batted ball data exhibits questionable reliability but because these metrics were complicated to figure, required me to collate the batted ball data, and were not personally useful to me. I figure these stats for my own enjoyment and have in some form or another going back to 1997. I share them here only because I would do it anyway, so if I’m not interested in certain categories, there’s no reason to keep presenting them.

Instead, I’m showing strikeout and walk rate, both expressed as per game. By game I mean not 9 innings but rather the league average of PA/G. I have always been a proponent of using PA and not IP as the denominator for non-run pitching rates, and now the use of per PA rates is widespread. Usually these are expressed as K/PA and W/PA, or equivalently, percentage of PA with a strikeout or walk. I don’t believe that any site publishes these as K and W per equivalent game as I am here. This is not better than K%--it’s simply applying a scalar multiplier. I like it because it generally follows the same scale as the familiar K/9.

To facilitate this, I’ve finally corrected a flaw in the formula I use to estimate plate appearances for pitchers. Previously, I’ve done it the lazy way by not splitting strikeouts out from other outs. I am now using this formula to estimate PA (where PA = AB + W):

PA = K + (3*IP - K)*x + H + W
Where x = league average of (AB - H - K)/(3*IP - K)

Then KG = K*Lg(PA/G) and WG = W*Lg(PA/G).

G-F is a junk stat, included here out of habit because I've been including it for years. It was intended to give a quick read of a pitcher's expected performance in the next season, based on eRA and strikeout rate. Although the numbers vaguely resemble RAs, it's actually unitless. As a rule of thumb, anything under four is pretty good for a starter. G-F = 4.46 + .095(eRA) - .113(K*9/IP). It is a junk stat. JUNK STAT JUNK STAT JUNK STAT. Got it?

%H is BABIP, more or less--%H = (H - HR)/(PA - HR - K - W), where PA was estimated above. Pitches/Start includes all appearances, so I've counted relief appearances as one-half of a start (P/S = Pitches/(.5*G + .5*GS). QS% is just QS/(G - GS); I don't think it's particularly useful, but Doug's Stats include QS so I include it.

I've used a stat called Relief Run Average (RRA) in the past, based on Sky Andrecheck's article in the August 1999 By the Numbers; that one only used inherited runners, but I've revised it to include bequeathed runners as well, making it equally applicable to starters and relievers. I am using RRA as the building block for baselined value estimates for all pitchers this year. I explained RRA in this article , but the bottom line formulas are:

BRSV = BRS - BR*i*sqrt(PF)
IRSV = IR*i*sqrt(PF) - IRS
RRA = ((R - (BRSV + IRSV))*9/IP)/PF

The two baselined stats are Runs Above Average (RAA) and Runs Above Replacement (RAR). RAA uses the league average runs/game (N) for both starters and relievers, while RAR uses separate replacement levels for starters and relievers. Thus, RAA and RAR will be pretty close for relievers:

RAA = (LgRA - RRA)*IP/9
RAR (relievers) = (1.11*LgRA - RRA)*IP/9
RAR (starters) = (1.28*LgRA - RRA)*IP/9

All players with 300 or more plate appearances are included in the Hitters spreadsheets (along with some players close to the cutoff point who I was interested in). Each is assigned one position, the one at which they appeared in the most games. The statistics presented are: Games played (G), Plate Appearances (PA), Outs (O), Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Runs Created (RC), Runs Created per Game (RG), Speed Score (SS), Hitting Runs Above Average (HRAA), Runs Above Average (RAA), Hitting Runs Above Replacement (HRAR), and Runs Above Replacement (RAR).

I do not bother to include hit batters, so take note of that for players who do get plunked a lot. Therefore, PA are simply AB + W. Outs are AB - H + CS. BA and SLG you know, but remember that without HB and SF, OBA is just (H + W)/(AB + W). Secondary Average = (TB - H + W)/AB = SLG - BA + (OBA - BA)/(1 - OBA). I have not included net steals as many people (and Bill James himself) do--it is solely hitting events.

BA, OBA, and SLG are park-adjusted by dividing by the square root of PF. This is an approximation, of course, but I'm satisfied that it works well. The goal here is to adjust for the win value of offensive events, not to quantify the exact park effect on the given rate. I use the BA/OBA/SLG-based formula to figure SEC, so it is park-adjusted as well.

Runs Created is actually Paul Johnson's ERP, more or less. Ideally, I would use a custom linear weights formula for the given league, but ERP is just so darn simple and close to the mark that it’s hard to pass up. I still use the term “RC” partially as a homage to Bill James (seriously, I really like and respect him even if I’ve said negative things about RC and Win Shares), and also because it is just a good term. I like the thought put in your head when you hear “creating” a run better than “producing”, “manufacturing”, “generating”, etc. to say nothing of names like “equivalent” or “extrapolated” runs. None of that is said to put down the creators of those methods--there just aren’t a lot of good, unique names available. Anyway, RC = (TB + .8H + W + .7SB - CS - .3AB)*.322.

RC is park adjusted by dividing by PF, making all of the value stats that follow park adjusted as well. RG, the rate, is RC/O*25.5. I do not believe that outs are the proper denominator for an individual rate stat, but I also do not believe that the distortions caused are that bad. (I still intend to finish my rate stat series and discuss all of the options in excruciating detail, but alas you’ll have to take my word for it now).

I have decided to switch to a watered-down version of Bill James' Speed Score this year; I only use four of his categories. Previously I used my own knockoff version called Speed Unit, but trying to keep it from breaking down every few years was a wasted effort.

Speed Score is the average of four components, which I'll call a, b, c, and d:

a = ((SB + 3)/(SB + CS + 7) - .4)*20
b = sqrt((SB + CS)/(S + W))*14.3
c = ((R - HR)/(H + W - HR) - .1)*25
d = T/(AB - HR - K)*450

James actually uses a sliding scale for the triples component, but it strikes me as needlessly complex and so I've streamlined it. I also changed some of his division to mathematically equivalent multiplications.

There are a whopping four categories that compare to a baseline; two for average, two for replacement. Hitting RAA compares to a league average hitter; it is in the vein of Pete Palmer’s Batting Runs. RAA compares to an average hitter at the player’s primary position. Hitting RAR compares to a “replacement level” hitter; RAR compares to a replacement level hitter at the player’s primary position. The formulas are:

HRAA = (RG - N)*O/25.5
RAA = (RG - N*PADJ)*O/25.5
HRAR = (RG - .73*N)*O/25.5
RAR = (RG - .73*N*PADJ)*O/25.5

PADJ is the position adjustment, and it is based on 2002-2011 offensive data. For catchers it is .89; for 1B/DH, 1.17; for 2B, .97; for 3B, 1.03; for SS, .93; for LF/RF, 1.13; and for CF, 1.02. I had been using the 1992-2001 data as a basis for the last ten years, but finally have done an update. I’m a little hesitant about this update, as the middle infield positions are the biggest movers (higher positional adjustments, meaning less positional credit). I have no qualms for second base, but the shortstop PADJ is out of line with the other position adjustments widely in use and feels a bit high to me. But there are some decent points to be made in favor of offensive adjustments, and I’ll have a bit more on this topic in general below.

That was the mechanics of the calculations; now I'll twist myself into knots trying to justify them. If you only care about the how and not the why, stop reading now.

The first thing that should be covered is the philosophical position behind the statistics posted here. They fall on the continuum of ability and value in what I have called "performance". Performance is a technical-sounding way of saying "Whatever arbitrary combination of ability and value I prefer".

With respect to park adjustments, I am not interested in how any particular player is affected, so there is no separate adjustment for lefties and righties for instance. The park factor is an attempt to determine how the park affects run scoring rates, and thus the win value of runs.

I apply the park factor directly to the player's statistics, but it could also be applied to the league context. The advantage to doing it my way is that it allows you to compare the component statistics (like Runs Created or OBA) on a park-adjusted basis. The drawback is that it creates a new theoretical universe, one in which all parks are equal, rather than leaving the player grounded in the actual context in which he played and evaluating how that context (and not the player's statistics) was altered by the park.

The good news is that the two approaches are essentially equivalent; in fact, they are equivalent if you assume that the Runs Per Win factor is equal to the RPG. Suppose that we have a player in an extreme park (PF = 1.15, approximately like Coors Field pre-humidor) who has an 8 RG before adjusting for park, while making 350 outs in a 4.5 N league. The first method of park adjustment, the one I use, converts his value into a neutral park, so his RG is now 8/1.15 = 6.957. We can now compare him directly to the league average:

RAA = (6.957 - 4.5)*350/25.5 = +33.72

The second method would be to adjust the league context. If N = 4.5, then the average player in this park will create 4.5*1.15 = 5.175 runs. Now, to figure RAA, we can use the unadjusted RG of 8:

RAA = (8 - 5.175)*350/25.5 = +38.77

These are not the same, as you can obviously see. The reason for this is that they take place in two different contexts. The first figure is in a 9 RPG (2*4.5) context; the second figure is in a 10.35 RPG (2*4.5*1.15) context. Runs have different values in different contexts; that is why we have RPW converters in the first place. If we convert to WAA (using RPW = RPG, which is only an approximation, so it's usually not as tidy as it appears below), then we have:

WAA = 33.72/9 = +3.75
WAA = 38.77/10.35 = +3.75

Once you convert to wins, the two approaches are equivalent. The other nice thing about the first approach is that once you park-adjust, everyone in the league is in the same context, and you can dispense with the need for converting to wins at all. You still might want to convert to wins, and you'll need to do so if you are comparing the 2014 players to players from other league-seasons (including between the AL and NL in the same year), but if you are only looking to compare Jose Bautista to Miguel Cabrera, it's not necessary. WAR is somewhat ubiquitous now, but personally I prefer runs when possible--why mess with decimal points if you don't have to?

The park factors used to adjust player stats here are run-based. Thus, they make no effort to project what a player "would have done" in a neutral park, or account for the difference effects parks have on specific events (walks, home runs, BA) or types of players. They simply account for the difference in run environment that is caused by the park (as best I can measure it). As such, they don't evaluate a player within the actual run context of his team's games; they attempt to restate the player's performance as an equivalent performance in a neutral park.

I suppose I should also justify the use of sqrt(PF) for adjusting component statistics. The classic defense given for this approach relies on basic Runs Created--runs are proportional to OBA*SLG, and OBA*SLG/PF = OBA/sqrt(PF)*SLG/sqrt(PF). While RC may be an antiquated tool, you will find that the square root adjustment is fairly compatible with linear weights or Base Runs as well. I am not going to take the space to demonstrate this claim here, but I will some time in the future.

Many value figures published around the sabersphere adjust for the difference in quality level between the AL and NL. I don't, but this is a thorny area where there is no right or wrong answer as far as I'm concerned. I also do not make an adjustment in the league averages for the fact that the overall NL averages include pitcher batting and the AL does not (not quite true in the era of interleague play, but you get my drift).

The difference between the leagues may not be precisely calculable, and it certainly is not constant, but it is real. If the average player in the AL is better than the average player in the NL, it is perfectly reasonable to expect the average AL player to have more RAR than the average NL player, and that will not happen without some type of adjustment. On the other hand, if you are only interested in evaluating a player relative to his own league, such an adjustment is not necessarily welcome.

The league argument only applies cleanly to metrics baselined to average. Since replacement level compares the given player to a theoretical player that can be acquired on the cheap, the same pool of potential replacement players should by definition be available to the teams of each league. One could argue that if the two leagues don't have equal talent at the major league level, they might not have equal access to replacement level talent--except such an argument is at odds with the notion that replacement level represents talent that is truly "freely available".

So it's hard to justify the approach I take, which is to set replacement level relative to the average runs scored in each league, with no adjustment for the difference in the leagues. The best justification is that it's simple and it treats each league as its own universe, even if in reality they are connected.

The replacement levels I have used here are very much in line with the values used by other sabermetricians. This is based both on my own "research", my interpretation of other's people research, and a desire to not stray from consensus and make the values unhelpful to the majority of people who may encounter them.

Replacement level is certainly not settled science. There is always going to be room to disagree on what the baseline should be. Even if you agree it should be "replacement level", any estimate of where it should be set is just that--an estimate. Average is clean and fairly straightforward, even if its utility is questionable; replacement level is inherently messy. So I offer the average baseline as well.

For position players, replacement level is set at 73% of the positional average RG (since there's a history of discussing replacement level in terms of winning percentages, this is roughly equivalent to .350). For starting pitchers, it is set at 128% of the league average RA (.380), and for relievers it is set at 111% (.450).

I am still using an analytical structure that makes the comparison to replacement level for a position player by applying it to his hitting statistics. This is the approach taken by Keith Woolner in VORP (and some other earlier replacement level implementations), but the newer metrics (among them Rally and Fangraphs' WAR) handle replacement level by subtracting a set number of runs from the player's total runs above average in a number of different areas (batting, fielding, baserunning, positional value, etc.), which for lack of a better term I will call the subtraction approach.

The offensive positional adjustment makes the inherent assumption that the average player at each position is equally valuable. I think that this is close to being true, but it is not quite true. The ideal approach would be to use a defensive positional adjustment, since the real difference between a first baseman and a shortstop is their defensive value. When you bat, all runs count the same, whether you create them as a first baseman or as a shortstop.

That being said, using "replacement hitter at position" does not cause too many distortions. It is not theoretically correct, but it is practically powerful. For one thing, most players, even those at key defensive positions, are chosen first and foremost for their offense. Empirical work by Keith Woolner has shown that the replacement level hitting performance is about the same for every position, relative to the positional average.

Figuring what the defensive positional adjustment should be, though, is easier said than done. Therefore, I use the offensive positional adjustment. So if you want to criticize that choice, or criticize the numbers that result, be my guest. But do not claim that I am holding this up as the correct analytical structure. I am holding it up as the most simple and straightforward structure that conforms to reality reasonably well, and because while the numbers may be flawed, they are at least based on an objective formula that I can figure myself. If you feel comfortable with some other assumptions, please feel free to ignore mine.

That still does not justify the use of HRAR--hitting runs above replacement--which compares each hitter, regardless of position, to 73% of the league average. Basically, this is just a way to give an overall measure of offensive production without regard for position with a low baseline. It doesn't have any real baseball meaning.

A player who creates runs at 90% of the league average could be above-average (if he's a shortstop or catcher, or a great fielder at a less important fielding position), or sub-replacement level (DHs that create 4 runs a game are not valuable properties). Every player is chosen because his total value, both hitting and fielding, is sufficient to justify his inclusion on the team. HRAR fails even if you try to justify it with a thought experiment about a world in which defense doesn't matter, because in that case the absolute replacement level (in terms of RG, without accounting for the league average) would be much higher than it is currently.

The specific positional adjustments I use are based on 2002-2011 data. I stick with them because I have not seen compelling evidence of a change in the degree of difficulty or scarcity between the positions between now and then, and because I think they are fairly reasonable. The positions for which they diverge the most from the defensive position adjustments in common use are 2B, 3B, and CF. Second base is considered a premium position by the offensive PADJ (.97), while third base and center field have similar adjustments in the opposite direction (1.03 and 1.02).

Another flaw is that the PADJ is applied to the overall league average RG, which is artificially low for the NL because of pitcher's batting. When using the actual league average runs/game, it's tough to just remove pitchers--any adjustment would be an estimate. If you use the league total of runs created instead, it is a much easier fix.

One other note on this topic is that since the offensive PADJ is a proxy for average defensive value by position, ideally it would be applied by tying it to defensive playing time. I have done it by outs, though.

The reason I have taken this flawed path is because 1) it ties the position adjustment directly into the RAR formula rather than leaving it as something to subtract on the outside and more importantly 2) there’s no straightforward way to do it. The best would be to use defensive innings--set the full-time player to X defensive innings, figure how Derek Jeter’s innings compared to X, and adjust his PADJ accordingly. Games in the field or games played are dicey because they can cause distortion for defensive replacements. Plate Appearances avoid the problem that outs have of being highly related to player quality, but they still carry the illogic of basing it on offensive playing time. And of course the differences here are going to be fairly small (a few runs). That is not to say that this way is preferable, but it’s not horrible either, at least as far as I can tell.

To compare this approach to the subtraction approach, start by assuming that a replacement level shortstop would create .86*.73*4.5 = 2.825 RG (or would perform at an overall level of equivalent value to being an average fielder at shortstop while creating 2.825 runs per game). Suppose that we are comparing two shortstops, each of whom compiled 600 PA and played an equal number of defensive games and innings (and thus would have the same positional adjustment using the subtraction approach). Alpha made 380 outs and Bravo made 410 outs, and each ranked as dead-on average in the field.

The difference in overall RAR between the two using the subtraction approach would be equal to the difference between their offensive RAA compared to the league average. Assuming the league average is 4.5 runs, and that both Alpha and Bravo created 75 runs, their offensive RAAs are:

Alpha = (75*25.5/380 - 4.5)*380/25.5 = +7.94

Similarly, Bravo is at +2.65, and so the difference between them will be 5.29 RAR.

Using the flawed approach, Alpha's RAR will be:

(75*25.5/380 - 4.5*.73*.86)*380/25.5 = +32.90

Bravo's RAR will be +29.58, a difference of 3.32 RAR, which is two runs off of the difference using the subtraction approach.

The downside to using PA is that you really need to consider park effects if you, whereas outs allow you to sidestep park effects. Outs are constant; plate appearances are linked to OBA. Thus, they not only depend on the offensive context (including park factor), but also on the quality of one's team. Of course, attempting to adjust for team PA differences opens a huge can of worms which is not really relevant; for now, the point is that using outs for individual players causes distortions, sometimes trivial and sometimes bothersome, but almost always makes one's life easier.

I do not include fielding (or baserunning outside of steals, although that is a trivial consideration in comparison) in the RAR figures--they cover offense and positional value only). This in no way means that I do not believe that fielding is an important consideration in player valuation. However, two of the key principles of these stat reports are 1) not incorporating any data that is not readily available and 2) not simply including other people's results (of course I borrow heavily from other people's methods, but only adapting methodology that I can apply myself).

Any fielding metric worth its salt will fail to meet either criterion--they use zone data or play-by-play data which I do not have easy access to. I do not have a fielding metric that I have stapled together myself, and so I would have to simply lift other analysts' figures.

Setting the practical reason for not including fielding aside, I do have some reservations about lumping fielding and hitting value together in one number because of the obvious differences in reliability between offensive and fielding metrics. In theory, they absolutely should be put together. But in practice, I believe it would be better to regress the fielding metric to a point at which it would be roughly equivalent in reliability to the offensive metric.

Offensive metrics have error bars associated with them, too, of course, and in evaluating a single season's value, I don't care about the vagaries that we often lump together as "luck". Still, there are errors in our assessment of linear weight values and players that collect an unusual proportion of infield hits or hits to the left side, errors in estimation of park factor, and any number of other factors that make their events more or less valuable than an average event of that type.

Fielding metrics offer up all of that and more, as we cannot be nearly as certain of true successes and failures as we are when analyzing offense. Recent investigations, particularly by Colin Wyers, have raised even more questions about the level of uncertainty. So, even if I was including a fielding value, my approach would be to assume that the offensive value was 100% reliable (which it isn't), and regress the fielding metric relative to that (so if the offensive metric was actually 70% reliable, and the fielding metric 40% reliable, I'd treat the fielding metric as .4/.7 = 57% reliable when tacking it on, to illustrate with a simplified and completely made up example presuming that one could have a precise estimate of nebulous "reliability").

Given the inherent assumption of the offensive PADJ that all positions are equally valuable, once RAR has been figured for a player, fielding value can be accounted for by adding on his runs above average relative to a player at his own position. If there is a shortstop that is -2 runs defensively versus an average shortstop, he is without a doubt a plus defensive player, and a more valuable defensive player than a first baseman who was +1 run better than an average first baseman. Regardless, since it was implicitly assumed that they are both average defensively for their position when RAR was calculated, the shortstop will see his value docked two runs. This DOES NOT MEAN that the shortstop has been penalized for his defense. The whole process of accounting for positional differences, going from hitting RAR to positional RAR, has benefited him.

I've found that there is often confusion about the treatment of first baseman and designated hitters in my PADJ methodology, since I consider DHs as in the same pool as first baseman. The fact of the matter is that first baseman outhit DH. There is any number of potential explanations for this; DHs are often old or injured, players hit worse when DHing than they do when playing the field, etc. This actually helps first baseman, since the DHs drag the average production of the pool down, thus resulting in a lower replacement level than I would get if I considered first baseman alone.

However, this method does assume that a 1B and a DH have equal defensive value. Obviously, a DH has no defensive value. What I advocate to correct this is to treat a DH as a bad defensive first baseman, and thus knock another five or ten runs off of his RAR for a full-time player. I do not incorporate this into the published numbers, but you should keep it in mind. However, there is no need to adjust the figures for first baseman upwards --the only necessary adjustment is to take the DHs down a notch.

Finally, I consider each player at his primary defensive position (defined as where he appears in the most games), and do not weight the PADJ by playing time. This does shortchange a player like Ben Zobrist (who saw significant time at a tougher position than his primary position), and unduly boost a player like Buster Posey (who logged a lot of games at a much easier position than his primary position). For most players, though, it doesn't matter much. I find it preferable to make manual adjustments for the unusual cases rather than add another layer of complexity to the whole endeavor.

2014 Park Factors

2014 League

2014 Team

2014 Team Offense

2014 Team Defense

2014 AL Relievers

2014 NL Relievers

2014 AL Starters

2014 NL Starters

2014 AL Hitters

2014 NL Hitters

Monday, September 29, 2014

Crude Playoff Odds--2014

In a world where there are plenty of sources for playoff odds that actually take into account the personnel currently available for each team, use projected rather than 2014-only performance, consider pitching matchups, and the like, there is no real reason for me to post this. Nonetheless, here are some very crude playoff odds. The key assumptions:

• Team strength is constant and is measured by my Crude Team Ratings, using an equal weight of W%, EW%, and PW% regressed with 69 games of .500
• Home field advantage is uniform and the home team wins 54.5% of the time

From there, the math is pretty simple and I will present with little explanation. First, the ratings which are used to feed the estimates:

These ratings don’t know or care that Oakland has stopped scoring runs (or that the magic influence of Cespedes is gone); they don’t know that Garret Richards is injured; they don’t know anything other than these team’s schedules, their wins and losses, runs and runs allowed, and BA/OBA/SLG and allowed.

From here, it’s just plug and chug. Wildcard game odds:

Home field offsets OAK’s perceived strength advantage over KC.

Division series:

Here, “P” is the probability that the series occurs; P(H win) is the probability that the home team wins should the series occur; and P(H) is the probability that the series occurs and that the home team wins [P*P(H win)].


World Series:

The probability of one of the “rivalry” matchups (OAK/SF, LAA/LA, BAL/WAS, or KC/STL) occurring is 21.5%, which is not bad at all. LAA/LA is the second-most likely series; the least likely is KC/SF, which is good because that is the one that I would least like to see.

Putting it all together:

One might be surprised by OAK having better odds to win a Division Series and each subsequent round than KC even though the latter is favored in their wildcard game, but the ratings (perhaps incorrectly) think the A’s are one of the strongest teams in the playoffs, and thus the Royals wild card game edge, solely due to home field, is insufficiently large to keep them ahead. KC would need a rating of around 122 to have an equal World Series win probability to the A’s.

The odds suggest a 57.6% chance that the junior circuit wins it all, not a surprise given that the AL teams rank 1-2-3-6-7 in strength and they have home field in the World Series to boot. In fact, the AL is favored in 22 of 25 potential matchups, with the only exceptions being Washington against Detroit or Kansas City and Los Angeles against Kansas City.