Walk Like a Sabermetrician: 2010

Thursday, December 23, 2010

Great Moments in Yahoo! Box Scores

It's not just baseball.

Luckily, I saw a couple tweets about this...I don't follow the NBA closely enough to ever click on Rockets/Clippers box scores.

Monday, December 20, 2010

Hitting by Lineup Slot, 2010

I devoted a whole post to leadoff hitters, whether justified or not, so it's only fair to have a post about hitting by batting order position in general. I certainly consider this piece to be more trivia than sabermetrics, but if nothing else it's fun to find new ways to expose the disaster that was Seattle's offense.

The data in this post was taken from Baseball-Reference. The figures for each team's runs are park-adjusted; BA, OBA, and SLG are raw, and OBA is figured as (H + W)/(AB + W). RC is ERP, including SB and CS, as used in my end of season stat posts. The weights used are constant across lineup positions; there was no attempt to apply specific weights to each position, although they are out there and would certainly make this a little bit more interesting.

NL #3 hitters were the most productive in 2009 as well. American League teams had their best hitters in the cleanup spot on average. In the leadoff piece, I touched on how leadoff hitters as a group were below-average; here we see that it was the AL that produced that result, as junior circuit leadoff men were less productive than any other spots other than #8 and #9. #2 hitters continued to be above-average, which is something of a departure from the long-term trend, and certainly is sabermetrically-approved.

Excluding NL #9 hitters (because of the many PA for pitchers), the two least-productive lineup slots were AL #8 and #9. The NL averaged higher RG at slots 1, 3, 5, 6, and 8.

Next, here are the team leaders in RG at each lineup position. The player listed is the one who appeared in the most games in that spot, which is sometimes misleading. For example, Nelson Cruz appeared in 54 games for Texas at #5 to Josh Hamilton's 52. Cruz had a very good 979 OPS in those appearances, but Hamilton turned in a whopping 1178 and was obviously the man most responsible for Texas' superiority at #5.

The only team with two league-leading positions was Oakland; as we'll see below, they also had two league-trailing positions. Keeping in mind that the AL and NL RG averages were 4.45 and 4.33 respectively, every lineup spot had at least one above-average team performance except for NL #9.

Two A's, three Mariners, two Astros, two Dodgers. What really boggles the mind is that Seattle had a spot (5th) in the heart of the order slugging under .300. Their .297 SLG was the better than only six other lineup spots (excluding NL #9 hitters), the highest of which were the PIT/LAA eights...and the SEA sevens. It was also a bad year to put a guy named Lee in the heart of your order as a NL team.

The two charts that follow display the top ten positions based on runs above average. RAA in this case is only in comparison to the AL or NL average at each position. The practical result of this is that NL #3 slots are being compared to a 6.04 RG compared to 5.20 for NL #4s. That doesn't actually mean that if your NL team got 5.5 out of #3 and #4 that the former were hurting and the latter helping. So these figures are presented for fun more than analysis:

As you'll see with the bottom ten, most of the extreme team positions occurred in key lineup slots--3, 4, 5, 1, etc. This makes sense, since those positions are more likely to be manned by one player and there aren't many teams that can go nine deep, so there's not as much variation at the bottom of the order.

There are those Mariner #5s again. The average AL #5 spot hit .264/.329/.437, 5.0 RG to Seattle's .210/.258/.297, 2.5 RG.

Finally, these charts give each team's ranking within their league in RG at each spot. The top and bottom three in each league are highlighted. While the NL has sixteen teams to the AL's fourteen, three still represents the top and bottom 20% in each circuit (rounded to a whole number--for the AL 20% is actually 2.8, for the NL it's 3.2).

The only teams that did not have even a single spot in the top bottom three were San Francisco and Tampa Bay. The Rays were #4 at spots 1-3, so the top of their order was quite productive relative to the league average.

Seattle was the punchline that keeps giving. Their leadoff hitters (almost all Ichiro) were the most productive in the AL, but eleventh was their best showing at any other slot.

Complete data is available in this spreadsheet.

Wednesday, December 15, 2010

Shallow Reflections on Bob Feller

Like many other Cleveland kids of the last couple of generations, I met Bob Feller and got his autograph once. It was at a Discount Drug Mart. Feller made countless such appearances, at drug stores and county fairs and other places where the pay couldn't have possibly been that lucrative. I'm not an autograph/memorabilia aficionado, but I recall reading that Feller's autograph was one of the least valuable for any player of his stature because of the huge supply.

At least in Cleveland, the legend outranked the ballplayer. There's only one player statue outside of Jacobs Field, and it is of course of Bob Feller. Feller is nearly unique in Indians history--a great, Hall of Fame level player that spent his career with just the Tribe. Even counting half-career Indian stars, only Tris Speaker and Nap Lajoie would be able to contest the title of greatest Indian. While Lajoie spent most of his career in Cleveland (65% measured by games played), with the team called the "Naps" for many years, he played before the radio era, before the liveball era, before the team was called the Indians, and never won a pennant in Cleveland.

Thus, Feller's status as a singular franchise icon is arguably unique for one of the sixteen franchises that made up MLB for sixty years. The Yankees (Ruth, Gehrig), Giants (Mathewson, Hubbell), Dodgers (Robinson, Snider), Red Sox (Williams, Yaz), White Sox (Thomas, Fox), Cardinals (Musial, Gibson), Tigers (Cobb, Kaline), Pirates (Wagner, Clemente), Cubs (Banks, Santo), Phillies (Schmidt, Roberts), Reds (Rose, Bench), and Braves (Aaron, Mathews) all have at least two players who, had they been Indians, could compete with Feller for that title (before you complain about omissions from that list, I stopped at two for each team even if there were more, and I'm not suggesting that all of the listed players were better than Feller by any stretch).

The A's might have the situation that most closely parallels the Indians--many of their stars were half-career guys. Still, Foxx, Grove, Simmons, Plank, Cochrane, Collins, Baker...the sheer bulk of icon candidates has to count for something. Among the teams that have used a name change to make a clean break from their past, the Browns/Orioles boast Ripken and Palmer and the Senators/Twins combine for Johnson and Puckett. I might be overselling my case, but I think it's safe to say that if there are any other long-time franchises with a player that is equally Mr. [Nickname] as Feller is Mr. Indian, it's due to that individual's transcendent greatness and not to the dearth of other candidates.

If my thoughts ring a little cold when offered upon the passing of a legend, please reconsider. I said the legend outranked the ballplayer in Cleveland, but I said nothing about the legend outranking the man.

Monday, December 13, 2010

Leadoff Hitters, 2010

This post kicks off a series of posts that I write every year, and therefore struggle to infuse with any sort of new perspective. However, they're a tradition on this blog and hold some general interest, so away we go.

This post looks at the offensive performance of teams' leadoff batters. I will try to make this as clear as possible: the statistics are based on the players that hit in the #1 slot in the batting order, whether they were actually leading off an inning or not. It includes the performance of all players who batted in that spot, including substitutes like pinch-hitters. Listed in parentheses after a team are all players that appeared in twenty or more games in the leadoff slot--while you may see a listing like "MIN (Span) this does not mean that the statistic is only based solely on Span's performance; it is the total of all Minnesota batters in the #1 spot, of which Span was the only one to appear in that spot in twenty or more games. I will list the top and bottom three teams in each category (plus the top/bottom team from each league if they don't make the ML top/bottom three); complete data is available in a spreadsheet linked at the end of the article. There are also no park factors applied anywhere in this article.

That's as clear as I can make it, and I hope it will suffice. I always feel obligated to point out that as a sabermetrician, I think that the importance of the batting order is often overstated, and that the best leadoff hitters would generally be the best cleanup hitters, the best #9 hitters, etc. However, since the leadoff spot gets a lot of attention, and teams pay particular attention to the spot, it is instructive to look at how each team fared there.

The conventional wisdom is that the primary job of the leadoff hitter is to get on base, and most simply, score runs. So let's start by looking at runs scored per 25.5 outs (AB - H + CS):

1. NYA (Jeter/Gardner), 6.5
2. FLA (Coghlan/Maybin/Bonifacio/Ramirez), 6.0
3. DET (Jackson), 5.9
Leadoff average, 5.0
ML average, 4.4
28. CLE (Brantley/Crowe/Cabrera), 4.0
29. WAS (Morgan), 4.0
30. SEA (Suzuki), 4.0

Obviously this category is heavily influence by the quality of the subsequent batters in the lineup; the best indication of this is Ichiro's last-place finish, as you'll see that his leadoff spot actually ranks among the leaders in a couple of more independent categories. Ichiro was the only batter to appear in the leadoff spot in all of his team's games; Juan Pierre (156), Rickie Weeks (155), and Denard Span (151) were the other batters to appear in 150 or games.

The other obvious metric to look at is On Base Average, which speaks to the other conventional goal of a leadoff hitter. The figures here exclude HB and SF to be directly comparable to earlier versions of this article, but those categories are available in the spreadsheet if you'd like to include them:

1. ARI (Johnson/Drew/Young), .366
2. SEA (Suzuki), .358
3. LA (Furcal/Podsednik), .351
Leadoff average, .324
ML average, .322
28. CIN (Phillips/Cabrera/Stubbs), .299
29. WAS (Morgan), .293
30. CLE (Brantley/Crowe/Cabrera), .292

The Reds just cannot seem to find a way to get their leadoff hitters on base. Last year they ranked 29th with a .301 OBA led by Willy Taveras, Drew Stubbs, and Chris Dickerson; and in 2008 they were 24th with Jerry Hairston, Corey Patterson, Jay Bruce and Dickerson. At least this year they weren't wasting PAs on proven failures like Taveras and Patterson.

As alluded to above, Seattle's leadoff hitters had the highest OBA in the American League, yet scored the fewest runs per out.

The next statistic is what I call Runners On Base Average. The genesis of it is from the A factor of Base Runs. It measures the number of times a batter reaches base per PA--excluding homers, since a batter that hits a home run never actually runs the bases.

My 2009 leadoff post was linked to a Cardinals message board, and this metric was the cause of a lot of confusion (this was mostly because the poster in question was thick-headed as could be, but it's still worth addressing). ROBA, like several other methods that follow, is not really a quality metric, it is a descriptive metric. A high ROBA is a good thing, but it's not necessarily better than a slightly lower ROBA plus a higher home run rate (which would produce a higher OBA and more runs). Listing ROBA is not in any way, shape or form a statement that hitting home runs is bad for a leadoff hitter. It is simply a recognition of the fact that a batter that hits a home run is not a baserunner.

ROBA also removes CS, and so the formula is (H + W - HR - CS)/(AB + W):

1. SEA (Suzuki), .337
2. NYA (Jeter/Gardner), .328
3. DET (Jackson), .323
5. LA (Furcal/Podesdnik), .323
Leadoff average, .296
ML average, .290
26. TOR (Lewis/Snider/Wise), .276
28. SF (Torres/Rowand), .269
29. CIN (Phillips/Cabrera/Stubbs), .267
30. WAS (Morgan), .260

Arizona's leadoff hitters, who led the majors in OBA, rank sixth in ROBA because they were second in the majors with 25 homers (Milwaukee hit 28 from the leadoff spot). Washington loses ground on the list (although they were just 29th in OBA) not because their leadoff hitters were driving the ball out of the park (5 homers ranked tied for fifth-fewest), but because they led the majors by being caught stealing 19 times.

I will also include what I've called Literal OBA here--this is just ROBA with HR subtracted from the denominator so that a homer does not lower LOBA, it simply has no effect. You don't really need ROBA and LOBA (or either, for that matter), but this might save some poor message board out there twenty posts, so here goes. LOBA = (H + W - HR - CS)/(AB + W - HR):

1. SEA (Suzuki), .340
2. NYA (Jeter/Gardner), .332
3. DET (Jackson), .330
4. ARI (Johnson/Drew/Young), .330
Leadoff average, .301
ML average, .298
26. CLE (Brantley/Crowe/Cabrera), .280
28. SD (Hairston/Venable/Gwynn/Eckstein), .275
29. CIN (Phillips/Cabrera/Stubbs), .273
30. WAS (Morgan), .262

The next two categories are most definitely categories of shape, not value. The first is the ratio of runs scored to RBI. Leadoff hitters as a group score many more runs than they drive in, partly due to their skills and partly due to lineup dynamics. Those with low ratios don’t fit the traditional leadoff profile as closely as those with high ratios:

1. FLA (Coghlan/Maybin/Bonifacio/Ramirez), 2.5
2. TEX (Andrus), 2.4
3. DET (Jackson), 2.1
Leadoff average, 1.7
27. KC (Podsednik/Blanco/DeJesus), 1.4
28. SD (Hairston/Venable/Gwynn/Eckstein), 1.4
29. SF (Torres/Rowand), 1.3
30. PHI (Victorino/Rollins), 1.2
ML average, 1.1

Florida's leadoff hitters scored a lot of runs (as we saw earlier), so it's no surprise they had a high R/RBI ratio. Texas ranks second because they drove in just 42 runs (tied with Cleveland for fewest), and with a ML-low .290 SLG it's not hard to see why (they trailed in SLG by a wide margin; CHA was next at .310).

A similar gauge, but one that doesn't rely on the teammate-dependent R and RBI totals, is Bill James' Run Element Ratio. RER was described by James as the ratio between those things that were especially helpful at the beginning of an inning (walks and stolen bases) to those that were especially helpful at the end of an inning (extra bases). It is a ratio of "setup" events to "cleanup" events. Singles aren't included because they often function in both roles.

Of course, there are RBI walks and doubles can be a great way to start an inning, but RER classifies events based on when they have the highest relative value, at least from a simple analysis:

1. CHA (Pierre), 4.1
2. TEX (Andrus), 2.4
3. HOU (Bourn/Bourgeois), 2.6
Leadoff average, 1.1
23. TOR (Lewis/Snider/Wise), .9
ML average, .8
28. NYN (Reyes/Pagan), .7
29. ATL (Prado/Infante), .7
30. SF (Torres/Rowand), .6

The influence of stolen bases is pretty strong in RER, which is why the White Sox rank so highly--their 67 swipes led all leadoff spots, with the Astros next at 61. Atlanta's 10 steals only beat out two teams (Boston and St. Louis) that stole nine.

Speaking of stolen bases, I decided it would be worthwhile this year to look at a pure measure of base stealing. Obviously there's a lot more that goes into being a leadoff hitter than simply stealing bases, but it is one of the areas that is often cited as important. So I've included the ranking for what some analysts call net steals, SB - 2*CS. I'm not going to worry about the precise breakeven rate, which is probably closer to 75% than 67%, but is also variable based on situation. The ML and leadoff averages in this case are per team lineup slot:

1. OAK (Crisp/Davis/Pennington), 45
2. CHA (Pierre), 33
3. HOU (Bourn/Bourgeois), 31
Leadoff average, 11
ML average, 3
28. ATL (Prado/Infante), -2
29. LAA (Aybar/Abreu), -4
30. COL (Fowler/Gonzalez/Young), -10

It is really quite mind-boggling that a playoff contender playing in the major's most offense-friendly park would allow its leadoff men to attempt 41 steals with a 59% success rate. And there's those Moneyball A's...never mind, too easy.

Oakland and Philadelphia tied for the lead in SBA at 87.5%; the Phillies were 35/40, and fourth in net steals. The only teams below the 2/3 success rate for net steals were BOS (9/14), LAA (22/35), ATL (10/16), and of course COL (24/41). Leadoff hitters composite SBA was 75.7%, compared to the overall major league rate of 72.4%.

Let's shift gears back to quality measures, beginning with one that David Smyth proposed when I first wrote this annual leadoff review. Since the optimal weight for OBA in a x*OBA + SLG metric is generally something like 1.7, David suggested figuring 2*OBA + SLG for leadoff hitters, as a way to give a little extra boost to OBA while not distorting things too much, or even suffering an accuracy decline from standard OPS. Since this is a unitless measure anyway, I multiply it by .7 to approximate the standard OPS scale and call it 2OPS:

1. ARI (Johnson/Drew/Young), 850
2. LA (Furcal/Podsednik), 794
3. MIL (Weeks), 791
4. SEA (Suzuki), 776
ML average, 733
Leadoff average, 722
28. SD (Hairston/Venable/Gwynn/Eckstein), 638
29. WAS (Morgan), 633
30. CLE (Brantley/Crowe/Cabrera), 629

Along the same lines, one can also evaluate leadoff hitters in the same way I'd go about evaluating any hitter, and just use Runs Created per Game with standard weights (this will include SB and CS, which are ignored by 2OPS):

1. ARI (Johnson/Drew/Young), 6.2
2. LA (Furcal/Podsednik), 5.4
3. MIL (Weeks), 5.3
4. SEA (Suzuki), 5.2
ML average, 4.5
Leadoff average, 4.4
28. TEX (Andrus), 3.1
29. CLE (Brantley/Crowe/Cabrera), 3.1
30. WAS (Morgan), 3.1

Not surprisingly, this list is extremely similar to the 2OPS list.

Finally, allow me to close with a crude theoretical measure of linear weights supposing that the player always led off an inning (that is, batted in the bases empty, no outs state). There are weights out there (see The Book) for the leadoff slot in its average situation, but this variation is much easier to calculate (although also based on a silly and impossible premise).

The weights I used were based on the 2010 run expectancy table from Baseball Prospectus. Ideally I would have used multiple seasons but this is a seat-of-the-pants metric. Here are the relevant lines of that RE table:

To calculate the value of a single or walk in (---, 0), simply subtract .492 from .859 to get .367. Similarly, the value of a double is 1.101 - .492 = .610 and a triple is 1.358 - .492 = .866. A home run is worth one run, as the state remains (---, 0) but there is a run on the board.

Assuming (conveniently and inaccurately) that all stolen base attempts occur with 0 out and are of second base, the value of a steal is 1.101 - .859 = .242, which necessarily is the same as the extra value of a double over that of a single or walk.

I will deal with outs in such a manner so as to force the average leadoff hitter to zero RAA. They will not come out to zero without special treatment; after all, this is a theoretical construct. Leadoff hitters are not perfectly average nor are events evenly distributed across base/out states.

First, a caught stealing costs the team the value of the baserunner previously earned (-.367), plus the cost of the out itself, which also applies to (AB - H). So to calculate the out value, solve this equation for x:

0 = .367(S + W - CS) + .61D + .866T + HR + .242SB + x*(AB - H + CS)

For 2010 leadoff hitters, x = -.230, and so our theoretical leadoff RAA (which I'll call raw Leadoff Efficiency because I was already using that name for a different metric in the past) is:

rLE = .367(S + W) + .61D + .866T + HR + .242SB - .583CS - .23(AB - H)

To convert this to a rate (it is a RAA total in its current form), I divided by PA (AB + W) and multiplied by the average number of PA for leadoff hitters in 2010 (742). This yields Leadoff Efficiency:

1. ARI (Johnson/Drew/Young), 27
2. LA (Furcal/Podsednik), 15
3. MIL (Weeks), 15
4. SEA (Suzuki), 12
ML average, 1
Leadoff average, 0
28. SD (Hairston/Venable/Gwynn/Eckstein), -19
29. CLE (Brantley/Crowe/Cabrera), -21
30. WAS (Morgan), -22

The fact that this list is very similar to the lists based on metrics designed to apply generic weights to all batters illustrates how the relative values of offensive events are fairly stable.

One thing I noticed when writing this article was how many teams were using multiple players in their leadoff spot. Compared to 2007-2009, there were indeed a lot of different players used in the role:

The first column is the average number of games in the leadoff spot for the team leader; the second column is the number of teams that had a player appear in 100 or more games a leadoff man, and the third is the total number of players with 20 or more appearances in the leadoff spot.

What was unusual in 2010 was not the number of players appearing in twenty or more games, but rather the lack of players that lead off in the bulk of their team's games. For now it is just a blip; it will be interesting to see if it remains that, or is indicative of a trend. My guess is the former, but it caught my eye and so I mentioned it here.

Assuming for the sake of discussion that it is the beginning of a trend, one would have to question whether the new approach is working. Leadoff hitters' composite OBA was just two points better than the major league average, the smallest margin since I started tracking it in 2005 (the previous low was six points in 2006). Leadoff hitters were also below-average in a generic RC analysis (4.4 RG versus a ML average of 4.5), and it's tough to believe that represents optimal lineup construction.

Here is a link to a Google Spreadsheet with the data used in this post.

Monday, December 06, 2010

Statistical Meanderings, 2010

This is about as close as I get to writing a Jayson Stark-style piece throughout the course of the year. Sine I hate that format, hopefully there will be something of greater interest than sheer trivia here. Most of the statistics mentioned come from my End of Season Stats and are explained in that post:

* Last year the AL/NL scoring gap in terms of R/G was the largest it had been since 1998; this year, at .12 (4.45 to 4.33) it was the narrowest it had been since 1990 (4.30 to 4.20). The overall scoring average of 4.38 was the lowest for the majors since 1992 (4.12); 1992 was also the last time that either of the leagues individually had as low of a scoring rate.

The offensive difference between the AL and NL was largely due to a difference in league batting averages. As a group, the AL and NL had nearly identical walk rates (.095 and .096 walks/at bat) and isolated power (.147 to .144), but the AL BA was five points higher (.260 to .255). The NL slugged just .399, the first sub-.400 league figure since the 1993 NL.

* I list two different winning percentage estimators in my team report. EW% is based on actual runs scored and allowed, while PW% is based on runs created and runs created allowed. Teams whose actual W% were very similar to both of the estimates included (W%, EW%, PW%): Atlanta (.562, .567, .564), Cincinnati (.562, .567, .564), Florida (.494, .501, .495), and Texas (.556, .564, .557).

An interesting group of teams is those whose PW% tracked their actual W% much better than EW% did. These are teams that may be over/underrated for 2011 by those that put a great deal of stock in Pythagorean record as an indicator. Such people are largely strawmen, but regardless, some of the teams in this group are Baltimore (.407, .386, .411), the Cubs (.463, .447, .468), Pittsburgh (.352, .324, .351), and St. Louis (.531, .564, .542). I'll leave it to the reader to find the more conventional Pythagorean watch teams, those whose EW% and PW% are in general agreement and diverge from actual W%.

* Last year, SF games were the lowest scoring in MLB at 7.83 RPG, which was the lowest figure since the 2003 Dodgers. In 2010, 7.83 RPG would have ranked just third-lowest, as Seattle (7.48) and San Diego (7.69) each exhibited a lower scoring context. The Mariners' 7.48 still couldn't touch the 2003 Dodgers at 6.98, but it was the lowest RPG for an AL team since the 1981 Yankees (7.14). Of course, Seattle's RPG was lowered by the .97 park factor, but even after park-adjusting the figure to 7.71, it was still the lowest AL scoring context since the 1989 Angels (7.70, not park-adjusted).

Commenting on Seattle's offensive ineptitude can be considered hitting after the whistle at this point, but allow me to indulge. Their 3.17 runs/game was the lowest since the 1981 Blue Jays averaged 3.10. Seattle's 2.95 R/G at home was the lowest since 1972, when both the Padres (2.71) and Angels (2.76) scored less. While Safeco had a large home/road split in 2010, the five-year PF is .97--a pitcher's park, yes, but not an extreme one.

They were more respectable on the road, averaging 3.38 R/G, a mark which the Pirates (3.14) managed to keep from being even the lowest in 2010 (although outside of the Pirates' showing, it was the fewest since the 1994 Pirates scored 3.20 away from home. I did not run the numbers relative to league average, but it probably wouldn't do to much to help Seattle; while the AL's average of 4.45 R/G is low relative to recent seasons, it's still a perfectly normal league scoring level in historical context.

* Unfortunately, we never got to see the playoff matchup between New York and Tampa Bay. While the concerns about the Rays running wild in such a series were likely overstated, it is true that the Yankees struggled at controlling the stolen base game. The 85.2 SBA against them was easily the highest in MLB, with the Red Sox (80.1) next. The Cardinals lead baseball with only 58.9% of opposition attempts successful.

* It will come as no surprise that Pittsburgh had a terrible defense in 2010. The degree of their anti-dominance may be a little jarring though: last in BA (by ten points, .283), last in OBA (by eight points, .347), last in SLG (by fourteen points, .451). The only team offense that exceeded any of the Pirates' allowed figures was Toronto, which slugged .456.

The Pirates were also last in innings/start (by .11 innings, 5.38), starters' eRA (by a whopping .62 runs, 5.86), and DER (.659). Their bullpen ranked only fifth-worst in eRA (4.82), and their modified fielding average was third-worst (.962). All of this predictably resulted in allowing 5.4 R/G (more than any team managed to score).

* In 2009, playoff teams averaged +72 runs above average on offense and just +44 on defense. In 2010, the teams exhibited more balance, as you can see:

I'd usually snark about defense winning championships at this point.

* You're probably aware that the long-term trend in MLB, pretty much dating all the way back to 1871, has been for fielding averages to increase. For the most part this holds, but there was an odd blip in 2009. The all-time high ML mFA is .9704, set in 2007. In 2008, the mFA rounded to four decimal places was the same but actually was a bit lower. In 2009, however, mFA dropped all the way to .9669, the lowest since 2001. The decline of .36% was the largest in the post-war era.

In 2010, mFA rebounded to .9693, an increase of .25%. That is still the lowest average (excluding 2009) since 2004. I am not claiming that fielding average is an important metric, or that there is a meaningful explanation for the fluctuations, but in looking at league fielding totals it caught my eye.

* Major league teams had a .559 W% at home in 2010, the highest mark since 1978 (.573). 93% of major league teams (28/30) had better records at home than on the road, which sounds like a lot, but while high it isn't extraordinary. (San Diego had the same record home and away). The average for 1961-2010 is 83%, but as recently as 2007-08, 29/30 teams have had better home records. In both 1978 and 1989 all teams had better records at home.

Much was made about the Pirates' .210 road W% (17-64), the worst since the identical showing by the 1963 Mets. Also notable was Detroit's home/road split of .642/.358, which was of equal magnitude to that of the Pirates and was the largest by a .500 or better team since the 1996 Rockies (.679/.346). The Rockies and Braves chipped in to make it four of the 23 highest differentials since 1961 in 2010.

* Cleveland fans seem to be pretty happy with new closer Chris Perez, and given his performance (7th in the AL among relievers with 20 RAR), but it would be a mistake to assume that he's proven himself as the long-term answer at the end of the game. He allowed a low .234 %H, so his 3.72 dRA is well above his 2.10 RRA or 2.86 eRA. The batted ball metrics are even less impressed--4.45 cRA, 4.74 sRA.

* Jonathon Papelbon pitched 67 innings with a 4.24 RRA, which results in 5 RAR; Scott Atchison pitched 60 innings with a 4.16 RRA, for 5 RAR as well. Of course, the similarities end there, as Papelbon's peripherals were much better than Atchison's, but it's never a good thing when your pricy closer is no more effective than the seventh man out of the pen.

* Bobby Jenks has been non-tendered, and obviously I have no insight to offer on his health or his PitchF/x data or anything like that. What I can tell you is that his peripherals were pretty good in 2010: 2.89 dRA (his %H was very high at .365), 3.34 cRA, 2.91 sRA. If he's healthy, he might be a good buy.

* Chad Qualls gave up a massive .397 %H; he actually looks serviceable in dRA (4.26) and the batted ball metrics (4.33 and 3.89).

* Trevor Hoffman was terrible in his swan song, but at least he was consistent across the board in RA estimators: 5.95 RRA, 5.82 eRA, 5.89 dRA, 5.45 cRA, 6.10 sRA. Ryan Madson was consistent in a good way: 2.47, 2.80, 2.88, 2.83, 2.90.

* Last year I pointed out that Francisco Rodriguez didn't pitch very well in the first year of his big contract, so I feel obligated to point out that he was pretty good in his 57 innings in 2010: 2.34 RRA, 3.16 eRA, 3.10 dRA, tied for fifteenth among NL relievers with 16 RAR. Of course, his off-the-field performance took a corresponding nose dive...

* Who is Wilton Lopez? I probably saw less Houston games than any other team this season, so I never saw him pitch. The 27 year-old Nicaraguan rookie ranked among the ten most valuable relievers in the NL (not considering leverage) with a 2.02 RRA in 67 innings and solid, consistent peripherals (3.41 eRA, 3.24 dRA, 3.24 cRA, 3.22 sRA). He was shelled last season in 19 innings, and his strikeout rate (6.7) leaves a lot to be desired, and for the season as a whole he wasn't trusted by Brad Mills, with a below-average Leverage Index. He did inherit .49 runners/appearance, but sometimes a high IR/G goes hand-in-hand with a mop-up role. As if that wasn't enough cold water, his minor league numbers don't look like anything special from a quick glance. It was a nice season in any event.

* I don't list Inherited Runs and Bequeathed Runs Saved on the reports themselves, but if you download the spreadsheets, they are included. The AL leaders in IRSV were Matt Thornton (5.7), Randy Choate (5.1), and Joaquin Benoit (5.0). Dan Wheeler (4.0) also had a particularly good showing from Tampa's pen. Eddie Bonine trailed the AL at -8.4

Among AL Relievers, Dusty Hughes got the least support from subsequent relievers as 15/25 scored (7.2 BRSV). Lance Cormier benefited the most as only 1/30 bequeathed runners came around to score (-8.1 BRSV).

In the NL, Wilton Lopez (9.2), Javier Lopez (9.2), and Santiago Casilla (8.7) were the leaders in stranding runners; Lopez allowed just 1/33 to score. A pair of Dodgers were the trailers: George Sherrill (-5.7) and Ramon Troncoso (-10.5 on 22/37).

Apparently Ronald Belisario was one of their victims, as he got the least support from subsequent relievers (4.6 BRSV on 12/24). Joe Thatcher was the most fortunate, as his Padre penmates prevented all 35 runners he bequeathed from scoring (-10.5).

* The flip side to the last bullet point is bequeathed runs saved for starting pitchers. In the AL, Rich Harden got the most support (2/20, -4.2) followed by Jake Westbrook and Jon Lester (both -3.7). Jason Vargas got the least help (12/20, 6.0).

In the NL, Jonathan Sanchez (3/25, -4.9) and a pair of Braves (Derek Lowe, -4.3 and Tommy Hanson, -3.0) were the best-supported by their pens. Scott Olsen (10/16, 5.0), Chris Narveson (4.8), and Kevin Correia (4.6 despite pitching for San Diego with their excellent bullpen) were the least supported.

* Presented without comment: Max Scherzer 37 RAR, Edwin Jackson 25. Clayton Richard 31 RAR, Jake Peavy 17.

* I usually only include pitchers with 15 starts on the starters report, but I had to throw Stephen Strasburg into the mix. Among NL pitchers with 15 starts (plus himself), he ranked twelfth in RA, fourth in eRA, and first in dRA, cRA, and sRA. Obviously that was in just 68 innings, but it was fun while it lasted.

* It's a shame there is no LVP award, as it would be a runaway in both leagues--Ryan Rowland-Smith, -26 RAR in the AL and Charlie Morton -31 in the NL. Rowland-Smith was 1-10 with a 7.83 RRA in 109 innings, and none of his peripheral RAs were much better (6.37 sRA was his best). He was last in the AL in all of the run averages, plus QS% (20).

Morton's season was more respectable, as he pitched 80 innings and gave up a .367 %H, which means his dRA (5.50) and batted ball RAs (5.10 and 4.56) weren't terrible. Teammate Zack Duke was next on the RAR trailer list (-20) and matched Morton with -42 RAA thanks to being allowed to pitch 159 innings.

* Cliff Lee and Jon Lester are very close in most of the categories I list, in addition to both having names that start with "L" and being left-handed. Lee pitched 4 1/3 more innings, with essentially the same RRA (3.55 to 3.52), eRA (3.19), and cRA (3.55 to 3.52). Lee was a little better in dRA (3.08 to 3.26), Lester better in sRA (3.77 to 3.36). Lee's %H was .300 to Lester's .295; Lee made 106 pitches/start, Lester 105; Lee made 64% quality starts, Lester 63%. Both get credit for 21 RAA, while Lee gets one more RAR (51 to 50).

* The bottom 12 starting pitchers in the AL in RAR are: Ryan Rowland-Smith, Brian Bannister, David Huff, Scott Kazmir, Rich Harden, Josh Beckett, Scott Feldman, Jamie Shields, Nick Blackburn, Jeremy Bonderman, Tim Wakefield, and AJ Burnett. The NL's bottom dozen is a much more conventional list of lousy pitchers: Charlie Morton, Zach Duke, Kyle Lohse, Nate Robertson, Manny Parra, Jeff Suppan, Paul Maholm, Bud Norris, Kevin Correia, Craig Stammen, Dave Bush, and John Ely.

* We all know that Texas owes its success largely to pitchers going deep in games, right? Wait, they only averaged 5.87 IP/S, ninth-lowest in the majors? Well, surely that must be because of some bad starts from Rich Harden or something.

Well, here are the P/S (counting relief appearances as half-starts) for the Rangers' starters with 15 or more starts (including starts made for other teams in the case of Cliff Lee): Lee 106, Wilson 104, Lewis 103, Feldman 95, Harden 93, Hunter 85.

Tampa Bay, their playoff opponents: Price 107, Garza 101, Shields 100, Davis 96, Niemann 89.

A few other teams: White Sox: Danks 106, Peavy 100, Buehrle 100, Floyd 97, Garcia 88.

Boston: Lackey 109, Lester 105, Matsuzaka 105, Beckett 103, Buchholz 100, Wakefield 85

Oakland: Gonzalez 102, Cahill 100, Mazzaro 99, Sheets 98, Braden 95, Anderson 95

Angels: Weaver 109, Santana 108, Saunders 100, Pineiro 100, Kazmir 98

Detroit: Verlander 113, Scherzer 106, Porcello 96, Galarraga 95, Bonderman 93

Of course, facts aren't particularly important when you need to rhetorically twist a playoff series into a morality play.

* Earlier in the season, I posted about the number of no-hitters thrown in 2010 and how it compared to expectation based on both the historical frequency of no-hitters and a theoretical probability of a no-hitter. The details are in that post and will not be repeated here.

There were six no-hitters pitched in 2010 out of 4,924 possible games (double-counting games since of course each pitcher has an opportunity to throw a no-hitter). Given the long-term frequency of no-hitters figured by Tom Flesher (.06%), we'd have expected 2.954 no-hitters. The Poisson distribution yields this expected distribution:

I also offered a distribution based on the theoretical probability derived from the overall major league BA (.2573) with 2.734 expected no-hitters (.056%):

Whichever model you choose, there's nothing particularly shocking about six no-hitters in one season, something that has about a 6% chance of occurring by chance. If I wanted to get cute, I'd point out that 6% is about once every twenty years, and the last big no-hitter season was in 1990, but that would be sabermetric malpractice.

* Speed Score leaders and trailers by position (with the caveat that these are based on just 2010 data when Bill James intended them to be based on multiple years):

C: Miguel Olivo (6.1)/Chris Snyder (.7)
1B: Kevin Youkilis (5.5)/Adrian Gonzalez (1.0)
2B: Sean Rodriguez (6.2)/Luis Valbuena (1.7)
3B: Evan Longoria (5.4)/Wilson Betemit (.9)
SS: Rafael Furcal (7.8)/Juan Uribe (2.2)
LF: Carl Crawford (9.1)/Pat Burrell (1.0)
CF: Dexter Fowler (9.1)/Torii Hunter (2.5)
RF: Will Venable (8.6)/Ryan Ludwick (1.9)
DH: Johnny Damon (5.5)/Willy Aybar (.7)

* RGs for Yankee hitters with 300+ PA: 6.8, 6.1, 5.9, 5.8, 5.6, 5.4, 5.2, 4.3, 4.1. Without doing any formal checks, I have to assume that's one of the more balanced league-leading lineups you are likely to see. The Yankees led in team RG with 5.14; the Red Sox were second at 5.08. Their breakdown for 300+ PA hitters was: 7.6, 6.7, 6.5, 6.2, 5.9, 5.1, 4.9, 4.9, 4.2. The range is only about one run wider, but it's more top-heavy.

* Ryan Braun and Prince Fielder had different shapes to their production (Braun had a .305 BA and .289 SEC while Fielder was at .263/.409), but the outcomes were very similar. Braun created 110 runs while making 434 outs for 6.45 RG, while Fielder created 109 runs in 427 outs for 6.49 RG; both were at +36 HRAA. Position adjustments put Braun several runs ahead in the categories where they are included, but it would be tough to find two stars on a team better matched.

* Park-adjusted stats: Matt LaPorta .222/.307/.364, 3.84 RG, 0 RAR. Justin Smoak .215/.305/.365, 3.87, 0. These two make an interesting pairing since both were the centerpiece of a trade of a former Indians left-handed ace. Of course it has been two years since LaPorta was traded to Cleveland while Smoak was traded this summer, but they both will need to improve on those performances to avoid the trades becoming Santana II and III. Personally, I think Smoak is a much better bet--he's younger and LaPorta was nagged by several injuries during 2010.

* The Indians got nearly identical offensive production out of their two primary center fielders, Trevor Crowe and Michael Brantley. Crowe hit .252, Brantley .247. Each had a .299 OBA. Crowe slugged .334, Brantley .328. Add in steals and they both created 3.42 runs/game and were essentially replacement level (1 RAR each). This was actually an improvement over Grady Sizemore, who hit even worse (.212/.264/.291) in his 137 plate appearances.

* Alex Rodriguez had the worst full-time season of his career in 2010. His hitting relative to the league average was a little worse in 1997, but he was playing shortstop at the time and had an additional 40+ plate appearances than he did in 2010. This is not meant as a condemnation of A-Rod in any way, as he was still a valuable asset and after all is 35 years old.

However, I was surprised by the lack of media excitement about this. Certainly it was pointed out that he was not his old self, but usually anything negative about the man is blown completely out of proportion, and the media could have had a field day if they'd framed the story in a certain light. Why didn't they? Speculation about motivation aside, RBI offer a statistical explanation. Rodriguez batted in 125 runs, most since his MVP campaign in 2007, and the sixth-highest total of his career.

At the same time, Rodriguez' ratio of RBI to RC (using no park adjustment in the latter) was the highest it had ever been at 125/92 = 1.38. His previous full-season high had come in 1999 (111/102, 1.09). For his career, A-Rod's RBI and RC are very close (1855 RBI, 1831 RC), and so he is not a hitter that has a RBI > RC tendency.

Among batters with 300 or more PA, only Pedro Feliz (40/26, 1.51) and Willy Aybar (43/30, 1.45) had a higher ratio than Rodriguez. Ichiro turned in the major's lowest ratio (43/91, .47). A-Rod's relative lack of production was masked by his RBI count, and if it has anything to do with preventing a media freakout, I'm personally pleased it was.

Matt Klaassen of Fangraphs has written about ARod's RC/RBI ratio as well.

* I hadn't noticed how poorly AL shortstops hit until the Silver Sluggers were announced. When I heard "Alexei Ramirez", I was little surprised. Of course, then I looked at the numbers a little more closely, and it's ugly. These are all AL players with 300+ PA who were primarily shortstops:

None of them managed to match even the AL average RG, which leads to this amusing chart of Silver Slugger winners together with their HRAA:

Admittedly, using average as a baseline is a cheat for shock value in this case.

* Another hitter of historical note whose 2010 wasn't up to his own previous standards was Albert Pujols, at least if you believe some accounts. It has been seemingly common to claim that 2010 was Pujols' worst season, but I beg to differ.

Looking at unadjusted (for league or park) RG, it was only Pujols' fourth-worst season, as he had lower RC rates in 2001, 2002, and 2007. Factoring in league and park, I have his ARG at 206, which was actually a smidge better than his pre-2010 career mark of 204, and puts the season well ahead of the three years already discussed and in the same general pool as 2004-2006. Factor in that Pujols set a career high with 690 PA, and my fielding-less WAR pegs it as the fifth-best season of his career--right in the middle. And still a season strong enough to be worthy of NL MVP honors.

Friday, November 26, 2010

The Curious Case of Brooks Robinson's Batting Runs (rWAR)

Colin Wyers of Baseball Prospectus pointed this out to me, and neither he nor I have an explanation for it. Rally's WAR estimates have become the most widely used on the internet, especially since they are available at Baseball-Reference. However, some of the batting runs figures don't make a whole lot of sense, and the specific player that Colin brought to my attention was Brooks Robinson.

Rally lists Robinson with a career total of 20 batting runs (above average). That figure does not include baserunning (0 runs) or GDP runs (-35 runs) or reached on error runs (-2 runs), so I will omit those areas of the game from my estimates which follow. The 20 batting runs seems awfully low. My own crude ERP-based estimate is 154 batting runs with park adjustment, 113 without (I estimate Robinson's career season-weighted PF to be .97, meaning he played in moderate pitcher's parks on average). Colin's estimate is 84 batting runs without park adjustment. Pete Palmer's estimate from the 2005 ESPN Baseball Encyclopedia (which does include base stealing) is 53 batting runs. Wyers used OPS+ to generate a crude estimate of +57.

When I was initially discussing this on Twitter, it completely slipped my mind that my figures were comparing Robinson to all league hitters (including pitchers for 1955-1972). Palmer's estimate and OPS+ exclude pitchers from league totals, and they are the closest to Rally's. Still, a thirty-run difference is still fairly large when dealing with offense.

In order to understand why we see discrepancies, it makes sense to attempt to replicate Rally's approach. His explanation of batting runs allows us to get a sense of his process:

Bat runs - This is park adjusted linear weights batting runs, using customized weights at the team level to ensure that total runs credited to players will equal the actual runs scored for that team.

While it is not specified in the quoted entry, Rally has explained elsewhere that he uses Base Runs to generate the linear weights. Since the weights are set so as to ensure that team BsR is equal to actual team runs, I'm going to assume that he's achieving this through the use of a custom B multiplier for each team-season.

I attempted to mimic this through use of a BsR formula that only considered the basic batting events--singles, doubles, triples, homers, walks, and at bats. This formula is far from the most accurate BsR equation ever devised, but it should perform well enough in this role:

A = H + W - HR
B = 2TB - H - 4HR + .05W
C = AB - H
D = HR

Using this equation, I calculated the B multiplier needed for each Orioles season to make BsR = actual runs scored. Then I calculated the corresponding intrinsic weights for each team-season, and used these to estimate Robinson's runs created. From there, I estimated Batting Runs by taking RC - Avg(RC/Out)*Robinson outs.

Using this approach, I estimate that Robinson contributed 107 batting runs (without accounting for pitchers and without a park adjustment).

In order to better mimic Rally's approach, I needed to remove pitcher hitting from the league total. To do this, I used the BsR formula to estimate intrinsic linear weights for each league-season, then figured the league RC/O for non-pitchers (I used a spreadsheet published by Terpsfan101 to get the non-pitcher totals). Using those figures as the baseline for Robinson, I got an estimate of 30 batting runs, which isn't that far off of Rally's. However, when a park adjustment is applied, it shoots back up to 71 runs, which is much closer to the Palmer and Wyers estimates.

More concerning was another curiosity that Wyers noted--the 1969 season. Robinson's Orioles are credited with a team total of 40 batting runs. However, they average 4.81 R/G in a league with an average of 4.09 runs, which means that they scored about 117 runs more than average. That's nearly an 80 run discrepancy!

It gets even more confusing when one looks at the league total listed at Baseball-Reference for the 1969 AL--a whopping -685 batting runs. I have no idea whether this is a problem with B-R's implementation of Rally's method or something else, but it obviously is an error of some sort.

What is different between Rally's figures and my attempts to replicate them? Obviously, if we knew for sure this exercise wouldn't be necessary, but it's safe to assume that:

1. Rally is using a different (and probably better) BsR equation than I am
2. His park factors and mine are probably similar, but surely there are differences
3. Rally may be incorporating some additional categories that I've ignored (intentional walks and sacrifices)

However, the likelihood that all of those differences work against Robinson and account for the difference is not that great (not to mention that the 1969 league figures are illogical). I feel a little guilty posting this without first consulting Rally about it, but I did not have his contact information. He's a good sabermetrician and it is quite possible that I am missing something here--but I do think there is enough smoke to warrant some further explanation.

Monday, November 22, 2010

Meanderings

What follows is a very lightweight post, even for one of this nature.

* I have a half-written post somewhere about the generation gap in sabermetrics between people who got into the discipline prior to the explosion of online sources and those who started at some time after that. I've never finished it or posted it because it's not about baseball--it's about sabermetrics, and because one could easily read it as self-aggrandizing (and perhaps even as a sign of old fogeyism). But the themes have manifested themselves a little bit in the reaction to Felix Hernandez winning the Cy Young.

I'm not crazy about looking at the BBWAA votes for an award as any kind of triumph or defeat for sabermetrics, but if you are inclined to view it in those terms, it's tough to see how Hernandez' win was anything but a victory for the discipline. The win craze in Cy Young voting may have reached its zenith after the Stone/Vuckovich/Hoyt selections stopped in the early 1980s, but it never fully died, not with Jack McDowell in 1993 or John Smoltz in 1996 or Bartolo Colon in 2005. The shiniest W-L may not have been the strong Cy indicator it once was, but a good W-L record was still necessary to get a seat at the BBWAA table (provided the pitcher in question was a starter). It was unprecedented that a 13-12 pitcher would get serious consideration.

It's absolutely true that one didn't need FIP or xFIP or SIERA to make a case for Felix Hernandez; ERA, innings pitched, and strikeouts, which have been kept for the last century, were sufficient to make one consider that Hernandez might have been the league's outstanding hurler. Still, it should not be forgotten that the notion that ERA and strikeouts and the like were useful indicators is one embraced by sabermetrics, that had many less adherents pre-James than it did in 1990, and many less adherents in 1990 than it did in...well, you get the idea.

But for certain members of the community (largely peripheral members, i.e. not the people authoring sabermetric blogs or engaging in their own research), generally those that fall into what could be called (uncharitably to the site) the "Fangraphs generation" of saberites, the notion that actual runs allowed is an acceptable tool by which to evaluate starting pitchers is foreign, as foreign as the notion that W-L was the key evidence was to my generation of sabermetricians.

* Any skirmishes about the baseball awards are a garden party compared to the battle being waged over Horse of the Year between Zenyatta and Blame. I would vote for the latter without a moment's hesitation, and I've yet to see a coherent argument for Zenyatta that is based solely on her 2010 performance. The Zenyatta crowd talks about her "transcending racing" (it's not a popularity contest), or about how she should have won in 2008 or 2009 (arguable, but wrong I believe, and irrelevant to a 2010 award), or about her accomplishments in 2008 and 2009 (beyond irrelevant). Blame ran a more ambitious campaign, beat better horses more times, beat Zenyatta head-to-head, had better speed figures, won more money, and ran exclusively on dirt and at classic distances.

Hernandez/Sabathia is actually not a bad comparison--Sabathia pitched well and wouldn't hardly have been the worst selection in the award's history--but outside of W-L record, it was hard to find an area in which he had Hernandez beat. Outside of the fact that she's Zenyatta, it's hard to find an area where she had Blame beat. To the same degree that I was reasonably confident that Felix Hernandez was the best AL pitcher in 2010, I'm reasonably confident that Blame was the best North American thoroughbred of 2010.

* It now looks as if the expanded playoff format is an unstoppable train. Writing on the idea in an earlier post, I said "In this case, not only do I consider the idea stupid, but it would seriously dampen my own enthusiasm for the playoffs."

Reading it back, I realize that was an overreaction. I don't like the idea of an extra wildcard team any more today than I did then, but I do realize that the likelihood of my enthusiasm for the playoffs being dampened is next to zero. If anything, I'll probably be happy to have a few extra games to watch. The allure of the game is too strong, and to make bold statements about my own ability to resist is self-flattery. I'll object with my head, but I'll tune in and I'll like watching the games if not agreeing with their existence--and so will others, and everyone will make money.

Also, it is worth noting that even with ten playoff teams, MLB will still have the lowest proportion of playoff teams among the big four US leagues.

Tuesday, November 16, 2010

IBA Ballot: MVP

I don't see any slam dunk choice for the AL MVP. My initial RAR numbers have Miguel Cabrera at 74, Jose Bautista 71, Josh Hamilton 68, and Robinson Cano 64. Adding in a crude fielding estimate ((UZR + Dewan's RS)/4) puts Hamilton in the lead at 72, followed by Cabrera 70, Bautista 70, Cano 66, and Longoria 62. Hamilton is also hurt by the fact that the initial RAR considers him a left fielder, but he actually played 22% of his innings in center. Refiguring his position adjustment to take this into account, his offense-only RAR is bumped up by a run, leaving him at 73 total.

It also stands to reason that Hamilton contributed as much or more on the bases than his competitors--BP's EqBRR less stolen base runs (steals are already accounted for in my RC formula) has Hamilton +2, Cabrera 0, Cano +1, Bautista -1, and Longoria +3, and thus only increases Hamilton's insignificant edge. It's not a factor that I consider, but Hamilton will almost certainly win the BBWAA award as he played for a playoff team and Cabrera did not.

There's one player left to consider before handing the award to Hamilton--Felix Hernandez. Hernandez' 76 RAR is definitely comparable to Hamilton's grand total of 75 RAR. However, Hernandez' peripherals are not quite as brilliant as his actual runs allowed, and while I have no qualms about choosing a pitcher as MVP, I like it to be a somewhat clear choice. Since the one run difference in RAR is meaningless and the evidence suggests that Hernandez is getting credit for a fair/favorable runs allowed rate, I can't justify going with him.

The bottom of the ballot is just a matter of mixing in the top starting pitchers with the position players, for whom I see little reason to deviate from RAR ranking. The exception is Paul Konerko who is at 55 RAR but frowned upon by the fielding metrics (-8) and is in front of a bunch of guys for whom I think most people would agree bring a lot more to the table in every area except batting (Adrian Beltre, Joe Mauer, Shin-Soo Choo, Carl Crawford). I would love to be able to justify getting Choo onto my ballot, but Carl Crawford ranks as his equal at the plate and adds more on the field and the basepaths:

1) LF Josh Hamilton, TEX
2) 1B Miguel Cabrera, DET
3) SP Felix Hernandez, SEA
4) RF Jose Bautista, TOR
5) 2B Robinson Cano, NYA
6) 3B Evan Longoria, TB
7) 3B Adrian Beltre, BOS
8) SP CC Sabathia, NYA
9) SP Jered Weaver, LAA
10) LF Carl Crawford, TB

The battle for top position player in the National League can be fairly safely restricted to three first baseman: Albert Pujols (82 RAR), Joey Votto (71), and Adrian Gonzalez (69). Next on the RAR list is Matt Holliday (61). Pujols has a sizeable lead over Votto in my RAR figures, one that may surprise a lot of readers at first glance, and even I was surprised at the margin.

Looking at their unadjusted batting lines, Votto (.324/.420/.600, 8.8 RG) appears to have the slight offensive edge over Pujols (.312/.414/.596, 8.6 RG). However, Pujols still has a four-run cushion in RAR thanks to an extra nine games played and 52 PA. When park is taken into account, Votto (.319/.414/.591, 8.6) and Pujols (.317/.421/.605, 8.9) essentially exchange raw stat lines with one another.

Consider that over the last five seasons, St. Louis's average RPG is 8.8 at home and 9.4 on the road. Cincinnati's split is 9.6/9.1. The parks have played as close to mirror images of one another. Of course park factors can't capture all of the potential influences on those figures--team construction, year-to-year weather fluctuations, chance, etc.--but I don't think it's outlandish to suggest, as my park factors do, that the overall run environment in which Cincinnati plays its schedule is 6% higher than that of St. Louis.

Maybe you don't trust the park adjustment. Maybe you'd prefer to look at each player's performance in the actual run context of his team in 2010, rather than the idealized league average context offered by park adjustments. There are drawbacks to such an approach, most notably that it assumes that each team is equally strong offensively and defensively, but there's an argument to be made that it captures value more effectively than does the neutralization approach. (Bill James made this argument using a fictional Jim Rice as an example in the original Historical Baseball Abstract, and it's something that I intend to ruminate on at some point).

Cincinnati games saw an average of 9.1 runs in 2010 (or 4.55 per team); St. Louis 8.5 (4.25); and throwing in San Diego for good measure, 7.69 (3.85). Using those figures as the new league average, and refiguring HRAA, RAR, and ARG (RG relative to average), the three come out:

Pujols: 70 HRAA, 79 RAR, 203 ARG
Votto: 63, 72, 194
Gonzalez: 53, 61, 184

I have no choice but to conclude that Pujols was the superior offensive player--to the extent that the tools being used capture reality. You can knock a few runs off of Pujols' figure for excess intentional walks, if you'd like, but it's not enough to make the gap disappear. Factoring in other areas of the game don't figure to do much to boost Votto--Pujols has a good fielding reputation and a track record of good performance in metrics, although this year the two are both rated as just about average by both UZR and RS, with a one run edge for Votto. BP's figures have Pujols as a +5 baserunner, Votto average.

To swing the comparison in Votto's favor, you either need to put stock in a metric like WPA (Votto was +7, Pujols +5.4) or give Votto a bonus because his team bested Pujols' for the division crown. I do neither.

The other interesting comparison is Pujols v. Halladay. Both have 82 RAR initially, but Pujols would actually pick up a few runs for fielding and baserunning, while Halladay would have to lose a tick for his hitting (-1 RC). Factor in the peripheral issue discussed re: Hernandez, and I favor Pujols. This is the second time in three years that I have listed Halladay second on a MVP ballot (last time, in the 2008 AL, he was ahead of the position players but lost out to Cliff Lee).

Adam Wainwright and Ubaldo Jimenez are also deserving of prominent positions on the ballot. Among the down ballot position players, I allow fielding to have just enough influence to push Troy Tulowitzki ahead of Carlos Gonzalez for Most Valuable Rockie, and to put Ryan Zimmerman ahead of some others (Dan Uggla, Jayson Werth, Hanley Ramirez, David Wright, notRyan Howard):

1) 1B Albert Pujols, STL
2) SP Roy Halladay, PHI
3) 1B Joey Votto, CIN
4) SP Adam Wainwright, STL
5) SP Ubaldo Jimenez, COL
6) 1B Adrian Gonzalez, SD
7) LF Matt Holliday, STL
8) 3B Ryan Zimmerman, WAS
9) SS Troy Tulowitzki, COL
10) LF Carlos Gonzalez, COL

Monday, November 08, 2010

IBA Ballot: Cy Young

For the Cy Young award, I generally do not consider hitting, although this is more sheer laziness than any strongly held belief that non-pitching aspects of the game shouldn't count towards the Cy. For the majority of pitchers it doesn't really matter (and fielding is at least included in Run Average, even if jumbled up with the other eight guys' glove work).

My suspicion is that the AL Cy will be the award for which my choices most differ from the sabermetric consensus, as I don't make DIPS metrics a primary consideration. My #1 choice is not one of those differences, though. I'll get into the other candidates a bit below, but assume for the sake of argument that the top two candidates are Felix Hernandez and CC Sabathia. Hernandez bests Sabathia in every single category I list on my pitcher report, albeit not always by significant margins:

* Hernandez pitched more innings (249.2 to 237.2)
* Hernandez had a lower ERA (2.34 to 3.12) and a lower RRA (2.96 to 3.39)
* switching to the more traditional RA estimators, Hernandez had a lower eRA (2.97 to 3.48) and a lower dRA (3.50 to 3.82)
* using batted ball inputs, Hernandez had a lower cRA (3.56 to 3.61) and a lower sRA (3.36 to 3.91)
* Hernandez also had a higher percentage of quality starts, which considering it's quality starts and doesn't include a park adjustment isn't something I'd stress, but he leads Sabathia 85% to 76%.
* So of course Hernandez has the margin in RAA (41 to 28) and RAR (76 to 61)

After Felix, it gets a little less clear--Sabathia leads a pack of five pitchers (Jered Weaver, David Price, Clay Buchholz, and Justin Verlander) separated by just five RAR, with two other pitchers cited as candidates (Cliff Lee and Jon Lester) within another five runs of them. Since a lot of saber-minded people consider similar metrics, I added a dRAR column, based on dRA (my BsR application of DIPS). It requires a new innings pitched figure, dIP, which can be figured as (1 - e%H - %W - %HR)*PA/2.84 (see this post for an explanation of the inputs):

I'm sure you'll see a lot of sabermetric ballots that list Felix #1, but then turn to Lee and Liriano due to their strong showing in the DIPS metrics. For me, they are a secondary consideration, enough to move a pitcher ahead of one a few RAR better, but not enough to turn the ballot upside down. Actual runs allowed contain many biases, but they also carry real and important data (at least from a retrospective value perspective) about sequencing (in addition to the more muffled signals about BABIP). It is also worth noting that when batted ball data is considered (another potential minefield, certainly), a pitcher might give back the advantage dRA indicates (Verlander is the best example here, as his sRA (SIERA-style) is 4.09). Weighing all of the metrics very unscientifically, but giving preference to RAR based on actual runs allowed, this is how I see it:

1) Felix Hernandez, SEA
2) CC Sabathia, NYA
3) Jered Weaver, LAA
4) Cliff Lee, TEX
5) David Price, TB

In the National Leauge, I expect to see much more of a consensus as many of the top candidates have peripherals less impressive than their actual runs allowed rate. Unlike the AL in which pitchers like Lee and Liriano have much better DIPS numbers, in the NL a lot of the top starters move in the same direction. Roy Halladay dRA may be .84 runs higher than his RRA, but Adam Wainwright's is .72, Ubaldo Jimenez's .52, Tim Hudson's a whopping 1.87, Roy Oswalt's .84, Matt Cain's .87, Cole Hamels' .83...this allows us to sideline the ideological debates to a greater extent.

Roy Halladay is the obvious #1 choice, trailing only Josh Johnson in RRA while pitching twenty innings more than anyone else and 67 more innings than Johnson. Not that he should care, but this is actually the first time I've personally ranked Halladay as the top starter in his league. When he won the Cy in 2003, I favored Pedro Martinez or Tim Hudson. In 2005 he was on his way to another Cy Young when he was injured; pitching just 141 innings he still would have ranked second on my ballot. In 2006 he lost a few starts in September and was again second to Johan Santana by my reckoning (although unlike in 2005, Santana was still on pace to earn my vote without Halladay's injury). In 2008 he was second by a slim margin to Cliff Lee, a pitcher he'd eventually become inextricably linked to. In 2009 he had a season so good it would almost always win my vote, but Zack Greinke had to go and turn in the season of the decade.

None of that is to put down Halladay, or say that the 2003 award the BBWAA bestowed upon him was a poor choice (while I favored Pedro, Halladay was a thoroughly defensible choice as well). Rather, it should serve to illustrate how consistently good he's been, and how close he has come to winning three or four Cy Youngs.

After park adjustments, Wainwright and Jimenez are impossibly close--they have the same RA (2.74), nearly the same RRA (2.74 to 2.68), similar ERAs (2.50 to 2.67), the same QS% (76), the same RAA (41) and essentially the RAR (72 to 71). Jimenez looks a little better in the traditional peripheral RAs, while Wainwright looks better in the (not park-adjusted) batted ball RAs. Flip a coin, because you can't go wrong.

Tim Hudson actually has a below-average dRA thanks to a .249 BABIP, but his batted ball metrics look a little better and there's no obvious candidate to replace him. Josh Johnson had an outstanding year, but pitching forty innings less than his competitors consigns him to fifth place:

1) Roy Halladay, PHI
2) Adam Wainwright, STL
3) Ubaldo Jimenez, COL
4) Tim Hudson, ATL
5) Josh Johnson, FLA

Monday, November 01, 2010

IBA Ballot: Rookie of the Year

Over the next few weeks I'll be posting the ballots I submitted to the Internet Baseball Awards, hosted by Baseball Prospectus. While I think too much is made about the post-season awards in general, I also can't deny that they are fun to discuss. Additionally, they present the opportunity to put theories about how to compare player's performance into action, and thus have the potential to stimulate a lot of interesting research and philosophical discussion that can be applied to more general questions (To be clear, that potential has not been fulfilled here.)

I approach the ROY the same way I do the MVP, except limited to rookies of course. I don't consider age, expected future production, or any related factor. I'm perfectly happy to vote for a 34 year old Japanese reliever if they were one of the five top-performing rookies.

Throughout my award posts, the fielding numbers I use are based on the average of Dewan's Runs Saved and Lichtman's UZR, regressed 50% towards zero; or (RS + UZR)/4. Looking at multiple metrics and regressing does not completely alleviate concerns about fielding metrics, but I would not feel comfortable throwing them out completely.

In the AL, the top position player is Austin Jackson, who I have at 27 RAR. Dewan's system loves his fielding (+21); UZR is not as enthusiastic (+4), but it's enough to push him past Brian Matusz for the top spot on my ballot. Matusz had 29 RAR, while Wade Davis had 30, but Matusz' DIPS/batted ball estimators are stronger, and so that puts him ahead on my ballot.

Neftali Feliz is getting some buzz as a candidate, thanks to his saves. I have seen Andrew Bailey's victory last year cited as a reason to support Feliz, but of course, that's a red herring. The comparison should not be of Feliz to a similar past winner, but of Feliz to this year's crop. Without considering leverage, it's not entirely clear that he deserves to rank ahead of another rookie reliever, Daniel Bard. With Jackson at 33 RAR, one would have to give Feliz a leverage multiplier of 1.57 to get them even. His 1.74 LI would suggest a multiplier of about 1.37, which brings him to 29 RAR, roughly equal to Matusz and Davis. I'm still uncomfortable with ascribing that much weight to LI and boosting a reliever who pitched 69.1 IP over a batter with 398 PA playing a demanding position (John Jaso).

Jaso will probably be overlooked by a lot of people, but a catcher with a .376 OBA is nothing to sneeze at. Danny Valencia played well, but he had nearly 70 fewer PA than Jaso, didn't have a significant offensive rate advantage (5.6 to 5.3 RG), and while it doesn't matter retrospectively, his offensive value was largely BA-driven (.313 BA, .205 SEC). I have it:

1) CF Austin Jackson, DET
2) SP Brian Matusz, BAL
3) SP Wade Davis, TB
4) C John Jaso, TB
5) RP Neftali Feliz, TEX

In the National League, the race comes down to Heyward and Posey, so I'll set them aside for a moment to discuss other candidates. Neil Walker checks in at 31 RAR, but -5 fielding and the possible over-adjustment for second baseman in my RAR methodology knocks him off the ballot. Ike Davis was the best rookie first baseman, as far as I can tell, on the basis of his superior OBA to Gaby Sanchez (.358 to .337) and high fielding marks. Chris Johnson's fielding is estimated at -5, which is enough to knock him out of contention, while Starlin Castro's season is more impressive for his age (20) than his performance (albeit not bad at all, 10 RAA and 23 RAR).

Among pitchers, Jaime Garcia stands out at 35 RAR. He will be somewhat overrated by mainstream analysis as just 77% of his runs allowed were earned, the lowest percentage of any NL starter. A 3.60 RRA is quite respectable, though, and his peripherals are similar. Madison Bumgarner was very good as well, turning in 28 RAR in just 110 IP.

I side with Heyward over Posey, largely on the basis of playing time: Heyward played 142 games and batted 611 times, while Posey played 108 games and batted 436 times. It also is important to note that Posey played 35% of his innings at first, which lowers his RAR to 33 versus Heyward's 42. After making that adjustment, Heyward's RG relative to position is 131 versus Posey's 140, which really cuts into Posey's rate stat advantage. Yes, it would have been nice if Posey had spent the whole season in the majors, but Brian Sabean prevented him from contributing for two months, and thus made this an easier choice for me than it seems to be for many others:

1) RF Jason Heyward, ATL
2) C Buster Posey, SF
3) SP Jaime Garcia, STL
4) 1B Ike Davis, NYN
5) SP Madison Bumgarner, SF

Tuesday, October 26, 2010

The Two Best Events in Sports

You can have the Super Bowl, the Stanley Cup, and the NCAA Tournament (except for the games involving my alma mater). Take the Olympics, the Masters, and the World Cup (please take the World Cup, I beg you). Just leave me the World Series and the Breeders' Cup.

Those two events are by far the most compelling (IMO should have gone without saying) in all of sports. Coincidentally, they both occur in autumn, sometimes even overlapping. This year, barring a horrible streak of rainouts, the World Series will have concluded before the horses hit the track at Churchill Downs, but with both events coming up I will bore you with a few of my stray thoughts.

* I do not have a rooting interest when it comes to who wins the World Series, seeing as I don't particularly care for either club. I do have a rooting interest in the series though--rooting for seven games. There has not been a World Series game seven since 2002--also a series that matched San Francisco against an AL West club, apropos of nothing.

If one assumes that the outcomes of each game of a series are independent, and that both teams are equally matched with constant strength and no home field advantage, then the probability of a series of length N is as follows (geometric distribution):

4 = 12.5%
5 = 25%
6 = 31.25%
7 = 31.25%

The probability of going seven years without a seventh game is (1 - .3125)^7 = 7.3%; it's not a particularly likely streak, but it's not remarkable either. I'd still like to see it come to an end in 2010.

* I don't believe there's much value in handicapping a seven-game series, but I would give Texas an edge, something like a 55% chance of winning. There's even less value in doing so on the basis of full-season team records, but I'll proceed for the sake of discussion.

San Francisco had a better actual W% (.568 to .556) and expected W% (based on runs scored and allowed, .581 to .564), but Texas had the edge in predicted W% (based on runs created and runs created allowed, .557 to .543). However, these comparisons don't take into account strength of schedule, which can be a significant factor between the unbalanced schedule and the AL/NL imbalance.

My crude ranking system (yet to be published, as it will take a long, boring post to explain it) gives the Rangers the edge on two of thee comparisons when SOS is taken into account. Based on W/L, Texas has a rating of 121 to San Francisco's 118 (or a 51% chance of winning a seven-game series with no HFA). Based on R/RA, San Fran leads 129 to 119 (54%). Based on RC/RCA, it's Texas 123 to 110 (56%). Considering that, I think 55% is a reasonable estimate.

* From a preseason perspective, when's the last time there was a more surprising World Series than TEX/SF? I intend that as a rhetorical question, as the answer depends on your own perspectives on the teams before the season. For me, it's probably the most surprising since 2005. I picked Texas second in the AL West and San Francisco fourth in the NL West.

There have been other pennant winners that I did not pick for the playoffs (a long list, in fact, owing to both my misjudgments and the inherent inaccuracy of the accuracy), but I had picked both 2005 pennant winners to finish fourth in their division, so that one stands out.

While record in the preceding season is far from a perfect measure of preseason expectations, it might be instructive to look at the combined previous season W% of the two World Series participants. In the expansion era (1961-), the average pennant winner played .550 ball in year X-1. Both TEX (.537) and SF (.543) were below average, although not by a huge margin. Combined, their .540 W% ranks 31st out of the 49 World Series.

Several series in the twenty-first century have been lower, including 2001 NYA/ARI (.533), 2006 DET/STL (.528), 2002 ANA/SF (.509), 2007 BOS/COL (.500), and 2008 TB/PHI (.478). This should not be too surprising, since the expanded playoffs have had the effect of reducing the same season W% of pennant winning teams.

The highest previous season W% of the era was on display in the 1999 NYA/ATL series, by a huge margin; the two teams had combined for a .678 W% in 1998. 1962 NYA/SF (.614), 1970 BAL/CIN (.611), 1978 NYA/LA (.611, and a World Series rematch), and 1964 NYA/STL (.610) are the other high points. The highest of this decade was surprisingly 2003 NYA/FLA; the Marlins were below .500 in 2002, but the Yankees' 103-58 carried the combination to .563.

The two lowest X-1 W% combinations both involve the Twins. Not surprisingly, the worst-to-first MIN/ATL series of 1991 is last at .429, with the 1987 MIN/STL series at .464. The other series featuring teams that had combined to be sub-.500 in the previous season were 1988 OAK/LA (.475), the aforementioned 2008 TB/PHI, 1967 BOS/STL (.478), and 1965 MIN/LA (.491).

The Twins also account for another dubious distinction; their three World Series are the only ones in the expansion era in which both teams were sub-.500 the previous season. Both the 1965 and 1987 series saw Minnesota playing an opponent that had won the NL pennant in year X-2, but struggled in year X-1 before rebounding and taking the flag back.

I'm now descending from "vaguely interesting trivia" to "absolutely worthless drivel", but it's something I noticed looking over the data. This series features one of the closest year X-1 matches between the two participants, with just one game difference between them (TEX was 87-75, SF 88-74 in 2009). The only perfect match of the era is the 1985 STL/KC series (both were 84-78), with 1965 MIN/LA (79-83, 80-82) and 1980 KC/PHI (85-77, 84-78) also off by just one game.

* I have to admit feeling a twinge of happiness with the Phillies' defeat. The Phillies were, both in my estimation and the conventional wisdom, the strongest NL team in 2010. But when a national baseball writer (even if it is a demonstrated fool like Tracy Ringolsby) picks a team to sweep through the playoffs 11-0, it's hard for me to not root against them. It wasn't just Ringolsby--the Vegas notion that the Phillies were 2-1 favorites to win the World Series is tough to defend logically.

The recent Phillies are among the more overhyped teams of recent memory. Their regular season records have been good, certainly, but not historically special. Winning two straight pennants (combined with the third that some members of the media awarded to them) caused a lot of people to downplay the regular season record.

I looked at the team with the best record in the NL over each three-year period beginning in 1961. Obviously, there is nothing special about this approach, no reason to think that looking at three years is better than looking at two or four or using a different approach altogether. It is a timeframe that fits the Phillies' record, as it captures their world title, their pennant, and their best regular season record.

The Phillies' three-year regular season record of 282-204 (.580) is the best in the NL over the last three seasons, but it ranks 28th of 48 in the expansion era, hardly the record of a historically great club. Recent NL leaders with better marks include several combinations of Cardinal seasons (2000-02, 2003-05, 2004-06) and all of the three-year groups formed from Atlanta's 1990s run.

At least to this point, the Phillies' would-be-dynasty is certainly no better than St. Louis' 2004-2006. The Cards record over that period was 288-197 (.594). Their postseason results were the same as the Phillies: a World Series win, a World Series loss, and a NLCS loss.

Absolutely worthless drivel: the best three-year NL record during the period was 310-176 (.638), compiled by the 1997-99 Braves. If you want one which includes a World Series win, it's the 1974-76 Reds (308-178, .634). The lowest three-year mark by a team which led the NL over that stretch is 260-226 (.535) by the 1982-84 Phillies.

* The main storyline for the Breeders' Cup revolves around Zenyatta. For those of you who may be unfamiliar with horse racing, Zenyatta is a six-year old mare that has raced nineteen times in her career and has never be beaten. Nineteen straight is the longest winning streak in major North American horse racing, surpassing the streaks of sixteen compiled by Citation and Cigar. Most of Zenyatta's victories have come in races against other fillies and mares, but after winning the Breeders' Cup Distaff in 2008 (I refuse to refer to this as the "Ladies' Classic" as is now proper), she became the first female ever to win the Breeders' Cup Classic in 2009.

Zenyatta will be retired after the race, and so there would be obvious interest in the final start of a legend, let alone the fact that she could finish her career perfect and become just the second horse to win the Classic twice (Tiznow repeated in 2000-01). She also could become the first horse to win three Breeders' Cup races.

This will be a tough task, however. The Zenyatta-doubters (a group which I admittedly would include myself in) will point out that she has run most of her races over a synthetic surface and in California, and that her only race against males was the 2009 Classic (which, in fairness, is the premiere race in North America).

Zenyatta certainly has a good chance to win, and I can even get behind the idea that she's the deserving favorite. However, if you let me have a choice between Zenyatta and the field, that's easy. Quality Road, Blame, and Lookin' at Lucky all should get support, and there are some other horses of intrigue that may run (like Japanese star Espoir City, the usual European invaders, and second line three-year olds First Dude, Fly Down, and Paddy O'Prado).

* A related Zenyatta storyline that some racing writers have begun to wring their hands over is whether Zenyatta will be Horse of the Year or not. Horse of the Year is voted on by a group of turf writers, similar to the MVP award. However, it's held in a little higher esteem in the horse racing world than the MVP is in baseball circles. A better comparison is college football's national championship, particularly prior to the BCS.

Like the MNC, the winner can be viewed as the overall champion of the season. In college football, you had conference champions (think divisional awards for horse racing, like Champion Three-Year Old Male, Champion Older Female, or Champion Sprinter) and bowl game champions (think Breeder's Cup race winners). There was no unified way to pick an overall champion, so journalists got together and took a poll. The strange thing is that people found themselves intensely emotionally invested in the outcome of that poll, but so be it.

Zenyatta, whose career accomplishments pretty straightforwardly place her among the all-time greats, has never been voted HOTY. Some folks seem to be concerned about this apparent contradiction, a great performer in an individual sport never voted as the best in a given year.

Historically, it is very difficult for a filly or mare to get the nod as HOTY. Since 1971, when the current honors (the Eclipse Awards) were introduced, only four female horses have won the honor: All Along in 1983, Lady's Secret in 1986, Azeri in 2002, and Rachel Alexandra in 2009. Generally, HOTY goes to the top older male horse (which is logical since older male horses are generally the best horses. It's similar to the MNC usually going to the most impressive champion of a major conference). A three-year old male also has a clear path to the award, by scoring impressive victories over older horses (as done by Tiznow in 2000 or Curlin in 2007) or by dominating races against other horses of his generation (Point Given in 2001).

Generally, horses from other groups will only get consideration if there is no clear choice among the top males. Even fillies/mares having undefeated seasons will get passed over in favor of a worthy male (see Personal Ensign in 1988; she even won the Whitney against males but couldn't beat out Breeders' Cup Classic winner Alysheba for HOTY).

That's pretty much what happened to Zenyatta in 2008. She won all seven of her races, but didn't face males or run off of a California synthetic track. Curlin had an impressive season, winning the Dubai World Cup, the Stephen Foster, the Woodward, and the Jockey Club Gold Cup, becoming the all-time earnings leader in the process. He got HOTY.

The 2009 HOTY race was a little less conventional as three year-old filly Rachel Alexandra won the award. Rachel Alexandra had won the prestigious Kentucky Oaks and Mother Goose against other three-year old fillies, but also defeated three-year old males in the Preakness and Haskell and older males in the Woodward. She was not entered in the Breeders' Cup because her owners did not want her to run over a synthetic track; her detractors claimed it was because they wanted to duck Zenyatta.

Zenyatta certainly had a case for HOTY, and would have been a reasonable choice, but I agreed with the decision to side with Rachel Alexandra. Rachel Alexandra won races all over the country; Zenyatta ran only in California. Rachel Alexandra was a dirt horse; Zenyatta stuck to synthetic surfaces, which simply have not yet reached the same level of importance in American racing. Zenyatta's win over males was admittedly in the most impressive race possible, but Rachel Alexandra's three wins over males were all in Grade I races. Perhaps the best point in Zenyatta's favor was that she won at the classic 1 1/4 mile distance, while Rachel Alexandra's longest race was the 1 3/16 mile Preakness.

Looking at the 2010 HOTY race, Zenyatta will win by acclimation if she can repeat in the Classic. She will also be an easy choice if any horse other than Blame, Lookin' at Lucky, or Quality Road win the race. In the event that one of those colts win, though, it would be tough to deny them the award. Each would have a head-to-head win over Zenyatta (and the rest of the field) in the most important race of the year. Blame boasts victories in the Stephen Foster and Whitney; Quality Road in the Donn, Met Mile, and Woodward; and Lookin' at Lucky in the Preakness and the Haskell.

It is possible that Zenyatta could be voted HOTY even in the event that one of her top three challengers wins the Classic. However, I would hope that voters would do this out of a belief that she was the top horse in 2010, not out of a desire to right the historical record. (*) A horse, especially a mare, can easily be considered an all-time great through finishing second three times in the HOTY voting.

(*) If this post wasn't already too long, I would advance a half-baked theory about how this is exactly what happened in 1998, when HOTY voters went for Skip Away instead of Awesome Again, after passing over Skip Away for Favorite Trick in 1997.

Monday, October 18, 2010

Even More Mundane Comments on the Playoff Structure

You certainly don't need me to point out to you that run scoring is down in the playoffs compared to the regular season. I am just going to give you some data on the matter, and half-heartedly explore one possible explanation for why that is.

I figured the RPG (total runs in the game by both teams) for every World Series (through 2008) and a comparison of that to the overall RPG for that season (figured as a simple average of the AL and NL RPG) and the RPG for the two World Series participants (again, a simple average). I have limited the scope to the World Series so that the cross-era comparisons are on more of a level footing.

I've averaged the data by decade (a simple average for 1900-1909, 1910-1919, etc.) so that we can see how it has changed over time:

Frankly, this chart surprised me. I had expected that in recent years the disparity between regular season and World Series RPG levels would have increased, but in fact recent decades are the closest matches for regular season scoring.

Since I assumed this to be the case, I was going to put forth the argument that the playoff structure coupled with changes in the game (specifically, the increased use of relief pitchers) has caused the post-season to become a different game from the regular season, to a greater extent than in the past. My personal take on this phenomenon was to be that it was unfortunate--that the run scoring levels, pitcher usage, strategy choices, etc. should ideally be as close to the same as possible for the regular season and the post-season. I don't like the idea of playing 162 games under one set of conditions and switching to very different conditions to crown a champion.

But my assumption was unfounded. While run scoring declines in the World Series (you can pick your explanation--colder weather, reduced usage of marginal pitchers, increased usage of one-run strategies, or whatever other theory you'd like to advance), the decline has not grown over time. Today's World Series are generally as close to regular season scoring levels as they have ever been.

One other little tidbit to note is that generally, with the 1990s and 2000s actually being the most obvious exceptions, the pennant winners combine for a higher RPG than the majors as a whole. Obviously we expect that pennant winners are very good teams, and will likely both score more runs than the league average and allow less. If a team was equally good offensively and defensively in terms of runs above average, then their RPG would be equal to the league average.

However, if pennant winners were especially strong on defense relative to offense, then their RPG should be lower than the league average. Of course, you can rightly point out that runs scored and allowed have different win values, dependent on their unique combination of runs scored and allowed, and so you don't want to draw too much of a conclusion from this one way or another. Park factors are also ignored by this crude comparison. But if pitching and defense were everything as a minority of traditionalists would have you believe, then pennant winners should certainly have lower RPG than the league average.

Moving along, one possible explanation for lower scoring levels in the post-season is increased usage of top pitchers. I did a little crude investigating on this front by figuring the percentage of regular season innings thrown by a team's top three pitchers (in terms of innings), and comparing that the percentage of World Series innings thrown by the top three pitchers (again, in terms of innings). Please note that I did not consider the same three pitchers--the group under consideration is the three pitchers with the most innings in the games being considered (regular season or World Series).

The reason I chose three pitchers is because presumably the top three in IP will be the front three starters, which is all teams have traditionally needed to use in a seven-game series (of course in today's game four starters are usually employed). There are a number of weaknesses to this approach, including but by no means limited to:

1. It doesn't include the effect of relief aces, who have a disproportionate impact on win probability thanks to working in high leverage situations, and are often employed differently in the playoffs. They are also a relatively modern phenomenon that will damage cross-era comparisons of IP%.

2. It doesn't account for injuries and other factors that alter pitching workloads. If, for instance, a top pitcher is out for the Series, IP% will likely be lower than it might have been, but only because of the absence of the pitcher, not because of any intentional alteration in strategy.

3. IP% can be highly influenced by series length. If a series only goes four games, then it is likely that a larger percentage of the workload can be borne by the key pitchers of the staff.

4. Rainouts or other delays in the series can greatly skew the results by allowing pitchers to pitch more than they would have. This is particularly evident in the 1989 World Series and its earthquake delay; the A's IP% for the series was 88%, the highest in twenty years.

5. I am only considering the World Series; presumably managers are more conservative with the usage of their pitching staffs in earlier playoff rounds, or at least no more aggressive.

So I'm not claiming that these results are particularly informative. Nonetheless, I broke them up by decade as I did for the RPG data. Reg IP% is the simple average of IP% for the two pennant winners, simply averaged for the decade; WS IP% is the same for the World Series; and RAT is the ratio of WS IP% to Reg IP%, expressed as a percentage:

Again, I have to admit this is not what I expected to see. I expected that teams of earlier eras, heavily concentrating their workload on a few pitchers to begin with, would show a more even IP% between the regular season and World Series. The opposite appears to be true; earlier teams ratcheted up the workload for frontline pitchers in the Series to a greater extent than to today's pennant winners. Of course, the weakness in using three pitchers is illustrated by the fact that the ratio was fairly stable until the 1970s, around which time the trend towards larger starting staffs was accelerating.

Again, the data here is by no means conclusive or even particularly insightful. However, I expected to find support for my seat-of-the-pants belief that style of play in the playoffs had become more removed from the regular season over time. Instead, I have no solid ground to stand on to make such a claim (there may well be data out there that would support such a position, but it isn't here).

My argument would have been that since 1) teams could now get a higher percentage of innings from front-line pitchers in the World Series than in the regular season, and 2) that because runs scored declined more precipitously in the World Series, ergo changes should be made to the playoff series format to make it closer to regular season conditions. The most obvious alteration would be to eliminate off-days, which would eliminate the possibility of using a three-man rotation and possibly even encourage the use of five starters as in the regular season.

Leaving aside the practical problems with such a change (chief among them revenue concerns), I personally believe that such modifications would make the playoff series a better test of team strength since they would more closely track the conditions of the regular season. But I didn't find any evidence that the disparity between regular season play and World Series play has increased over time--to the limited extent that the data here addresses the issue, the disparity has actually lessened. Any push for changes that would close the gap is undermined by the fact that larger disparities (at least in terms of these two measures) were accepted throughout the twentieth century.

Walk Like a Sabermetrician