Monday, December 06, 2010

Statistical Meanderings, 2010

This is about as close as I get to writing a Jayson Stark-style piece throughout the course of the year. Sine I hate that format, hopefully there will be something of greater interest than sheer trivia here. Most of the statistics mentioned come from my End of Season Stats and are explained in that post:

* Last year the AL/NL scoring gap in terms of R/G was the largest it had been since 1998; this year, at .12 (4.45 to 4.33) it was the narrowest it had been since 1990 (4.30 to 4.20). The overall scoring average of 4.38 was the lowest for the majors since 1992 (4.12); 1992 was also the last time that either of the leagues individually had as low of a scoring rate.

The offensive difference between the AL and NL was largely due to a difference in league batting averages. As a group, the AL and NL had nearly identical walk rates (.095 and .096 walks/at bat) and isolated power (.147 to .144), but the AL BA was five points higher (.260 to .255). The NL slugged just .399, the first sub-.400 league figure since the 1993 NL.

* I list two different winning percentage estimators in my team report. EW% is based on actual runs scored and allowed, while PW% is based on runs created and runs created allowed. Teams whose actual W% were very similar to both of the estimates included (W%, EW%, PW%): Atlanta (.562, .567, .564), Cincinnati (.562, .567, .564), Florida (.494, .501, .495), and Texas (.556, .564, .557).

An interesting group of teams is those whose PW% tracked their actual W% much better than EW% did. These are teams that may be over/underrated for 2011 by those that put a great deal of stock in Pythagorean record as an indicator. Such people are largely strawmen, but regardless, some of the teams in this group are Baltimore (.407, .386, .411), the Cubs (.463, .447, .468), Pittsburgh (.352, .324, .351), and St. Louis (.531, .564, .542). I'll leave it to the reader to find the more conventional Pythagorean watch teams, those whose EW% and PW% are in general agreement and diverge from actual W%.

* Last year, SF games were the lowest scoring in MLB at 7.83 RPG, which was the lowest figure since the 2003 Dodgers. In 2010, 7.83 RPG would have ranked just third-lowest, as Seattle (7.48) and San Diego (7.69) each exhibited a lower scoring context. The Mariners' 7.48 still couldn't touch the 2003 Dodgers at 6.98, but it was the lowest RPG for an AL team since the 1981 Yankees (7.14). Of course, Seattle's RPG was lowered by the .97 park factor, but even after park-adjusting the figure to 7.71, it was still the lowest AL scoring context since the 1989 Angels (7.70, not park-adjusted).

Commenting on Seattle's offensive ineptitude can be considered hitting after the whistle at this point, but allow me to indulge. Their 3.17 runs/game was the lowest since the 1981 Blue Jays averaged 3.10. Seattle's 2.95 R/G at home was the lowest since 1972, when both the Padres (2.71) and Angels (2.76) scored less. While Safeco had a large home/road split in 2010, the five-year PF is .97--a pitcher's park, yes, but not an extreme one.

They were more respectable on the road, averaging 3.38 R/G, a mark which the Pirates (3.14) managed to keep from being even the lowest in 2010 (although outside of the Pirates' showing, it was the fewest since the 1994 Pirates scored 3.20 away from home. I did not run the numbers relative to league average, but it probably wouldn't do to much to help Seattle; while the AL's average of 4.45 R/G is low relative to recent seasons, it's still a perfectly normal league scoring level in historical context.

* Unfortunately, we never got to see the playoff matchup between New York and Tampa Bay. While the concerns about the Rays running wild in such a series were likely overstated, it is true that the Yankees struggled at controlling the stolen base game. The 85.2 SBA against them was easily the highest in MLB, with the Red Sox (80.1) next. The Cardinals lead baseball with only 58.9% of opposition attempts successful.

* It will come as no surprise that Pittsburgh had a terrible defense in 2010. The degree of their anti-dominance may be a little jarring though: last in BA (by ten points, .283), last in OBA (by eight points, .347), last in SLG (by fourteen points, .451). The only team offense that exceeded any of the Pirates' allowed figures was Toronto, which slugged .456.

The Pirates were also last in innings/start (by .11 innings, 5.38), starters' eRA (by a whopping .62 runs, 5.86), and DER (.659). Their bullpen ranked only fifth-worst in eRA (4.82), and their modified fielding average was third-worst (.962). All of this predictably resulted in allowing 5.4 R/G (more than any team managed to score).

* In 2009, playoff teams averaged +72 runs above average on offense and just +44 on defense. In 2010, the teams exhibited more balance, as you can see:

I'd usually snark about defense winning championships at this point.

* You're probably aware that the long-term trend in MLB, pretty much dating all the way back to 1871, has been for fielding averages to increase. For the most part this holds, but there was an odd blip in 2009. The all-time high ML mFA is .9704, set in 2007. In 2008, the mFA rounded to four decimal places was the same but actually was a bit lower. In 2009, however, mFA dropped all the way to .9669, the lowest since 2001. The decline of .36% was the largest in the post-war era.

In 2010, mFA rebounded to .9693, an increase of .25%. That is still the lowest average (excluding 2009) since 2004. I am not claiming that fielding average is an important metric, or that there is a meaningful explanation for the fluctuations, but in looking at league fielding totals it caught my eye.

* Major league teams had a .559 W% at home in 2010, the highest mark since 1978 (.573). 93% of major league teams (28/30) had better records at home than on the road, which sounds like a lot, but while high it isn't extraordinary. (San Diego had the same record home and away). The average for 1961-2010 is 83%, but as recently as 2007-08, 29/30 teams have had better home records. In both 1978 and 1989 all teams had better records at home.

Much was made about the Pirates' .210 road W% (17-64), the worst since the identical showing by the 1963 Mets. Also notable was Detroit's home/road split of .642/.358, which was of equal magnitude to that of the Pirates and was the largest by a .500 or better team since the 1996 Rockies (.679/.346). The Rockies and Braves chipped in to make it four of the 23 highest differentials since 1961 in 2010.

* Cleveland fans seem to be pretty happy with new closer Chris Perez, and given his performance (7th in the AL among relievers with 20 RAR), but it would be a mistake to assume that he's proven himself as the long-term answer at the end of the game. He allowed a low .234 %H, so his 3.72 dRA is well above his 2.10 RRA or 2.86 eRA. The batted ball metrics are even less impressed--4.45 cRA, 4.74 sRA.

* Jonathon Papelbon pitched 67 innings with a 4.24 RRA, which results in 5 RAR; Scott Atchison pitched 60 innings with a 4.16 RRA, for 5 RAR as well. Of course, the similarities end there, as Papelbon's peripherals were much better than Atchison's, but it's never a good thing when your pricy closer is no more effective than the seventh man out of the pen.

* Bobby Jenks has been non-tendered, and obviously I have no insight to offer on his health or his PitchF/x data or anything like that. What I can tell you is that his peripherals were pretty good in 2010: 2.89 dRA (his %H was very high at .365), 3.34 cRA, 2.91 sRA. If he's healthy, he might be a good buy.

* Chad Qualls gave up a massive .397 %H; he actually looks serviceable in dRA (4.26) and the batted ball metrics (4.33 and 3.89).

* Trevor Hoffman was terrible in his swan song, but at least he was consistent across the board in RA estimators: 5.95 RRA, 5.82 eRA, 5.89 dRA, 5.45 cRA, 6.10 sRA. Ryan Madson was consistent in a good way: 2.47, 2.80, 2.88, 2.83, 2.90.

* Last year I pointed out that Francisco Rodriguez didn't pitch very well in the first year of his big contract, so I feel obligated to point out that he was pretty good in his 57 innings in 2010: 2.34 RRA, 3.16 eRA, 3.10 dRA, tied for fifteenth among NL relievers with 16 RAR. Of course, his off-the-field performance took a corresponding nose dive...

* Who is Wilton Lopez? I probably saw less Houston games than any other team this season, so I never saw him pitch. The 27 year-old Nicaraguan rookie ranked among the ten most valuable relievers in the NL (not considering leverage) with a 2.02 RRA in 67 innings and solid, consistent peripherals (3.41 eRA, 3.24 dRA, 3.24 cRA, 3.22 sRA). He was shelled last season in 19 innings, and his strikeout rate (6.7) leaves a lot to be desired, and for the season as a whole he wasn't trusted by Brad Mills, with a below-average Leverage Index. He did inherit .49 runners/appearance, but sometimes a high IR/G goes hand-in-hand with a mop-up role. As if that wasn't enough cold water, his minor league numbers don't look like anything special from a quick glance. It was a nice season in any event.

* I don't list Inherited Runs and Bequeathed Runs Saved on the reports themselves, but if you download the spreadsheets, they are included. The AL leaders in IRSV were Matt Thornton (5.7), Randy Choate (5.1), and Joaquin Benoit (5.0). Dan Wheeler (4.0) also had a particularly good showing from Tampa's pen. Eddie Bonine trailed the AL at -8.4

Among AL Relievers, Dusty Hughes got the least support from subsequent relievers as 15/25 scored (7.2 BRSV). Lance Cormier benefited the most as only 1/30 bequeathed runners came around to score (-8.1 BRSV).

In the NL, Wilton Lopez (9.2), Javier Lopez (9.2), and Santiago Casilla (8.7) were the leaders in stranding runners; Lopez allowed just 1/33 to score. A pair of Dodgers were the trailers: George Sherrill (-5.7) and Ramon Troncoso (-10.5 on 22/37).

Apparently Ronald Belisario was one of their victims, as he got the least support from subsequent relievers (4.6 BRSV on 12/24). Joe Thatcher was the most fortunate, as his Padre penmates prevented all 35 runners he bequeathed from scoring (-10.5).

* The flip side to the last bullet point is bequeathed runs saved for starting pitchers. In the AL, Rich Harden got the most support (2/20, -4.2) followed by Jake Westbrook and Jon Lester (both -3.7). Jason Vargas got the least help (12/20, 6.0).

In the NL, Jonathan Sanchez (3/25, -4.9) and a pair of Braves (Derek Lowe, -4.3 and Tommy Hanson, -3.0) were the best-supported by their pens. Scott Olsen (10/16, 5.0), Chris Narveson (4.8), and Kevin Correia (4.6 despite pitching for San Diego with their excellent bullpen) were the least supported.

* Presented without comment: Max Scherzer 37 RAR, Edwin Jackson 25. Clayton Richard 31 RAR, Jake Peavy 17.

* I usually only include pitchers with 15 starts on the starters report, but I had to throw Stephen Strasburg into the mix. Among NL pitchers with 15 starts (plus himself), he ranked twelfth in RA, fourth in eRA, and first in dRA, cRA, and sRA. Obviously that was in just 68 innings, but it was fun while it lasted.

* It's a shame there is no LVP award, as it would be a runaway in both leagues--Ryan Rowland-Smith, -26 RAR in the AL and Charlie Morton -31 in the NL. Rowland-Smith was 1-10 with a 7.83 RRA in 109 innings, and none of his peripheral RAs were much better (6.37 sRA was his best). He was last in the AL in all of the run averages, plus QS% (20).

Morton's season was more respectable, as he pitched 80 innings and gave up a .367 %H, which means his dRA (5.50) and batted ball RAs (5.10 and 4.56) weren't terrible. Teammate Zack Duke was next on the RAR trailer list (-20) and matched Morton with -42 RAA thanks to being allowed to pitch 159 innings.

* Cliff Lee and Jon Lester are very close in most of the categories I list, in addition to both having names that start with "L" and being left-handed. Lee pitched 4 1/3 more innings, with essentially the same RRA (3.55 to 3.52), eRA (3.19), and cRA (3.55 to 3.52). Lee was a little better in dRA (3.08 to 3.26), Lester better in sRA (3.77 to 3.36). Lee's %H was .300 to Lester's .295; Lee made 106 pitches/start, Lester 105; Lee made 64% quality starts, Lester 63%. Both get credit for 21 RAA, while Lee gets one more RAR (51 to 50).

* The bottom 12 starting pitchers in the AL in RAR are: Ryan Rowland-Smith, Brian Bannister, David Huff, Scott Kazmir, Rich Harden, Josh Beckett, Scott Feldman, Jamie Shields, Nick Blackburn, Jeremy Bonderman, Tim Wakefield, and AJ Burnett. The NL's bottom dozen is a much more conventional list of lousy pitchers: Charlie Morton, Zach Duke, Kyle Lohse, Nate Robertson, Manny Parra, Jeff Suppan, Paul Maholm, Bud Norris, Kevin Correia, Craig Stammen, Dave Bush, and John Ely.

* We all know that Texas owes its success largely to pitchers going deep in games, right? Wait, they only averaged 5.87 IP/S, ninth-lowest in the majors? Well, surely that must be because of some bad starts from Rich Harden or something.

Well, here are the P/S (counting relief appearances as half-starts) for the Rangers' starters with 15 or more starts (including starts made for other teams in the case of Cliff Lee): Lee 106, Wilson 104, Lewis 103, Feldman 95, Harden 93, Hunter 85.

Tampa Bay, their playoff opponents: Price 107, Garza 101, Shields 100, Davis 96, Niemann 89.

A few other teams: White Sox: Danks 106, Peavy 100, Buehrle 100, Floyd 97, Garcia 88.

Boston: Lackey 109, Lester 105, Matsuzaka 105, Beckett 103, Buchholz 100, Wakefield 85

Oakland: Gonzalez 102, Cahill 100, Mazzaro 99, Sheets 98, Braden 95, Anderson 95

Angels: Weaver 109, Santana 108, Saunders 100, Pineiro 100, Kazmir 98

Detroit: Verlander 113, Scherzer 106, Porcello 96, Galarraga 95, Bonderman 93

Of course, facts aren't particularly important when you need to rhetorically twist a playoff series into a morality play.

* Earlier in the season, I posted about the number of no-hitters thrown in 2010 and how it compared to expectation based on both the historical frequency of no-hitters and a theoretical probability of a no-hitter. The details are in that post and will not be repeated here.

There were six no-hitters pitched in 2010 out of 4,924 possible games (double-counting games since of course each pitcher has an opportunity to throw a no-hitter). Given the long-term frequency of no-hitters figured by Tom Flesher (.06%), we'd have expected 2.954 no-hitters. The Poisson distribution yields this expected distribution:

I also offered a distribution based on the theoretical probability derived from the overall major league BA (.2573) with 2.734 expected no-hitters (.056%):

Whichever model you choose, there's nothing particularly shocking about six no-hitters in one season, something that has about a 6% chance of occurring by chance. If I wanted to get cute, I'd point out that 6% is about once every twenty years, and the last big no-hitter season was in 1990, but that would be sabermetric malpractice.

* Speed Score leaders and trailers by position (with the caveat that these are based on just 2010 data when Bill James intended them to be based on multiple years):

C: Miguel Olivo (6.1)/Chris Snyder (.7)
1B: Kevin Youkilis (5.5)/Adrian Gonzalez (1.0)
2B: Sean Rodriguez (6.2)/Luis Valbuena (1.7)
3B: Evan Longoria (5.4)/Wilson Betemit (.9)
SS: Rafael Furcal (7.8)/Juan Uribe (2.2)
LF: Carl Crawford (9.1)/Pat Burrell (1.0)
CF: Dexter Fowler (9.1)/Torii Hunter (2.5)
RF: Will Venable (8.6)/Ryan Ludwick (1.9)
DH: Johnny Damon (5.5)/Willy Aybar (.7)

* RGs for Yankee hitters with 300+ PA: 6.8, 6.1, 5.9, 5.8, 5.6, 5.4, 5.2, 4.3, 4.1. Without doing any formal checks, I have to assume that's one of the more balanced league-leading lineups you are likely to see. The Yankees led in team RG with 5.14; the Red Sox were second at 5.08. Their breakdown for 300+ PA hitters was: 7.6, 6.7, 6.5, 6.2, 5.9, 5.1, 4.9, 4.9, 4.2. The range is only about one run wider, but it's more top-heavy.

* Ryan Braun and Prince Fielder had different shapes to their production (Braun had a .305 BA and .289 SEC while Fielder was at .263/.409), but the outcomes were very similar. Braun created 110 runs while making 434 outs for 6.45 RG, while Fielder created 109 runs in 427 outs for 6.49 RG; both were at +36 HRAA. Position adjustments put Braun several runs ahead in the categories where they are included, but it would be tough to find two stars on a team better matched.

* Park-adjusted stats: Matt LaPorta .222/.307/.364, 3.84 RG, 0 RAR. Justin Smoak .215/.305/.365, 3.87, 0. These two make an interesting pairing since both were the centerpiece of a trade of a former Indians left-handed ace. Of course it has been two years since LaPorta was traded to Cleveland while Smoak was traded this summer, but they both will need to improve on those performances to avoid the trades becoming Santana II and III. Personally, I think Smoak is a much better bet--he's younger and LaPorta was nagged by several injuries during 2010.

* The Indians got nearly identical offensive production out of their two primary center fielders, Trevor Crowe and Michael Brantley. Crowe hit .252, Brantley .247. Each had a .299 OBA. Crowe slugged .334, Brantley .328. Add in steals and they both created 3.42 runs/game and were essentially replacement level (1 RAR each). This was actually an improvement over Grady Sizemore, who hit even worse (.212/.264/.291) in his 137 plate appearances.

* Alex Rodriguez had the worst full-time season of his career in 2010. His hitting relative to the league average was a little worse in 1997, but he was playing shortstop at the time and had an additional 40+ plate appearances than he did in 2010. This is not meant as a condemnation of A-Rod in any way, as he was still a valuable asset and after all is 35 years old.

However, I was surprised by the lack of media excitement about this. Certainly it was pointed out that he was not his old self, but usually anything negative about the man is blown completely out of proportion, and the media could have had a field day if they'd framed the story in a certain light. Why didn't they? Speculation about motivation aside, RBI offer a statistical explanation. Rodriguez batted in 125 runs, most since his MVP campaign in 2007, and the sixth-highest total of his career.

At the same time, Rodriguez' ratio of RBI to RC (using no park adjustment in the latter) was the highest it had ever been at 125/92 = 1.38. His previous full-season high had come in 1999 (111/102, 1.09). For his career, A-Rod's RBI and RC are very close (1855 RBI, 1831 RC), and so he is not a hitter that has a RBI > RC tendency.

Among batters with 300 or more PA, only Pedro Feliz (40/26, 1.51) and Willy Aybar (43/30, 1.45) had a higher ratio than Rodriguez. Ichiro turned in the major's lowest ratio (43/91, .47). A-Rod's relative lack of production was masked by his RBI count, and if it has anything to do with preventing a media freakout, I'm personally pleased it was.

Matt Klaassen of Fangraphs has written about ARod's RC/RBI ratio as well.

* I hadn't noticed how poorly AL shortstops hit until the Silver Sluggers were announced. When I heard "Alexei Ramirez", I was little surprised. Of course, then I looked at the numbers a little more closely, and it's ugly. These are all AL players with 300+ PA who were primarily shortstops:

None of them managed to match even the AL average RG, which leads to this amusing chart of Silver Slugger winners together with their HRAA:

Admittedly, using average as a baseline is a cheat for shock value in this case.

* Another hitter of historical note whose 2010 wasn't up to his own previous standards was Albert Pujols, at least if you believe some accounts. It has been seemingly common to claim that 2010 was Pujols' worst season, but I beg to differ.

Looking at unadjusted (for league or park) RG, it was only Pujols' fourth-worst season, as he had lower RC rates in 2001, 2002, and 2007. Factoring in league and park, I have his ARG at 206, which was actually a smidge better than his pre-2010 career mark of 204, and puts the season well ahead of the three years already discussed and in the same general pool as 2004-2006. Factor in that Pujols set a career high with 690 PA, and my fielding-less WAR pegs it as the fifth-best season of his career--right in the middle. And still a season strong enough to be worthy of NL MVP honors.

