Walk Like a Sabermetrician: March 2006

Monday, March 27, 2006

2006 Predictions

Predictions like this are, as always, offered up as “fun” item and not as serious analysis. Last year I made a point about predictions that I think is worth recapping. When you say “I pick the Yankees to win the AL East”, you usually do not mean that you think it is a sure thing that the Yankees will win the AL East. What you mean is that you think that they have the best chance. This chance might be very high, say 80%, or it may be much lower, say 40%. Of course, at any chance under 50%, what you actually think is that the odds are that the Yankees will NOT win the AL East. But since you gauge them as having the highest chance, you predict that they will win.

This problem gets even worse as you move up to higher and higher championships. It would be silly to suggest that any of the thirty teams is an even money to choice to win the World Series. But yet I will go ahead and pick a World Series winner, even though I think it more likely then not that team will not win it all.

Of course, most people inherently recognize this I think, and don’t feel the need to explicitly state it when they make their predictions. Maybe I am particularly gun shy because to be honest, my predictions usually suck. I have been doing this for many years now, and just going from memory, have not gotten an impressive percentage right. Last year’s happen to be preserved on this blog, so reviewing them, I successfully predicted the AL West(LAA), the NL East(ATL), and the NL Central(STL). I had the Red Sox and Yankees in the playoffs, but reversed the order (actually, since they finished in a mathematical tie, I guess you could consider these two predictions correct). I thought the Twins would repeat in the AL Central, and had the White Sox in fourth, which is not slick at all. I picked the Phillies to win the NL Wildcard, which they contended for until the last day of the season. I had the NL champs from Houston in fourth in their division as well. So if you are a Tigers or Astros fan, take heart. I picked the Dodgers in the NL West, and embarrassingly predicted a Red Sox WS repeat over them.

I also predicted the three major awards, and got ARod as MVP, although that is a really absurd thing to predict, because ARod could have easily won the AL MVP five times now, and you just get lucky in the years he actually wins. How much courage would it take to predict Pujols to win the NL MVP this year? So I will try to make my award picks at least a little daring. The other predictions at the bottom are meant to be taken somewhat TIC.

AL EAST
1. New York
2. Toronto
3. Boston
4. Baltimore
5. Tampa Bay
The Yankees remain a ticking time bomb in my mind, but I don’t expect it to explode this year. Although the pitching staff could wind up the same way it did last year, as the Yankees are now expecting last year’s patches like Wang and Chacon to be key guys. I’m not crazy about Toronto’s moves, but they could provide enough oomph to overtake a Red Sox team that has its own issues. Baseball Prospectus made a very good point in their book this year about the Jays that should temper some of the enthusiasm; while they had something like a .545 EW% last year, based on RC and RC Allowed(or what I call PW%), it was more like .490, which is how they actually played. So while it may appear that they only need to improve their standing against Boston or New York by three or four games based on Pythagorean record, the actual gap is probably a lot larger. Baltimore is not a bad team, and could surprise if Mazzone can work his magic there; they already have a farily solid group of pitchers to work with in Benson, Cabrera, Bedard, and Chen. I remain skeptical on Mazzone’s magical abilities, and so will pick the Orioles in fourth. The Devil Rays should not change their name, at least not to “Rays”.

AL CENTRAL
1. Cleveland
2. Minnesota
3. Chicago
4. Detroit
5. Kansas City
I intend to write a little more in-depth thing on the Indians before Opening Day on Sunday, but I think they can win this, with the caveat that I think the top three are all essentially equals and would be surprised not one bit if the order was reversed. The Indians’ bullpen will likely be far short of where it was last year, and there won’t be a repeat ERA leader in the starting rotation. And the offense has a number of players who could be due for some regression, particularly Sizemore and Peralta. But the team had the highest PW% in the game last year(.617), so I think they could lose a lot of ground and still win here. The Twins would be a lot more interesting if they could hit, but they still have Mr. Santana and I would take Francisco Liriano over any rookie pitcher in the game. The White Sox are widely hailed as favorites, which I can’t help but think is shortsighted. Their W% was around seven games better then their EW%, which in turn was about four games better then their PW%, which indicates they should have won 87 games last year. Yes, adding Jim Thome and Javier Vazquez seems to improve the team on paper, but I don’t think you can expect their starters to repeat their performances of a year ago. The Tigers are making progress, but remain a year or two away from contending. The Royals signings of veteran flotsam may make them better, but I’m not sure of it. They will likely improve a bit just because nobody’s really that bad.

AL WEST
1. Oakland
2. Los Angeles (wildcard)
3. Texas
4. Seattle
The A’s look really, really good to me. The biggest concern to me is that the young pitching will not repeat their performances, but I think the offense could improve, as Crosby ideally will stay healthy and Frank Thomas, even for 300 PA, is a big bat. Plus they have Nick Swisher, the only Buckeye in the major leagues at this moment. That has to count for something. The Angels are still a solid team, though. I’m not terribly excited about the Rangers adding solid pitchers to their rotation, although they could be better, and they weren’t that far out of this thing to begin with. There are three teams in each AL division that really wouldn’t surprise me if they were to be playing come October. The Mariners have King Felix, so he should keep them contented for another poor year at least. I realize that most people are older then the ballplayers they watch, and so I don’t expect a lot of sympathy, but it is kind of odd to now be watching some guys younger then myself. Felix Hernandez is the first player to have a star-caliber performance in the majors who is younger then me, so I guess my window of opportunity at being a professional baseball player is closing rapidly, as if my lack of talent hadn’t slammed it shut years ago.

NL EAST
1. New York
2. Philadelphia (wildcard)
3. Atlanta
4. Washington
5. Florida
The Mets are the kind of team I hate to love, but in this division, that can get you picked to win. They have made additions that could improve their team, although I don’t for the life of me understand the Seo and Benson trades, and Pedro’s health is a big concern. I have been picking the Phillies for years, and have always gotten burned--I think they have less potential then they’ve had in the past, but they’ve been underperforming for years, so maybe some day they’ll exceed expectations. The Braves did a remarkable job to win again last year, and you pick against them at your own risk, but what better time to take that risk then when Mazzone leaves and they do little to improve the club in the off-season? The Nationals are an embarrassment to baseball, and desperately need an owner and a plan of how to build their organization, and preferably not one that includes Jim Bowden. The Marlins should be just plain awful.

NL CENTRAL
1. St. Louis
2. Milwaukee
3. Chicago
4. Houston
5. Pittsburgh
6. Cincinnati
The Cardinals have the potential for implosion, but there’s nobody in this division who I see as ready to fill the void, even if the Cards drop to around 88 wins. I was on the Brewers’ train last year, picking them third, and I am downright bullish on their future prospects, but I don’t think it is their time yet. The Cubs could win if the stars align and Prior, Wood, and Zambrano all pitch a full season like they have the potential to (I realize Zambrano has a record of good health, but given the way pitchers go and fate’s seeming opposition to the Cubs, would anyone truly be surprised to see him break down with the other two having Cy Young-type seasons? I know I wouldn’t), but the odds of that aren’t that great and they sure won’t win because of their offense. You can’t expect Pettitte and Oswalt to be as spectacular as they were a year ago; then you take away Clemens, do nothing to improve a weak offense, and the Astros don’t look so hot. But keep in mind that I picked them in fourth last year as well. The Pirates are picked fifth only because the Reds are no good either. The trade of Pena, who while far from a sure thing has some decent major league success at a young age and seems to have worlds of talent, for a middling pitcher like Bronson Arroyo, makes me unenthusiastic about the Wayne Krivsky era being any better then that of Dan O’Brien.

NL WEST
1. San Diego
2. Los Angeles
3. San Francisco
4. Arizona
5. Colorado
I thought about this one a lot--are the superhuman powers of Barry Bonds enough to overcome the myriad of problems with the Giants? Are the Padres improved over last year (I’m inclined to think yes), but if so, does that mean they’ll win 85 games? Will the Dodgers, with many of the same players I picked to win the NL pennant last year, rebound? Can the Diamondbacks defy Pythagoras again? Can a division look this bad two years in a row? As an aside, I mentioned earlier that I was bullish on the Brewers’ future. I will disclose here, for the first time, that I was once bullish on the Rockies future. The fact that you do not need me to tell you when exactly I thought that to know that I was colossally wrong says a lot about the Rockies’ organization, and why I consider them the most boring team in baseball.

WORLD SERIES
Oakland over St. Louis
Although you could spin the bottle for the NL pennant if you wanted to.

AL ROY: Craig Hansen, BOS
The theory here is that Hansen is thrown in as the closer and gets thirty saves and gets tons of pub. You may not buy it; I’m not sure I do.
AL CY: C.C. Sabathia, CLE
Sabathia has never emerged as the ace the Indians and their fans have hoped for, but he is still just 26 and had a 4.02 eRA last year; improve a bit, get some run support, win 20 games, team wins the division, win the Cy? Unlikely, but possible.
AL MVP: Eric Chavez, OAK
Chavez bounces back, A’s look like best team in league, he gets MVP. Again, I’m trying to avoid the real obvious picks here.
NL ROY: Ryan Zimmerman, WAS
Would Jeremy Hermida get ignored because he plays for the league’s worst team, or get sympathy votes because of it?
NL CY: Jake Peavy, SD
Not an out on a limb pick at all, but he’s good and he hasn’t won it yet, so why not?
NL MVP: Albert Pujols, STL
Despite what I said above, he’s too good and too consistent not to go with the obvious pick.
Most annoying story of the year: Barry Bonds, home run record, BALCO, steroids, Bud Selig, Fay Vincent, etc.
Funniest story of the year: Alfonso Soriano, Jim Bowden, Jose Vidro, left field, second base, etc.
Most predictable headlines of the year: “Rockies remain mired in mediocrity”; “Rays name change met with yawns”; “Pujols records 120th RBI”; “Guillen’s comments spark controversy”; “Selig to remain commissioner for an additional two years”; “World Baseball Classic blamed for [fill in the blank]”
First manager fired: Buck Showalter, TEX
Over/under on number of games NL West winner wins: 87
Over/under on number of Bonds homers: 28
Over/under on number of Marlins losses: 100
Over/under on number of posts on this blog between now and offseason: 15

Wednesday, March 22, 2006

Review of "Baseball Prospectus 2006"

I assume that most of the readers of this blog are familiar with the Baseball Prospectus, and know whether it is something you want to read or not, and don’t really need somebody to tell you what they think. So this review will simply point out a few changes in the book and discuss some of the sabermetric articles in it.

They have cut back on the number of player comments this year ever so slightly, which is actually IMO a good thing. Personally, I got tired reading somebody try to come up with something insightful to say about another middling pseudo-prospect. Instead, they have added a little section at the end of each team’s chapter which lists a stat lines for several other batters and pitchers and a little sentence or two about each of them.

One slightly annoying thing is that they have cut back on the number of stat lines displayed for a player that only played with a team for a brief period, like a cup of coffee in AAA or the majors. It is kind of annoying to read a comment about some guy that says something like "He showed good command in a brief September call-up", and then look up and not see the numbers for yourself. Also, if a guy split time at a level between a couple of teams, it would make sense to combine the numbers if one of the stat lines falls below your stand-alone standard.

Also interesting to note is that EQA has reclaimed its position in the book, while MLVr is only shown in its non-translated form. I am not crazy about either of these metrics, as EQA distorts the scale and MLVr is based on the Basic RC model, but it is interesting how their fortunes in print at BP have varied over the years.

The articles in the back of the book are a mixed bag. The article on the off-field business aspects of the game in 2005, written by Andrew Bahrlias, is excellent. According to his bio in the back of the book, he is a former counsel to the Yankees, so he knows his stuff.

Then there is "Injury Accounting", by Thomas Gorman. This piece attempts to quantify the effects of injuries, but is quite dull for my money’s worth. It simply assumes that the player will be replaced by a replacement-level player while ignoring chaining, uses a combination of actual performance and PECOTA projection to estimate how the player would have performed if healthy, and uses days on the DL to estimate games missed due to injury. I am not saying that any of the stuff is bad or wrong, just that it is pretty straightforward and doesn’t break any new ground. Chaining is a difficult thing to quantify, and so I appreciate the fact that it was ignored for simplicity’s sake, but it is too big of a problem to ignore on the injury front I think. If they could have modeled chaining, even rudimentarily, then this would be a grand slam piece.

Keith Woolner has a piece on Win Expectancy, expanding upon his work from last year. This article is interesting, but WE is hardly an unexplored topic, so it is most interesting for some the results that are generated, like breakeven percentages for stolen bases in different leagues. Definitely a worthwhile article, but there is lots of other good work being done with WE by others as well.

The other "fungo" is Gary Huckabay’s article entitled "Where Does Statistical Analysis Fall Down? Reality and Perception". This piece did not sit well with me at all. To be fair, one must keep in mind that the main focus of it is the usefulness and implementation of "performance analysis" in Major League front offices. So some of the comments are directed towards the use of statistical/sabermetric/performance analysis in that context.

Still, the piece comes off as dismissive of much of the sabermetric community. For example, "First, throw the term ‘sabermetrics’ out the window. It’s slippery, doesn’t describe anything of substance, and trivializes the nature of serious analysis." Okay, then. Sabermetrics describes nothing of substance. Now perhaps it is true, a front office may not really care about the exact Pythagorean exponent that should be used, or some similar thing that sabermetricians like myself care about. I accept that, and quite frankly don’t care. Maybe another couple of quotes will help me explain further:

"The baseball analysis ‘community’ lacks standards; people self-publish their work and feel confident that they’re qualified to offer advice on multi-million dollar transactions."

Again, there is an element of truth to this, but is this not true of just about everything in life, not just "baseball analysis"? Aren’t there people who’ve never spent a day in the military who feel compelled to give military advice, people who’ve never taken an economics course who decry "price gouging" every time the gas price goes up by a nickel. Now there is no reason why one should heed the advice of many of these self-appointed experts, but such a dismissive attitude towards all commentary by non-experts insulates the industry in question from any criticism. Perhaps only soldiers should comment on the military, only gas station owners on gas prices, and only actors and professional movie critics on movies. This all sounds fine and dandy until you realize that nobody is an expert on everything, and you will be silencing yourself on some matter that interests you be it politics or military strategy or whether Ben Roethlisberger made it across the goal line in the Super Bowl.

Of course, the bit about self-publishing is sort of funny as well, because self-published outlets like Baseball Think Factory or blogs have more interaction and out in the open peer review then does the Baseball Prospectus. Who has ever been allowed to review PECOTA? Heck, in their Baseball Between the Numbers book, they don’t even give formulas for things as elementary as EQR and the Pythagenpat exponent! I understand the need to keep things proprietary, but if you’re going to do that, don’t turn around and lecture others who publish their results in the open and solicit rebuttals and debate as "lacking standards".

"There is excessive attention paid to the ‘academic’ race, refining a model to another 1% of precision, without regard to its utility for making decisions that will actually help a ballclub, or the enormous error bars inherent in the entire exercise."

This one could have been intended at me (note I am not suggesting that it is, because it's obviously not, and there are plenty of other people, some of whom people have actually heard of, who it could apply to; what I mean that it is aimed at people with sabermetric interests similar to mine). It presumes that the only purpose of doing sabermetric-type research would be to help a ballclub. I’ve never made any pretensions that RPG^.285 as a pyth exponent will have any tangible consequences on running a team then using "2", or similar things. Some things are worth knowing for extreme theoretical situations, to people who are interested in theoreticals. Knowing how many runs it will take to win a game in a 25 RPG context may not interest a major league team. But neither will knowing pi to one thousand digits interest someone who needs to know the area of a circle. Yes, that is an academic pursuit, but that is what academics do. Maybe I should pretentiously call myself an academic sabermetrician in order to avoid confusion and give the impression that I secretly want total control over the Rangers organization.

Now the point about error bars is one that I agree with, to a point--there will be sizeable error in all estimates, but that is no reason that those with the patience and interest should not endeavor to make the initial estimate as precise as is possible, and as well-reasoned as is possible, and as applicable across a wide range of contexts as is possible.

Of course, this is yet another funny criticism to be coming from Baseball Prospectus, since it is them who published an accuracy study showing EQR to be slightly more accurate since 1871 then BsR, XR, or other choices of run estimators. It is them who have a six page article in the very same book about optimizing PECOTA, their projection method, which is the most error-prone activity of all--trying to forecast future performance of individual human beings trying to hit a little white sphere going 90 miles an hour with a thirty-four ounce piece of wood. It is them who in the next article(the Woolner article on WE) print results of a regression slope to five places and the intercept to six places. I realize that Baseball Prospectus is made up of individuals and is not monolithic--however, there is no acknowledgement that some of these criticisms could easily be applied to members of his group and their work.

Some readers have always found the BP to be unduly arrogant. I have never really shared that opinion, but I think Gary Huckabay’s piece is. However, if you have read BP in the past and enjoyed it, you should not let that stop you from reading it this year.

In closing though, the book has a quote from an Esquire review on the cover calling it the "heir" to the Baseball Abstract. For many years, this may have been true, because since the BBBA went under in 2001 there has been no other sabermetric annuals published, so the title would fall to BP by default. But that is no longer the case. While not as true to the Abstract format as the BBBA was, there is no doubt in my mind that the Hardball Times Annual is the current book that best encapsulates the spirit of the Abstract. This is not an inherently good or bad thing; the books have different purposes and different target audiences. But if you’re looking for which is more like the Abstract, it’s not even close.

Saturday, March 11, 2006

WBC and Other Thoughts

I have been enjoying the World Baseball Classic immensely...thanks to a local station picking up some of the games shown on tape delay on ESPN, I have been able to watch and score thirteen games. While many of the games have been clunkers, I have enjoyed watching players from foreign leagues who nobody had ever heard of before. The games should be better as the bottom feeders have been weeded out.

There were no real shockers in the eight teams that advanced--I correctly predicted six of them, not that this is any great feat--it is probably below average in fact. I thought for some reason that Canada would edge out Mexico, which wasn’t a horrific pick as both went 2-1, although Mexico trounced Canada head-to-head. And I also thought that the Cubans were over-hyped and not at the caliber of the other power countries--and you could glean support for that viewpoint from their drubbing at the hands of Puerto Rico, but I overestimated the Panamanian team and had no idea they would get no-hit and mercied by the Dutch. Unfortunately that game was not one of the ones I got to see, so I have yet to see a no-hitter from start to finish.

The favorites to advance from this round would have to be the US, Japan, DR, and Venezuela, although Canada showed us that anything can happen. It is only a three game pool, so nothing would be shocking. But I think Korea, Mexico, and Cuba are clearly inferior on paper to their pool opponents. Puerto Rico could surprise, though, although their pitching is not as deep as the Dominican and nowhere near the caliber of Venezuela. But I think they definitely have the best chance to beat out one of the favorites.

On Cuba, I think that the love of Yulieski Gourriel is just a bit excessive. Sure, he looked awesome against Panama and the Netherlands, but Puerto Rico held him in check. He certainly looks like a very talented player, and a great prospect, but some have made claims like he could be a major league star right now, and I just don’t see it. Logically, without considering Gourriel in particular, he is twenty-one years old. How many 21 year olds have ever been among the best players in baseball? Sure, there are some: ARod, Mantle, Kaline, Pujols, and others. But most of those guys are among the truly elite players in the history of the game. The likelihood of any particular player being in this group is small. Like I said, I would think Gourriel would be a top major league prospect, but not a star in the majors. Yet.

I think that what you will see in the next round is that Cuba has no idea what they are up against when playing elite players. Their dominance of international competition has come against amateurs and minor leaguers. I think that their isolation from the rest of the baseball world severely hurts them. Cuba has a population of 11 million, while the Dominican has 9 million. There is no doubt that Cuba is baseball-mad as the Dominican is, and so I have little doubt that Cuba could be a talent source like the Dominican Republic is. But I also think that their relative isolation from the rest of the baseball world can not be good for their player’s development. This also showed with the bush league jawing of their catcher at every call and the batter, I believe it was Urrutia, who felt the need to make a hand signal to the umpire telling him the pitch was outside. I would have drilled him in the ear.

Also, I would not want to be a Cuban pitcher under any circumstance. They have had them on an incredibly short leash. Against Puerto Rico, they ran through four pitchers in the fourth inning. The one poor guy, Suarez, came in with the bases loaded, issued a four pitch walk, and was yanked. That said, the Cuba/Panama game was as gripping and exciting as any game you are likely to see this season.

On other baseball issues, I still do not care about Barry Bonds and steroids. Sorry.

Cuban government officials think that they can control fans’ signs in San Juan. Sorry.

The Book by Tango, MGL, and Andy Dolphin is good. You should get it. I’ll probably post a review eventually, probably eight months after it came out like most of my other reviews have been.

Al Leiter being on the national team of the United States is like...I don’t know, I can’t think of a bad enough analogy. I’ll leave it at, it’s real bad.

Brian Kenny, who did play-by-play for some of the games in San Juan, is one of the worst baseball commentators I have ever heard. He makes me plead for John Kruk and Harold Reynolds. First, he seems to have a chip on his shoulder, a real negative outlook on just about everything and everybody. Second, he several times referred to sabermetrics while making outlandish statements that are not supported by sabermetrics. The one that sticks out was when he complained about the Netherlands batting Andruw Jones fourth instead of third, and saying that this does not make “sabermetric sense” because over the course of a season, the third-place hitter will get 50 more plate appearances. Now leave aside the issue of batting order construction for a moment, and just consider the claim. 50 more plate appearances? For one lineup slot? That does not pass any logical check, let alone actually looking it up like a sabermetricians would do. Thanks to The Book, though, I have a handy table that shows me that the third place hitter gets .11 more PA/game then the cleanup man, and that this difference is pretty constant down the lineup. .11 PA/game is 18 over the course of a season--it would take nearly 3 lineup slots to get a 50 PA difference.

The last piece on rate stats is flawed because I did not use custom LW for the players I added to theoretical teams. I was fully aware of how to do this, of course, I just figured it wouldn’t be necessary. What I failed to think of is that changing the LW value of a home run by even .05 makes a huge difference when your player hits 200 home runs. So whenever I get around to part seven, I will start by going back over those examples.

Friday, March 10, 2006

Rate Stat Series, pt. 6

We have seen previously that R/PA does not properly take into account the effect of avoiding outs or creating more PA, and that R/O overstates the importance of avoiding outs for an individual by treating an individual as a complete lineup. With the two most intuitive candidates for a proper individual rate stat disqualified, where do we turn next?

It seems only natural that some sabermetricians decided to look back at where they started, with R/PA, and try to make adjustments to it to correct the problem that it has. The first published work that I personally saw that took this approach was done by the poster “Sibelius” in 2000 on FanHome. Sibelius saw the problems in both R/PA and R/O, but was frustrated that others did not share his concerns with R/O. So he published his own method which was based on a modification of R/PA to include the effect of extra PA.

His approach began with the truism that each additional out a player avoided would save the team average runs/out from being lost. So by simply comparing the out rate of the player to the out rate of the team, and multiplying by the number of PA for the player, you have a measure of how many outs he has avoided. Then each of these is valued at the team runs/out figure:
Runs Saved = (NOA - TmNOA)*PA*TmR/O

In Part 3 of the series, I used a hypothetical team with a .330 NOA, .12 R/PA, and .179 R/O. Suppose we had a player on this team with a .400 NOA in 550 PA. He would make (.4-.33)*550 = 38.5 less outs then an average hitter, and these would be worth 38.5*.179 = 6.89 runs.

To incorporate these into a rate stat, Sibelius simply added them to the basic Runs Created figure, and divided by Plate Appearances. So this stat is just R/PA PLUS the effect of avoiding outs. And so I will call it R+/PA.

Incidentally, I independently developed this approach shortly after Sibelius posted it. Independently is probably a bit of a stretch because I had read his work and agreed with his ideas--I just did not realize that the specific approach I developed was mathematically equivalent to his. My approach was to calculate the number of extra PA the player had generated (through a technique like that described in Part 2 of this series) rather then the number of outs he was avoided, and then to value each extra PA at the team R/PA. But as Sibelius pointed out to me, this produced identical results to his more simple approach.

So how does this do with the hypothetical players we have looked at before? In Part 1, we found that R/O rated a player who, when added to an otherwise average team, would score 5.046 R/G ahead of a player whose team would score 5.523 R/G. That first player draws 200 walks and makes 100 outs, while the second player hits 150 homers and makes 350 outs. They are added to a team with a .330 OBA, .12 R/PA, and .179 R/O as above.

In this case, Player A has a .667 OBA, and will save (.667-.33)*300*.179 = 18.08 runs, while Player B with his .300 OBA will save (.3-.33)*500*.179 = -2.69 runs. Player A had 54 RC to begin with, so he has 72.08 R+, or .240 R+/PA. Player B had 184 RC to being with, for 181.31 R+, or .363 R+/PA. This is the “right” decision, as Player B’s team scored more runs. R/O comes to the opposite conclusion, that Player A was more valuable..

I do not want to give the impression that because R+/PA meshes with our logic in this case, it will do so in all cases. Take the case of a batter who draws 499 walks in 500 PA. His team will have an OBA of around .404 and score 6.075 R/G. This player, who I’ll call Player C, has a “+” figure of 59.79 runs, plus 499*(1/3) = 166.33 RC, for a R/PA of .333, R/O of 166.3, and a R+/PA of .452.

Suppose that we have another player, D, who hits 170 home runs and makes 330 outs in 500 PA. At 1.4 runs, we’ll credit him with 238 RC, but his generation of PA is worth just .895 runs. He winds up with .476 R/PA, .721 R/O, and .478 R+/PA. But his team will have an OBA of “only” .331 and we expect them to score about 6.011 R/G.

So Player C is more valuable in this case, but has a lower R+/PA, although admittedly both the R/G and R+/PA differences are fairly small. His R/O, though, is wildly ahead of Player D’s, to an extent that does not at all reflect the impact they have on their team’s scoring. R/PA comes to the “right” decision here, but again, the difference between the two players is way out of proportion with the impact they have on their team’s offense.

From these results, perhaps you will agree with me if I state that R+/PA is a sort of third way between R/PA and R/O, that combines strengths and weaknesses. But I would not claim that it is the “correct” rate stat. We would expect a correct stat to always agree with the result of adding a player to a team, because that is how I defined the term “correct” in part 5.

But then again, the rate stat is just one component of our evaluation of a batter. The other is our value stat, which we have assumed is Runs Above Average for the sake of this discussion. So how do the RAA figures based on R+/PA differ from those based on R/O? RAA based on R/O is, in this case looking at the team as the base entity, (R/O - TmR/O)*O. RAA based on R+/PA is (R+/PA - TmR/PA)*PA. So, based on R/O:
Player A has RAA = (54/100 - .179)*100 = +36.1
Player B has RAA = (184/350 - .179)*350 = +121.35

Based on R+/PA, we have:
Player A: RAA = (.240 - .12)*300 = +36
Player B: RAA = (.363 - .12)*500 = +121.5

As you can see, the figures are nearly identical, for two pretty extreme players! They would be even closer, if not identical, had I not rounded the figures off in the process. So the only difference between rating players on R/O and R+/PA, at least against average, is the form and value that the rate stat takes--the value portions are equivalent.

But if two procedures yield identical values, shouldn’t they yield identical rates as well? The player has been to the plate the same number of times and made the same number of outs whether we calculate his value based on R/O or R+/PA. So why should his rate stat be different?

If you agree with this line of thinking, then you are forced to reach the conclusion that we are using the wrong rate stat. Of course, you could argue that neither R/O or R+/PA forms the proper framework for assessing value. But even if we accept that these frameworks are flawed, we can still accept that within that faulty framework, there is a better way to express the rate stat. This is the road that we will go down in the next installment.

Walk Like a Sabermetrician

Monday, March 27, 2006

2006 Predictions

Wednesday, March 22, 2006

Review of "Baseball Prospectus 2006"

Saturday, March 11, 2006

WBC and Other Thoughts

Friday, March 10, 2006

Rate Stat Series, pt. 6

Me, Elsewhere

Analysis Links

Reference Links

Blog Archive

OSU Baseball

End of Season Statistics

Win Shares Walkthrough

NL 1876-1881 Series

Labels

About Me