Tuesday, October 05, 2010

Playoff Meanderings

When I write a post to give some thoughts/impressions on the playoffs, I always like to work in a numerical example of why I feel that formally projecting the outcomes is largely a waste of time. These are always repackaged versions of the same idea (assuming independence of outcomes, constant team strength from game-to-game, no home field advantage), namely using the binomial distribution to estimate some sort of probability. Last year I estimated the probability that the Nationals would win the World Series, if only they somehow managed to make the playoffs. This year, I'm going to look at it from the perspective of "How good do you have to be to have an X% chance of winning the pennant?"

I'm going to assume that a team has a constant W% against its playoff opponents, whatever that W% might be, and that the other seven playoff participants are of equal quality. Neither will change from round-to-round. Suppose we have a .500 team. What is the probability they win the pennant given those conditions? That's easy--25%. So how good does a team have to be in order to have a 40% chance of winning the pennant? 10%? 50%?

This chart uses trial and error to estimate that probability. Given the initial assumptions, one can use the binomial distribution to estimate the probability of winning a five game series, then a seven game series. I won't bother reviewing that math because I think most of you probably already know about the binomial distribution, and it is incidental to the point:



The W% listed on the chart is the one that Excel Goal Seek provides, rounded to three decimal places. The takeaway from this chart is that it's just not credible to be supremely confident about which team is going to win the pennant--and yet mainstream fans and pundits will be. To have even a 50/50 chance (given the stated assumptions, of course), a team must be a .606 (98 win) team--relative to its playoff opponents. Even if the playoff opponents are of just .525 true quality, Log5 estimates that the .606 team would be a true quality .630 (102 win) team.

The necessary W% to achieve a given probability only increase of course when the goal is changed from pennant to World Series, and the team must win one five and two seven game series:



I have been amazed by the number of people who should know better who seem to want to hand the Phillies the pennant. The Phillies are clearly the strongest NL team entering the post-season, but I don't for a second believe they have the average 60% chance to win each game that they would need in order to be even a 50-50 shot.

Over the last couple of weeks I've been working on a rating system that takes strength of schedule into account. There's nothing unique about it; countless other people have published similar systems (or the results of said systems) that are based on similar and quite likely better-conceived methodology. It is creatively called crude team ranking (CTR).

Allow me to use these yet unpublished ratings (I need to write up a formal explanation, and that should appear some time during the offseason) to estimate the playoff probabilities. Please don't take the results too seriously, as they have many flaws (they don't consider home field advantage, they don't estimate true talent by using regression or projected performance, and they don't consider specific team personnel). They are based 50% on actual record and 50% on expected record from R/RA and RC/RCA. Still, they offer a reasonable estimate of the various probabilities, and they are very easy to figure with the spreadsheet:



The first column is the team's overall ranking among the 30 MLB teams in CTR. The playoff teams pretty close to being the top eight in CTR; #4 Boston and #8 Toronto did not make the playoffs. Cincinnati has the lowest odds to advance to the second round or win the World Series, but Texas has a lower estimated probability of winning the pennant (since they are guaranteed to play a very highly ranked team in the LCS). Similarly, Philly has the highest first round and pennant odds, but the prospect of a strong AL World Series opponent knocks their championship odds just below that of New York and Tampa Bay.

It's worth noting that even the team with the worst rating in the playoffs (CIN) playing the best team in its circuit is still estimated to have a 41% of winning the five-game series, illustrating again the intended takeaway for this post.

Pushing the system a little bit further still, here are the estimated probabilities of each LCS matchup, along with the probability of each team winning such a series:



Again, the biggest on-paper mismatch possible (NYA/TEX) results in a series in which the underdog has a 40% chance to win, which undersells it just a little because I haven't considered HFA which would go to Texas in such a series. For the World Series:



Now we have a series in which one team has an approximately 2/3 chance of winning (NYA/CIN). If you'd like to see a very evenly matched series, pull for Texas to pull two upsets and make it out of the AL, as the rankings consider them to be about equal with Atlanta and San Francisco, while still being close enough to Philadelphia on the plus side and Cincinnati on the down side to offer an even matchup. The highest individual World Series win probability for the NL in non-PHI series is Atlanta's estimated 46% chance to beat Minnesota.

Other miscellaneous (and equally crude) probabilities:

* Both NYA and TB eliminated in first round: 19%
* NYA, TB, PHI all eliminated in first round: 8%
* All favorites win in first round (TB, NYA, PHI, SF): 9%
* All favorites lose in first round: 4%
* World Series does not feature TB, NYA, or PHI: 26%
* American League wins World Series: 56%

(The rankings favor ATL over SF, but the consensus is definitely the reverse, the two team's ratings are very close, and ATL has suffered key injuries that the system, based on composite regular season performance along, can't account for).

Some random observations:

* BP points out in their series preview that the Reds and Phillies are pretty close in third-order W% (based on RC and RC allowed and adjusted for strength of schedule). My figures agree that it's fairly close, but still have the Phillies in front. However, it is worth pointing out that it is not a matter of the Reds looking better when using what I call PW%; it's the Phillies looking worse. The Reds' W% is .562 and their PW% is .564. They won as many games as you'd expect from their RC/RCA.

* I think the impact of Philadelphia's three starters is being overstated. The Braves should be a pretty good reminder that a collection of great starters is no guarantee of post-season glory (which is not to suggest that winning five pennants is anything to sneeze at). The Astros' staff which is being cited as the last comparable front three lost in the LCS in '04, then were swept in the World Series in '05, outpitched by a group of lesser pitchers who just happened to be having an amazing run. The Astros of 1998 didn't have the same kind of consistent stars that the Braves, later Astros, or Phillies possess, but they did have Randy Johnson plus Mike Hampton, Shane Reynolds, and Jose Lima, and they were quickly dispatched by San Diego.

I'm not saying that great starting pitching is going to harm the Phillies' chances, but it will cause them to be overstated.

* Jon Heyman posted an article with his odds for winning the World Series, which I'll use as an illustration. I don't mean to pick on Heyman, because there are a lot of folks out there who do that already. I've presented the odds based on the charts above as well:



Heyman's odds for most of the teams are fairly reasonable; as you can see, there are three teams we disagree on significantly. Heyman gives Philadelphia 2-1 odds, which is equivalent to having a 1/3 probability. I realize I'm beating a dead horse here, but I can't stress this enough. As demonstrated above, in order to have a 1/3 chance of winning the World Series, a team would need to have an expected W% of close to .600 against their playoff opponents. It doesn't make any sort of logical sense to suggest that any recent team short of the '98 Yankees is even close to that level.

Minnesota at 20-1 is not quite as silly, but is close, for the same reasons. That's a 4.76% chance of winning, and a team would need to have a .425 expected W% versus their opponents to be that big of a longshot. This is not unique to Heymen at all--in general, people tend to overstate the probabilities of the most likely sports outcomes and understate the probabilities of the least likely.

One nice thing about Heyman's odds is that they sum to 100%, almost exactly. I'm actually impressed by that.

* I will close by noting my order of preference for the world championship--you don't care, nor should you, and it has nothing to do with sabermetrics at all. But after seeing the team I was pulling against the most win four out of five World Series from 2001-2005, I find it cathartic to pre-vent:

1. NYA
2. TB
3. ATL
(I would be happy if any of those three teams won)
4. MIN
5. TEX
(I would be unhappy if any of these three won)
6. CIN
7. SF
8. PHI

No comments:

Post a Comment

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.