Tuesday, September 25, 2012

Playoff Probabilities

I am not a fan of the two wildcard playoff format, but my protestations were not considered and so it will soon be upon us. One thing I meant to do look at eventually was how the second wildcard would impact the probability of teams of a certain presumed strength winning in the playoffs. I’ve never gotten around to it, so I’m pretty much forced to look into it now or forever hold my peace.

The model that I will use to discuss this is admittedly simplified. It makes assumptions that are clearly simpler than reality:

* Teams have a constant strength from game to game. Even the strictest believer in baseball games as an expression of random chance disagrees with this as the identity of the starting pitcher obviously matters.

* Game outcomes are completely independent of one another. While game outcomes are largely independent, the argument for independence is weakened in the playoffs where decisions about how to manage the game (particularly pitcher usage) are clearly influenced by the status of the series.

* Home field advantage is uniform for all teams.

With these assumptions (along with others that have gone unstated), it is easy to construct a model of a playoff series. If one ignores home field advantage, the binomial distribution makes it very easy. That is the way I’ve approached these problems in the past, but I’ve decided to consider home field advantage and make the computations a tad more arduous this time (Of course, as you incorporate HFA into analysis of playoff series, you realize how inconsequential it is barring some intangible psychological force).

I have set up an (excessively) clumsy spreadsheet to do the math. The spreadsheet allows you to enter the playoff teams in order of seeding (i.e. A1 is the AL #1 seed down to N5, the NL’s second wildcard, with the National League assumed to have home field advantage--if the AL does, just enter the AL teams as N1-5 and the NL teams as A1-5) and enter a strength rating for each team (in the form of a win ratio as I use here). It then calculates the probability of each potential playoff series and their outcomes. The probability of each series outcome is figured on the other tabs, and if you are so inclined you could alter the home field pattern for each round. You can also enter the average W% for the home team. I’ve set this to .573, which is the World Series average for 1922-2008. The regular season average is usually around .540, so I think this is a fairly generous assumption in terms of strength of HFA.

The yellow cells are where the user should input custom data. I’ve not provided full documentation for every step as I doubt anyone will actually use this spreadsheet, but if you do and have any questions I will be happy to expound on the documentation. The spreadsheet can be accessed here (change html at the end to xls to download in Excel format).

In the post linked above, I looked at the winning percentages for all playoff teams and theoretical second wildcards for 1995-2010. I’m going to use these averages to set up a theoretical “typical” playoff scenario, and see how the probability of each team advancing to certain round varies with and without the second wildcard team. Using the actual W%s without adjustment to represent the strength of the teams is wrong for a couple of reasons, most notably that some regression is needed to estimate true quality and that no adjustment has been made for the unbalanced schedule, which is a big concern when performing interleague comparisons (and a smaller but still present concern for intraleague comparisons). However, exaggerating the differences in quality between teams will produce a liberal estimate of the differences between playoff formats, which may not be terrible for the sake of discussion. Also note that I’ve not made any adjustment for the fact that under the old format, the wildcard could be matched up with the #2 seed if the #1 seed came from their division.

Here are those average W%s and the resulting CTR (simply W%/(1 - W%) in this case) for each seed:

To rehash that earlier post, much of my antipathy towards the second wildcard and the fetishization of division titles is on display here. The AL wildcard has typically been one of the strongest playoff teams, while the wildcard in both leagues has a better average record than the third division winner. If MLB is hellbent on allowing a fifth team into the playoffs, then I would propose making the playoff between the two qualifying teams with the worst record rather than the two that failed to win their division. This complaint is water under the bridge at this point, though.

First, let’s run through the playoffs with a standard home field advantage (I’m using .543) and just one wildcard:

Now the same scenario, but with a special increased playoff HFA of .573. The last column is the marginal number of World Series victories per 1000 seasons relative to the .543 HFA assumption:

Remember, I’ve given the National League World Series HFA, so the NL teams get more of a boost from assuming a stronger HFA than do the AL teams. The NL picks up 9.6 World Series victories per 1000 seasons as a result of this stronger HFA assumption.

More relevant to the point of this post, here is the effect of adding the second wildcard to the mix. Again, let me emphasize that I’m assuming that there is no difference in expected W% from game-to-game. This is particularly relevant for the wildcard teams as one of the purported benefits of the extra playoff is that it will put the winner at a disadvantage entering the Division Series in terms of pitcher availability, since they will clearly have an incentive to use their best available pitching in the wildcard game. I’m not saying that you shouldn’t attempt to model this and other game-to-game factors when assessing playoff probabilities--but doing so complicates the exercise considerably. Instead, think of what I’m doing here as simply an analysis of the format itself rather than the consequences of that format upon the teams. These probabilities are based on the .573 HFA assumption. The last column is the marginal number of World Series victories per 1000 seasons relative to the single wildcard:

Keeping in mind that this analysis is not a true comparison of the previous format to the current one as it doesn’t account for the wildcard being unable to face a divisional opponent in the Division Series, this actually makes me feel a little better about the two wildcard formats, as it increases the probability of the best teams (#1 and #2 seeds, although the AL wildcard is generally in that class as well) winning the World Series. #1 seeds get an easier Division Series matchup by getting the second wildcard roughly 40% of the time. Obviously the #5 seed benefits the most, going from out of the picture to being a long shot. Equally obviously, the first wildcards take a huge hit.

The interesting result is the decrease in W% for the #3 seed, the reason for which is not immediately obvious. The cause is the increased likelihood of facing the #1 seed in the LCS (and the increased likelihood of facing the other league’s best teams in the World Series).

Still, the effects on the division winners’ odds are relatively small, not much different than the difference in assuming that the typical player HFA is .03 wins greater. The brunt of the impact is felt by the first wildcard.

Tuesday, September 11, 2012


* There is always some grumbling about September roster expansion, and the supposed ills it inflicts on the game, but it seems to have reached a fever pitch in September 2012. There are a lot of calls for some kind of reform, whether it involves severely curtailing the practice or (most popularly) forcing a manager to declare 25 active players at the start of every game.

I don’t have a problem with roster expansion, myself--my preference would be to keep the status quo. I will also admit to not having read all of the pieces that have been written about this, so it is quite possible that someone has prominently beaten me to the punch with the following suggestion. Rather than having the manager declare a 25 man active roster, why not simply limit the manager to using 25 players in a particular game?

Such a rule would give a manager in-game flexibility that would be absent in the case of a pre-declared scratch list. In a close game, he would be free to use extra players in situational roles. In a blowout, he would be able to use his mop up relievers and get young players into the game. But he would not be able to use any more players than he could during the rest of the season.

There are a couple of drawbacks to this rule that come to mind. One is that 25 players is still an increase over the number that is typically deployed during the rest of the season, even in fairly unusual cases. The four excess starting pitchers usually are excluded from game action, especially in the American League. I’d argue that this is a good thing--it allows for some additional substitutions while preventing abuse, but if one is concerned about any change in the behavior of managers, it’s a valid criticism.

The other is a logistical issue rather than a baseball issue, but it would be a little harder to keep track of. Announcers would be completely bewildered as a team approached the substitution limit, and while the omnipresent lineup cards should be sufficient for managers and umpires to keep up, it’s not hard to imagine some confusion arising.

* Everyone has an opinion on Stephen Strasburg. I don’t, really--I certainly agree with the principle of being cautious with pitchers, particularly young pitchers, those who have prior injury histories, and those of extraordinary talent--all three of which fit Strasburg. What I do have an opinion about is the intestinal fortitude of Mike Rizzo and anyone else who took responsibility for the final decision. I consider it a pretty bold stance to take, given that there is almost no outcome in which they do not receive heavy criticism.

If Washington fails to win the World Series, the question of how they might have performed with Strasburg will be raised incessantly. Even a quick sweep in the Division Series in which one win would have not stemmed the tide will not get them off the hook, because psychological factors will be raised (“Strasburg could have won game one and completely changed the momentum”, “The Nationals players would have been more confident with Strasburg available”, etc.) And if they lose a seven game World Series in which Edwin Jackson gets roughed up a couple of times--well, if I was Rizzo, I’d consider hiring a food taster at that point. So far I’ve only discussed the 2012 on-field consequences. A bigger outcry will come if Strasburg gets hurt again, particularly if it’s in the next two years.

What’s remarkable about this decision is the near certainty with which it will be judged as a failure by mainstream observers. Perhaps my imagination is too limited, or my faith in sound reasoning on behalf of mainstream observers artificially low, but it’s difficult for me to imagine a scenario in which the shutdown is considered to be a success. The odds are very good that Washington will not win it all, with or without Strasburg. Steven Strasburg is quite unlikely to have an injury-free career. The takeaway for me is that Rizzo must really believe he’s made the right call.

* I am an unabashed supporter of the World Baseball Classic, so I’m taking it as my duty to update you about the upcoming qualifying tournaments. The existence of these has largely gone unremarked upon.

For the first time, four spots in the sixteen team field will be up for grabs. The twelve countries that won games in the 2009 WBC are automatic qualifiers (Japan, Korea, China, United States, Mexico, Italy, Netherlands, Dominican Republic, Venezuela, Cuba, Australia, and Puerto Rico). The first two qualifiers open up next week.

The qualifiers are four-team double elimination tournaments, the format of which will be familiar to those who watched the 2009 WBC or follow the NCAA Tournament. In Jupiter, FL, South Africa will play Israel and Spain will play France. In Regensburg, Germany, Canada will play Great Britain and Germany will play the Czech Republic. In November, the other two spots will be decided. In Panama City, Panama will play Brazil and Colombia will play Nicaragua. In Taipei City, the Philippines will play Thailand and Taiwan will play New Zealand.

Handicapping these tournaments is silly (after all, the Netherlands beat the Dominican Republic twice in the 2009 WBC). Speaking broadly without knowledge of the actual makeup of the teams, there are clear favorites in the Germany and Taiwan qualifiers, as Canada and Taiwan are much stronger baseball nations than the second-tier European and Asian countries, respectively.

The other two are fairly wide open. WBC rules allow people who could be but are not citizens of a country to play, which allows Israel access to Jewish players. Israel will be managed by Brad Ausmus, and faces a very weak field, so they may be the favorite. Panama played in the first two WBCs without much success. Colombia and Nicaragua have both produced their share of major leaguers, and even Brazil now has their own big leaguer in Yan Gomes.

Just to provide a general sense of how the participating countries have performed in recent international tournaments, here is each country’s rank in the IBAF World Rankings divided by qualifier. (These rankings don’t do justice to countries with strong baseball that don’t field teams in many international tournaments such as the Dominican Republic, and obviously are based on tournament results and don’t tell one anything about the quality of WBC team being fielded. Nonetheless, it’s kind of fun to look at--and it’s even more fun to look at the full ranking list, which includes countries such as Bolivia, Myanmar, and New Caledonia, which I had to look up on Wikipedia. Also note that the top-ranked non-participant, Netherlands Antilles, is included with the Netherlands for the WBC):

It is also worth noting that the IBAF website states that it will now bestow the title of world champion on the WBC winner rather than the soon to be defunct World Cup winner.