## Thursday, February 09, 2017

### Simple Extra Inning Game Length Probabilities

With the recent news that MLB will be testing starting an extra inning with a runner on second in the low minors, it might be worthwhile to crunch some numbers and estimate the impact on the average length of extra innings game under various base/out situations to start innings. I used empirical data on the probability of scoring X runs in an inning given the base/out situation based on a nifty calculator created by Greg Stoll. Stoll’s description says it is based on MLB games from 1957-2015, including postseason.

Obviously using empirical data doesn’t allow you to vary the run environment…the expected runs for the rest of the inning with no outs, bases empty is .466 so the average R/G here is around 4.2. It also doesn’t account for any behavioral changes due to game situation, as strategy can obviously differ when it is an extra innings situation as opposed to a more mundane point in the game. Plus any quirks in the data are not smoothed over. Still, I think it is a fun exercise to quickly estimate the outcome of various extra inning setups.

These results will be presented in terms of average number of extra innings and probability of Y extra innings assuming that the rule takes effect in the tenth inning (i.e. each extra inning is played under the same rules).

If you know the probability of scoring X runs, assume the two teams are of equal quality, and assume independence between their runs scored (all significant assumptions), then it is very simple to calculate the probabilities of various outcomes in extra innings. If Pa(x) is the probability that team A scores x runs in an inning, and Pb(x) is the probability that team B scores x runs in an inning, then the probability that team A outscores team B in the inning (i.e. wins the game this inning) is:

P(A > B) = Pa(1)*Pb(0) + Pa(2)*[Pb(0) + Pb(1)] + Pa(3)*[Pb(0) + Pb(1) + Pb(2)] + ….

Since we’ve assumed the teams are of equal quality, the probability for team B is the same, just switching the Pas and Pbs. We can calculate the probability of them scoring the same number of runs (i.e. the probability the game extends an additional inning) by taking 1 – P(A > B) – P(B > A) = 1 – 2*P(A >B) since the teams are even, or directly as:

P(A = B) = Pa(0)*Pb(0) + Pa(1)*Pb(1) + Pa(2)*Pb(2) + … = Pa(0)^2 + Pa(1)^2 + Pa(2)^2 + … since the teams are even

I called this P. The probability that game continues past the tenth is equal to P. The probability that the game terminates after the tenth is 1-P. The probability that the game continues past the eleventh is P^2; the probability that the game terminates after the eleventh is P*(1 – P). Continue recursively from here. The average length of the game is 10*P(terminates after 10) + 11*P(terminates after 11) + …

I used Stoll’s data to estimate a few probabilities of game length for a rule that would start each extra innings with the teams in each of the 24 base/out situations. For a given inning-initial base/out situation, P(10) is the probability that the game is over after 10 innings, P(11) the probability it is over after 11 or fewer extra innings, etc. “average” is the average number of innings in an extra inning game played under that rule, and R/I is the average scored in the remainder of the inning from Stoll’s data for teams in that base/out situation.

It will come as no surprise that generally the higher the R/I, the lower the probability of the game continuing is. In a low scoring environment, the teams are more likely to each score zero or one run; as the scoring environment increases, so does the variance (I should have calculated the variance of runs per inning from Stoll’s data to really drive this point home, but I didn’t think of it until after I’d made the tables), and differences in inning run totals between the two teams are what ends extra inning games.

The highlighted roles are bases empty, nobody out (i.e. the status quo); runner at second, nobody out (the proposed MLB rule); runners at first and second, nobody out (the international rule, starting from the eleventh inning; this chart assumes all innings starting with the tenth are played under the same rules, so it doesn’t let you compare these two rules directly); and bases loaded, nobody out, which maximizes the run environment and minimizes the duration of extra innings (making games beyond 12 innings as theoretically rare as games beyond 15 innings are under traditional rules). Of course, these higher scoring innings would take longer to play, so simply looking at the duration of game doesn’t fully address the alleged problems that tinkering with the rules would be intended to solve.

I did separately calculate these probabilities for the international rule--play the tenth inning under standard rules, then start subsequent innings with runners on first and second. It produces longer games than starting with a runner at second in the tenth, which is not surprising.  