Monday, December 18, 2017

Crude Team Ratings, 2017

For the last several years I have published a set of team ratings that I call "Crude Team Ratings". The name was chosen to reflect the nature of the ratings--they have a number of limitations, of which I documented several when I introduced the methodology.

I explain how CTR is figured in the linked post, but in short:

1) Start with a win ratio figure for each team. It could be actual win ratio, or an estimated win ratio.

2) Figure the average win ratio of the team’s opponents.

3) Adjust for strength of schedule, resulting in a new set of ratings.

4) Begin the process again. Repeat until the ratings stabilize.

The resulting rating, CTR, is an adjusted win/loss ratio rescaled so that the majors’ arithmetic average is 100. The ratings can be used to directly estimate W% against a given opponent (without home field advantage for either side); a team with a CTR of 120 should win 60% of games against a team with a CTR of 80 (120/(120 + 80)).

First, CTR based on actual wins and losses. In the table, “aW%” is the winning percentage equivalent implied by the CTR and “SOS” is the measure of strength of schedule--the average CTR of a team’s opponents. The rank columns provide each team’s rank in CTR and SOS:



The top ten teams were the playoff participants, with the two pennant winners coming from the group of three teams that formed a clear first-tier. The #9 and #10 teams lost the wildcard games. Were it not for the identity of the one of those three that did not win the pennant, it would have been about as close to perfect a playoff outcome as I could hope for. What stood out the most among the playoff teams to me is that Arizona ranked slightly ahead of Washington. As we’ll see in a moment, the NL East was bad, and as the best team in the worst division, the Nationals had the lowest SOS in the majors, with their average opponent roughly equivalent to the A’s, while the Diamondbacks’ average opponent was roughly equivalent to the Royals.

Next are the division averages. Originally I gave the arithmetic average CTR for each divison, but that’s mathematically wrong--you can’t average ratios like that. Then I switched to geometric averages, but really what I should have done all along is just give the arithemetic average aW% for each division/league. aW% converts CTR back to an “equivalent” W-L record, such that the average across the major leagues will be .50000. I do this by taking CTR/(100 + CTR) for each team, then applying a small fudge factor to force the average to .500. In order to maintain some basis for comparison to prior years, I’ve provided the geometric average CTR alongside the arithmetric average aW%, and the equivalent CTR by solving for CTR in the equation:

aW% = CTR/(100 + CTR)*F, where F is the fudge factor (it was 1.0005 for 2017 lest you be concerned there is a massive behind-the-scenes adjustment taking place).



The league gap closed after expanding in 2016, but the AL maintained superiority, with only the NL West having a higher CTR than any AL division. It was a good bounceback for the NL West after being the worst division in 2016, especially when you consider that the team that had been second-best for several years wound up as the second-worst team in the majors. The NL East was bad, but not as bad as it was just two years ago.

I also figure CTRs based on various alternate W% estimates. The first is based Expected W%, (Pythagenpat based on actual runs scored and allowed):



The second is CTR based on Predicted W% (Pythagenpat based on runs created and allowed, actually Base Runs):



Usually I include a version based on Game Expected Winning %, but this year I’m finally switching to using the Enby distribution so it’s going to take a little bit more work, and I’d like to get one of these two posts up before the end of the year. So I will include the CTRs based on gEW% in the Run Distribution post.

A few seasons ago I started including a CTR version based on actual wins and losses, but including the postseason. I am not crazy about this set of ratings, the reasoning behind which I tried very poorly to explain last year. A shorter attempt follows: Baseball playoff series have different lengths depending on how the series go. This has a tendency to exaggerate the differences between the teams exhibited by the series, and thus have undue influence on the ratings. When the Dodgers sweep the Diamondbacks in the NLDS, this is certainly additional evidence that we did not previously have which suggests that the Dodgers are a stronger team than the Diamondbacks. But counting this as 3 wins to 0 losses exaggerates the evidence. I don’t mean this in the (equally true) sense that W% over a small sample size will tend to be more extreme than a W% estimate based on components (R/RA, RC/RCA, etc.) This we could easily account for by using EW% or PW%. What I’m getting at is that the number of games added to the sample is dependent on the outcomes of the games that are played. If series were played through in a non-farcical manner (i.e. ARI/LA goes five games regardless of the outcomes), than this would be a moot point.

I doubt that argument swayed even one person, so the ratings including playoff performance are:



With the Dodgers holding a 161 to 156 lead over the Astros before the playoffs, romping through the NL playoffs at 7-1 while the Astros went 7-4 in the AL playoffs, and taking the World Series to seven games, they actually managed to increase their position as the #1 ranked team. I’m not sure I’ve seen that before--certainly it is common for the World Series winner to not be ranked #1, but usually they get closer to it than further away.

And the differences between ratings include playoffs (pCTR) and regular season only (rCTR):

No comments:

Post a Comment

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.