Monday, December 23, 2019

Crude Team Ratings, 2019

Crude Team Rating (CTR) is my name for a simple methodology of ranking teams based on their win ratio (or estimated win ratio) and their opponents’ win ratios. A full explanation of the methodology is here, but briefly:

1) Start with a win ratio figure for each team. It could be actual win ratio, or an estimated win ratio.

2) Figure the average win ratio of the team’s opponents.

3) Adjust for strength of schedule, resulting in a new set of ratings.

4) Begin the process again. Repeat until the ratings stabilize.

The resulting rating, CTR, is an adjusted win/loss ratio rescaled so that the majors’ arithmetic average is 100. The ratings can be used to directly estimate W% against a given opponent (without home field advantage for either side); a team with a CTR of 120 should win 60% of games against a team with a CTR of 80 (120/(120 + 80)).

First, CTR based on actual wins and losses. In the table, “aW%” is the winning percentage equivalent implied by the CTR and “SOS” is the measure of strength of schedule--the average CTR of a team’s opponents. The rank columns provide each team’s rank in CTR and SOS:

The ten playoff teams almost occupied the top ten spots, Cleveland just barely edging out Milwaukee.

I’ve switched how I aggregate for division/league ratings over time, but I think I’ve settled on the right approach, which is just to take the average aW% for each:

It was finally the NL’s year to top the AL, with the latter dragged down by the three worst teams, including Detroit which I believe turned in the lowest CTR since I’ve been calculating these. Amazingly, the AL Central was actually better than in 2019, increasing their average aW% from .431. This was because the Indians were slightly better in 2019 (116 to 113 CTR), the Twins shot from 85 to 140, and the White Sox graduated from horrible to merely bad (58 to 74).

The CTRs can also use theoretical win ratios as a basis, and so the next three tables will be presented without much comment. The first uses gEW%, which is a measure I calculate that looks at each team’s runs scored distribution and runs allowed distribution separately to calculate an expected winning percentage given average runs allowed or runs scored, and then uses Pythagorean logic to combine the two and produce a single estimated W% based on the empirical run distribution:

Next EW% based on R and RA:

And PW% based on RC and RCA:

The final set of ratings is based on actual wins and losses, but includes the playoffs. I am not crazy about this view; while it goes without saying that playoff games provide additional data regarding the quality of teams, I believe that the playoff format biases the ratings against teams that lose in the playoffs, particularly in series that end well before the maximum number of games. It’s not that losing three straight games in a division series shouldn’t hurt a team’s rating, it’s that terminating the series after three games and not playing out the remaining creates bias. Imagine what would happen to CTRs based on regular season games if a team’s regular season terminated when they fell ten games out of a playoff spot. Do you think this would increase the variance in team ratings? The difference between the playoffs and regular season on this front is that the length of the regular season is independent of team performance, but the length of the playoffs is not.

My position is not that the playoffs should be ignored altogether, but I don’t have a satisfactory suggestion on how to correct the playoff-inclusive ratings for this bias without injecting a tremendous amount of my own subjective approach into the mix (one idea would be to add in the expected performance over the remaining games of the series based on the odds implied from the regular season CTRs, but of course this is begging the question to a degree). So I present here the ratings including playoff performance, with each team’s regular season actual CTR and the difference between the two:

Friday, December 13, 2019

Hitting by Position, 2019

The first obvious thing to look at is the positional totals for 2018, with the data coming from "MLB” is the overall total for MLB, which is not the same as the sum of all the positions here, as pinch-hitters and runners are not included in those. “POS” is the MLB totals minus the pitcher totals, yielding the composite performance by non-pitchers. “PADJ” is the position adjustment, which is the position RG divided by the total for all positions, including pitchers (but excluding pinch hitters). “LPADJ” is the long-term positional adjustment that I am now using, based on 2010-2019 data (see more below). The rows “79” and “3D” are the combined corner outfield and 1B/DH totals, respectively:

There’s nothing too surprising here, although third basemen continue to hit above their historical norm and corner outfielders outhit 1B/DH ever so slightly.

All team figures from this point forward in the post are park-adjusted. The RAA figures for each position are baselined against the overall major league average RG for the position, except for left field and right field which are pooled. NL pitching staffs by RAA (note that the runs created formula I use doesn’t account for sacrifice hits, which matters more when looking at pitcher’s offensive performance than any other breakout you can imagine):

This range is a tad narrower than the norm which is around +/- 20 runs; no teams cost themselves a full win at the plate. This is the second year in a row in which this is the case; of course as innings pitched by starters decline, the number of plate appearances for pitchers does as well.

The teams with the highest RAA at each position were:


Usually the leaders are pretty self-explanatory, although I did a double-take on Seattle catchers (led by Omar Narvaez with a quietly excellent 260 plate appearances from Tom Murphy) and Milwaukee second basemen (combination of Keston Hiura and Mike Moustakas). I always find the list of positional trailers more interesting (the player listed is the one who started the most games at that position; they usually are not solely to blame for the debacle ):

Four teams hogged eight spots to themselves, kindly leaving one leftover for another AL Central bottom feeder. Moustakas featured prominently in Milwaukee’s successful second base and dreadful third base performances, but unfortunately Travis Shaw was much more responsible for the latter (503 OPS in 66 games! against Moustakas’ solid 815 in 101 games). Also deserving of a special shoutout for his contributions to the two moribund White Sox positions is Daniel Palka, who was only 0-7 as a DH but had a 421 OPS in 78 PA as a right fielder. His total line for the season was 372 OPS in 93 PA; attending a September White Sox/Indians series, it was hard to take one’s eyes off his batting line of the scoreboard (his pre-September line was .022/.135/.022 in 52 PA).

The next table shows the correlation (r) between each team’s RG for each position (excluding pitchers) and the long-term position adjustment (using pooled 1B/DH and LF/RF). A high correlation indicates that a team’s offense tended to come from positions that you would expect it to:

The following tables, broken out by division, display RAA for each position, with teams sorted by the sum of positional RAA. Positions with negative RAA are in red, and positions that are +/-20 RAA are bolded:

A few observations:

* The Tigers were below-average at every position; much could (and has) been written about Detroit’s historic lack of even average offensive players, but a positional best of -9 kind of sums it up

* The Indians had only one average offensive position, which was surprising to me as I would have thought that even while not having his best season, Franciso Lindor would have salvaged shortstop (he had 17 RAA personally). Non-Lindor Indian shortstops had only 92 PA, but they hit .123/.259/.173 (unadjusted).

* That -30 at third base for the Angels, wonder what they’ll do to address that

* Houston had 109 infield RAA, the next closest team was the Dodgers with 73. The Dodgers had the best outfield RAA with 77; the Astros were fifth with 46.

Finally, I alluded to an update to the long-term positional adjustments I use above. You can see my end of season stats post for some discussion about why I use offensive positional adjustments in my RAR estimates. A quick summary of my thinking:

* There are a lot of different RAR/WAR estimates available now. If I can offer a somewhat valid but unique perspective, I think that adds more value than a watered down version of the same thing everyone else is publishing.

* When possible, I like to publish metrics that I have had some role in developing (please note, I’m not saying that any of them are my own ideas, just that it’s nice to be able to develop your own version of a pre-existing concept). I don’t publish my own defensive metrics and while defensive positional adjustments are based on more than simply player’s comparative performance across positions using fielding metrics, they are basic starting point for that type of analysis.

* While I do not claim that the relationship is or should be perfect, at the level of talent filtering that exists to select major leaguers, there should be an inverse relationship between offensive performance by position and the defensive responsibilities of the position. Not a perfect one, but a relationship nonetheless. An offensive positional adjustment than allows for a more objective approach to setting a position adjustment. Again, I have to clarify that I don’t think subjectivity in metric design is a bad thing - any metric, unless it’s simply expressing some fundamental baseball quantity or rate (e.g. “home runs” or “on base average”) is going to involve some subjectivity in design (e.g linear or multiplicative run estimator, any myriad of different ways to design park factors, whether to include a category like sacrifice flies that is more teammate-dependent)

I use the latest ten years of data for the majors (2010 – 2019), which should smooth out some of the randomness in positional performance. Than I simply calculate RG for each position and divide by the league average of positional performance (i.e. excluding pinch-hitters and pinch-runners). I then pool 1B/DH and LF/RF. Only looking at positional performance is necessary because the goal is not to express the position average relative to the league, but rather to the other positions for the purpose of determining their relative performance. If pinch-hitters perform worse than position players, I don’t want them to bring down the league average and thus raise the offensive positional adjustment, because pinch-hitters will not be subject to the offensive positional adjustment when calculating their RAR. (I suppose if you were so inclined, you could include them, and use that as your backdoor way of accounting for the pinch-hitting penalty in a metric, but I assign each player to a primary position (or some weighted average of their defensive positions) and so this wouldn’t really make sense, and would result in positional adjustments that are too high when they are applied to the league average RG.

For 2010-2019, the resulting PADJs are:

Wednesday, December 04, 2019

Leadoff Hitters, 2019

In the past I’ve wasted time writing in a structured format, instead of just explaining how the metrics are calculated and noting anything that stands out to me. I’m opting for the latter approach this year, both in this piece and in other “end of season” statistics summaries.

I’ve always been interested in leadoff hitter performance, despite not being particularly not claiming that it held any particular significance beyond the obvious. The linked spreadsheet includes a number of metrics, and there are three very important caveats:

1. The data from Baseball-Reference and includes the performance of anyone who hit in the leadoff spot during a game. I’ve included the names and number of games starting at leadoff for all players with twenty or more starts.

2. Many of the metrics shown are descriptive, not quality metrics

3. None of this is park-adjusted

The metrics shown in the spreadsheet are:

* Runs Scored per 25.5 outs = R*25.5/(AB – H + CS)

Runs scored are obviously influenced heavily by the team, but it’s a natural starting point when looking at leadoff hitters.

* On Base Average (OBA) = (H + W + HB)/(AB + W + HB)

If you need this explained, you’re reading the wrong blog.

* Runners On Base Average (ROBA) = (H + W + HB – HR – CS)/(AB + W + HB)

This is not a quality metric, but it is useful when thinking about the run scoring process as it’s essentially a rate for the Base Runs “A” component, depending on how you choose to handle CS in your BsR variation. It is the rate at which a hitter is on base for a teammate to advance.

* “Literal” On Base Average (LOBA) = (H + W + HB – HR – CS)/(AB + W + HB – HR)

This is a metric I’ve made up for this series that I don’t actually consider of any value; it is the same as ROBA except it doesn’t “penalize” homers by counting them in the denominator. I threw scare quotes around “penalize” because I don’t think ROBA penalizes homers; rather it recognizes that homers do not result in runners on base. It’s only a “penalty” if you misuse the metric.

* R/RBI Ratio (R/BI) = R/RBI

A very crude way of measuring the shape of a hitter’s performance, with much contextual bias.

* Run Element Ratio (RER) = (W + SB)/(TB – H)

This is an old Bill James shape metric which is a ratio between events that tend to be more valuable at the start of an inning to events that tend to be more valuable at the end of an inning. As such, leadoff hitters historically have tended to have high RERs, although recently they have just barely exceeded the league average as is the case here. Leadoff hitters were also just below the league average in Isolated Power (.180 to .183) and HR/PA (.035 to .037)

* Net Stolen Bases (NSB) = SB – 2*CS

A crude way to weight SB and CS, not perfectly reflecting the run value difference between the two

* 2OPS = 2*OBA + SLG

This is a metric that David Smyth suggested for measuring leadoff hitters, just an OPS variant that uses a higher weight for OBA than would be suggested by maximizing correlation to runs scored (which would be around 1.8). Of course, 2OPS is still closer to ideal than the widely-used OPS, albeit with the opposite bias.

* Runs Created per Game – see End of Season Statistics post for calculation

This is the basic measure I would use to evaluate a hitter’s rate performance.

* Leadoff Efficiency – This is a theoretical measure of linear weights runs above average per 756 PA, assuming that every plate appearance occurred in the quintessential leadoff situation of no runners on, none out. 756 PA is the aveage PA/team for the leadoff spot this season. See this post for a full explanation of the formula; the 2019 out & CS coefficients are -.231 and -.598 respectively.

A couple things that jumped out at me:

* Only six teams had just one player with twenty or more starts as a leadoff man. Tampa Bay was one of those teams; Austin Meadows led off 53 times, while six other players lead off (this feels like it should be one word) between ten and twenty times.

* Chicago was devoid of quality leadoff performance in either circuit, but the Cubs OBA woes really stand out; at .296, they were fourteen points worse than the next-closest team, which amazingly enough was the champion of their divison. The opposite was true in Texas, where the two best teams in OBA reside.

See the link below for the spreadsheet; if you change the end of the URL from “HTML” to “XLSX”, you can download an Excel version:

2019 Leadoff Hitters