Monday, August 10, 2009

Silly OPS Tricks

I'm not really sure why I'm writing this post, since it uses a metric that I don't particularly wish to propagate and doesn't offer any serious analytical application.

Nonetheless, herein is a (relatively) simple and reasonably accurate way to predict a team's win-loss record from its OPS and OPS Allowed. It combines two simple rules of thumb (one relating OPS to runs, and one relating runs to wins) into a single OPS to wins conversion. And that is the reason I am sharing it here despite my antipathy towards OPS--the two rules of thumb both take the same form, and so their combination is fairly elegant. I guess what I'm trying to say is that it is "neat", even if I don't think you should use it.

This will be a metric of the type that I call predicted winning percentage, which is based on component statistics, as opposed to expected winning percentage, which is based on actual runs scored and allowed.

The first rule of thumb is the conversion between OPS and runs. OPS has, roughly, a 2:1 relationship with runs scored. If a team has an OPS 5% better than league average, then we expect them to score about 10% more runs than league average. I have noted this relationship before, but of course it is well-known.

Above and in the linked post, I used this in relation to the league average, but it can be applied to a team and its opponents as well. So we can estimate a team's run ratio (R/RA) as:

RR = 2*OPS/OPS Allowed - 1

Run Ratio can be related to wins in any number of ways, some more accurate than others--the most notable application is the Pythagorean formula. A linearization of the Pythagorean formula for the normal range of run scoring is what Bill James called "Double the Edge"--a team that scores 5% more runs than its opponents should win about 10% more games. So we can estimate a team's win ratio (W/L) as:

WR = 2*RR - 1

We can substitute the OPS relationship in, and get:

WR = 2*(2*OPS/OPS Allowed - 1) - 1

which simplifies to:

WR = 4*OPS/OPS Allowed - 3

As you can see, this is a steep function, and is a consequence of combining the pair of 2:1 functions. A team with an OPS 5% better than its opponents figures to have a W/L ratio of 4*1.05 - 3 = 1.2.

Win Ratio can be converted to a more familiar form, W%, very simply, as WR/(WR + 1). If we substitute the OPS relationship into that equation, we get this formula that takes us directly from OPS and OPS Allowed to W%:

W% = (4*OPS/OPS Allowed - 3)/(4*OPS/OPS Allowed - 2)

How well does this work? Not too shabby...over the past two seasons (not the largest sample size in the world, but nothing in this post is meant to be rigorous in any way, shape, or form) it has a RMSE in predicting actual W% of 5.64. My PW% estimate using Base Runs and Pythagenpat has a similar RMSE over the same period (5.52). Using it to estimate expected W% (Pythagenpat record, based on actual runs scored and allowed), the OPS knockoff has a RMSE of 4.06, while PW% has a RMSE of 3.46.

When you estimate W% from component statistics (in other words, without the benefit of R and RA), you have three areas where errors can occur:

1. error in predicting runs scored
2. error in predicting runs allowed
3. error in converting between runs and wins

If you estimate W% from runs and runs allowed, you obviously only have to worry about the third type of error. But with so much going on in figuring PW% (as I have defined it), it doesn't really matter whether you use "state of the art" methods (BsR + Pythagenpat), or use chicken scratchings based on OPS. You're going to have some fairly significant error either way. The theoretical superiority of the "state of the art" approach is hinted at by its better tracking of EW%.

Anyway, there are still a number of weaknesses with the OPS method (I'll give it a name just for convenience--let's call it the Reynolds estimate, since we all know Harold loves his OPS). These include, but are not necessarily limited to:

1. The simple fact that it's based on OPS. OPS has a lot of problems, but they don't manifest themselves too much when you deal with real teams in the normal performance range, so it's not too much of a concern here.

2. It can't be used with OPS+. OPS+ is no great shakes either, but given it's prominence in the Total Baseball and later the ESPN Encyclopedia and Baseball-Reference, it gets used just as much in the sabermetric community as ordinary OPS. The Reynolds estimate is incompatible with OPS+, as OPS+ does not have a 2:1 relationship with runs (it has a 1:1 relationship--the misunderstanding of the OPS and OPS+ relationships with runs is a never-ending frustration of mine). This is a selling point for OPS+, but it means it doesn't work here (you can of course work out a PW% estimate based on OPS+, but that's besides the point).

3. It breaks down at the extremes. The OPS to runs relationship, particularly when using outs, will cause you all sorts of problems if you attempt to use it to estimate how many runs Babe Ruth created in 1920. The double the edge estimate of W% is fine in the normal performance range, but you don't want to use it to figure individual Offensive W% or anything. Combining those two issues, you don't want to figure a pitcher's estimated W% or a hitter's OW% with this method.

4. It does not have the property of reciprocity between a team and its opponents. For example, the 2007 Red Sox had an OPS of 806 and allowed an OPS of 705. That gives them a Reynolds estimate of .611.

But if you plug in the Red Sox opponents (a team with an OPS of 705 and an OPS Allowed of 806), you get a Reynolds estimate of .333. In order for this to make theoretical sense (unless you know something about run distributions that the rest of us don't), the Red Sox and their opponents need to add up to 1.

Why does this happen? Well, for one thing I played fast and loose by equating OPS Allowed with League OPS when plugged into the regression equation. In fact, this is a shortcut that works fine for average teams but will cause problems at extremes. Let me reintroduce an equation for estimating runs from OPS and outs:

Runs = (.496*OPS - .182)*(AB - H)

The Red Sox OPS of 806 means they should score about .218 runs/out, and their OPS Allowed of 705 means they should allow about .168 runs/out, for a run ratio of 1.299. Our shortcut (2*OPS/OPS Allowed - 1) yields an estimated run ratio of 1.287. Not a huge difference, but a small source of error, and due entirely to a shortcut.

It's worse for the Red Sox opponents, who should be estimated with a run ratio of .77 (.168/.218). But the shortcut estimates a run ratio of .749. To make matters worse, the shortcut estimates a 1.287 run ratio for the Red Sox, which has a reciprocal of .779. But the use of the shortcut eliminates reciprocity between the run ratio of a team and its opponents.

To state it again, the reason this happens is that 2*OPS/LgOPS - 1 relates to runs scored by a team, and is centered around LgOPS. It really should be applied separately to estimate runs scored from OPS and runs allowed from OPS Allowed.

An even bigger reciprocity problem arises from the use of WR = 2*RR - 1. This is why analysts who have worked with that equation (like Bill Kross) have used a different formula for teams whose run ratio < 1. We could invert OPS and OPS Allowed and subtract from one:

W% = 1 - (4*OPS Allowed/OPS - 2)/(4*OPS Allowed/OPS - 3)

Which can be simplified to:

W% = 1/(4*OPS Allowed/OPS - 2)

Doing it this way, with separate equations, the RMSE against actual W% drops to 5.40, which is actually a tad better than the Base Run/Pythagenpat estimate (remember, this is only a small sample of sixty teams, and I'm not using the most accurate BsR formula available). The entire approach is a shortcut itself, and so I'm not advocating using separate formulas; that would defeat the purpose of a quick and dirty estimate. If you want something deeper than a quick and dirty estimate, you shouldn't be using OPS at all.

Anyway, the reason I got to thinking about this at all was that in the Bill James Gold Mine, the statistical summary for each team includes OPS and OPS Allowed. I certainly don't go out of my way to look up team OPS. Then it dawned on me that it was a neat coincidence that the conversion could be made by combining the pair of 2:1 functions, and that it would at least look nice. But make no mistake--like anything involving OPS, it's an "accident" that it works out so nicely. The 2:1 relationship between runs and wins is well documented, and it is the basis for a few W% estimators (including Pythagorean). But OPS is not a meaningful, real-life baseball number; it's a made-up statistic that happens to relate to runs on the team level at 2:1.

To end on a *truly* frivolous note, the inclusion of OPS and OPS Allowed in the Gold Mine caused me to notice something I hadn't before--that a team's raw run total over 162 games is relatively close to its OPS without the decimal place (in mathematical terms, OPS*1000). For 2008-2009, the RMSE of this direct estimate (looking at OPS-->Runs and OPS Allowed-->Runs Allowed) is 43.94. Of course, a real estimate based on OPS will have a much lower RMSE, somewhere in the general vicinity of 26 runs. But that involves applying a formula like:

Runs = (.496*OPS - .182)*(AB - H)

Just looking at a team's OPS over the course of a 162 game season, without any sort of mathematical manipulation, gives you an estimate of team runs that is in the same accuracy ballpark as running a regression for runs based on batting average. This is not any great shakes, of course, and you'd be a fool to estimate that because the Rangers allowed a 817 OPS last year, they should have allowed 817 runs (they actually allowed 967). But in many other cases, it will put you in the right ballpark, although you will be stuck in the nosebleed seats.

The reason this "works" can be seen by looking at the regression equation. The average team will make about 4080 outs (AB-H) per season (25.2 outs/game * 162 games). Substituting 4080 into the equation for outs, you can simplify it to roughly:

Runs = 2*OPS - 743

Over the past two years, the average major league team has scored 765 runs and compiled a 753 OPS. So for an average team, there isn't much difference between figuring 2*OPS - 743 or just taking OPS, since their OPS is pretty close to 743 as it is. As you move away from the average, this "formula's" accuracy will take a nose dive (exemplified by the Rangers example above).

This is the part where I set off the secret beacon in the Statue of Liberty and perform a mind-wipe, and you forget everything you just read and never, ever actually use the Reynolds estimate, okay?

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.