Wednesday, April 15, 2015

Reinventing the Wheel, Now With Win Estimators!

It is in my nature to snark about bad baseball analysis. Maybe more of it is nurture, as much of my early sabermetric reading was the younger Bill James, with later exposure to early BP and other r.s.bb derivatives, where snark was an integral part of the culture.

That is not really intended to be an excuse, although it may well read that way. As I have grown older I believe that I have generally become more aware of how little I actually know, but more consequently to snarking, less interested in engaging. I have lost almost any desire I ever had to evangelize about sabermetrics to the “unwashed masses” (now there’s a snarky, loaded term). Instead I am content to write to my very small audience, which even so is almost entirely based on what I want to write rather than what I think anyone might want to read, and take passive-aggressive potshots on Twitter. This probably still tilts me more towards the jackass side of the scale than the average sabermetrician, but so be it.

Every once in a while, though, I run across something that irks me so much that I have to respond to it in full. Against my better judgment, I feel compelled to draft a polemic in response, even though I know there’s nothing good that can possibly come of it. That is the case with an article that appeared in the Fall 2014 issue of SABR’s The Baseball Research Journal entitled “A New Formula to Predict a Team’s Winning Percentage” and written by Stanley Rothman, Ph.D.

Historically, the quality of sabermetric articles in the BRJ has been a mixed bag. Early BRJ editions included seminal research by pioneers of the field like Pete Palmer and Dick Cramer. Eventually the quality of such articles significantly dropped off, and BRJ was a leading purveyor of the rehashing of bases/X metrics that I rail against , and other equally banal statistical pieces with notable but rare exceptions. (That is particularly amusing since in the heyday of BRJ as a place where sabermetric research was published, Barry Codell introduced Base-Out Percentage, one of only a few times that metric could have been legitimately been said to have been “invented”).

In recent years, the quality of the statistical pieces in BRJ has been significantly improved, so I hope that my mockery of this particular piece is not taken as an indictment of the entirety of the body of work the editors (now Cecilia Tan) have been doing on this front. In fact, the Fall 2014 issue features a couple of sabermetric pieces I enjoyed greatly, both based on Log5 and other predictors of head-to-head matchups (John A. Richards’ piece “Probabilities of Victory in Head-to-Head Matchups” covered the theoretical basis for Log5 and a comparison of Log5 estimates to empirical results, and Matt Haechrel did likewise for individual batter-pitcher matchups in “Matchup Probabilities in Major League Baseball").

Dr. Rothman’s piece is an unfortunate exception. And since I consider myself (perhaps incorrectly so) to be something of a subject-matter expert in winning percentage estimators, I feel compelled to point out areas in which Rothman’s findings bury obvious, well-established principles in a barrage of linear regressions.

Rothman opens his paper by discussing Bill James’ ubiquitous and groundbreaking Pythagorean method, and then asks “Why not just use the quantity (RS-RA) to calculate EXP(W%)”? Why not indeed? This question is never satisfactorily answered in the paper. Nor is it even addressed henceforth.

Rothman proceeds to set up a W% estimator that he christens the Linear Formula as:

EXP(W%) = m*(RS-RA) + b

Note that Rothman’s terms RS and RA are just that--runs scored and runs allowed by a team. Not per game, per inning, or on any other sensible rate basis--raw, unadulterated seasonal totals.

Next, he provides the standard equations for m and b, and makes some simplifying assumptions. His regressions are run separately for each MLB season, so each team’s number of games is 162 (obviously there are some limited and non-material exceptions) and there are 30 observations in each regression (Rothman uses 1998-2012 data in his analysis). After these substitutions, the intercept b is equal to .5 and the slope m is:

m = SUM[(RS - RA)*W%]/SUM[(RS - RA)^2]

Rothman notes that for major league seasons viewed in aggregate, there is a strong correlation between SUM(RS - RA)*W% and SUM(RS - RA)^2, and so he develops a formula to predict the latter from the former:

EXP[SUM(RS - RA)^2] = 1464.4*SUM[(RS - RA)*W%] + 32710

This is substituted into the regression formula for expected W% with the intercept dropped since it has little impact to get the following equation:

EXP(W%) = SUM[(RS - RA)*W%]/{1464.4*SUM[(RS - RA)*W%]}*(RS - RA) + .5

= .000683*(RS - RA) + .5

This is the final formula that Rothman refers to as the Linear Formula. At this point, I will offer a few of my own comments:

1) There is nothing novel about presenting a W% estimator based on some relationship between run differential and W%. The rule of thumb that ten runs equals one win is just that. One of the earliest published W% estimators, from Arnold Soolman, was based on a regression that used RS/G and RA/G as separate variables but could have just as easily used the difference (and the insignificant difference in regression coefficients for the terms back that up).

2) The author’s choice to express this equation on a team-seasonal basis is, frankly, bizarre. It results in the formula being much less easy to apply to anything other than team seasonal totals, and it obscures the nature of the relationship between runs and wins, hiding the fact that this is little different than assuming ten runs per win. If you divide 1464.4 by 162 games/season, you find that the formula implies 9.04 runs per win and would be more conveniently expressed as .1106*(RS - RA)/G + .5.

3) I don’t understand the rationale for using a separate equation for each league-season, then developing a single slope by running another regression of various league quantities. It would be much more straightforward to combine all teams from the data set together and run a regression. Such an approach would also result in a higher R^2 for the team W% estimates. I don’t think that maximizing R^2 should be a paramount in constructing a W% estimator, but in this case I fail to see the advantage of not studying the relationship between runs and wins directly at the team level rather than aggregating team-level regressions across multiple seasons.

Returning to the article, Rothman uses a Chi-Square test on 2013 data to compare the Linear Formula to Pythagorean. Setting aside the silliness of using thirty data points for an accuracy test when hundreds are available, I must give Rothman credit for not using the Linear Formula’s better test statistic to trumpet its superiority--instead he writes that “there is no reason to believe that both of these formulas cannot be used.”

The article than includes a digression on applying this approach to the NBA and NFL. The conclusion and “additional points” sections of the article provide a handful of interesting contentions:

* Rothman suggests that one of the chief advantages of the Linear Formula is that it is “easier for a general manager to understand and use”. The premise is that GMs can use the Linear Formula to calculate the marginal wins from player transactions.

While there is certainly nothing wrong with these types of back of the envelope estimate, this comment would have been less bizarre twenty years ago. Now it seems incredibly na├»ve to suggest that the majority of major league front offices could improve their planning by using a dumbed down win estimator. It’s hard to determine which is sillier--the notion that front offices that would entertain such analysis would not be using more advanced models (the outcome suggested by which would depend much more on the projection of player performance than how that performance is translated into wins), or the notion that front offices who were so inclined and needed to do back of the envelope calculations would not be able to grasp Pythagorean.

* Apparently referring to the approximation used to derive the multi-year version of the formula above, Rothman asks “Why is there a strong positive correlation between SUM[(RS - RA)^2] and SUM[W%*(RS - RA)] in MLB?”

I might be accused of under-thinking this, but my response is “Why wouldn’t there be?” The key quantity in each sum is run differential. We know that run differential is positively correlated with W% (if it were not, this article would never have been written), so it should follow that the square of run differential (or the square root, the cube, the logarithm, any defined function) should have some relationship to the winning percentage times the run differential. And since the quantities Rothman is comparing are sums on the league level, both should increase as the differences between teams increase (i.e. if all teams were .500 and had zero run differentials, both quantities would be zero. As teams move away from the mean, both quantities increase).

* Rothman notes that if a team’s run differential is greater than 732, than the linear formula will produce an estimated W% in excess of 1.00. “However, this is not a problem because for the years 1998-2012 the maximum value for (RS - RA) is 300.”

Note that Rothman does not discuss the opposite problem, which is that a run differential of -732 will produce an equally implausible negative W%. But the hand-waiving away of this as a potential issue coupled with the posed but unaddressed question “Why not just use the quantity (RS-RA) to calculate EXP(W%)?” is why this article got under my skin.

If Dr. Rothman has taken five seconds to consider the advantages and disadvantages of how to construct a W% estimator, scant evidence of it has manifested itself in his paper (and given as this is a commentary on the paper and not Dr. Rothman himself or whatever unpublished consideration he gave to these matters, that is all I have to go on). There is certainly nothing wrong with experimenting with different estimators, but these experiments should not rise to the level of publication in a printed research journal unless they yield new insight in some way. Nothing in Rothman’s piece did--in fact, given the bizarre manner in which he chose to express the equation, I would suggest that if anything the piece regresses the field’s knowledge on W% estimators.

So allow me the liberty of answering Rothman’s question and the hand-waived problem for him.

Q: Why not use run differential to estimate W%?

A: Because doing so, at least through the simple linear regression approach, does not bound W% between zero and one, does not recognize that the marginal value of runs is variable, and does not recognize that the value of a run is dependent on the scoring environment.


Other than that, it’s great!

“Why not?” is a great reason to experiment, but it’s not a great reason to formally propose a new method (well, really, recycle existing methods, but I’m piling on as it is). There is also nothing wrong with using a model with certain deficiencies that other models avoid, whether due to computation restrictions, ease of use, a lack of deleterious effect for the task at hand, etc. But it should be incumbent on the analyst and the publisher to acknowledge them.

Finally, anyone publishing sabermetric research in this day and age should recognize that whatever new approach you believe you have developed for a common problem (like win estimation, or measuring offensive performance), it’s probably not new at all. This is certainly the case here given the work of Soolman, the rule of thumb that ten runs equals one win, the dynamic runs per win formula used in The Hidden Game of Baseball and Total Baseball by Pete Palmer, and other related approaches. All of these are based on the basic construct W% = m*run differential + b.

Personal anecdote: I don’t remember when this was exactly, maybe when I was in the eighth grade, but in our math class we were learning about linear equations of the form y = mx + b and there was an example in the textbook that showed how one could eyeball a line through a scatterplot and develop the equation for that line. In other words, a manual, poor man’s linear regression.

So I did just that with a few years of team data, plotting run differential per game against W% (I want to say I used 1972-74 data), and came up with W% = .1067*RD + .5. Foolishly, I actually used this for W% estimates for a period of time. Thankfully, I was cognizant that it was not a new approach but rather just a specific implementation of one developed by others, and I did not attempt to/no one permitted me to publish it as if it was. Years later, W% = .1106*RD + .5 appeared in the pages of the Baseball Research Journal.

So that this post might have some smidgeon of lasting value, I will close by reiterating the three conditions of an ideal win estimator that such linear constructs fail to satisfy. I have written plenty about win estimators in the past (and will doubtlessly rehash much of it again in the future), but I don’t believe I’ve explicitly singled out those properties. An ideal W% estimator would satisfy all three, which is not to say there is no use for an estimator that satisfies only two or even zero. The Linear Formula satisfies none. I will discuss how three of the common approaches perform: Pythagorean (with fixed exponent), Pythagenpat, and Palmer (RPW = 10*sqrt(runs per inning by both teams). Palmer can serve as a stand-in for any method that allows RPW to vary as the scoring level varies, and of course there are other constructs that I am not discussing.

1. The estimate should fall in the range [0,1]

The reason for this is self-explanatory. Pythagorean and Pythagenpat pass, while Palmer does not. Obviously this is not really an issue when you apply the method to normal major league teams. It can become an issue when extrapolating to individual/extreme performances, though.

2. The formula should recognize that the marginal value of runs is variable.

This is somewhat related to #1--the construct of Pythagorean results in it passing both tests. However, there are other constructs that are bounded but fail here. Palmer fails here, which is inevitable for a linear formula. The gist here is that each additional run scored is less valuable in terms of buying wins and each additional run prevented is more valuable. This is also the hardest to articulate and the hardest to prove if one has not bought into a Pythagorean-based approach (or examined other W% models such as those based on run distributions).

3. The formula should recognize that as more runs are scored, the number of marginal runs needed to earn a win increases.

This could be confused with #2, but #2 is true regardless of the scoring level in question--it's true in 1930 and in 1968. In this case, the relationship between runs and wins changes as the run environment changes. This is where a fixed exponent Pythagorean approach falls short, while both Pythagenpat and Palmer take this into account.

Saturday, April 04, 2015

2015 Predictions

No involved disclaimer this year; I will just point you to this article and point out that it applies even to much more formal predictions than those displayed here. This is my opinion and it is in the spirit of fun rather than analysis:

AL EAST
1. Boston
2. Toronto
3. New York
4. Baltimore
5. Tampa Bay

I am less confident in my order here than for any other division. The whole AL is something of a tossup, though, as many others have noted. I’ve settled on Boston for the East and the pennant. Their starting pitching is mediocre on paper, but at least they should have the resources to improve it (perhaps in a Cole Hamels type way) and a fair number of competent bodies to cycle through the back of the rotation in case of injury or ineffectiveness. But their offense projects as the best in the league. Toronto in many ways is the same team, but with an offense more dependent on its stars and a rotation that, outside Drew Hutchison, may lack the upside of Boston’s. New York looks like a middle-of-the-pack team to me; they may be better than recent years but have that masked by their recent Pythagorean outperformance. Either way I think they would need a lot of old players to stay healthy to win it. Baltimore is a team that I’ve missed on repeatedly, but I don’t just assume that this year’s equivalent of Steve Pearce or Miguel Gonzalez is bound to materialize. Plus I have a natural distrust of all things Ubaldo Jimenez is even tangentially associated with. Tampa Bay is a team that PECOTA loves, but I tend to agree with the mainstream on. Although they do seem like a high variance team and could surprise, plus Drew Smyly.

AL CENTRAL
1. Detroit
2. Cleveland (wildcard)
3. Chicago
4. Kansas City
5. Minnesota

I thought the Tigers were vulnerable last year; that is even more the case in 2015. Verlander’s status as an ace has been seriously impaired, Price for Scherzer from a preseason perspective is ok except that Drew Smyly and Austin Jackson are replaced by Alfredo Simon and Anthony Gose. Or is that Shane Greene and Rajai Davis? Does it really matter? The only reason I am picking them to win is because I can’t bring myself to pick Cleveland. The history of me picking the Shapiro era Indians to win is not a good one--I think 2007 is the only time I got it right. Plus the Indians are too popular among prognosticators for comfort. It’s easy to imagine a scenario where they are really good--Kluber approaches his 2014 performance, a couple of Carrasco/Salazar/Bauer/House are really good, Kipnis or Swisher bounces back, Gomes and Brantley don’t regress too much…but it’s also not that hard to picture multiple issues resulting in a catastrophic failure. While it’s hard to predict bullpens, the Indians looks a little precarious thanks to how hard it was worked last year and that the third and fourth righties are Scott Atchison and Anthony Swarzak. The White Sox have frontline players to compete with Detroit and Cleveland for sure, but I question whether the other pieces are strong enough. Tyler Flowers, catcher and Hector Noesi, any role don’t inspire confidence. Kansas City will be one of the great sabermetric/mainstream divergence cases, but other than Yordano Ventura, who am I supposed to like in their rotation? Other than Alex Gordon, who am I supposed to really like in their lineup? The Royals could contend again but it’s hard to pick it. There’s not much to say about the Twins, but I’m sure it’s all Joe Mauer’s fault anyway.

AL WEST
1. Los Angeles
2. Seattle (wildcard)
3. Oakland
4. Houston
5. Texas

My crude numbers have it as too close to call between Los Angeles and Seattle. The Angels would appear to have the stronger offense, the Mariners better pitching. All things being equal I’ll bet on the team with Mike Trout, although the lineup looks below average except for him. It would be fun if Oakland could hang around in the race and bust up some narratives, but I’ve never been a believer in the 2012-2014 A’s in making preseason predictions, so I’m certainly not going to start now. I think this will the year that Houston safely clears the bar of respectability, although that bar seems to be set higher for them as they have become a lightning rod, often for sabermetrically-inclined people who want to prove that they are not part of the herd. Such is the price of a touch of self-promotion and the stronger “Billy Beane should have never written that book” effect. I was all set to pick Texas as some kind of dark horse bounceback contender, and then my perennial Cy Young pick Yu Darvish went down and I took it as a sign to banish them to the bottom of the league.

NL EAST
1. Washington
2. New York
3. Miami
4. Atlanta
5. Philadelphia

There’s no reason to get off the Washington bandwagon now, as this looks like the safest division pick in MLB. With the teardown of the Braves, there is no credible threat on paper. I do have to balance my backlash impulses against the tendency to overrate supposed “Super Rotations” and the notion that they somehow guarantee playoff success, as if Washington sans Scherzer wasn’t a darn good group or the experience of the Halladay/Lee/Hamels/Oswalt Phillies and the Maddux/Glavine/Smoltz/(Avery/Neagle) Braves shouldn’t have disabused that notion long ago. But bonus likability points for the fact that so many people want to make Bryce Harper into a villain. The Mets were a tempting wildcard pick for me, but the loss of Wheeler made it easier to push them down a little bit. I personally like them better than the crude numbers I run, which only estimate 79 wins. Miami has a fun young core with Stanton, Yelich, Ozuna, Fernandez, etc. but I think they’ve jumped the gun on trying to win and at least one of those moves will be exposed as a big misstep (Dee Gordon). They are the best bet at the moment to be the next non-Washington winner of this division, though. In the span of two years, Atlanta has gone from a team I irrationally liked to one I thought was good but disliked (thanks Brian McCann!) to one that actively appears to court my dislike. It may not matter because they might have the worst offense in the majors. But that distinction may go to the Phillies, who may also have one of the worst pitching staffs. But they have Ryan Howard, franchise icon.

NL CENTRAL
1. St. Louis
2. Chicago (wildcard)
3. Pittsburgh
4. Milwaukee
5. Cincinnati

St. Louis is an easy pick in a different way than Washington--they don’t tower over the field to the same extent, but they are the only thing resembling a safe pick in the Central. And they have Jason Heyward now, which is good for multiple brownie points that didn’t contribute to this pick. I feel like a sucker for picking Chicago to win a wildcard. It’s really easy to let the prospect hype run wild in one’s mind and jump the gun. But what’s one to do? On paper they do appear to be the second-best team in the division, a good offense supporting a bad pitching staff. My crude workup (based on Fangraphs’ composite projections) isn’t counting on too much from Javier Baez or a full season from Kris Bryant, although it does assume Jorge Soler is an excellent player right now. My point is that it’s not a terribly over-exuberant projection. While I don’t put a whole lot of stock in it, Joe Maddon has experience with the quick turnaround, although this time everyone is watching for it. The East offers nothing special in the way of wildcard material and San Diego also carries potential for serious overhype. Pittsburgh should also be right in the mix, looking on paper to be pretty average on both sides of the ball. I overrated Milwaukee last year--I may be too quick to cast them aside in 2015, they also look like a .500 team on paper which in reality means they are a serious wildcard contender. It’s hard to imagine I would be less impressed with Cincinnati’s management post-Baker, and yet here we are. Moving every possible starter to the bullpen to go with Jason Marquis (Jason Marquis is still in the league?!!) at the back of the rotation does not inspire confidence, nor does the continued sniping (and more importantly, loss of skill) of Brandon Phillips. That Raisel Iglesias, who many felt would be a reliever, somehow escaped the Aroldis Chapman Memorial Black Hole, is a mystery that may never be fully explained.

NL WEST

1. Los Angeles
2. San Francisco (wildcard)
3. San Diego
4. Arizona
5. Colorado

I remain befuddled at why PECOTA loves the Dodgers so much; they are again clear favorites but a midpoint expectation of 98 wins doesn’t make any sense. Their offense is far from the sure thing I would expect to predict such a record, although they have intriguing Cubans on call in case of problems. Their bullpen also is far from a sure thing. San Francisco’s offense looks to be below-average, with strong pitching; if you ignore park effects one might say the same about San Diego. The Padres made a splash on the offensive side to be sure, but the left side of the infield is still spotty and it looks to me like pitching is their strength. Flip a coin between the Giants and the Padres; I’ve picked the former simply because it’s the less desirable outcome in my eyes. Arizona and Colorado are not just the two worst teams in this division on paper, they are both contenders for the worst team in the majors, with Philadelphia and perhaps Minnesota and Texas in on the game. When the moves made by Dave Stewart make more immediate intuitive sense than those by new GM Jeff Bridich (seriously, what’s the deal with Jorge De La Rosa?), it’s time to fear for the non-California wing of the NL West.

WORLD SERIES

Washington over Boston

I picked Washington last year and see no reason to stop now. Boston has the potential to make this post look absurd by August but that will happen one way or the other regardless.

AL Rookie of the Year: SP Carlos Rodon, CHA

AL Cy Young: Yovani Gallardo, TEX
Ok, ok, that’s a joke…I picked Gallardo to win the NL Cy more times than I would care to admit.
Serious pick: Chris Sale, CHA
I’m tempted to pick Drew Smyly but that wouldn’t be serious either. I do really like Drew Smyly though.

AL MVP: 2B Robinson Cano, SEA

NL Rookie of the Year: RF Jorge Soler, CHN

NL Cy Young: Stephen Strasburg, WAS
I’m sticking with this until it happens.

NL MVP: RF Bryce Harper, WAS

Worst team in each league: MIN, PHI

Most likely to go .500 in each league: OAK, MIL