tag:blogger.com,1999:blog-121333352019-11-11T13:21:19.377-05:00Walk Like a SabermetricianOccasional commentary on baseball and sabermetricsphttp://www.blogger.com/profile/18057215403741682609noreply@blogger.comBlogger588125tag:blogger.com,1999:blog-12133335.post-63509217854714655372019-11-11T13:21:00.002-05:002019-11-11T13:21:19.335-05:00Hypothetical Award Ballots, 2019In the past I’ve split these up into three separate posts, but it’s dawned on me that maybe if combined they will be long enough to actually merit a post. I should note that this is something I write not because I think anyone will be interested in, but because I enjoy having a record of what I thought about these things years later. In reviewing some of those posts from prior years, I’ve concluded that they had way too many numbers in an attempt to justify every ballot spot. I publish the RAR figures that are the starting point for any retrospective player valuation exercise I engage in -- I no longer see a need to regurgitate them all unless it’s important to a point. <br /><br />AL ROY:<br /><br />1. DH Yordan Alvarez, HOU<br />2. SP John Means, BAL<br />3. SP Zach Plesac, CLE<br />4. 2B Brandon Lowe, TB<br />5. 2B Cavan Biggio, TOR<br /><br />Alvarez is an easy choice – while he only had 367 PA, the only AL hitter with a better RG was Mike Trout. The only real competition is John Means, who turned in a fine season pitching for Baltimore, although his peripherals were far less impressive than his actual results, which was also true for Zach Plesac. I slid Brandon Lowe just ahead of Cavan Biggio on the basis of fielding, which is also why they got the nod over Eloy Jimenez and Luis Arraez.<br /><br />NL ROY:<br /><br />1. 1B Pete Alonso, NYN<br />2. SP Mike Soroka, ATL<br />3. SS Fernando Tatis, SD<br />4. LF Bryan Reynolds, PIT<br />5. SP Chris Paddack, SD<br /><br />Any of the first three would top my AL ballot. On a pure RAR basis, Soroka would edge out Alonso, but Soroka’s peripherals were not as strong as his actual runs allowed which drops him a bit. It’s worth noting that on a rate basis Fernando Tatis was better than Alonso -- he had 40 RAR in 84 games, which over a 150 game season would have put him squarely in the MVP race. Of course, he was unlikely to have kept up that pace, and his underlying performance may not have been the equals of those numbers. But on the other hand, he is four years younger than Alonso and much more likely to be a long-term star. Bryan Reynolds had a quietly good season, but there were other strong position player candidates including Keston Hiura, Kevin Newman, Tommy Edman, and Christian Walker, any of whom would have edged out the second basemen on my AL ballot. The same is also true of pitchers -- I went with Chris Paddack over Sandy Alcantara, Dakota Hudson, and Zac Gallen. Gallen was brilliant over 80 innings (2.63 RRA with lesser but still strong peripherals like a 3.70 dRA), but it’s not enough when Paddack tossed 140 innings with 10.6 K/2.1 W per game.<br /><br />AL Cy Young:<br /><br />1. Justin Verlander, HOU<br />2. Gerrit Cole, HOU<br />3. Shane Bieber, CLE<br />4. Lance Lynn, TEX<br />5. Charlie Morton, TB<br /><br />I expect Cole to win, but my vote would go to Verlander. Verlander threw ten more innings with a better RRA and the same eRA, although Cole does better in dRA as Verlander’s BABIP was low (.226 to Cole’s .279). I give that some weight, but not enough to overcome Verlander’s lead, and one could argue that Verlander’s high home run rate should offset his low BABIP when making adjustments for peripherals. Sam Miller pointed out on <U>Effectively Wild</u> that Verlander has had a disproportionate number of second-place finishes in Cy voting. I concur, and while none of them were cases in which the actual choice was a poor one, for my money Verlander was the AL’s top pitcher in 2011, 2012, 2016, 2018, and 2019. Mike Minor’s high dRA knocked him off my ballot in favor of teammate Lance Lynn and Charlie Morton.<br /><br />NY Cy Young:<br /><br />1. Jacob deGrom, NYN<br />2. Stephen Strasburg, WAS<br />3. Max Scherzer, WAS<br />4. Jack Flaherty, STL<br />5. Hyun-Jin Ryu, LA<br /><br />deGrom was an easy choice for the top of the ballot, but after that I used a fair amount of judgment. Strasburg had the most consistent RAR figures, whether using RRA, eRA, or dRA; Flaherty and Ryu both had significantly worse dRAs, which dropped them behind the Nationals on my ballot. There also should be some recognition of Zack Greinke; had he spent his entire season in the NL he would have ranked second here, but if it’s an NL award I don’t think AL performance should get any credit, and so he doesn’t rank in the top five.<br /><br />AL MVP:<br /><br />1. CF Mike Trout, LAA<br />2. 3B Alex Bregman, HOU<br />3. SP Justin Verlander, HOU<br />4. SP Gerrit Cole, HOU<br />5. SP Shane Bieber, CLE<br />6. SP Lance Lynn, TEX<br />7. SP Charlie Morton, TB<br />8. SS Marcus Semien, OAK<br />9. SP Mike Minor, TEX<br />10. CF George Springer, HOU<br /><br />Had Mike Trout not been sidelined by a foot issue in September, this wouldn’t even be a question. I still think Trout is the clear (if not inarguable) choice; he starts ahead of Bregman by just a single run in RAR, and if you give full credit to fielding metrics, Bregman could be ahead as Trout’s BP/UZR/DRS fielding runs saved were (7, -1, -1) compared to Bregman’s (11, 2, 7). However, I only give half-credit as the uncertainty regarding fielding performance means an estimated fielding run saved is not as conclusive of value as an estimated offensive run contributed. The other major area of the game not taken into account in my RAR estimates is baserunning, and using BP’s figures, Trout was +3 runs and Bregman -4 (removing basestealing runs, which I already take into account). That wipes out any advantage Bregman might have in the field, and all things being equal I would take the player who contributes equal RAR in less playing time - just because I think that if I’ve erred in setting replacement level, I’ve erred by setting it too low. The slotting of position players otherwise follows RAR except that Xander Bogaerts had dreadful fielding metrics (-21, 1, -21) which knocks him out.<br /><br />If you just look at RAR, Verlander could rank ahead of either of the hitters, but while I have absolutely no problem supporting a pitcher as MVP, I do think in such a case that they should have better RAR not just when using their actual runs allowed, but using peripherals as well. Verlander has 91, 83, or 64 RAR depending on the inputs you use; I have Trout as 80 when considering fielding and baserunning, and that sixteen run gap using Verlander’s dRA is too large for me to put him on top.<br /><br />I’ve never put six pitchers on a hypothetical MVP ballot before, and as you’ll see with the NL, a full half of my MVP ballot spots went to pitchers. One thing I should revisit is the replacement level I’m using for starters, which is 128% of the league average RA; I had previously used 125%, and with the continual decline in the share of innings borne by starters and the 2019 development that starters had a better overall eRA than relievers, it’s worth revisiting the replacement level I’m using for starters and considering adjusting it downward. <br /><br />NL MVP:<br /><br />1. CF Cody Bellinger, LA<br />2. RF Christian Yelich, MIL<br />3. SP Jacob deGrom, NYN<br />4. 3B Anthony Rendon, WAS<br />5. SP Stephen Strasburg, WAS<br />6. SP Max Scherzer, WAS<br />7. 1B Pete Alonso, NYN<br />8. CF Ronald Acuna, ATL<br />9. LF Juan Soto, WAS<br />10. SP Jack Flaherty, STL<br /><br />Bellinger and Yelich were very close in RAR, but this is a case where fielding gives Bellinger (15, 10, 19) a clear edge over Yelich (-1, 0, -3). That’s pretty much the only place that needs explanation beyond just perusing the RAR figures, except that Starling Marte’s (-12, -1, -1) fielding puts him behind the young outfielders of the AL East.<br /><br />phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-79492196632306788552019-10-04T11:25:00.000-04:002019-10-04T11:26:18.287-04:00End of Season Statistics, 2019The spreadsheets are published as Google Spreadsheets, which you can download in Excel format by changing the extension in the address from "=html" to "=xlsx", or in open format as "=ods". That way you can download them and manipulate things however you see fit.<br /><br />The data comes from a number of different sources. Most of the data comes from Baseball-Reference. KJOK's park database is extremely helpful in determining when park factors should reset. <br /><br />The basic philosophy behind these stats is to use the simplest methods that have acceptable accuracy. Of course, "acceptable" is in the eye of the beholder, namely me. I use Pythagenpat not because other run/win converters, like a constant RPW or a fixed exponent are not accurate enough for this purpose, but because it's mine and it would be kind of odd if I didn't use it.<br /><br />If I seem to be a stickler for purity in my critiques of others' methods, I'd contend it is usually in a theoretical sense, not an input sense. So when I exclude hit batters, I'm not saying that hit batters are worthless or that they *should* be ignored; it's just easier not to mess with them and not that much less accurate (note: hit batters are actually included in the offensive statistics now).<br /><br />I also don't really have a problem with people using sub-standard methods (say, Basic RC) as long as they acknowledge that they are sub-standard. If someone pretends that Basic RC doesn't undervalue walks or cause problems when applied to extreme individuals, I'll call them on it; if they explain its shortcomings but use it regardless, I accept that. Take these last three paragraphs as my acknowledgment that some of the statistics displayed here have shortcomings as well, and I've at least attempted to describe some of them in the discussion below.<br /><br />The League spreadsheet is pretty straightforward--it includes league totals and averages for a number of categories, most or all of which are explained at appropriate junctures throughout this piece. The advent of interleague play has created two different sets of league totals--one for the offense of league teams and one for the defense of league teams. Before interleague play, these two were identical. I do not present both sets of totals (you can figure the defensive ones yourself from the team spreadsheet, if you desire), just those for the offenses. The exception is for the defense-specific statistics, like innings pitched and quality starts. The figures for those categories in the league report are for the defenses of the league's teams. However, I do include each league's breakdown of basic pitching stats between starters and relievers (denoted by "s" or "r" prefixes), and so summing those will yield the totals from the pitching side. The one abbreviation you might not recognize is "N"--this is the league average of runs/game for one team, and it will pop up again.<br /><br />The Team spreadsheet focuses on overall team performance--wins, losses, runs scored, runs allowed. The columns included are: Park Factor (PF), Home Run Park Factor (PFhr), Winning Percentage (W%), Expected W% (EW%), Predicted W% (PW%), wins, losses, runs, runs allowed, Runs Created (RC), Runs Created Allowed (RCA), Home Winning Percentage (HW%), Road Winning Percentage (RW%) [exactly what they sound like--W% at home and on the road], Runs/Game (R/G), Runs Allowed/Game (RA/G), Runs Created/Game (RCG), Runs Created Allowed/Game (RCAG), and Runs Per Game (the average number of runs scored an allowed per game). Ideally, I would use outs as the denominator, but for teams, outs and games are so closely related that I don’t think it’s worth the extra effort.<br /><br />The runs and Runs Created figures are unadjusted, but the per-game averages are park-adjusted, except for RPG which is also raw. Runs Created and Runs Created Allowed are both based on a simple Base Runs formula. The formula is:<br /><br />A = H + W - HR - CS<br />B = (2TB - H - 4HR + .05W + 1.5SB)*.76<br />C = AB - H<br />D = HR<br />Naturally, A*B/(B + C) + D.<br /><br />I have explained the methodology used to figure the PFs before, but the cliff’s notes version is that they are based on five years of data when applicable, include both runs scored and allowed, and they are regressed towards average (PF = 1), with the amount of regression varying based on the number of years of data used. There are factors for both runs and home runs. The initial PF (not shown) is:<br /><br />iPF = (H*T/(R*(T - 1) + H) + 1)/2<br />where H = RPG in home games, R = RPG in road games, T = # teams in league (14 for AL and 16 for NL). Then the iPF is converted to the PF by taking x*iPF + (1-x), where x = .6 if one year of data is used, .7 for 2, .8 for 3, and .9 for 4+.<br /><br />It is important to note, since there always seems to be confusion about this, that these park factors already incorporate the fact that the average player plays 50% on the road and 50% at home. That is what the adding one and dividing by 2 in the iPF is all about. So if I list Fenway Park with a 1.02 PF, that means that it actually increases RPG by 4%.<br /><br />In the calculation of the PFs, I did not take out “home” games that were actually at neutral sites (of which there were a rash this year).<br /><br />There are also Team Offense and Defense spreadsheets. These include the following categories:<br /><br />Team offense: Plate Appearances, Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Walks and Hit Batters per At Bat (WAB), Isolated Power (SLG - BA), R/G at home (hR/G), and R/G on the road (rR/G) BA, OBA, SLG, WAB, and ISO are park-adjusted by dividing by the square root of park factor (or the equivalent; WAB = (OBA - BA)/(1 - OBA), ISO = SLG - BA, and SEC = WAB + ISO).<br /><br />Team defense: Innings Pitched, BA, OBA, SLG, Innings per Start (IP/S), Starter's eRA (seRA), Reliever's eRA (reRA), Quality Start Percentage (QS%), RA/G at home (hRA/G), RA/G on the road (rRA/G), Battery Mishap Rate (BMR), Modified Fielding Average (mFA), and Defensive Efficiency Record (DER). BA, OBA, and SLG are park-adjusted by dividing by the square root of PF; seRA and reRA are divided by PF.<br /><br />The three fielding metrics I've included are limited it only to metrics that a) I can calculate myself and b) are based on the basic available data, not specialized PBP data. The three metrics are explained in this post, but here are quick descriptions of each:<br /><br />1) BMR--wild pitches and passed balls per 100 baserunners = (WP + PB)/(H + W - HR)*100<br /><br />2) mFA--fielding average removing strikeouts and assists = (PO - K)/(PO - K + E)<br /><br />3) DER--the Bill James classic, using only the PA-based estimate of plays made. Based on a suggestion by Terpsfan101, I've tweaked the error coefficient. Plays Made = PA - K - H - W - HR - HB - .64E and DER = PM/(PM + H - HR + .64E)<br /><br />Next are the individual player reports. I defined a starting pitcher as one with 15 or more starts. All other pitchers are eligible to be included as a reliever. If a pitcher has 40 appearances, then they are included. Additionally, if a pitcher has 50 innings and less than 50% of his appearances are starts, he is also included as a reliever (this allows some swingmen type pitchers who wouldn’t meet either the minimum start or appearance standards to get in). This would be a good point to note that I didn't do much to adjust for the opener--I made some judgment calls (very haphazard judgment calls) on which bucket to throw some pitchers in. This is something that I should definitely give some more thought to in coming years.<br /><br />For all of the player reports, ages are based on simply subtracting their year of birth from 2019. I realize that this is not compatible with how ages are usually listed and so “Age 27” doesn’t necessarily correspond to age 27 as I list it, but it makes everything a heckuva lot easier, and I am more interested in comparing the ages of the players to their contemporaries than fitting them into historical studies, and for the former application it makes very little difference. The "R" category records rookie status with a "R" for rookies and a blank for everyone else; I've trusted Baseball Prospectus on this. Also, all players are counted as being on the team with whom they played/pitched (IP or PA as appropriate) the most.<br /><br />For relievers, the categories listed are: Games, Innings Pitched, estimated Plate Appearances (PA), Run Average (RA), Relief Run Average (RRA), Earned Run Average (ERA), Estimated Run Average (eRA), DIPS Run Average (dRA), Strikeouts per Game (KG), Walks per Game (WG), Guess-Future (G-F), Inherited Runners per Game (IR/G), Batting Average on Balls in Play (%H), Runs Above Average (RAA), and Runs Above Replacement (RAR).<br /><br />IR/G is per relief appearance (G - GS); it is an interesting thing to look at, I think, in lieu of actual leverage data. You can see which closers come in with runners on base, and which are used nearly exclusively to start innings. Of course, you can’t infer too much; there are bad relievers who come in with a lot of people on base, not because they are being used in high leverage situations, but because they are long men being used in low-leverage situations already out of hand.<br /><br />For starting pitchers, the columns are: Wins, Losses, Innings Pitched, Estimated Plate Appearances (PA), RA, RRA, ERA, eRA, dRA, KG, WG, G-F, %H, Pitches/Start (P/S), Quality Start Percentage (QS%), RAA, and RAR. RA and ERA you know--R*9/IP or ER*9/IP, park-adjusted by dividing by PF. The formulas for eRA and dRA are based on the same Base Runs equation and they estimate RA, not ERA.<br /><br />* eRA is based on the actual results allowed by the pitcher (hits, doubles, home runs, walks, strikeouts, etc.). It is park-adjusted by dividing by PF.<br /><br />* dRA is the classic DIPS-style RA, assuming that the pitcher allows a league average %H, and that his hits in play have a league-average S/D/T split. It is park-adjusted by dividing by PF.<br /><br />The formula for eRA is:<br /><br />A = H + W - HR<br />B = (2*TB - H - 4*HR + .05*W)*.78<br />C = AB - H = K + (3*IP - K)*x (where x is figured as described below for PA estimation and is typically around .93) = PA (from below) - H - W<br />eRA = (A*B/(B + C) + HR)*9/IP<br /><br />To figure dRA, you first need the estimate of PA described below. Then you calculate W, K, and HR per PA (call these %W, %K, and %HR). Percentage of balls in play (BIP%) = 1 - %W - %K - %HR. This is used to calculate the DIPS-friendly estimate of %H (H per PA) as e%H = Lg%H*BIP%.<br /><br />Now everything has a common denominator of PA, so we can plug into Base Runs:<br /><br />A = e%H + %W<br />B = (2*(z*e%H + 4*%HR) - e%H - 5*%HR + .05*%W)*.78<br />C = 1 - e%H - %W - %HR<br />cRA = (A*B/(B + C) + %HR)/C*a<br /><br />z is the league average of total bases per non-HR hit (TB - 4*HR)/(H - HR), and a is the league average of (AB - H) per game.<br /><br />Also shown are strikeout and walk rate, both expressed as per game. By game I mean not nine innings but rather the league average of PA/G. I have always been a proponent of using PA and not IP as the denominator for non-run pitching rates, and now the use of per PA rates is widespread. Usually these are expressed as K/PA and W/PA, or equivalently, percentage of PA with a strikeout or walk. I don’t believe that any site publishes these as K and W per equivalent game as I am here. This is not better than K%--it’s simply applying a scalar multiplier. I like it because it generally follows the same scale as the familiar K/9.<br /><br />To facilitate this, I’ve finally corrected a flaw in the formula I use to estimate plate appearances for pitchers. Previously, I’ve done it the lazy way by not splitting strikeouts out from other outs. I am now using this formula to estimate PA (where PA = AB + W):<br /><br />PA = K + (3*IP - K)*x + H + W<br />Where x = league average of (AB - H - K)/(3*IP - K)<br /><br />Then KG = K*Lg(PA/G) and WG = W*Lg(PA/G).<br /><br />G-F is a junk stat, included here out of habit because I've been including it for years. It was intended to give a quick read of a pitcher's expected performance in the next season, based on eRA and strikeout rate. Although the numbers vaguely resemble RAs, it's actually unitless. As a rule of thumb, anything under four is pretty good for a starter. G-F = 4.46 + .095(eRA) - .113(K*9/IP). It is a junk stat. JUNK STAT JUNK STAT JUNK STAT. Got it?<br /><br />%H is BABIP, more or less--%H = (H - HR)/(PA - HR - K - W), where PA was estimated above. Pitches/Start includes all appearances, so I've counted relief appearances as one-half of a start (P/S = Pitches/(.5*G + .5*GS). QS% is just QS/(G - GS); I don't think it's particularly useful, but Doug's Stats include QS so I include it.<br /><br />I've used a stat called Relief Run Average (RRA) in the past, based on Sky Andrecheck's article in the August 1999 By the Numbers; that one only used inherited runners, but I've revised it to include bequeathed runners as well, making it equally applicable to starters and relievers. One thing that's become more problematic as time goes on for calculating this expanded metric is the sketchy availability of bequeathed runner data for relievers. As a result, only bequeathed runners left by starters (and "relievers" when pitching as starters) are taken into account here. I use RRA as the building block for baselined value estimates for all pitchers. I explained RRA in this article, but the bottom line formulas are:<br /><br />BRSV = BRS - BR*i*sqrt(PF)<br />IRSV = IR*i*sqrt(PF) - IRS<br />RRA = ((R - (BRSV + IRSV))*9/IP)/PF<br /><br />The two baselined stats are Runs Above Average (RAA) and Runs Above Replacement (RAR). Starting in 2015 I revised RAA to use a slightly different baseline for starters and relievers as described here. The adjustment is based on patterns from the last several seasons of league average starter and reliever eRA. Thus it does not adjust for any advantages relief pitchers enjoy that are not reflected in their component statistics. This could include runs allowed scoring rules that benefit relievers (although the use of RRA should help even the scales in this regard, at least compared to raw RA) and the talent advantage of starting pitchers. The RAR baselines do attempt to take the latter into account, and so the difference in starter and reliever RAR will be more stark than the difference in RAA.<br /><br />RAA (relievers) = (.951*LgRA - RRA)*IP/9<br />RAA (starters) = (1.025*LgRA - RRA)*IP/9<br />RAR (relievers) = (1.11*LgRA - RRA)*IP/9<br />RAR (starters) = (1.28*LgRA - RRA)*IP/9<br /><br />All players with 250 or more plate appearances (official, total plate appearances) are included in the Hitters spreadsheets (along with some players close to the cutoff point who I was interested in). Each is assigned one position, the one at which they appeared in the most games. The statistics presented are: Games played (G), Plate Appearances (PA), Outs (O), Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Runs Created (RC), Runs Created per Game (RG), Speed Score (SS), Hitting Runs Above Average (HRAA), Runs Above Average (RAA), Hitting Runs Above Replacement (HRAR), and Runs Above Replacement (RAR).<br /><br />Starting in 2015, I'm including hit batters in all related categories for hitters, so PA is now equal to AB + W+ HB. Outs are AB - H + CS. BA and SLG you know, but remember that without SF, OBA is just (H + W + HB)/(AB + W + HB). Secondary Average = (TB - H + W + HB)/AB = SLG - BA + (OBA - BA)/(1 - OBA). I have not included net steals as many people (and Bill James himself) do, but I have included HB which some do not.<br /><br />BA, OBA, and SLG are park-adjusted by dividing by the square root of PF. This is an approximation, of course, but I'm satisfied that it works well (I plan to post a couple articles on this some time during the offseason). The goal here is to adjust for the win value of offensive events, not to quantify the exact park effect on the given rate. I use the BA/OBA/SLG-based formula to figure SEC, so it is park-adjusted as well.<br /><br />Runs Created is actually Paul Johnson's ERP, more or less. Ideally, I would use a custom linear weights formula for the given league, but ERP is just so darn simple and close to the mark that it’s hard to pass up. I still use the term “RC” partially as a homage to Bill James (seriously, I really like and respect him even if I’ve said negative things about RC and Win Shares), and also because it is just a good term. I like the thought put in your head when you hear “creating” a run better than “producing”, “manufacturing”, “generating”, etc. to say nothing of names like “equivalent” or “extrapolated” runs. None of that is said to put down the creators of those methods--there just aren’t a lot of good, unique names available.<br /><br />For 2015, I refined the formula a little bit to:<br /><br />1. include hit batters at a value equal to that of a walk<br />2. value intentional walks at just half the value of a regular walk<br />3. recalibrate the multiplier based on the last ten major league seasons (2005-2014)<br /><br />This revised RC = (TB + .8H + W + HB - .5IW + .7SB - CS - .3AB)*.310<br /><br />RC is park adjusted by dividing by PF, making all of the value stats that follow park adjusted as well. RG, the Runs Created per Game rate, is RC/O*25.5. I do not believe that outs are the proper denominator for an individual rate stat, but I also do not believe that the distortions caused are that bad. (I still intend to finish my rate stat series and discuss all of the options in excruciating detail, but alas you’ll have to take my word for it now).<br /><br />Several years ago I switched from using my own "Speed Unit" to a version of Bill James' Speed Score; of course, Speed Unit was inspired by Speed Score. I only use four of James' categories in figuring Speed Score. I actually like the construct of Speed Unit better as it was based on z-scores in the various categories (and amazingly a couple other sabermetricians did as well), but trying to keep the estimates of standard deviation for each of the categories appropriate was more trouble than it was worth.<br /><br />Speed Score is the average of four components, which I'll call a, b, c, and d:<br /><br />a = ((SB + 3)/(SB + CS + 7) - .4)*20<br />b = sqrt((SB + CS)/(S + W))*14.3<br />c = ((R - HR)/(H + W - HR) - .1)*25<br />d = T/(AB - HR - K)*450<br /><br />James actually uses a sliding scale for the triples component, but it strikes me as needlessly complex and so I've streamlined it. He looks at two years of data, which makes sense for a gauge that is attempting to capture talent and not performance, but using multiple years of data would be contradictory to the guiding principles behind this set of reports (namely, simplicity. Or laziness. You're pick.) I also changed some of his division to mathematically equivalent multiplications.<br /><br />There are a whopping four categories that compare to a baseline; two for average, two for replacement. Hitting RAA compares to a league average hitter; it is in the vein of Pete Palmer’s Batting Runs. RAA compares to an average hitter at the player’s primary position. Hitting RAR compares to a “replacement level” hitter; RAR compares to a replacement level hitter at the player’s primary position. The formulas are:<br /><br />HRAA = (RG - N)*O/25.5<br />RAA = (RG - N*PADJ)*O/25.5<br />HRAR = (RG - .73*N)*O/25.5<br />RAR = (RG - .73*N*PADJ)*O/25.5<br /><br />PADJ is the position adjustment, and it has now been updated to be based on 2010-2019 offensive data. For catchers it is .92; for 1B/DH, 1.14; for 2B, .99; for 3B, 1.07; for SS, .95; for LF/RF, 1.09; and for CF, 1.05. As positional flexibility takes hold, fielding value is better quantified, and the long-term evolution of the game continues, it's right to question whether offensive positional adjustments are even less reflective of what we are trying to account for than they were in the past. I have a general discussion about the use of offensive positional adjustments below that I wrote a decade ago, but I will also have a bit more to say about this and these specific adjustments in my annual post on Hitting by Position which hopefully will actually be published this year. <br /><br />That was the mechanics of the calculations; now I'll twist myself into knots trying to justify them. If you only care about the how and not the why, stop reading now.<br /><br />The first thing that should be covered is the philosophical position behind the statistics posted here. They fall on the continuum of ability and value in what I have called "performance". Performance is a technical-sounding way of saying "Whatever arbitrary combination of ability and value I prefer".<br /><br />With respect to park adjustments, I am not interested in how any particular player is affected, so there is no separate adjustment for lefties and righties for instance. The park factor is an attempt to determine how the park affects run scoring rates, and thus the win value of runs.<br /><br />I apply the park factor directly to the player's statistics, but it could also be applied to the league context. The advantage to doing it my way is that it allows you to compare the component statistics (like Runs Created or OBA) on a park-adjusted basis. The drawback is that it creates a new theoretical universe, one in which all parks are equal, rather than leaving the player grounded in the actual context in which he played and evaluating how that context (and not the player's statistics) was altered by the park.<br /><br />The good news is that the two approaches are essentially equivalent; in fact, they are precisely equivalent if you assume that the Runs Per Win factor is equal to the RPG. Suppose that we have a player in an extreme park (PF = 1.15, approximately like Coors Field pre-humidor) who has an 8 RG before adjusting for park, while making 350 outs in a 4.5 N league. The first method of park adjustment, the one I use, converts his value into a neutral park, so his RG is now 8/1.15 = 6.957. We can now compare him directly to the league average:<br /><br />RAA = (6.957 - 4.5)*350/25.5 = +33.72<br /><br />The second method would be to adjust the league context. If N = 4.5, then the average player in this park will create 4.5*1.15 = 5.175 runs. Now, to figure RAA, we can use the unadjusted RG of 8:<br /><br />RAA = (8 - 5.175)*350/25.5 = +38.77<br /><br />These are not the same, as you can obviously see. The reason for this is that they take place in two different contexts. The first figure is in a 9 RPG (2*4.5) context; the second figure is in a 10.35 RPG (2*4.5*1.15) context. Runs have different values in different contexts; that is why we have RPW converters in the first place. If we convert to WAA (using RPW = RPG, which is only an approximation, so it's usually not as tidy as it appears below), then we have:<br /><br />WAA = 33.72/9 = +3.75<br />WAA = 38.77/10.35 = +3.75<br /><br />Once you convert to wins, the two approaches are equivalent. The other nice thing about the first approach is that once you park-adjust, everyone in the league is in the same context, and you can dispense with the need for converting to wins at all. You still might want to convert to wins, and you'll need to do so if you are comparing the 2015 players to players from other league-seasons (including between the AL and NL in the same year), but if you are only looking to compare Christian Yelich to Matt Carpenter, it's not necessary. WAR is somewhat ubiquitous now, but personally I prefer runs when possible--why mess with decimal points if you don't have to?<br /><br />The park factors used to adjust player stats here are run-based. Thus, they make no effort to project what a player "would have done" in a neutral park, or account for the difference effects parks have on specific events (walks, home runs, BA) or types of players. They simply account for the difference in run environment that is caused by the park (as best I can measure it). As such, they don't evaluate a player within the actual run context of his team's games; they attempt to restate the player's performance as an equivalent performance in a neutral park.<br /><br />I suppose I should also justify the use of sqrt(PF) for adjusting component statistics. The classic defense given for this approach relies on basic Runs Created--runs are proportional to OBA*SLG, and OBA*SLG/PF = OBA/sqrt(PF)*SLG/sqrt(PF). While RC may be an antiquated tool, you will find that the square root adjustment is fairly compatible with linear weights or Base Runs as well. I am not going to take the space to demonstrate this claim here, but I will some time in the future.<br /><br />Many value figures published around the sabersphere adjust for the difference in quality level between the AL and NL. I don't, but this is a thorny area where there is no right or wrong answer as far as I'm concerned. I also do not make an adjustment in the league averages for the fact that the overall NL averages include pitcher batting and the AL does not (not quite true in the era of interleague play, but you get my drift).<br /><br />The difference between the leagues may not be precisely calculable, and it certainly is not constant, but it is real. If the average player in the AL is better than the average player in the NL, it is perfectly reasonable to expect the average AL player to have more RAR than the average NL player, and that will not happen without some type of adjustment. On the other hand, if you are only interested in evaluating a player relative to his own league, such an adjustment is not necessarily welcome.<br /><br />The league argument only applies cleanly to metrics baselined to average. Since replacement level compares the given player to a theoretical player that can be acquired on the cheap, the same pool of potential replacement players should by definition be available to the teams of each league. One could argue that if the two leagues don't have equal talent at the major league level, they might not have equal access to replacement level talent--except such an argument is at odds with the notion that replacement level represents talent that is truly "freely available".<br /><br />So it's hard to justify the approach I take, which is to set replacement level relative to the average runs scored in each league, with no adjustment for the difference in the leagues. The best justification is that it's simple and it treats each league as its own universe, even if in reality they are connected.<br /><br />The replacement levels I have used here are very much in line with the values used by other sabermetricians. This is based both on my own "research", my interpretation of other's people research, and a desire to not stray from consensus and make the values unhelpful to the majority of people who may encounter them.<br /><br />Replacement level is certainly not settled science. There is always going to be room to disagree on what the baseline should be. Even if you agree it should be "replacement level", any estimate of where it should be set is just that--an estimate. Average is clean and fairly straightforward, even if its utility is questionable; replacement level is inherently messy. So I offer the average baseline as well.<br /><br />For position players, replacement level is set at 73% of the positional average RG (since there's a history of discussing replacement level in terms of winning percentages, this is roughly equivalent to .350). For starting pitchers, it is set at 128% of the league average RA (.380), and for relievers it is set at 111% (.450).<br /><br />I am still using an analytical structure that makes the comparison to replacement level for a position player by applying it to his hitting statistics. This is the approach taken by Keith Woolner in VORP (and some other earlier replacement level implementations), but the newer metrics (among them Rally and Fangraphs' WAR) handle replacement level by subtracting a set number of runs from the player's total runs above average in a number of different areas (batting, fielding, baserunning, positional value, etc.), which for lack of a better term I will call the subtraction approach.<br /><br />The offensive positional adjustment makes the inherent assumption that the average player at each position is equally valuable. I think that this is close to being true, but it is not quite true. The ideal approach would be to use a defensive positional adjustment, since the real difference between a first baseman and a shortstop is their defensive value. When you bat, all runs count the same, whether you create them as a first baseman or as a shortstop.<br /><br />That being said, using "replacement hitter at position" does not cause too many distortions. It is not theoretically correct, but it is practically powerful. For one thing, most players, even those at key defensive positions, are chosen first and foremost for their offense. Empirical research by Keith Woolner has shown that the replacement level hitting performance is about the same for every position, relative to the positional average.<br /><br />Figuring what the defensive positional adjustment should be, though, is easier said than done. Therefore, I use the offensive positional adjustment. So if you want to criticize that choice, or criticize the numbers that result, be my guest. But do not claim that I am holding this up as the correct analytical structure. I am holding it up as the most simple and straightforward structure that conforms to reality reasonably well, and because while the numbers may be flawed, they are at least based on an objective formula that I can figure myself. If you feel comfortable with some other assumptions, please feel free to ignore mine.<br /><br />That still does not justify the use of HRAR--hitting runs above replacement--which compares each hitter, regardless of position, to 73% of the league average. Basically, this is just a way to give an overall measure of offensive production without regard for position with a low baseline. It doesn't have any real baseball meaning.<br /><br />A player who creates runs at 90% of the league average could be above-average (if he's a shortstop or catcher, or a great fielder at a less important fielding position), or sub-replacement level (DHs that create 3.5 runs per game are not valuable properties). Every player is chosen because his total value, both hitting and fielding, is sufficient to justify his inclusion on the team. HRAR fails even if you try to justify it with a thought experiment about a world in which defense doesn't matter, because in that case the absolute replacement level (in terms of RG, without accounting for the league average) would be much higher than it is currently.<br /><br />The specific positional adjustments I use are based on 2002-2011 data. I stick with them because I have not seen compelling evidence of a change in the degree of difficulty or scarcity between the positions between now and then, and because I think they are fairly reasonable. The positions for which they diverge the most from the defensive position adjustments in common use are 2B, 3B, and CF. Second base is considered a premium position by the offensive PADJ (.99), while third base and center field have larger adjustments in the opposite direction (1.05 and 1.07).<br /><br />Another flaw is that the PADJ is applied to the overall league average RG, which is artificially low for the NL because of pitcher's batting. When using the actual league average runs/game, it's tough to just remove pitchers--any adjustment would be an estimate. If you use the league total of runs created instead, it is a much easier fix.<br /><br />One other note on this topic is that since the offensive PADJ is a stand-in for average defensive value by position, ideally it would be applied by tying it to defensive playing time. I have done it by outs, though.<br /><br />The reason I have taken this flawed path is because 1) it ties the position adjustment directly into the RAR formula rather than leaving it as something to subtract on the outside and more importantly 2) there’s no straightforward way to do it. The best would be to use defensive innings--set the full-time player to X defensive innings, figure how Derek Jeter’s innings compared to X, and adjust his PADJ accordingly. Games in the field or games played are dicey because they can cause distortion for defensive replacements. Plate Appearances avoid the problem that outs have of being highly related to player quality, but they still carry the illogic of basing it on offensive playing time. And of course the differences here are going to be fairly small (a few runs). That is not to say that this way is preferable, but it’s not horrible either, at least as far as I can tell.<br /><br />To compare this approach to the subtraction approach, start by assuming that a replacement level shortstop would create .86*.73*4.5 = 2.825 RG (or would perform at an overall level of equivalent value to being an average fielder at shortstop while creating 2.825 runs per game). Suppose that we are comparing two shortstops, each of whom compiled 600 PA and played an equal number of defensive games and innings (and thus would have the same positional adjustment using the subtraction approach). Alpha made 380 outs and Bravo made 410 outs, and each ranked as dead-on average in the field.<br /><br />The difference in overall RAR between the two using the subtraction approach would be equal to the difference between their offensive RAA compared to the league average. Assuming the league average is 4.5 runs, and that both Alpha and Bravo created 75 runs, their offensive RAAs are:<br /><br />Alpha = (75*25.5/380 - 4.5)*380/25.5 = +7.94<br /><br />Similarly, Bravo is at +2.65, and so the difference between them will be 5.29 RAR.<br /><br />Using the flawed approach, Alpha's RAR will be:<br /><br />(75*25.5/380 - 4.5*.73*.86)*380/25.5 = +32.90<br /><br />Bravo's RAR will be +29.58, a difference of 3.32 RAR, which is two runs off of the difference using the subtraction approach.<br /><br />The downside to using PA is that you really need to consider park effects if you do, whereas outs allow you to sidestep park effects. Outs are constant; plate appearances are linked to OBA. Thus, they not only depend on the offensive context (including park factor), but also on the quality of one's team. Of course, attempting to adjust for team PA differences opens a huge can of worms which is not really relevant; for now, the point is that using outs for individual players causes distortions, sometimes trivial and sometimes bothersome, but almost always makes one's life easier.<br /><br />I do not include fielding (or baserunning outside of steals, although that is a trivial consideration in comparison) in the RAR figures--they cover offense and positional value only). This in no way means that I do not believe that fielding is an important consideration in player evaluation. However, two of the key principles of these stat reports are 1) not incorporating any data that is not readily available and 2) not simply including other people's results (of course I borrow heavily from other people's methods, but only adapting methodology that I can apply myself).<br /><br />Any fielding metric worth its salt will fail to meet either criterion--they use zone data or play-by-play data which I do not have easy access to. I do not have a fielding metric that I have stapled together myself, and so I would have to simply lift other analysts' figures.<br /><br />Setting the practical reason for not including fielding aside, I do have some reservations about lumping fielding and hitting value together in one number because of the obvious differences in reliability between offensive and fielding metrics. In theory, they absolutely should be put together. But in practice, I believe it would be better to regress the fielding metric to a point at which it would be roughly equivalent in reliability to the offensive metric.<br /><br />Offensive metrics have error bars associated with them, too, of course, and in evaluating a single season's value, I don't care about the vagaries that we often lump together as "luck". Still, there are errors in our assessment of linear weight values and players that collect an unusual proportion of infield hits or hits to the left side, errors in estimation of park factor, and any number of other factors that make their events more or less valuable than an average event of that type.<br /><br />Fielding metrics offer up all of that and more, as we cannot be nearly as certain of true successes and failures as we are when analyzing offense. Recent investigations, particularly by Colin Wyers, have raised even more questions about the level of uncertainty. So, even if I was including a fielding value, my approach would be to assume that the offensive value was 100% reliable (which it isn't), and regress the fielding metric relative to that (so if the offensive metric was actually 70% reliable, and the fielding metric 40% reliable, I'd treat the fielding metric as .4/.7 = 57% reliable when tacking it on, to illustrate with a simplified and completely made up example presuming that one could have a precise estimate of nebulous "reliability").<br /><br />Given the inherent assumption of the offensive PADJ that all positions are equally valuable, once RAR has been figured for a player, fielding value can be accounted for by adding on his runs above average relative to a player at his own position. If there is a shortstop that is -2 runs defensively versus an average shortstop, he is without a doubt a plus defensive player, and a more valuable defensive player than a first baseman who was +1 run better than an average first baseman. Regardless, since it was implicitly assumed that they are both average defensively for their position when RAR was calculated, the shortstop will see his value docked two runs. This DOES NOT MEAN that the shortstop has been penalized for his defense. The whole process of accounting for positional differences, going from hitting RAR to positional RAR, has benefited him.<br /><br />I've found that there is often confusion about the treatment of first baseman and designated hitters in my PADJ methodology, since I consider DHs as in the same pool as first baseman. The fact of the matter is that first baseman outhit DH. There are any number of potential explanations for this; DHs are often old or injured, players hit worse when DHing than they do when playing the field, etc. This actually helps first baseman, since the DHs drag the average production of the pool down, thus resulting in a lower replacement level than I would get if I considered first baseman alone.<br /><br />However, this method does assume that a 1B and a DH have equal defensive value. Obviously, a DH has no defensive value. What I advocate to correct this is to treat a DH as a bad defensive first baseman, and thus knock another five or so runs off of his RAR for a full-time player. I do not incorporate this into the published numbers, but you should keep it in mind. However, there is no need to adjust the figures for first baseman upwards --the only necessary adjustment is to take the DHs down a notch.<br /><br />Finally, I consider each player at his primary defensive position (defined as where he appears in the most games), and do not weight the PADJ by playing time. This does shortchange a player like Ben Zobrist (who saw significant time at a tougher position than his primary position), and unduly boost a player like Buster Posey (who logged a lot of games at a much easier position than his primary position). For most players, though, it doesn't matter much. I find it preferable to make manual adjustments for the unusual cases rather than add another layer of complexity to the whole endeavor.<br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vR_acLogCW22NoOw7_rfiCXn85Ft6oDHP5ZXC7sr9tklxAwNhWeIbr84IH9f9Mk3lHqFRCjBRVlQM8Y/pub?output=HTML">2019 League</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vTrHwr8h9xH7oiyrGBLuoFuqfkRjMopi_0LYN_y8k6nWwyWMds73hLsgwQ6vMXcqHd4GQUix3Tmqjp-/pub?output=html">2019 PF</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vTfJPLaAzwXiO6AfXEcVN0YgmdM2-eiiwh4iNr4GVufqa09lF5tiNh1AR4aEO8g4rGGOQNjoaz81Ect/pub?output=html">2019 Teams</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vQILe7gamRtgQpEcuVtbgnzyj2rlzM40Ne9fLDxfK5nTBXYYzepoHqwNfg3SYrOy6W1VkYAM4fZXVes/pub?output=html">2019 Team Defense</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vRL6QYSzpgFmrBojCCtOy6NqF6jyrRHzFIPKZtJpP3H6MfpCNdGs78ast1_VjAQ3Sv_OOGUFWGqj1ic/pub?output=html">2019 Team Offense</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vRnyG9VJ7woC_5a5-rNzsZ2XADRCBH1JhoCi4-WXKyyX1dPJEz8O4l0TQ-Lq6hM_v2YY5B0OkUh9jkj/pub?output=html">2019 AL Relievers</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vSmV5PxNU0T3RP_0_XVH5MeTOQlQvvsryjxNLmmdjRrXeoG4IyIXFNOJwPogxvgMnhghT-oMoPeFUxh/pub?output=html">2019 NL Relievers</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vQdulwbcP7GxDHbbGxRcNRgcKwyNlNcHU4B6tVZ5HYeWgym6DmjJYreo-M78EaG9fU20J6KrQkCJUDW/pub?output=html">2019 AL Starters</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vRd3bNTV7Lq4w8QVv5dq3EnEGFWpKN4VIoB3vW-ECF_b_W4N5iR2X-mxlySY4bcepO50hVmwdtiWXwb/pub?output=html">2019 NL Starters</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vS9oiEWKUlISU-d43WyWHdiWfyD3taSGlyUTz6MA9MAuLIP576KIbAPQstoYaexCrfimnODFRwZ-Jah/pub?output=html">2019 AL Hitters</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vRGzl7_LApCqbiLCJ51sRTpzfZ2jRVbEjdf--OFeXN5_p1J41jBZka1ryRxbXJnE7-pSBGstZrd3GF8/pub?output=html">2019 NL Hitters</a>phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-18538015217922491452019-09-30T18:33:00.000-04:002019-09-30T18:33:44.772-04:00Crude Playoff Odds -- 2019These are very simple playoff odds, based on my crude rating system for teams using an equal mix of W%, EW% (based on R/RA), PW% (based on RC/RCA), and 69 games of .500. They account for home field advantage by assuming a .500 team wins 54.2% of home games (major league average 2006-2015). They assume that a team's inherent strength is constant from game-to-game. They do not generally account for any number of factors that you would actually want to account for if you were serious about this, including but not limited to injuries, the current construction of the team rather than the aggregate seasonal performance, pitching rotations, estimated true talent of the players, etc.<br /><br />The CTRs that are fed in are:<br /><br /><a href="https://4.bp.blogspot.com/-qmpxo6zrJ_A/XZKA7iYiHwI/AAAAAAAACsA/O8Tyw6wAKj0_B6UKuTK3LA0nubbZIH-fACLcBGAsYHQ/s1600/19odds1.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-qmpxo6zrJ_A/XZKA7iYiHwI/AAAAAAAACsA/O8Tyw6wAKj0_B6UKuTK3LA0nubbZIH-fACLcBGAsYHQ/s400/19odds1.JPG" width="368" height="400" data-original-width="173" data-original-height="188" /></a><br /><br />Wilcard game odds (the least useful since the pitching matchups aren’t taken into account, and that matters most when there is just one game):<br /><br /><a href="https://4.bp.blogspot.com/-epVSBrupjD4/XZKBAW_Wk2I/AAAAAAAACsE/Qh-MGUqEML0HFE_PXTNl4yxHYMKZoAkqwCLcBGAsYHQ/s1600/19odds2.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-epVSBrupjD4/XZKBAW_Wk2I/AAAAAAAACsE/Qh-MGUqEML0HFE_PXTNl4yxHYMKZoAkqwCLcBGAsYHQ/s400/19odds2.JPG" width="400" height="81" data-original-width="291" data-original-height="59" /></a><br /><br />LDS:<br /><br /><a href="https://3.bp.blogspot.com/-2_D1KQYFBu8/XZKBLiktZpI/AAAAAAAACsQ/7YnYkdfmpTAr3R7Tj3S8vT35VbI2F0laQCLcBGAsYHQ/s1600/19odds3.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-2_D1KQYFBu8/XZKBLiktZpI/AAAAAAAACsQ/7YnYkdfmpTAr3R7Tj3S8vT35VbI2F0laQCLcBGAsYHQ/s400/19odds3.JPG" width="400" height="108" data-original-width="501" data-original-height="135" /></a><br /><br />LCS:<br /><br /><a href="https://1.bp.blogspot.com/-QCM7IBnM3X0/XZKBF0jitgI/AAAAAAAACsI/uq9KHrSn_7Ez2UGEE5cee8FizcNRgx_FQCLcBGAsYHQ/s1600/19odds4.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-QCM7IBnM3X0/XZKBF0jitgI/AAAAAAAACsI/uq9KHrSn_7Ez2UGEE5cee8FizcNRgx_FQCLcBGAsYHQ/s400/19odds4.JPG" width="400" height="197" data-original-width="501" data-original-height="247" /></a><br /><br />WS:<br /><br /><a href="https://4.bp.blogspot.com/-RdDxmKHj4-A/XZKBQjuQ0WI/AAAAAAAACsU/iu5tXqUtuBIhy9f0hOvPco_yttiIoAN7QCLcBGAsYHQ/s1600/19odds5.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-RdDxmKHj4-A/XZKBQjuQ0WI/AAAAAAAACsU/iu5tXqUtuBIhy9f0hOvPco_yttiIoAN7QCLcBGAsYHQ/s400/19odds5.JPG" width="400" height="394" data-original-width="502" data-original-height="495" /></a><br /><br />It was easier to run this when World Series home field advantage was determined by league rather than team record. The record approach is not as arbitrary as alternating years or as silly as using the All-Star game result, but it does produce its own share of undesirable outcomes. Houston would have home field over Los Angeles, but given that the NL was finally stronger than the AL this year, the Astros' one game edge suggests an inferior record to that of the Dodgers, not a superior one. Even worse are the tiebreakers - after head-to-head, the edge goes to the team with the better intradivisional records favors teams from weak divisions, who likely performed less well than their raw win-loss record would suggest. The same is true of intraleague record which is the next tiebreaker. If some division/league breakout is the criteria of choice, it should be inter-, not intra-.<br /><br />Putting it all together:<br /><br /><a href="https://2.bp.blogspot.com/-3LkQKQ5-UUE/XZKCnIAwUFI/AAAAAAAACsk/wopO19Fxo20a9WNnVdWI6-HnoyOdxfQsQCLcBGAsYHQ/s1600/19odds6.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-3LkQKQ5-UUE/XZKCnIAwUFI/AAAAAAAACsk/wopO19Fxo20a9WNnVdWI6-HnoyOdxfQsQCLcBGAsYHQ/s400/19odds6.JPG" width="400" height="248" data-original-width="337" data-original-height="209" /></a><br /><br />phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-32236641781504133352019-09-25T08:43:00.000-04:002019-09-30T18:44:30.673-04:00Enby Distribution, pt. 11--Game Expected W%This is (finally!) the last post in this series, at least for now.<br /><br />In the Mets essay for the 1986 <u>Baseball Abstract</u>, Bill James focused on data he was sent by a man named Jeffrey Eby on the frequency of teams scoring and allowing X runs in a game, and their winning percentage when doing so. After some discussion of this data, and a comparison of the Mets and Dodgers offense (the latter was much efficient at clustering its runs scored in games to produce wins), he wrote:<br /><br />“One way to formalize this approach would be to add up the ‘win expectations’ for each game. That is, since teams which score one run will win 14.0% of the time, then for any game in which a team scores exactly one run, we can consider them to have an ‘offensive winning percentage’ for that game of .140. For any game in which the team scores give runs, they have an offensive winning percentage of .695. Their offensive winning percentage for the season is the average of their offensive wining [sic] percentages for all the games.”<br /><br />It stuck James at the time, and me reading it many years later, as a very good way to boil the data we have about team runs scored by game and boil it down into a single number that gets to the heart of the matter – how efficient was a team at clustering their runs to maximize their expected wins? James (in the essay) and I (for the last eight seasons or so on this blog) used the empirical data on the average winning percentage of teams when scoring or allowing X runs to calculate the winning percentage he described. I have called these gOW% and gDW%, for “game” offensive and defensive W%. However, there are a number of drawbacks to using empirical data.<br /><br />To repeat myself from my 2016 review of the data, these include:<br /><br />1. The empirical distribution is subject to sample size fluctuations. In 2016, all 58 times that a team scored twelve runs in a game, they won; meanwhile, teams that scored thirteen runs were 46-1. Does that mean that scoring 12 runs is preferable to scoring 13 runs? Of course not--it's a quirk in the data. Additionally, the marginal values (i.e. the change in winning percentage from scoring X runs to X+1 runs) don’t necessary make sense even in cases where W% increases from one runs scored level to another.<br /><br />2. Using the empirical distribution forces one to use integer values for runs scored per game. Obviously the number of runs a team scores in a game is restricted to integer values, but not allowing theoretical fractional runs makes it very difficult to apply any sort of park adjustment to the team frequency of runs scored.<br /><br />3. Related to #2 (really its root cause, although the park issue is important enough from the standpoint of using the results to evaluate teams that I wanted to single it out), when using the empirical data there is always a tradeoff that must be made between increasing the sample size and losing context. One could use multiple years of data to generate a smoother curve of marginal win probabilities, but in doing so one would lose centering at the season’s actual run scoring rate. On the other hand, one could split the data into AL and NL and more closely match context, but you would lose sample size and introduce more quirks into the data.<br /><br />Given these constraints, I have always promised to use Enby to develop estimated rather than empirical probabilities of winning a game when scoring X runs, given some fixed average runs allowed per game (or the complement from the defensive perspective). Suppose that the major league average is 4.5 runs/game. Given this, we can use Enby to estimate the probability of scoring X runs in a game (since the goal here is to estimate W%, I am using Enby with a Tango Distribution c parameter = .852, which is used for head-to-head matchups):<br /><br /><a href="https://1.bp.blogspot.com/-l193ny9CSQI/XYrjE1A-FyI/AAAAAAAACrQ/Xf9OGMS9XLogi6jy73ChgDDfU0xMb3uZwCLcBGAsYHQ/s1600/gow1.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-l193ny9CSQI/XYrjE1A-FyI/AAAAAAAACrQ/Xf9OGMS9XLogi6jy73ChgDDfU0xMb3uZwCLcBGAsYHQ/s400/gow1.JPG" width="139" height="400" data-original-width="131" data-original-height="376" /></a><br /><br />From here, the logic to estimate the probability of winning is fairly straightforward. If you score zero runs, you always lose. If you score one run, you win if you allow zero runs. If you allow one run, then the game goes to extra innings (I’m assuming that Enby represents per nine inning run distributions, just as we did for the Cigol estimates. Since the major league average innings/game is pretty close to nine, this is a reasonable if slightly imprecise assumption), in which case we’ll assume you have a 50% chance to win (we’re not building any assumptions about team quality in as we do in Cigol, necessitating an estimate of winning in extra innings that reflects expected runs and expected runs allowed). So a team that scores 1 run should win 5.39% + 10.11%/2 = 10.44% of those games.<br /><br />If you score two runs, you win all of the games where you allow zero or one, and half of the games where you allow 2, so 5.39% + 10.11% + 13.53%/2 = 22.26%. This can be very easily generalized:<br /><br />P(win given scoring X runs) = sum (from n = 0 to n = x - 1) of P(n) + P(x)/2<br /><br />Where P(y) = probability of allowing y runs<br /><br />Thus we get this chart:<br /><br /><a href="https://4.bp.blogspot.com/-k0SNZ2pDkw4/XYrjp_3HEdI/AAAAAAAACrY/mli98Gy_cT0owXelYRkVvuOADHAdw176gCLcBGAsYHQ/s1600/gow2.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-k0SNZ2pDkw4/XYrjp_3HEdI/AAAAAAAACrY/mli98Gy_cT0owXelYRkVvuOADHAdw176gCLcBGAsYHQ/s400/gow2.JPG" width="380" height="400" data-original-width="356" data-original-height="375" /></a><br /><br />It should be evident that the probability of winning when allowing X runs is the complement of the probability of winning when scoring X runs, although this could also be calculated directly from the estimated run distribution.<br /><br />Now, instead of using the empirical data for any given league/season to calculate gOW%, we can use Enby to generate the expected W%s, eliminating the sample size concerns and enabling us to customize the run environment under consideration. I did just that for the 2016 majors, where the average was 4.479 R/G (Enby distribution parameters are r = 4.082, B = 1.1052, z = .0545):<br /><br /><a href="https://2.bp.blogspot.com/-_rxB8XZ8QQ0/XYrj6l5ZvJI/AAAAAAAACrg/1KDvlUdSXoIhv5QiVXRvpM8-_T5QBwLDQCLcBGAsYHQ/s1600/gow3.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-_rxB8XZ8QQ0/XYrj6l5ZvJI/AAAAAAAACrg/1KDvlUdSXoIhv5QiVXRvpM8-_T5QBwLDQCLcBGAsYHQ/s400/gow3.JPG" width="200" height="400" data-original-width="196" data-original-height="392" /></a><br /><br />The first two columns compare the actual 2016 run distribution to Enby. The next set compares the empirical probability of winning when scoring X runs (I modified it to use a uniform value for games in which 12+ runs were scored, for the purpose of calculating gOW% and gDW%) to the Enby estimated probability. The Enby probabilities are generally consistent with the observed probabilities for 2016, but as expected there are some differences, and note that Enby is assuming independence of runs scored and runs allowed in a single game which environmental conditions alone make an assumption that can be most positively described as “simplifying”.<br /><br />The resulting gOW% and gDW% from using the Enby estimated probabilities:<br /><br /><a href="https://1.bp.blogspot.com/-VQxjpG1ELPQ/XYrklgpWcfI/AAAAAAAACro/v30lbAHmbMcqrkZKuIMCo138Rc1DDx7TACLcBGAsYHQ/s1600/gow4.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-VQxjpG1ELPQ/XYrklgpWcfI/AAAAAAAACro/v30lbAHmbMcqrkZKuIMCo138Rc1DDx7TACLcBGAsYHQ/s400/gow4.JPG" width="333" height="400" data-original-width="327" data-original-height="393" /></a><br /><br />There is not a huge difference between these and the empirical figures. One thing that is lost by switching to theoretical values is that the league does not necessarily balance to .500. In 2016 the average gOW% was .497 and the average gDW% was .502.<br /><br />However, the real value of this approach is that we no longer are forced to pretend that runs are equally valuable in every context. Note that Colorado had the second-highest OW% and third-lowest DW% in the majors. Anyone reading this blog knows that this is mostly a park illusion. If you look at park-adjusted R/G and RA/G, Colorado ranked seventeenth and nineteenth-best respectively, with 4.42 and 4.50 (again the league average R/G was 4.48), so the Rockies were slightly below average offensively and defensively. While we certainly don’t expect our estimate of their offensive or defensive quality using aggregate season runs to precisely match our estimate when considering their run distributions on a game basis (if they did, this whole exercise would be a complete waste of time), it would be quite something if a single team managed to be wildly efficient on offense and wildly inefficient on defense. <br /><br />When we consider that Colorado’s park factor was 1.18, in order to compute gOW%/gDW% in the run environment in which they played, we need to take the league average of 4.479 R/G x 1.18 = 5.29. (We could of course use the NL average R/G here as well; I’m intending this post as an example of how to do the calculations, not a full implementation of the metrics. For the same reason, I will round that park adjusted average up a tick to 5.3 R/G, since I already have the Enby distribution parameters handy at increments of .05 R/G). With c = .852, we have an Enby distribution with r = 5.673, B = .9363, z = .0257. The resulting Enby estimates of scoring frequency and W% scoring/allowing X runs are:<br /><br /><a href="https://2.bp.blogspot.com/-HaGoxvLWUcI/XYrk0BtbViI/AAAAAAAACrs/Apz1ZwLJpzwyi09mHLr0YEV4rHaGKzzsACLcBGAsYHQ/s1600/gow5.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-HaGoxvLWUcI/XYrk0BtbViI/AAAAAAAACrs/Apz1ZwLJpzwyi09mHLr0YEV4rHaGKzzsACLcBGAsYHQ/s400/gow5.JPG" width="379" height="400" data-original-width="356" data-original-height="376" /></a><br /><br />Using these estimated W%s, the Rockies gOW% drops from .560 to .485 and their gDW% increases from .437 to .508. As suggested by their park-adjusted R/G figures, Colorado’s offense and defense were both about average; their defense fares a little better when looking at the game distribution than when using aggregate totals, and the opposite for the offense.<br /><br />Some readers are doubtlessly thinking that by aggregating at the season level, we’ve lost some key detail. We could have looked at Colorado home and road games separately, each with a distinct set of Enby parameters and corresponding probabilities of winning when scoring X runs rather than lumping it altogether and applying the park factor that considers that half of the games are on the road. This of course is true; you can slice and dice however you’d like. I find the team seasonal level to be a reasonable compromise.<br /><br />This is beyond the scope of this series, so I will mention it briefly and move on. I have previously combined gOW% and gDW% into a single W% estimate by converting each into an equivalent run ratio using Pythagenpat math, then using Pythagenpat to convert those ratios into a W% estimate. This makes theoretical sense, although it loses sight of using the actual runs scored and allowed distributions of a team in a season and rearranging them (“bootstrapping” if you must). It occurred to me in writing this post that I could just use the same logic I use to convert Enby probabilities of scoring X runs into an estimated W% for the team. For example, we could use the Rockies runs scored distribution to estimate how often they would win when allowing x runs and use this in conjunction with their runs allowed distribution to estimate a W% given their runs allowed distribution. Then we could do the same with their runs scored/runs allowed to estimate a W% given their runs scored distribution. Averaging these two estimates would, in essence, put together every possible combination of their actual runs distribution from the season and calculate the average expected wins. For a simple example that avoids “ties”, if a team played two games, winning one 3-1 and the other 7-5, we would make every possible combination (3-1, 3-5, 7-1, 7-5) and estimate a .750 gEW%, compared to a 1.000 W% and a .720 Pythagenpat W%.<br /><br />Here’s an example for the 2016 Rockies:<br /><br /><a href="https://2.bp.blogspot.com/-cVpvlHjc3DE/XYrlc4T23MI/AAAAAAAACr0/wIM6k070OZsvt4yLhCQCveuaB13NaCu-ACLcBGAsYHQ/s1600/gow6.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-cVpvlHjc3DE/XYrlc4T23MI/AAAAAAAACr0/wIM6k070OZsvt4yLhCQCveuaB13NaCu-ACLcBGAsYHQ/s400/gow6.JPG" width="400" height="299" data-original-width="456" data-original-height="341" /></a><br /><br />The first two columns tell us that the Rockies scored two runs in 16 games and allowed two in 15 games. After converting these to frequencies, we can easily calculate the probability of winning giving that the team scores X runs in the same manner as we did above with Enby probabilities. For example, when the Rockies score two runs, they will win if they allowed zero (5.56%) or one (8.64%), and half of games in which they allow two (9.26%), for a win probability of 5.56% + 8.64% + 9.26%/2 = 18.8%. Figured this way, the Colorado’s gOW% is .494, their gDW% is .496, and thus their gEW% is .495. Please note that I’m not suggesting that using the team’s actual frequencies of scoring/allowing X runs is preferable to using league averages or Enby. Furthermore, the gOW% and gDW% components are not useful, since they make the estimate of the quality of the offense or defense dependent on its counterpart. <br />phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-69754639994691666502019-08-21T07:13:00.000-04:002019-08-21T07:13:07.383-04:00A Most Pyrrhic VictoryIt’s never fun to be in a position where you feel like your team’s short-term success will hamper its long-term prospects. For one, it is inherently an arrogant thought - holding that you can perceive something that often the professionals that run the team cannot (although one of the most common occurrences of this phenomenon in sports, rooting against current wins with an eye to draft position, doesn’t fit). It feels like a betrayal of the loyalty you supposedly hold as a fan, specifically with the players that you like who are caught in the crossfire. Most significantly, it’s just not fun - sports are fun when your team wins, not win they lose, even if you rationalize those losses as just one piece of a grander design.<br /><br />It is even harder when the team in question represents your alma mater, an institution to which you feel an immense loyalty and pride, one far deeper than anything you feel towards any kind of social or religious institution, professional organization, or (of course) a government. Such is the predicament that I find myself in when following the fortunes of Ohio State baseball. It is a position I have never been in before as a fan of OSU sports - I have rarely been part of the rabble calling for a regime change in any sport, and in the one case I can recall in which I was, it wasn’t with any kind of glee or malice. I believed that the coach in question wanted to win, was trying their best, was a worthy representative of the university, might even succeed in turning it around if given the opportunity - but that it was probably time to reluctantly pull the plug.<br /><br />None of this holds when considering the position of Greg Beals. Beals’ tenure at OSU now stretches, incredibly, over nine seasons, nine seasons that are much worse than any nine season stretch that proceeded it in the last thirty years of OSU baseball. A stretch of nine seasons in which a Big Ten regular season title has rarely been more than a pipe dream. I don’t feel like recounted the depressing details in this space - the season preview posts for the next four seasons will provide ample opportunity. That’s right - Beals now holds a three-year extension that takes him through 2023.<br /><br />How has he managed to pull this off? Apparently with another well-timed run in the Big Ten Tournament, winning the event for the second time and thus earning an automatic bid to the NCAA tournament. It’s not as if the Buckeyes were on the bubble before the tournament - well actually, they were. They were squarely on the bubble for the <i>Big Ten</i> tournament. OSU’s overall season record ended up at 36-27, but if you look deeper it was worse than that. At 12-12 in the Big Ten, they finished in a three-way tie for sixth place, needing help on the final day to qualify for the eight team field. Then they turned around and won it.<br /><br />In the NCAA tournament, the Buckeyes were thumped by Vanderbilt, eked out a thirteen-inning victory over McNeese to stay alive, then falling to Indiana State. To add insult to injury, another Big Ten team, the one from the heart of darkness, also had an unlikely tournament run. Except that outfit, channeling the spirit of departed basketball coach/practitioner of the dark arts John Beilein, made their run in the NCAA tournament, all the way to 1-0 lead in the final series before the aforementioned Commodores restored order to the universe. <br /><br />The Buckeyes were actually outscored by one run on the season, averaging 5.56 runs scored and 5.57 runs allowed per game. Compared to the average Big Ten team, the Bucks were +10 runs offensively and -15 runs defensively. However, this obscures some promising developments on the pitching side. The weekend rotation of Seth Lonsway (9 RAA, 12.3 K/5.8 W), Garrett Burhenn (10, 6.8/3.1), and Griffan Smith (3, 8.9/3.8) was surprisingly effective given its youth (sophomore, sophomore, freshman respectively). Relief ace Andrew Magno was absolutely brilliant (22, 10.4/5.0) with some heroic and perhaps ill-advised extended appearances in the tournaments; he was popped in the fifteenth round by Detroit. Outside of them, there were a group of relievers clustered between 2 and -3 RAA (Joe Gahm, Thomas Waning, Will Pfenig, and TJ Root) and a few rough lines - midweek starter Jake Vance had a 7.90 RA in 41 innings for -11 RAA, and three relievers (Mitch Milheim, TJ Brock, and usual position player Brady Cherry) combined for 57 innings and a whopping 65 runs allowed for -31 RAA. Thankfully most of these were low-leverage innings. <br /><br />The pitching was also not done any favors by the defense, as Ohio recorded a DER of just .641 compared to a conference average of .656. The good news is that the offense made up for it at the plate; the bad news is that the best hitters have exhausted or foregone their remaining eligibility. The biggest excpetion was sophomore Dillon Dingler, who returned to his natural position behind the plate after a freshman year spent in center and hit .291/.391/.424 for 9 RAA. Junior Connor Pohl was just an average hitter playing first base, but is a solid defender and was durable. Senior Kobie Foppe lost the second base job as he struggled mightily over his 118 PA (.153/.284/.194); junior utility man Matt Carpenter assumed the role but only hit .257/.300/.324 himself. Sophomore Noah West started the season at shortstop and was much improved offensively (.284/.318/.420), but his injury led to a reshuffling of the defensive alignment, with freshman Zack Dezenzo moving over from third (he hit a solid .256/.316/.440 with 10 longballs) and classmate Marcus Ernst assuming the hot corner (.257/.316/.300 over 76 PA) before yielding to yet another freshman, Nick Erwin (.235/.288/.272 over 147 PA). <br /><br />Senior Brady Cherry finally fulfilled his potential, mashing .314/.385/.563 for 23 RAA in left field. Little-used fifth-year senior Ridge Winand wound up as the regular center fielder, although his bat did not stand out (.243/.335/.347). In right field, junior Dominic Canzone had one the finest seasons ever by a Buckeye hitter, parlaying a .345/.432/.620 (37 RAA) line into an eighth-round nod from Arizona. Sophomore backup catcher Brent Todys eventually assume DH duties thanks to his power (.256/.345/.462); his .206 ISO trailed only Canzone and Cherry, who each blasted sixteen homers.<br /><br />So the Beals era rolls on, and at least another Big Ten tournament title has been added to the trophy case. When official SID releases after the season-ending NCAA tournament loss to Indiana State say “Buckeyes Championship Season Comes to an End”, you wonder whether there is some sarcasm even amongst people who are paid to provide favorable coverage. And then you realize no, it’s not even spin, they really believe it. Once #BealsBall takes root, it is nigh near impossible to make it just go away.phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-23627275225629540972019-05-29T08:12:00.000-04:002019-05-29T08:12:02.280-04:00Enby Distribution, pt. 10: Behavior Near 1 RPGEven for this series, this is an esoteric topic, but I wanted to specifically explore how Enby, Cigol, runs per win, Pythagorean exponent, etc. behaved around 1 RPG. 1 RPG is not a particularly interesting point from a real-world baseball perspective. Take 20 RPG. This is an outlandish level of scoring for teams, but one can easily imagine a theoretical scenario constructed from real players, and using the types of constructs that have sometimes been used by sabermetricians (for instance, a team of Babe Ruths with average pitching playing a team of Ty Cobbs with average pitching) in which 20 RPG would be the context. But 1 RPG? Maybe if you have a team of Rey Ordonezes facing Pedro Martinez 1999, but Pedro Martinez 1999 is backed by a team of Bill Bergens and they have to face Lefty Grove 1931?<br /><br />Still, 1 RPG is of interest in the world of win estimators, as it is the point that led to Pythagenpat (and thus my own intense interest in win estimators). As you know, 1 RPG is the minimum possible scoring level since a game doesn’t end until at least one run is scored. This insight, which to my knowledge was first proffered by David Smyth, led to my discovery of the Pythagenpat exponent (and I believe Smyth’s as well). So it will always hold a special interest to me, regardless of how impractical any application may be.<br /><br />In order to facilitate this, I expanded my list of Enby and Cigol parameters (the difference is that Enby uses c = .767 in the Tango Distribution and Cigol uses c = .852) to look at each .05 RPG interval from .05 - 1.95. First, using the Enby pararmeters is a graph of the estimated probability of scoring X runs for teams that average .5, 1, 1.5, and 2 R/G:<br /><br /><a href="https://1.bp.blogspot.com/-HXkaNGw3qAY/XO3PJSPsgsI/AAAAAAAACpw/jxT452PUB8Y-jGDWWEUFjjjzwubL0GsQgCLcBGAs/s1600/enby10-1.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-HXkaNGw3qAY/XO3PJSPsgsI/AAAAAAAACpw/jxT452PUB8Y-jGDWWEUFjjjzwubL0GsQgCLcBGAs/s400/enby10-1.JPG" width="400" height="287" data-original-width="995" data-original-height="713" /></a><br /><br />I deliberately cut-off the .5 R/G team’s probability of being shutout, which is 68.7%, in order to increase the space available for other points by about 40%. One thing that should stand out if you’ve looked at any of the other graphs of this type I’ve posted is that the distinctive shape (which for the lack of a more precise term I’ll call left tail truncated, extremely elongated right tail bell curve) is not present. For all of these teams except the 2 R/G, the probability of scoring x+1 runs is always lower than the probability of scoring x runs. The 2 R/G team is actually the first at .05 intervals that achieves this modest success; teams that average 1.95 R/G are expected to be shutout in 25.1% of games and score one run in 25.0%. At 2, it is 24.3% and 24.7% respectively.<br /><br />My real interest with these teams is how RPW and Pythagenpat exponent might behave at such low levels of scoring. In order to test this, I generated a Cigol W% for each possible matchup between teams average .05 - 2 R/G at intervals of .05. I included inverse matchups (e.g. 1.25 R/G and 2 RA/G as well as 2 R/G and 1.25 RA/G), but eliminated cases where R = RA (obviously W% is .500 at these points). I also eliminated cases in which R + RA < 1, since these are impossible: <a href="https://3.bp.blogspot.com/-Rf9YPahtFEU/XO3P3H4E-YI/AAAAAAAACp4/be2QrwgkZlMEoESdQwZz1Is27cRtDQC0QCLcBGAs/s1600/enby10-2.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-Rf9YPahtFEU/XO3P3H4E-YI/AAAAAAAACp4/be2QrwgkZlMEoESdQwZz1Is27cRtDQC0QCLcBGAs/s400/enby10-2.JPG" width="400" height="287" data-original-width="989" data-original-height="709" /></a><br /><br />The relationship between RPG and RPW, even in this extremely low scoring context, is generally as we’d expect. The power regression line is a decent fit and takes a very satisfying form, as Pythagenpat RPW <a href="https://walksaber.blogspot.com/2009/01/runs-per-win-from-pythagenpat.html">can be shown</a> to be equal to 2*RPG^(1 - z). The implied z value here is lower than the .27 - .29 used for more normal environments, but close enough to suggest that Pythagenpat, which is correct by definition at 1 RPG, remains a useful tool at slightly higher RPGs.<br /><br />To test that more directly, we can look at the required Pythagorean exponents for these teams plotted against RPG as well:<br /><br /><a href="https://4.bp.blogspot.com/-Jtc0MjZOH6g/XO3Qee3C2ZI/AAAAAAAACqA/4_tvawj0bio1VJdGc2Zo4TFsBrxE0rvDwCLcBGAs/s1600/enby10-3.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-Jtc0MjZOH6g/XO3Qee3C2ZI/AAAAAAAACqA/4_tvawj0bio1VJdGc2Zo4TFsBrxE0rvDwCLcBGAs/s400/enby10-3.JPG" width="400" height="287" data-original-width="989" data-original-height="709" /></a><br /><br />This graph is less encouraging. At first glance the most disturbing this is that the power regression doesn’t do a great job of fitting the data, as it produces Pythagorean exponents too low for the higher scoring contexts. The only way to achieve a RPG approaching 4 given how I defined this dataset is to have teams that are fairly evenly matched, while wide gaps in team quality can pop up at low RPG (for example, we could get 1 RPG from .05 R/.95 RA at one extreme of imbalance or .5 R/.5 RA at the other). This again suggests that the imbalance between the two teams has a material impact on the needed Pythagorean exponent, but one that I’ve as of yet been unable to successfully capture in a satisfactory equation.<br /><br />The more alarming thing about these results is they show a fraying of the Cigol W% estimates from Smyth’s logical conclusion that underpins Pythagenpat--namely that a 1 RPG team will win the same number of games as runs they score. For the nine unique pairs of R/RA (not counting their inverses), the Cigol W% is off slightly, as you can see the needed Pythagorean exponents at 1 RPG are not equal to 1:<br /><br /><a href="https://1.bp.blogspot.com/-xRLwBhp2UmQ/XO3QsIGda_I/AAAAAAAACqE/IUqJvB2K-CMqADuyqAEj21eeFbIpoiVJACLcBGAs/s1600/Enby10-4.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-xRLwBhp2UmQ/XO3QsIGda_I/AAAAAAAACqE/IUqJvB2K-CMqADuyqAEj21eeFbIpoiVJACLcBGAs/s400/Enby10-4.JPG" width="400" height="175" data-original-width="507" data-original-height="222" /></a><br /><br />True W% is equal to R/G, and the error/162 is (Cigol W% - True W%)*162. The errors are not horrible, all well within one standard deviation of the typical Pythagenpat error for normal major league teams, but they still could into question the theoretical validity of the Cigol estimates in extremely low scoring contexts.<br /><br />I redid the graph by replacing the Cigol estimates for these nine teams and their inverses with the True W%. This only corrects the W% for cases where we think for the moment that by definition Cigol is wrong; if that is so, Cigol is likely causing significant distortions at scoring levels just above 1 RPG as well, which are not corrected. I never expected Cigol to be a perfect model (or, to phrase it more precisely, I never expected any actual implementation of Cigol to be a perfect model; the mathematical underpinnings of Cigol, given the assumption of independence of runs scored and allowed, are true by definition), but I have written much of this series as if Cigol and the previously unnamed “True W%” were one in the same. This is not the case, but it is always a bit disappointing when you find a blemish in your model.<br /><br />With these corrections, we have this graph and regression equations:<br /><br /><a href="https://2.bp.blogspot.com/-z8cLKvKCZ0s/XO3Q8XGpHGI/AAAAAAAACqQ/B-HL6Tfp3c4SSjnHvLxxIOjwBf-x2iWeQCLcBGAs/s1600/Enby10-5.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-z8cLKvKCZ0s/XO3Q8XGpHGI/AAAAAAAACqQ/B-HL6Tfp3c4SSjnHvLxxIOjwBf-x2iWeQCLcBGAs/s400/Enby10-5.JPG" width="400" height="286" data-original-width="987" data-original-height="706" /></a><br /><br />This doesn’t do much to change the regression equations (changing eighteen observations out of 1,398 generally will not), but at least it looks better to have observations at (1, 1). I don’t have any correction to offer to Enby/Cigol itself to solve this problem; my inclination is to assume there are two problems at play:<br /><br />1) that the estimate probability of being shutout, the Enby parameter z, for which I use the Tango Distribution to estimate, doesn’t hold up at these extremely low scoring levels. Maybe the Tango Distribution c parameter, which varies based on whether the question revolves around one team’s runs per inning scoring distribution or at matchup between two teams, inherently assumes covariance between R and RA that doesn’t hold when only one team scores in a game by definition (at 1 RPG, and many other games between teams for which RPG is slightly greater than 1 would end 1-0 as well). But that is just a guess, and one that might appear to a reader to throw the other method under the bus. I don’t mean it in that way at all, of course; the Tango Distribution was not developed to be an input into a runs/game distribution. <br /><br />2) Regardless of the z parameter, Cigol assumes that runs scored and runs allowed are independent between the two teams and from game to game. But when I say that a team that plays scored .6 R/G and allows .4 must have a .600 W%, I am referring to a team that has actually put up those figures over some period of time. This is still not the same as saying that the team is a true .6/.4 team. And so there is not necessarily a flaw in Cigol at all. Enby (using the c = .852 parameters) expects a true talent .6 R/G team to score more than one run in 13.9% of their games. So it would be extremely unlikely that any team, even at these ridiculously low scoring levels, could ever produce a 1 RPG over a period of several games or longer.<br /><br />But redefining the question in terms of true talent means that you could have a true talent .3 R/.4 RA team, for instance. I unceremoniously tossed these teams out of the dataset earlier, but they should have been included. So I will quickly look at Cigol’s estimate of the necessary Pythagorean exponent for these teams (these are teams scoring and allowing .05 - .9 runs per game at intervals of .05 with a total R+RA < 1): <a href="https://3.bp.blogspot.com/-mu7Sk5c88do/XO3RnVIAbbI/AAAAAAAACqY/OkF3ovCXWx83V9Wm7vf0upYQuZSjJnhxwCLcBGAs/s1600/Enby10-6.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-mu7Sk5c88do/XO3RnVIAbbI/AAAAAAAACqY/OkF3ovCXWx83V9Wm7vf0upYQuZSjJnhxwCLcBGAs/s400/Enby10-6.JPG" width="400" height="294" data-original-width="983" data-original-height="722" /></a><br /><br />This isn’t interesting except as confirmation that the lower bound for the exponent is 1, which means that Pythagenpat fails for these teams. Pythagenpat will allow these teams to have exponents below 1. For example, .5 RPG is a Pythagenpat exponent around .5^.28 = .824.<br /><br />For the sake of the rest of this discussion, I will no longer hew to a strict requirement that the exponent be equal to 1 at any point (only that it never dip below 1). In its place, let me propose an alternate set of rules for an equation to estimate the Pythagorean exponent to be valid:<br /><br />1) the exponent must always increase with RPG if R = RA (or, the equation need not be strictly limited to using RPG; however, it must strictly increase with RPG for a theoretically average team. I don’t know for sure that this is a theoretical imperative, but I want to preclude the use of a quadratic model that might appear to be a good fit but with a negative coefficient for the x^2 term which results in a negative derivative when x is large<br /><br />2) the exponent must be close to 1 at 1 RPG. If we came up with a power regression that said the exponent = 1.02*RPG^.272, for instance, that would be fine. It’s close to 1.<br />Once I decided that I didn’t need to adhere to the constraint that x = 1 when RPG = 1, I tried a number of forms of x = RPG^z plus some other term that incorporated run differential. Here are a handful of the more promising ones:<br /><br />x = 1.03841*RPG^.265 + .00114*RD^2 (RMSE = 4.0084)<br />x = 1.04567*RPG^.2625 + .00113*RD^2 (RMSE = 4.0082)<br />x = 1.05299*RPG^.26 + .00113*RD^2 (RMSE = 4.0080)<br />x = 1.05887*RPG^.258 + .00113*RD^2 (RMSE = 4.0077)<br />x = 1.03059*RPG^.27 + .16066*(RD/RPG)^2 (RMSE = 4.0076)<br />x = 1.04561*RPG^.265 + .15274*(RD/RPG)^2 (RMSE = 4.0076)<br />x = 1.01578*RPG^.275 + .16862*(RD/RPG)^2 (RMSE = 4.0080)<br /><br />I must have run thirty regressions, looking for some formula that would beat 4.0067 (the minimum RMSE for an optimized Pythagenpat for 1961-2014 major league teams). Just to give you an idea of how silly I got, I tried this equation to estimate x (the Pythagorean exponent, eschewing the Pythagenpat construct):<br /><br />x = 10^(.30622 * log(RPG) + .0091*log(RD^2/RPG) - .01342) (RMSE = 4.011)<br /><br />Abandoning for a moment the attempt to get a lower RMSE with major league teams, how do those equations fare with the full Cigol dataset compared to Pythagenpat? In this case the RMSE is comparing the estimated W% from the formula in question to the Cigol estimate. Using z = .2867 (the value that optimizes RMSE for the 1961-2014 major league teams), the RMSE (per 162 games) is .46784. Using z = .2852 (the value that optimized RMSE for the full Cigol dataset), the RMSE is .46537. For each of the equations above:<br /><br />x = 1.03841*RPG^.265 + .00114*RD^2 (RMSE = .37791)<br />x = 1.04567*RPG^.2625 + .00113*RD^2 (RMSE = .40180)<br />x = 1.05299*RPG^.26 + .00113*RD^2 (RMSE = .42551)<br />x = 1.05887*RPG^.258 + .00113*RD^2 (RMSE = .44487)<br />x = 1.03059*RPG^.27 + .16066*(RD/RPG)^2 (RMSE = .56590)<br />x = 1.04561*RPG^.265 + .15274*(RD/RPG)^2 (RMSE = .60852)<br />x = 1.01578*RPG^.275 + .16862*(RD/RPG)^2 (RMSE = .52524)<br /><br />At least we can do better with the full Cigol dataset with a more esoteric construct than just using a fixed z value. But the practical impact is very small, and as we’ve seen these formulas add nothing to the accuracy of estimates for normal major league teams and sacrifice a bit of theoretical grounding. phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-37318550981131228992019-03-27T20:51:00.000-04:002019-03-28T08:51:25.967-04:002019 PredictionsThis is a blog intended for sabermetrically-inclined readers. I shouldn’t have to spell out a list of caveats about the for entertainment purposes only content that follows, and I won’t.<br /><br />AL EAST<br /><br />1. New York<br />2. Boston (wildcard)<br />3. Tampa Bay<br />4. Toronto<br />5. Baltimore<br /><br />I usually don’t actually show the numbers that come out of my “system” such as it is - it is not as robust a system as PECOTA or Fangraphs’ or Clay Davenport’s predictions, simplifying where the others are more rigorous and fed by other people’s player projections, because why bother reinventing that wheel when others have already done it so well? But in the case of the 2019 AL I think the estimates for the top four teams are illustrative of my failure to commit to any of this:<br /><br />NYA 822/653, 100<br />HOU 814/653, 99<br />BOS 850/683, 99<br />CLE 783/634, 98<br /><br />That’s (R/RA, Wins) in case it wasn’t obvious. So I can make bland statements like “the Red Sox appear to have a little better offense but worst defense than the Yankees”, but beyond that there’s not much to say other than it should be another entertaining season. It does appear to me that the Yankees and Astros have more surplus arms sitting around than the other contenders, and that’s certainly not a bad thing and something that the crude projection approach I take ignores. I’d expect Tampa Bay to take a step back from 2018 with a subpar offense. The Blue Jays are interesting as a sleeper, especially if the prospects show up and play more to their potential than their 2019 baseline expectation. Baltimore has two things going for them - I have Miami as worse on paper, and at least they’re trying a new approach. Actually three, because Camden Yards is awesome.<br /><br />AL CENTRAL<br /><br />1. Cleveland<br />2. Minnesota <br />3. Detroit<br />4. Kansas City<br />5. Chicago<br /><br />The Indians are still the easy divisional favorite, to an extent that surprised me when I actually put the numbers to it. They are closer to the big three in the AL (in fact, right behind by my reckoning) than they are to the Twins. It’s easy to look at the negatives – a borderline embarrassing outfield, an unsettled bullpen with little attempt to add high upside depth, a clustering of the team’s excellence in starting pitching which is more prone to uncertainty. But it’s worth keeping in mind that Cleveland underplayed their peripherals last year (although less their PW% than their EW%) - they have some room to decline while still projecting to win 90 as they did last year. Names like Sano and Buxton both make the Twins offense look better than it actually figures to be while also giving it more upside than a typical team, but they look like a slightly above average offense and slightly below average defense. You can throw a blanket over the three teams at the bottom - the order I’ve picked them for 2019 is the reverse order of the optimism I’d hold for 2020 as a fan of those teams.<br /><br />AL WEST<br /><br />1. Houston<br />2. Los Angeles (wildcard)<br />3. Oakland<br />4. Seattle<br />5. Texas<br /><br />Houston is an outstanding team once again, a World Series contender with room for misfortune. The Angels are my tepid choice for second wildcard - the Rays are in a tough division, the Twins could feast on the Central underlings but look like about as .500 of a team on paper as you can get, while the A’s can expect some regression on both offense and the bullpen. The Angels have huge rotation question marks, but all of these teams are flawed. The Mariners and Rangers both strike me as teams that could easily outplay projections; alas, it would take a surfeit of that to get into the race.<br /><br />NL EAST<br /><br />1. Philadelphia<br />2. Washington (wildcard)<br />3. New York<br />4. Atlanta<br />5. Miami<br /><br />This should be interesting. It’s easy to overrate the Phillies given that they were in the race last year when they really shouldn’t have been as close. It would be easy to overrate the Braves, who arrived early. It would be easy to underrate the Nationals, losing their franchise icon while bringing in another ace and graduating another potential outfield star. It would be easy to underrate the Mets, who are generally a disaster but still have talent. The only thing that wouldn’t be easy to do is trick yourself into thinking the Marlins are going to win.<br /><br />NL CENTRAL<br /><br />1. Chicago<br />2. Milwaukee (wildcard)<br />3. St. Louis<br />4. Cincinnati<br />5. Pittsburgh<br /><br />I have this about dead even on paper, but I give a slight edge to the Cubs with a bounce back from Kris Bryant and a more settled (if aging) rotation. The Brewers are legit, and their rotation should benefit from some arms that were used as swingmen last year getting a shot at starting. But the bullpen will likely be worse and some offensive regression shouldn’t come as a surprise. The Cardinals and Reds are a bit further back on paper, but close enough that it wouldn’t be that surprising if they played themselves into the mix. As a semi-Reds fan I’m a little skeptical about the chances of the quick transitional rebuild actually paying off. The Pirates look easily like the best team that I’ve picked last; the start of 2018 is a good reminder that teams like this can find themselves in the race.<br /><br />NL WEST<br /><br />1. Los Angeles<br />2. Colorado<br />3. Arizona<br />4. San Diego<br />5. San Francisco<br /><br />The Dodgers run in the NL West is underappreciated due to their failure to win the World Series and people inclined to write it off because of their payroll. I like their divisional chances better in 2019 as only the Rockies are real challengers. I’d put Colorado in the second tier of NL contenders with Cincinnati, St. Louis, New York, and Atlanta. If you can figure out if Arizona is starting a rebuild or trying to do one of those on-the-fly retools, let me know. Maybe let Mike Hazen know too. The Padres are interesting in that the prospects that have shown up so far haven’t lived up to expectations yet, but there are more and LOOK MANNY MACHADO. The Giants with Machado or Harper would have been the opposite of the Padres, more or less, which is considerably less interesting.<br /><br />WORLD SERIES<br /><br />Houston over Los Angeles<br /><br />Or Houston or Boston. They’re basically interchangeable. <br /><br />AL MVP: CF Mike Trout, LAA<br />AL Cy Young: Trevor Bauer, CLE<br />AL Rookie of the Year: LF Eloy Jimenez, CHA<br />NL MVP: 3B Nolan Arenado, COL<br />NL Cy Young: Aaron Nola, PHI<br />NL Rookie of the Year: CF Victor Robles, WAS<br />phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-77697849969953912452019-02-09T16:34:00.000-05:002019-02-09T16:34:17.247-05:00Pitching Optional?What happens when you take a team that got into the NCAA tournament despite finishing in the middle of the pack in its conference and relying on a makeshift pitching staff and remove the few reliable pitchers while leaving much of the offense intact? Does this sound interesting to you, like an experiment cooked up in the lab of a mad sabermetrician (or more likely a resident of that state up north)? If so, you may be interested in the 2019 Buckeyes.<br /><br />In the ninth season of the seemingly never-ending Greg Beals regime, he once again has an entire unit with next to no returning experience. Sometimes this is unavoidable in college sports, but it happens to Beals with regularity as player development does not appear to be a strong suit of the program. Players typically either make an impact as true freshmen or are never heard from, while JUCO transfers are a roster staple to paper over the holes. The only difference with this year’s pitching situation is that holes are largely being plugged with freshmen rather than transfers.<br /><br />The three pitchers penciled in as the rotation have precious little experience, with two true freshmen and a junior with 24 appearances and 11 starts in his career. Lefty Seth Lonsway was a nineteenth-round pick of Cincinnati and will be joined by classmate Garrett Burhenn, with Jake Vance as the junior veteran. Vance was +3 RAA in 36 innings last year, which doesn’t sound like much until you consider the dearth of returning performers on the rest of the staff.<br /><br />Midweek starts and long relief could fall to sophomore lefty Griffan Smith, who was not effective as a freshman (-7 RAA in 32 innings). The other veteran relievers are junior Andrew Magno (sidelined much of last season with an injury, but Beals loves his lefty specialists so if healthy he will see the mound) and senior sidewarmer Thomas Waning, who was promising as a sophomore but coughed up 18 runs in 16 frames in 2018. A trio of freshmen righties are said to throw 90+ MPH (Bayden Root, TJ Brock, Will Pfenning) joined by other freshmen in Cole Niekamp and lefty Mitch Milheim. Joe Gahm is a junior transfer from Auburn via Chattahoochee Valley Community College and given his experience and BA ranking as a top 30 Big Ten draft prospect should find a role. Senior Brady Cherry will also apparently get a chance to pitch this season, something he has yet to do in his Buckeye career.<br /><br />The Buckeye offense is more settled, and unless the pitchers exceed reasonable expectations will have to carry the team in 2019. Sophmore Dillon Dingler moves in from center field (that’s nothing, as recent OSU catcher Jalen Washington moved to shortstop) to handle the catching duties and was raved about by the coaches last season so big things are expected despite a .244/.325/.369 line. He’ll be backed up by sophomore transfer Brent Todys from Andrew College, with senior Andrew Fishel, junior Sam McClurg and freshman Mitchell Smith rounding out the roster.<br /><br />First base will belong to junior Conner Pohl after he switched corners midway through 2018; he also played the keystone as a freshman so he’s been all over the infield. While his production was underwhelming for first base, at 3 RAA he was a contributor last season and looks like a player who should add power as he matures. Senior Kobie Foppe got off to a slow start last year, flipped from shortstop to second base, and became an ideal leadoff man (.335/.432/.385); even with some BABIP regression he should be solid. Third base will go to true freshman Zach Dezenzo, while junior shortstop Noah West needs to add something besides walks to his offensive game (.223/.353/.292). The main infield backups are freshman Nick Erwin at short, sophomore Scottie Seymour and freshmen Aaron Hughes and Marcus Ernst at the corners, and junior Matt Carpenter everywhere just like his MLB namesake (albeit without the offensive ability).<br /><br />I’ll describe the outfield backwards from right to left, since junior right fielder Dominic Canzone is the team’s best offensive player (.323/.396/.447 which was a step back from his freshman campaign) and will be penciled in as the #3 hitter. The other two spots are not as settled as one would hope given the imperative of productive offense for this team. A pair of seniors will battle for center: Malik Jones did nothing at the plate as a JUCO transfer last year besides draw walks (245/.383/.286 in 63 PA) while Ridge Winand has barely seen the field. In left, senior Nate Romans has served as a utility man previously, although he did contribute in 93 PA last year (.236/.360/.431). Senior Brady Cherry completes his bounce around the diamond which has included starting at third and second; in 2018 he hit just .226/.321/.365, a step back from 2017. While he could get time in left, it’s more likely he’ll DH since the plan is to use him out of the bullpen as well. Other outfield backups are freshman Nolan Clegg in the corners and Alec Taylor in center.<br /><br />OSU opens the season this weekend with an odd three-game series against Seton Hall in Pt. Charlotte, Florida. It is the start of a very lackluster non-conference schedule that doesn’t figure to help the Buckeyes’ cause come tournament time as the schedule did last year (although unfortunately as you can probably tell I tend to think the resume will be beyond help). There are no games against marquee names, although OSU will play MSU in a rare non-conference Big Ten matchup. The home schedule opens March 15 with a three-game series against Lipscomb, a one-off with Northern Kentucky, and a four-game series against Hawaii, whose players will probably wondering what they did to wind up in Columbus in mid-March when they could be home.<br /><br />Big Ten play opens March 29 at Rutgers, with the successive weekends home to Northwestern and the forces of darkness, at Maryland, home to Iowa, at Minnesota, home to PSU, and at Purdue. Midweek opponents are the typical fare of local nines, including Toledo, Cincinnati, Ohio University (away), Dayton, Xavier, Miami (away), Wright State, and Youngstown State (away). The Big Ten tournament will be played May 22-26 in Omaha.<br /><br />It’s hard to be particularly optimistic that another surprise trip to the NCAA tournament is in the cards. Even some of the best pitchers who have come through OSU have struggled as freshman so it’s hard to project the starting pitching to be good, and while there are productive returnees at multiple positions, only Canzone is a proven excellent hitter and a couple positions are occupied by players who must make serious improvement to be average. The non-conference schedule may be soft enough to keep the record respectable, but there are few opportunities to grab wins that will help come selection time. Aspiring to qualify for the Big Ten tournament seems a more realistic goal. Beals is the longest-tenured active coach at OSU in any of the four sports that I follow rabidly, which on multiple levels is concerning (although two of the three other program have coaches in place who have demonstrated their value at OSU, and the third did well in a three-game trial). Yet somehow Beals marches on, floating aimlessly in the middle of an improved Big Ten.<br /><br />Note: This preview is always a combination of my own knowledge and observation along with the <a href="https://ohiostatebuckeyes.com/2019-baseball-season-outlook/">official season outlook</a> released by the program, especially as pertains to position changes and newcomers about which I have next to no direct knowledge. That reliance was even greater this year due to the turnover on the mound.phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-64645848492143471322019-02-04T08:03:00.000-05:002019-02-04T08:03:12.227-05:00Enby Distribution, pt. 9: Cigol at the Extremes--Pythagenpat ExponentIn the last installment, I explored using the Cigol dataset to estimate the Pythagorean exponent. Alternatively, we could sidestep the attempt to estimate the exponent and try to directly estimate the z parameter in the Pythagenpat equation x = RPG^z.<br /><br />The positives of this approach include being able to avoid the scalar multipliers that move the estimator away from a result of 1 at 1 RPG, and also maintains a form that has been found useful by sabermetricians in the last decade or so. The latter is also the biggest drawback to this approach--it assumes that the form x = RPG^z is correct, and foregoes the opportunity of finding a form that provides a better fit, particularly with extreme datapoints. It’s also fair to question my objectivity in this matter, given that a plausible case could be made that I have a vested interest in “re-proving” the usefulness of Pythagenpat. That’s not my intent, but I would be remiss in not raising the possibility of my own (unintentional) bias influencing this discussion.<br /><br />Given that we know the Pythagorean exponent x as calculated in the last post, it is quite simple to compute the corresponding z value:<br /><br />z = log(x)/log(RPG)<br /><br />For the full dataset I’ve used throughout these posts, a plot of z against RPG looks like this:<br /><br /><a href="https://3.bp.blogspot.com/-0joxK_2uIkE/XFDQijU2SoI/AAAAAAAACos/2Fzg5hJoEWEe9xZJJs3SHV7f9Ei4ouvYwCLcBGAs/s1600/cigol9a.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-0joxK_2uIkE/XFDQijU2SoI/AAAAAAAACos/2Fzg5hJoEWEe9xZJJs3SHV7f9Ei4ouvYwCLcBGAs/s400/cigol9a.JPG" width="400" height="267" data-original-width="1048" data-original-height="699" /></a><br /><br />A quick glance suggests that it may be difficult to fit a clean function to this plot, as there is no clear relationship between RPG and z. It appears that in the 15-20 RPG range, there are a number of R/RA pairs for which a higher z is necessary than for the pairs at 20-30 RPG. While I have no particular reason to believe that the z value should necessarily increase as RPG increases, I have strong reason to doubt that the dataset I’ve put together allows us to conclude otherwise. Based on the way the pairs were chosen, extreme quality differences are overrepresented in this range. For example, there are pairs in which a team scores 14 runs per game and allows only 3. The more extreme high RPG levels are only reached when both teams are extremely high scoring; the most extreme difference captured in my dataset at 25 RPG is 15 R/10 RA.<br /><br />The best fit to this graph comes from a quadratic regression equation, but the negative coefficient for RPG^2 (the equation is z = -.0002*RPG^2 + .0062*RPG + .2392) makes it unpalatable from a theoretical perspective. The apparent quadratic shape may well be an accident of the data points used as described in the preceding paragraph. Power and logarithmic functions fail to produce the upward slope from 5-10 RPG, as does a linear equation. The latter has a very low r^2 (just .022) but results in an aesthetically pleasing gently increasing exponent as RPG increases (equation of .2803 + .00025*RPG). The slope is so gentle as to result in no meaningful difference when applying the equation to actual major league teams, leaving it as useless as the r^2 suggests it would be (RMSE of 4.008 for 1961-2014, with same result if using the z value based on plugging in the average of RPG of 8.805 for that period).<br /><br />It’s tempting to assume that z is higher in cases in which there is a large difference in runs scored and runs allowed. This could potentially be represented in an equation by run differential or run ratio, and such a construct would not be without sabermetric precedent, as other win estimators have been proposed that explicitly consider the discrepancy between the two teams (explicitly as in beyond the obvious truth that as you score more runs than you allow, you will win more games). (See the discussion of Tango’s old win estimator in <a href="https://walksaber.blogspot.com/2018/05/enby-distribution-pt-7-cigol-at.html">part 7</a>).<br /><br />First, let’s take a quick peak at the z versus RPG plot we’d get for the limited dataset I’ve used throughout the series (W%s between .3 and .7 with R/G and RA/G between 3 and 7):<br /><br /><a href="https://4.bp.blogspot.com/-_-dNMBMbDqY/XFDRffss4II/AAAAAAAACo4/5Tp2YkOz7JQPC7748mC0_FrzQIPOdiNNgCLcBGAs/s1600/cigol9b.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-_-dNMBMbDqY/XFDRffss4II/AAAAAAAACo4/5Tp2YkOz7JQPC7748mC0_FrzQIPOdiNNgCLcBGAs/s400/cigol9b.JPG" width="400" height="273" data-original-width="1038" data-original-height="709" /></a><br /><br />The relationship here is more in line with what we might have expected--z levels out as RPG increases, but there is no indication that z decreases with RPG (which assuming my reasoning above is correct, reflects the fact that the teams in this dataset are much more realistic and matched in quality than are the oddballs in the full dataset). Again, the best fit comes from a quadratic regression, but the negative coefficient for RPG^2 is disqualifying. A logarithmic equation fits fairly well (r^2 = .884), but again fails to capture the behavior at lower levels of RPG, not as damaging to the fit here because of the more limited data set. The logarithmic equation is z = .2484 + .0132*ln(RPG), but this produces a worse RMSE with the 1961-2014 teams (4.012) than simply using a fixed z.<br /><br />Returning to the full dataset, what happens if we run a regression that includes abs(R - RA) as a variable alongside RPG? We get this equation for z:<br /><br />z = .26846 + .00025*RPG + .00246*abs(R - RA)<br /><br />This is interesting as it is the same slope for RPG as seen in the equation that did not include abs(RD), but the intercept is much lower, which means that for average (R = RA) teams, the estimated z will be lower. This equation implies that differences between a team and its opponents really drive the behavior of z in the data.<br /><br />Applying this equation to the 1961-2014 data fails to improve RMSE, raising it to 4.018. So while this may be a nice idea and seem to fit the theoretical data better, it is not particularly useful in reality. I also tried a form with an RPG^2 coefficient as well (and for some reason liked it when initially sketching out this series), but the negative RPG^2 coefficient dooms the equation to theoretical failure (and with a 4.017 RMSE it does little better with empirical data):<br /><br />z = .24689 - .00011*RPG^2 + .00378*RPG + .00183*abs(R - RA)<br /><br />One last idea I tried was using (R - RA)^2 as a coefficient rather than abs(R - RA). Squaring run differential eliminates any issue with negative numbers, and perhaps it is extreme quality imbalances that really drive the behavior of z. Alas, a RMSE of 4.014 is only slightly better than the others:<br /><br />z = .27348 + .00025*RPG + .00020*(R - RA)^2<br /><br />If you are curious, using the 1961-2014 team data, the minimum RMSE for Pythagenpat is achieved when z = .2867 (4.0067). The z value that minimized RMSE for the full dataset is .2852. This may be noteworthy in its own right -- a dataset based on major league team seasons and one based on theoretical teams of wildly divergent quality and run environment coming to the same result may be an indication that extreme efforts to refine z may be a fool's errand.<br /> <br />You may be wondering why, after an entire series built upon my belief in the importance of equations that work well for theoretical data, I’ve switched in this installment to largely measuring accuracy based on empirical data. My reasoning is as follows: in order for a more complex Pythagenpat equation to be worthwhile, it has to have a material and non-harmful effect in situations in which Pythagenpat is typically used. If no such equation is available (which is admittedly a much higher hurdle to clear than me simply not being able to find a suitable equation in a week or so of messing around with regressions), then it is best to stick with the simple Pythagenpat form. If one a) is really concerned with accuracy in extreme circumstances and b) thinks that Cigol is a decent “gold standard” against which to attempt to develop a shortcut that works in those circumstances, then one should probably just use Cigol and be done with it. Without a meaningful “real world” difference, and as the functions needed become more and more complex, it makes less sense to use any sort of shortcut method rather than just using Cigol. <br /><br />Thus I will for the moment leave the Pythagenpat z function as a humble constant, and hold Cigol in reserve if I’m ever really curious to make my best guess at what the winning percentage would be for a team that scores 1.07 runs and allows 12.54 runs per game (probably something around .0051).<br /><br />The “full” dataset I’ve used in the last few posts is available <a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vSmpX1FeQjcDjUWYYnZZBgaKOzRprmz8SD89L7qTNM3cn3GSh8BT95yc3VYIECMt3NFr0jRa2cb79U9/pub?output=xlsx">here</a>.phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-57326166150378299662019-01-19T13:10:00.001-05:002019-01-19T13:11:20.432-05:00Run Distribution and W%, 2018I always start this post by looking at team records in blowout and non-blowout games. I define blowouts as games in which the margin of victory is six runs or more (rather than five, the definition used by Baseball-Reference). I settled on this last year after a Twitter discussion with Tom Tango and a poll that he ran. This definition results in 19.4% of major league games in 2018 being classified as blowouts; using five as the cutoff, it would be 28.0%, and using seven it would be 13.2%. Of course, using one standard ignores a number of factors, like the underlying run environment (the higher the run scoring level, the less impressive a fixed margin of victory) and park effects (which have a similar impact but in a more dramatic way when comparing teams in the same season). For the purposes here, around a fifth of games being blowouts feels right; it’s worth monitoring each season to see if the resulting percentage still makes sense.<br /><br />Team records in non-blowouts:<br /><br /><a href="https://3.bp.blogspot.com/-KbDwgO8h3ug/XENnJKh1vmI/AAAAAAAACnw/839ds_rf1dw9eARAD8bEh-vVLnxfbk6WgCLcBGAs/s1600/rd18a.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-KbDwgO8h3ug/XENnJKh1vmI/AAAAAAAACnw/839ds_rf1dw9eARAD8bEh-vVLnxfbk6WgCLcBGAs/s400/rd18a.JPG" width="178" height="400" data-original-width="262" data-original-height="589" /></a><br /><br />With over 80% of major league games being non-blowouts (as we’ll see in a moment, the highest blowout % for any team was 26% for Cleveland), it’s no surprise that all of the playoff teams were above .500 in these games, although the Indians and Dodgers just barely so. The Dodgers compensated in a big way:<br /><br /><a href="https://1.bp.blogspot.com/-6_1cDJbnRK8/XENnL9kmaLI/AAAAAAAACn0/eZFbKfjbCbc2nAxkbIo7eEs3W57qjRcfQCLcBGAs/s1600/rd18b.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-6_1cDJbnRK8/XENnL9kmaLI/AAAAAAAACn0/eZFbKfjbCbc2nAxkbIo7eEs3W57qjRcfQCLcBGAs/s400/rd18b.JPG" width="178" height="400" data-original-width="262" data-original-height="589" /></a><br /><br />There was very little middle ground in blowout games, with just three teams having a W% between .400 - .500. This isn’t too surprising since strong teams usually perform very well in blowouts, and the bifurcated nature of team strength in 2018 has been much discussed. This also shows up when looking at each team’s percentage of blowouts and difference between blowout and non-blowout W%:<br /><br /><a href="https://3.bp.blogspot.com/-RoDT0eTQyjs/XENnNfE9NVI/AAAAAAAACn4/WIu9Ucbb0IYmVVZEcvUvZqL3dSEo7lzBwCLcBGAs/s1600/rd18c.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-RoDT0eTQyjs/XENnNfE9NVI/AAAAAAAACn4/WIu9Ucbb0IYmVVZEcvUvZqL3dSEo7lzBwCLcBGAs/s400/rd18c.JPG" width="179" height="400" data-original-width="263" data-original-height="589" /></a><br /><br />A more interesting way to consider game-level data is to look at how teams perform when scoring or allowing a given number of runs. For the majors as a whole, here are the counts of games in which teams scored X runs:<br /><br /><a href="https://3.bp.blogspot.com/-rMkMBMvpofw/XENnOwImOhI/AAAAAAAACn8/rsOJDcbMylIjnELO6d4wrxIe6xOt_XmkgCLcBGAs/s1600/rd18d.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-rMkMBMvpofw/XENnOwImOhI/AAAAAAAACn8/rsOJDcbMylIjnELO6d4wrxIe6xOt_XmkgCLcBGAs/s400/rd18d.JPG" width="386" height="400" data-original-width="457" data-original-height="474" /></a><br /><br />The “marg” column shows the marginal W% for each additional run scored. In 2018, three was the mode of runs scored, while the second run resulted in the largest marginal increase in W%. The distribution is fairly similar to 2017, with the most obvious difference being an increase in W% in one-run games from .057 to .103; not surprisingly, the proportion of shutouts increased as well, from 5.4% to 6.4%. <br /><br />The major league average dipped from 4.65 to 4.44 runs/game; this is the run distribution anticipated by Enby for that level (actually, 4.45) of R/G for fifteen or fewer runs:<br /><br /><a href="https://3.bp.blogspot.com/-CUm07HQFszM/XENnTZAl55I/AAAAAAAACoE/DLbI66yzMuwBg77nVqUsOHnMwkAYMN20QCLcBGAs/s1600/rd18f.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-CUm07HQFszM/XENnTZAl55I/AAAAAAAACoE/DLbI66yzMuwBg77nVqUsOHnMwkAYMN20QCLcBGAs/s400/rd18f.JPG" width="306" height="400" data-original-width="247" data-original-height="323" /></a><br /><br />Shutouts ran almost 1% above Enby’s estimated; that stands out in graph form along with Enby’s compensation by over-estimating the frequency of 2 and 3 run games. Still, a zero-modified negative binomial distribution (which is what the distribution I call Enby is) does a decent job:<br /><br /><a href="https://3.bp.blogspot.com/-BtsYCQMVNwE/XENnQXK-JjI/AAAAAAAACoA/306Oz3z8sIoPqljB60t3mV4Ri-PIFpsNQCLcBGAs/s1600/rd18e.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-BtsYCQMVNwE/XENnQXK-JjI/AAAAAAAACoA/306Oz3z8sIoPqljB60t3mV4Ri-PIFpsNQCLcBGAs/s400/rd18e.JPG" width="400" height="235" data-original-width="1078" data-original-height="632" /></a><br /><br />One way that you can use Enby to examine team performance is to use the team’s actual runs scored/allowed distributions in conjunction with Enby to come up with an offensive or defensive winning percentage. The notion of an offensive winning percentage was first proposed by Bill James as an offensive rate stat that incorporated the win value of runs. An offensive winning percentage is just the estimated winning percentage for an entity based on their runs scored and assuming a league average number of runs allowed. While later sabermetricians have rejected restating individual offensive performance as if the player were his own team, the concept is still sound for evaluating team offense (or, flipping the perspective, team defense). <br /><br />In 1986, James sketched out how one could use data regarding the percentage of the time that a team wins when scoring X runs to develop an offensive W% for a team using their run distribution rather than average runs scored as used in his standard OW%. I’ve been applying that concept since I’ve written this annual post, and last year was finally able to implement an Enby-based version. I will point you to <a href="https://walksaber.blogspot.com/2018/01/run-distribution-and-w-2017.html">last year’s post</a> if you are interested in the details of how this is calculated, but there are two main advantages to using Enby rather than the empirical distribution:<br /><br />1. While Enby may not perfectly match how runs are distributed in the majors, it sidesteps sample size issues and data oddities that are inherent when using empirical data. Use just one year of data and you will see things like teams that score ten runs winning less frequently than teams that score nine. Use multiple years to try to smooth it out and you will no longer be centered at the scoring level for the season you’re examining.<br /><br />2. There’s no way to park adjust unless you use a theoretical distribution. These are now park-adjusted by using a different assumed distribution of runs allowed given a league-average RA/G for each team based on their park factor (when calculating OW%; for DW%, the adjustment is to the league-average R/G).<br /><br />I call these measures Game OW% and Game DW% (gOW% and gDW%). One thing to note about the way I did this, with park factors applied on a team-by-team basis and rounding park-adjusted R/G or RA/G to the nearest .05 to use the table of Enby parameters that I’ve calculated, is that the league averages don’t balance to .500 as they should in theory. The average gOW% is .495 and the average gDW% is .505. <br /><br />For most teams, gOW% and OW% are very similar. Teams whose gOW% is higher than OW% distributed their runs more efficiently (at least to the extent that the methodology captures reality); the reverse is true for teams with gOW% lower than OW%. The teams that had differences of +/- 2 wins between the two metrics were (all of these are the g-type less the regular estimate, with the teams in descending order of absolute value of the difference):<br /><br />Positive: None<br />Negative: LA, WAS, CHN, NYN, CLE, LAA, HOU<br /><br />It doesn’t help here that the league average is .495, but it’s also possible that team-level deviations from Enby are greater given the unusual distribution of offensive events (e.g. low BA, high K, high HR) that currently dominates in MLB. One of the areas I’d like to study given the time and a coherent approach to the problem is how Enby parameters may vary based on component offensive statistics. The Enby parameters are driven by the variance of runs per game and the frequency of shutouts; for both, it’s not too difficult to imagine changes in the shape of offense having a significant impact. <br /><br />Teams with differences of +/- 2 wins (note: this calculation uses 162 games for all teams even though a handful played 161 or 163 in 2018) between gDW% and standard DW%: <br /><br />Positive: MIA, PHI, PIT, NYN, CHA<br />Negative: HOU<br /><br />Miami’s gDW% was .443 while their DW% was .406, a difference of 5.9 wins which was the highest in the majors for either side of the ball (their offense displayed no such difference, with .449/.444). That makes them a good example to demonstrate what having an unusual run distribution relative to Enby looks like and how that can change the expected wins:<br /><br /><a href="https://4.bp.blogspot.com/-tZU5khZxRHg/XENnU2Ut5SI/AAAAAAAACoI/ugzgDoL0Uw88Bp_2zS9DtHGx-awxyzcEQCLcBGAs/s1600/rd18g.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-tZU5khZxRHg/XENnU2Ut5SI/AAAAAAAACoI/ugzgDoL0Uw88Bp_2zS9DtHGx-awxyzcEQCLcBGAs/s400/rd18g.JPG" width="400" height="228" data-original-width="1052" data-original-height="599" /></a><br /><br />This graph excludes two games in which the Marlins coughed up 18 and 20 runs, which themselves do much to explain the huge discrepancy--giving up twenty runs kills your RA/G but from a win perspective is scarcely different then giving up thirteen (given their run environment, Enby expected that Miami would win 1.1% of the time scoring thirteen and 0.0% allowing twenty).<br /><br />Miami allowed two and three runs much less frequently than Enby expected; given that they should have won 79% of games when allowing two and 59% when allowing three that explains much of the difference. They allowed eight or more runs 23.6% of the time compared to just 12.5% estimated by Enby, but all those extra runs weren’t particularly costly in terms of wins since the Marlins were only expected to win 6.4% of such games (calculated by taking the weighted average of the expected W% when allowing 8, 9, … runs with the expected frequency of allowing 8, 9, … runs given that they allowed 8+ runs).<br /><br />I don’t have a good clean process for combining gOW% and gDW% into an overall gEW%; instead I use Pythagenpat math to convert the gOW% and gDW% into equivalent runs and runs allowed and calculate an EW% from those. This can be compared to EW% figured using Pythagenpat with the average runs scored and allowed for a similar comparison of teams with positive and negative differences between the two approaches:<br />Positive: MIA, PHI, KC, PIT, CHA, MIN, SF, SD<br />Negative: LA, WAS, HOU, CHN, CLE, LAA, BOS, ATL<br /><br />Despite their huge defensive difference, Miami was edged out for the largest absolute value of difference by the Dodgers (6.08 to -6.11). The Dodgers were -4.8 on offense and -1.7 on defense (astute readers will note these don’t sum to -6.11, but they shouldn’t given the nature of the math), while the Marlins 5.9 on defense was only buffeted by .9 on offense (as we’ve seen before, there was only a .005 discrepancy between their gOW% and OW%). <br /><br />The table below has the various winning percentages for each team:<br /><br /><a href="https://4.bp.blogspot.com/-P12HbLGu31M/XENnWZS-g6I/AAAAAAAACoM/3eVblPz1hQA6jI1CtwF7xGN5o-RJvwkmACLcBGAs/s1600/rd18h.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-P12HbLGu31M/XENnWZS-g6I/AAAAAAAACoM/3eVblPz1hQA6jI1CtwF7xGN5o-RJvwkmACLcBGAs/s400/rd18h.JPG" width="400" height="360" data-original-width="658" data-original-height="592" /></a>phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-70881401725672764992018-12-18T09:03:00.000-05:002018-12-18T09:03:03.696-05:00Crude Team Ratings, 2018For the last several years I have published a set of team ratings that I call "Crude Team Ratings". The name was chosen to reflect the nature of the ratings--they have a number of limitations, of which I documented several when I introduced the <a href="http://walksaber.blogspot.com/2011/01/crude-team-ratings.html">methodology</a>.<br /><br />I explain how CTR is figured in the linked post, but in short:<br /><br />1) Start with a win ratio figure for each team. It could be actual win ratio, or an estimated win ratio.<br /><br />2) Figure the average win ratio of the team’s opponents.<br /><br />3) Adjust for strength of schedule, resulting in a new set of ratings.<br /><br />4) Begin the process again. Repeat until the ratings stabilize.<br /><br />The resulting rating, CTR, is an adjusted win/loss ratio rescaled so that the majors’ arithmetic average is 100. The ratings can be used to directly estimate W% against a given opponent (without home field advantage for either side); a team with a CTR of 120 should win 60% of games against a team with a CTR of 80 (120/(120 + 80)).<br /><br />First, CTR based on actual wins and losses. In the table, “aW%” is the winning percentage equivalent implied by the CTR and “SOS” is the measure of strength of schedule--the average CTR of a team’s opponents. The rank columns provide each team’s rank in CTR and SOS:<br /><br /><a href="https://3.bp.blogspot.com/-Eb71ynDLU_E/XBgetRAGzoI/AAAAAAAACms/fJW00TpzXuwG92KhKIGdySBa7c3GMdeCQCLcBGAs/s1600/ctr18a.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-Eb71ynDLU_E/XBgetRAGzoI/AAAAAAAACms/fJW00TpzXuwG92KhKIGdySBa7c3GMdeCQCLcBGAs/s400/ctr18a.JPG" width="281" height="400" data-original-width="393" data-original-height="560" /></a><br /><br />The playoff teams all finished in the top twelve, with the third-place teams from the top-heavy AL East and West being denied spots in the dance despite having the fifth/sixth most impressive records in the majors (and damn the .3 CTR that separated us from #6org). The AL also had four of the bottom five teams; the bifurcated nature of the AL is something that was well observed and noted from the standings but also is evident when adjusting for strength of schedule. Note the hellish schedule faced by bad AL teams; Baltimore, with the worst CTR in MLB, had the toughest SOS at 118 - an average opponent at the level of the Cubs. Those Cubs had the easiest schedule, playing an average opponent roughly equivalent to the Pirates.<br /><br />Next are the division averages. Originally I gave the arithmetic average CTR for each division, but that’s mathematically wrong--you can’t average ratios like that. Then I switched to geometric averages, but really what I should have done all along is just give the arithmetic average aW% for each division/league. aW% converts CTR back to an “equivalent” W-L record, such that the average across the major leagues will be .50000:<br /><br /><a href="https://2.bp.blogspot.com/-2JxoaOxbbP4/XBgeutyGlmI/AAAAAAAACmw/9aKHBtnmvIY-FhvLsMuWctHSKY0bKI4yQCLcBGAs/s1600/ctr18b.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-2JxoaOxbbP4/XBgeutyGlmI/AAAAAAAACmw/9aKHBtnmvIY-FhvLsMuWctHSKY0bKI4yQCLcBGAs/s400/ctr18b.JPG" width="400" height="266" data-original-width="244" data-original-height="162" /></a><br /><br />The AL once again was markedly superior to the NL; despite the sorry showing of the Central, the West was almost as good as it was bad, and the East was strong as well. Given the last fifteen years of AL dominance, you may have glossed over the last sentence, but if you are familiar with the results of 2018 interleague play, it may give you pause. The NL went 158-142 against the AL, so how does the average AL team rank ahead? It may be counter-intuitive, but one can easily argue that the NL should have performed better than it did. The NL’s best division got the benefit of matching up with the AL’s worst division (the Centrals). The AL Central went 38-62 (.380), but the East went 54-46 (.540) and the West 50-50 (.500). <br /><br />Of course, the CTRs can also use theoretical win ratios as a basis, and so the next three tables will be presented without much comment. The first uses gEW%, which is a measure I calculate that looks at each team’s runs scored distribution and runs allowed distribution separately to calculate an expected winning percentage given average runs allowed or runs scored, and then uses Pythagorean logic to combine the two and produce a single estimated W% based on the empirical run distribution:<br /><br /><a href="https://2.bp.blogspot.com/-44XPjGg5o1Q/XBgev2dvNVI/AAAAAAAACm0/qujPULN_4QY8w4tg6_PLh9WViJcxYyxLgCLcBGAs/s1600/ctr18c.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-44XPjGg5o1Q/XBgev2dvNVI/AAAAAAAACm0/qujPULN_4QY8w4tg6_PLh9WViJcxYyxLgCLcBGAs/s400/ctr18c.JPG" width="282" height="400" data-original-width="395" data-original-height="560" /></a><br /><br />The next version utilizes EW%, which is to say standard Pythagenpat based on actual runs scored and allowed:<br /><br /><a href="https://1.bp.blogspot.com/-4f9tkV6mPaQ/XBgew4Ht8jI/AAAAAAAACm4/g38T2ckvNgs1ppJGJj_O44kA1vjYh8IrgCLcBGAs/s1600/ctr18d.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-4f9tkV6mPaQ/XBgew4Ht8jI/AAAAAAAACm4/g38T2ckvNgs1ppJGJj_O44kA1vjYh8IrgCLcBGAs/s400/ctr18d.JPG" width="281" height="400" data-original-width="394" data-original-height="561" /></a><br /><br />And one based on PW%, which is Pythagenpat but using runs created and runs created allowed in place of actual runs totals:<br /><br /><a href="https://1.bp.blogspot.com/-upSNIK1dp4k/XBgex8qibMI/AAAAAAAACm8/uDny0MDFpfUrpD_wNYcC7leDD62TxfCcQCLcBGAs/s1600/ctr18e.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-upSNIK1dp4k/XBgex8qibMI/AAAAAAAACm8/uDny0MDFpfUrpD_wNYcC7leDD62TxfCcQCLcBGAs/s400/ctr18e.JPG" width="282" height="400" data-original-width="395" data-original-height="560" /></a><br /><br />Everything that I’ve shared so far has been based on regular season data only. Of course, ten teams play additional games for keeps, and we could consider these in ratings as well. Some points to consider when it comes to incorporating postseason data:<br /><br />1. It would be silly to pretend that these additional games don’t give us any insight on team quality. Of course every opportunity we have to observe a team in a competitive situation increases our sample size and informs our best estimate of that team’s quality.<br /><br />2. Postseason games are played under a different set of conditions and constraints than regular season games, particularly when it comes to how pitchers are used. This is not a sufficient reason to ignore the data, in my opinion.<br /><br />3. A bigger problem, and one that causes me to present ratings that include postseason performance only half-hardheartedly, is the bias introduced to the ratings by the playoff structure. The nature of postseason series serves to exaggerate the difference in team performance observed during the series. Take the Astros/Indians ALCS. The Astros dominated the Indians over three games, which certainly provides additional evidence vis-a-vis the relative strength of the two teams. Based on regular season performance, Houston looked like a superior club (175 to 113 CTR, implying that they should win 61% of their games against the Indians), and the sweep provided additional evidence. However, because the series terminated after three games, all won by the Astros, it overstated the difference. If the series was played out to completion (assuming you can imagine a non-farcial way in which this could be done), we would expect to see Cleveland pull out a win, and even adding 1-4 and 4-1 to these two team’s ratings would decrease the CTR gap between the two (although still correctly increasing it compared to considering only the regular season).<br /><br />This is one of those concepts that seems very clear to me when I think about it, and yet is extremely difficult to communicate in a coherent manner, so let me simply assert that I think bias is present when the number of observations is dependent on the outcome of the previous observations (like in a playoff series) that is not present when the number of observations is independent of the outcome of previous observations (as is the case for a regular season in which all teams play 162 games regardless of whether they are mathematically eliminated in August).<br /><br />Still, I think it’s worth looking at CTRs including the post-season; I only will present these for actual wins and losses, but of course if you were so inclined you could base them on estimated winning percentages as well:<br /><br /><a href="https://4.bp.blogspot.com/-S90oZmQp5g4/XBgezHusEfI/AAAAAAAACnA/jUgw37_OMKAK5YVmEHubnonJ7t38vDCTQCLcBGAs/s1600/ctr18f.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-S90oZmQp5g4/XBgezHusEfI/AAAAAAAACnA/jUgw37_OMKAK5YVmEHubnonJ7t38vDCTQCLcBGAs/s400/ctr18f.JPG" width="282" height="400" data-original-width="395" data-original-height="560" /></a><br /><br />Here is a comparison of CTR including postseason (pCTR) to the regular season-only version, and the difference between the two:<br /><br /><a href="https://4.bp.blogspot.com/-2h45K73e6Bc/XBge0OtOs0I/AAAAAAAACnE/YEu8_A6zL38uOLMJDfyMqR83ixNl3shdACLcBGAs/s1600/ctr18g.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-2h45K73e6Bc/XBge0OtOs0I/AAAAAAAACnE/YEu8_A6zL38uOLMJDfyMqR83ixNl3shdACLcBGAs/s400/ctr18g.JPG" width="142" height="400" data-original-width="198" data-original-height="559" /></a><br /><br />I’ve been figuring CTRs since 2011 and playoff-inclusive versions since 2013, so Boston’s rating in both stood out when I saw it. I thought it might be interesting to look at the leader in each category each season. The 2018 Red Sox are the highest-rated team of the past eight seasons by a large margin (of course, such ratings do nothing to estimate any kind of underlying differences in MLB-wide quality between seasons):<br /><br /><a href="https://1.bp.blogspot.com/-Z8U16xoMtSU/XBgfyZyet1I/AAAAAAAACnk/n9H1cm-oP7w8jxWKvCcpmV9GuaKUcgJ9gCLcBGAs/s1600/ctr18h.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-Z8U16xoMtSU/XBgfyZyet1I/AAAAAAAACnk/n9H1cm-oP7w8jxWKvCcpmV9GuaKUcgJ9gCLcBGAs/s400/ctr18h.JPG" width="400" height="269" data-original-width="247" data-original-height="166" /></a><br /><br />I didn’t realize that the 2017 Dodgers were the previous leaders; I would have guessed it was the 2016 Cubs, although they would be much farther down the list. It is also worth noting that this year’s Astros would have been ranked #1 of the period were it not for the Red Sox. Boston’s sixteen point improvement including the playoffs was easily the best for any team that had been ranked #1, and that does make sense intuitively: 3-1 over the #3 ranked (regular season) Yankees, 4-1 over #2 ranked Houston, and 4-1 over the #9 Dodgers is one impressive playoff showing.phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-6722763786523989552018-12-10T07:23:00.000-05:002018-12-10T07:23:01.916-05:00Hitting by Position -- 2018Of all the annual repeat posts I write, this is the one which most interests me--I have always been fascinated by patterns of offensive production by fielding position, particularly trends over baseball history and cases in which teams have unusual distributions of offense by position. I also contend that offensive positional adjustments, when carefully crafted and appropriately applied, remain a viable and somewhat more objective competitor to the defensive positional adjustments often in use, although this post does not really address those broad philosophical questions.<br /><br />The first obvious thing to look at is the positional totals for 2018, with the data coming from Baseball-Reference.com. "MLB” is the overall total for MLB, which is not the same as the sum of all the positions here, as pinch-hitters and runners are not included in those. “POS” is the MLB totals minus the pitcher totals, yielding the composite performance by non-pitchers. “PADJ” is the position adjustment, which is the position RG divided by the overall major league average (this is a departure from past posts; I’ll discuss this a little at the end). “LPADJ” is the long-term positional adjustment that I use, based on 2002-2011 data. The rows “79” and “3D” are the combined corner outfield and 1B/DH totals, respectively:<br /><br /><a href="https://1.bp.blogspot.com/-7cfsEPtIzUc/XAmti0z8VkI/AAAAAAAAClo/rZ-ZRjzpw18RPc-eYOwRzo414oqaB2OGwCLcBGAs/s1600/pos18a.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-7cfsEPtIzUc/XAmti0z8VkI/AAAAAAAAClo/rZ-ZRjzpw18RPc-eYOwRzo414oqaB2OGwCLcBGAs/s400/pos18a.JPG" width="400" height="121" data-original-width="898" data-original-height="271" /></a><br /><br />An annual review of this data is problematic because it can lead to seeing trends where there are actually just blips. Two years ago second basemen smashed their way to unprecedented heights; this year they were right back at their long-term average. In 2017, DHs were 4% worse than the league average -- in 2018 they were part of what one could call the left side convergence of the defensive spectrum, as DH, 1B, LF, RF, and 3B all basically hit the same. Shortstops were above league average, which is of note, while catchers and center fielders also ended up right at their normal levels (yes, I really should update my “long-term” positional adjustments; I promise to do that for next year). <br /><br />Moving on to looking at more granular levels of performance, I always start by looking at the NL pitching staffs and their RAA. I need to stress that the runs created method I’m using here does not take into account sacrifices, which usually is not a big deal but can be significant for pitchers. Note that all team figures from this point forward in the post are park-adjusted. The RAA figures for each position are baselined against the overall major league average RG for the position, except for left field and right field which are pooled.<br /><br /><a href="https://3.bp.blogspot.com/-jypCyenUG6U/XAmtsFs44xI/AAAAAAAACls/PNCWAOb_oYclgH1KS8yaXxEFXXHZTkdKQCLcBGAs/s1600/pos18b.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-jypCyenUG6U/XAmtsFs44xI/AAAAAAAACls/PNCWAOb_oYclgH1KS8yaXxEFXXHZTkdKQCLcBGAs/s400/pos18b.JPG" width="400" height="372" data-original-width="311" data-original-height="289" /></a><br /><br />While positions relative to the league bounce around each year, it seems that the most predictable thing about this post is that the difference between the best and worst NL pitching staffs will be about twenty runs at the plate. Sixteen is a narrower spread than typical, but pitchers also fell to an all-time low -5 positional adjustment.<br /><br />I don’t run a full chart of the leading positions since you will very easily be able to go down the list and identify the individual primarily responsible for the team’s performance and you won’t be shocked by any of them, but the teams with the highest RAA at each spot were:<br /><br />C--LA, 1B--LA, 2B--HOU, 3B--CLE, SS--BAL, LF--MIL, CF--LAA, RF--BOS, DH--NYA<br /><br />I don’t know about “shocked”, but I was surprised to see that Baltimore had the most productive shortstops. Not that I didn’t know that Manny Machado had a great “first half” of the season for the O’s, but I was surprised that whoever they threw out there for the rest of the year didn’t drag their overall numbers down further. In fact Tim Beckham and Jonathan Villar were perfectly cromulent (offensively at least, although Machado wasn’t lighting up any defensive metrics himself) and Baltimore finished two RAA ahead of the Red Sox (Bogaerts) and Indians (Lindor), and three runs ahead of the Rockies (Story).<br /><br />More interesting are the worst performing positions; the player listed is the one who started the most games at that position for the team:<br /><br /><a href="https://2.bp.blogspot.com/-Xl26tTieGa4/XAmtuF8nxmI/AAAAAAAAClw/BNMA_jA9q8cNNG8Le5--WPXQCTKnRroQwCLcBGAs/s1600/pos18c.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-Xl26tTieGa4/XAmtuF8nxmI/AAAAAAAAClw/BNMA_jA9q8cNNG8Le5--WPXQCTKnRroQwCLcBGAs/s400/pos18c.JPG" width="400" height="175" data-original-width="416" data-original-height="182" /></a><br /><br />Boston’s catchers weren’t the worst relative to their position, but they were the worst hitting regular position in MLB on a rate basis; teams can certainly overcome a single dreadful position, but they usually don’t do it to the tune of 108 wins. The most pathetic position was definitely the Orioles’ first basemen, with a healthy lead for lowest RAA thanks to having the fifth-worst RG of any regular position (only Red Sox catchers, Tigers second basemen, Diamondback catchers, and their own catchers were worse).<br /><br />I like to attempt to measure each team’s offensive profile by position relative to a typical profile. I’ve found it frustrating as a fan when my team’s offensive production has come disproportionately from “defensive” positions rather than offensive positions (“Why can’t we just find a corner outfielder who can hit?”) The best way I’ve yet been able to come up with to measure this is to look at the correlation between RG at each position and the long-term positional adjustment. A positive correlation indicates a “traditional” distribution of offense by position--more production from the positions on the right side of the defensive spectrum. (To calculate this, I use the long-term positional adjustments that pool 1B/DH as well as LF/RF, and because of the DH I split it out by league.) There is no value judgment here--runs are runs whether they are created by first basemen or shortstops:<br /><br /><a href="https://1.bp.blogspot.com/-_kaPc6J1BW8/XAmtvrzZZsI/AAAAAAAACl0/tOlvDj2gNV4khOyB5nSFPBb0gjcyvSHMQCLcBGAs/s1600/pos18d.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-_kaPc6J1BW8/XAmtvrzZZsI/AAAAAAAACl0/tOlvDj2gNV4khOyB5nSFPBb0gjcyvSHMQCLcBGAs/s400/pos18d.JPG" width="306" height="400" data-original-width="222" data-original-height="290" /></a><br /><br /><br />We’ve already seen that Milwaukee’s shortstops were the least productive in the majors and their left fielders the most productive, which helps explain their high correlation. Baltimore’s low correlation likewise makes sense as they had the least productive first basemen and the most productive shortstops.<br /><br />The following tables, broken out by division, display RAA for each position, with teams sorted by the sum of positional RAA. Positions with negative RAA are in red, and positions that are +/-20 RAA are bolded:<br /><br /><a href="https://1.bp.blogspot.com/-3KXFQagQaSs/XAmtxfPsVlI/AAAAAAAACl4/MMYVdXf2PPgszPRwIg060VytALfTykDLACLcBGAs/s1600/pos18e.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-3KXFQagQaSs/XAmtxfPsVlI/AAAAAAAACl4/MMYVdXf2PPgszPRwIg060VytALfTykDLACLcBGAs/s400/pos18e.JPG" width="400" height="91" data-original-width="487" data-original-height="111" /></a><br /><br /><br />Boston’s monstrously productive outfield easily led the majors in RAA (as did their corner outfielders), but the catchers dragged them down just behind New York. Baltimore was below-average at every position except shortstop, so after they dealt Machado it was really ugly. They were the worst in MLB at both the infield and outfield corners.<br /><br /><br /><a href="https://3.bp.blogspot.com/-wM5abGoDgKM/XAmtzY3GT0I/AAAAAAAACmA/44nNyztDJegFTtwa71Ld2zab9_fSb9hcwCLcBGAs/s1600/pos18f.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-wM5abGoDgKM/XAmtzY3GT0I/AAAAAAAACmA/44nNyztDJegFTtwa71Ld2zab9_fSb9hcwCLcBGAs/s400/pos18f.JPG" width="400" height="91" data-original-width="486" data-original-height="110" /></a><br /><br />Cleveland’s offense had issues all over the place, but a pair of MVP candidates on the infield can cover that up, at least in the AL Central where every other team was below average. Chicago had the major’s worst outfield RAA while Detroit was last in the AL for middle infielders.<br /><br /><br /><a href="https://2.bp.blogspot.com/-ZPJ91kW90SU/XAmt1TyuWII/AAAAAAAACmE/-xHJBX8ykEENN6XqfrilZZsv6iDeIQ9PwCLcBGAs/s1600/pos18g.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-ZPJ91kW90SU/XAmt1TyuWII/AAAAAAAACmE/-xHJBX8ykEENN6XqfrilZZsv6iDeIQ9PwCLcBGAs/s400/pos18g.JPG" width="400" height="80" data-original-width="487" data-original-height="98" /></a><br /><br />For a second straight season, Houston’s infield and middle infield were tops in the AL despite injuries slowing down Altuve and Correa. Oakland’s Matts led the majors in corner infield RAA. Los Angeles had the majors least productive infield, which for all its badness still couldn’t cancel out the Trout-led centerfielders. <br /><br /><a href="https://2.bp.blogspot.com/-FJYx4dP-LTo/XAmt3oTPeII/AAAAAAAACmI/U_2xG_mikioNd-lCEMMQ5ZaQjfuxW5hjgCLcBGAs/s1600/pos18h.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-FJYx4dP-LTo/XAmt3oTPeII/AAAAAAAACmI/U_2xG_mikioNd-lCEMMQ5ZaQjfuxW5hjgCLcBGAs/s400/pos18h.JPG" width="400" height="89" data-original-width="443" data-original-height="99" /></a><br /><br />Washington led the NL in corner outfield RAA. New York was last in the NL in corner infield RAA. Last year Miami led the majors in outfield RAA; this year they trailed. This comes as no surprise of course, but is still worthy of a sad note before chuckling at the dark, monochromatic threads they will wear in 2019.<br /><br /><a href="https://4.bp.blogspot.com/-S7kYtrEuz7w/XAmt5aOWfKI/AAAAAAAACmM/sy4sEh3tjII1_HaCbtlVL41_SBT2B6bdQCLcBGAs/s1600/pos18i.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-S7kYtrEuz7w/XAmt5aOWfKI/AAAAAAAACmM/sy4sEh3tjII1_HaCbtlVL41_SBT2B6bdQCLcBGAs/s400/pos18i.JPG" width="400" height="90" data-original-width="442" data-original-height="100" /></a><br /><br />This division was nearly the opposite of the AL Central, as every team had a positive RAA with room to spare. Chicago lead the majors in middle infield RAA; Milwaukee was the worst in the same category, but covered it over with the NL’s most productive outfield. I will admit to being as confused by their trade deadline manueverings at the next guy, but when you see so starkly how little they were getting out of their middle infield, the shakeup makes more sense. Of course one of the chief regular season offenders Orlando Arcia raked in the playoffs.<br /><br /><a href="https://3.bp.blogspot.com/-QIFzqM6RPeY/XAmt7eEkovI/AAAAAAAACmQ/MXDd_cv_IXYqTf_01966H1eKPyijh4CtgCLcBGAs/s1600/pos18j.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-QIFzqM6RPeY/XAmt7eEkovI/AAAAAAAACmQ/MXDd_cv_IXYqTf_01966H1eKPyijh4CtgCLcBGAs/s400/pos18j.JPG" width="400" height="88" data-original-width="442" data-original-height="97" /></a><br /><br />Los Angeles’ corner infielders led the majors in RAA, and dragged their middle infielders (well, really just the second basemen) to the best total infield mark as well. The rest of the division was below average, and Colorado’s corner outfielders were last in the NL which should provide a juicy career opportunity for someone.<br /><br />The full spreadsheet with data is available <a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vSDQA7ysXIGcI32GTev5WsqebP4r1aAeDFROIRjAVqcD1qJmu9ASlH-YjZBobEvDszDickXBwXX7uhO/pub?output=html">here</a>.phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-16674611252419806302018-11-19T08:11:00.000-05:002018-11-19T08:11:15.480-05:00Leadoff Hitters, 2018I will try to make this as clear as possible: the statistics are based on the players that hit in the #1 slot in the batting order, whether they were actually leading off an inning or not. It includes the performance of all players who batted in that spot, including substitutes like pinch-hitters. <br /><br />Listed in parentheses after a team are all players that started in twenty or more games in the leadoff slot--while you may see a listing like "HOU (Springer)" this does not mean that the statistic is only based solely on Springers's performance; it is the total of all Houston batters in the #1 spot, of which Springer was the only one to start in that spot in twenty or more games. I will list the top and bottom three teams in each category (plus the top/bottom team from each league if they don't make the ML top/bottom three); complete data is available in a spreadsheet linked at the end of the article. There are also no park factors applied anywhere in this article.<br /><br />That's as clear as I can make it, and I hope it will suffice. I always feel obligated to point out that as a sabermetrician, I think that the importance of the batting order is often overstated, and that the best leadoff hitters would generally be the best cleanup hitters, the best #9 hitters, etc. However, since the leadoff spot gets a lot of attention, and teams pay particular attention to the spot, it is instructive to look at how each team fared there.<br /><br />The conventional wisdom is that the primary job of the leadoff hitter is to get on base, and most simply, score runs. It should go without saying on this blog that runs scored are heavily dependent on the performance of one’s teammates, but when writing on the internet it’s usually best to assume nothing. So let's start by looking at runs scored per 25.5 outs (AB - H + CS):<br /><br />1. BOS (Betts/Benintendi), 9.0<br />2. STL (Carpenter/Pham), 7.1<br />3. NYA (Gardner/Hicks/McCutchen), 6.8<br />Leadoff average, 5.4<br />ML average, 4.4<br />28. SF (Hernandez/McCutchen/Blanco/Panik), 4.1<br />29. SD (Jankowski, Margot), 4.1<br />30. BAL (Mancini/Mullens/Beckham), 4.0<br /><br />In the years I’ve been writing this post, I’m not sure I’ve since the same player show up as a member of a leading team and a atrailing team, but there is Andrew McCutchen, part-time leadoff hitter for both the group that scored runs as the third-highest clip and at the third-lowest. Leading off just 28 times for the Giants and 21 times for the Yankees, he wasn’t the driving force behind either performance.<br /><br />The most basic team independent category that we could look at is OBA (figured as (H + W + HB)/(AB + W + HB)):<br /><br />1. BOS (Betts/Benintendi), .421<br />2. CHN (Almora/Rizzo/Murphy/Zobrist), .367<br />3. KC (Merrifield/Jay), .365<br />Leadoff average, .335<br />ML average, .320<br />28. BAL (Mancini/Mullens/Beckham), .297<br />29. DET (Martin/Candelario), .296<br />30. SF (Hernandez/McCutchen/Blanco/Panik), .294<br /><br />I’m still lamenting the loss of “Esky Magic” as a punchline for every leaderboard in this post, even though it’s been two years since the Royals leadoff spot was making outs in bunches thanks to their magical shortstop. Luckily Whit Merrifield gives off scrappy player vibes that media narrative makers can get behind...well, could if anyone still cared about the Royals.,<br /><br />The next statistic is what I call Runners On Base Average. The genesis for ROBA is the A factor of Base Runs. It measures the number of times a batter reaches base per PA--excluding homers, since a batter that hits a home run never actually runs the bases. It also subtracts caught stealing here because the BsR version I often use does as well, but BsR versions based on initial baserunners rather than final baserunners do not. Here ROBA = (H + W + HB - HR - CS)/(AB + W + HB).<br /><br />This metric has caused some confusion, so I’ll expound. ROBA, like several other methods that follow, is not really a quality metric, it is a descriptive metric. A high ROBA is a good thing, but it's not necessarily better than a slightly lower ROBA plus a higher home run rate (which would produce a higher OBA and more runs). Listing ROBA is not in any way, shape or form a statement that hitting home runs is bad for a leadoff hitter. It is simply a recognition of the fact that a batter that hits a home run is not a baserunner. Base Runs is an excellent model of offense and ROBA is one of its components, and thus it holds some interest in describing how a team scored its runs. As such it is more a measure of shape than of quality:<br /><br />1. BOS (Betts/Benintendi), .360<br />2. KC (Merrifield/Jay), .338<br />3. CHN (Almora/Rizzo/Murphy/Zobrist), .334<br />Leadoff average, .298<br />ML average, .285<br />28. BAL (Mancini/Mullens/Beckham), .264<br />29. SF (Hernandez/McCutchen/Blanco/Panik), .259<br />30. LAA (Calhoun/Kinsler/Cozart), .257<br /><br />The Angels are the only change from the top/bottom three on the OBA list; they were fourth-last at .298 but their 26 homers eighth and drove their ROBA down to the bottom.<br /><br />I also include what I've called Literal OBA--this is just ROBA with HR subtracted from the denominator so that a homer does not lower LOBA, it simply has no effect. It “literally” (not really, thanks to errors, out stretching, caught stealing after subsequent plate appearances, etc.) is the proportion of plate appearances in which the batter becomes a baserunner able to be advanced by his teammates. You don't really need ROBA and LOBA (or either, for that matter), but this might save some poor message board out there twenty posts, by not implying that I think home runs are bad. LOBA = (H + W + HB - HR - CS)/(AB + W + HB - HR):<br /><br />1. BOS (Betts/Benintendi), .379<br />2. KC (Merrifield/Jay), .343<br />3. CHN (Almora/Rizzo/Murphy/Zobrist), .343<br />Leadoff average, .306<br />ML average, .294<br />28. BAL (Mancini/Mullens/Beckham), .271<br />29. LAA (Calhoun/Kinsler/Cozart), .267<br />30. SF (Hernandez/McCutchen/Blanco/Panik), .266<br /><br />The next two categories are most definitely categories of shape, not value. The first is the ratio of runs scored to RBI. Leadoff hitters as a group score many more runs than they drive in, partly due to their skills and partly due to lineup dynamics. Those with low ratios don’t fit the traditional leadoff profile as closely as those with high ratios (at least in the way their seasons played out, and of course using R and RBI incorporates the quality and style of the hitters in the adjacent lineup spots rather then attributes of the leadoff hitters’ performance in isolation):<br /><br />1. SEA (Gordon/Haniger), 2.0<br />2. MIL (Cain/Thames), 1.9<br />3. MIA (Dietrich/Castro/Ortega), 1.9<br />Leadoff average, 1.6<br />28. WAS (Eaton/Turner), 1.3<br />29. ATL (Acuna/Inciarte/Albies), 1.3<br />30. TOR (Granderson/McKinney), 1.2<br />ML average, 1.0<br /><br />I don’t know about you, but if you’d told me that leadoff spots led by Cain/Thames and Eaton/Turner would both appear as extreme on this list, I would have guessed that the former would be the one tilted to RBI.<br /><br />A similar gauge, but one that doesn't rely on the teammate-dependent R and RBI totals, is Bill James' Run Element Ratio. RER was described by James as the ratio between those things that were especially helpful at the beginning of an inning (walks and stolen bases) to those that were especially helpful at the end of an inning (extra bases). It is a ratio of "setup" events to "cleanup" events. Singles aren't included because they often function in both roles. <br /><br />Of course, there are RBI walks and doubles are a great way to start an inning, but RER classifies events based on when they have the highest relative value, at least from a simple analysis:<br /><br />1. PHI (Hernandez), 1.4<br />2. KC (Merrifield/Jay), 1.3<br />3. TB (Smith/Kiermaier/Span), 1.1<br />Leadoff average, .8<br />ML average, .7<br />28. LA (Taylor/Pederson), .6<br />29. PIT (Frazier/Harrison/Dickerson), .5<br />30. TOR (Granderson/McKinney), .5<br /><br />I should note that in the context-neutral RER, the two teams with seemingly backwards placement on the R/RBI list are closer to where you’d expect--the Nats were sixth at 1.0 while the Brewers were still forwardly placed but much closer to average (ranking tenth with .9).<br /><br />Since stealing bases is part of the traditional skill set for a leadoff hitter, I've included the ranking for what some analysts call net steals, SB - 2*CS. I'm not going to worry about the precise breakeven rate, which is probably closer to 75% than 67%, but is also variable based on situation. The ML and leadoff averages in this case are per team lineup slot:<br /><br />1. WAS (Eaton/Turner), 26<br />2. KC (Merrifield/Jay), 25<br />3. TB (Smith/Kiermaier/Span), 17<br />Leadoff average, 5<br />ML average, 2<br />27. BAL (Mancini/Mullens/Beckham), -6<br />27. CHN (Almora/Rizzo/Murphy/Zobrist), -6<br />27. LA (Taylor/Pederson), -6<br />30. CIN (Peraza/Schebler/Winker/Hamilton), -9<br /><br />Shifting back to quality measures, first up is one that David Smyth proposed when I first wrote this annual leadoff review. Since the optimal weight for OBA in a x*OBA + SLG metric is generally something like 1.7, David suggested figuring 2*OBA + SLG for leadoff hitters, as a way to give a little extra boost to OBA while not distorting things too much, or even suffering an accuracy decline from standard OPS. Since this is a unitless measure anyway, I multiply it by .7 to approximate the standard OPS scale and call it 2OPS:<br /><br />1. BOS (Betts/Benintendi), 1017<br />2. CLE (Lindor), 847<br />3. STL (Carpenter/Pham), 842<br />Leadoff average, 762<br />ML average, 735<br />28. SF (Hernandez/McCutchen/Blanco/Panik), 669<br />29. DET (Martin/Candelario), 652<br />30. BAL (Mancini/Mullens/Beckham), 649<br /><br />Along the same lines, one can also evaluate leadoff hitters in the same way I'd go about evaluating any hitter, and just use Runs Created per Game with standard weights (this will include SB and CS, which are ignored by 2OPS):<br /><br />1. BOS (Betts/Benintendi), 8.7<br />2. CLE (Lindor), 5.9<br />3. STL (Carpenter/Pham), 5.7<br />Leadoff average, 4.7<br />ML average, 4.3<br />28. SF (Hernandez/McCutchen/Blanco/Panik), 3.5<br />29. DET (Martin/Candelario), 3.3<br />30. BAL (Mancini/Mullens/Beckham), 3.1<br /><br />Seeing the same six extreme teams in perfect order overstates the correlation, but naturally there is a very strong relationship between the last two metrics. The biggest difference in any team’s ranks in the two was four spots. <br /><br />Allow me to close with a crude theoretical measure of linear weights supposing that the player always led off an inning (that is, batted in the bases empty, no outs state). There are weights out there (see The Book) for the leadoff slot in its average situation, but this variation is much easier to calculate (although also based on a silly and impossible premise). <br /><br />The weights I used were based on the 2010 run expectancy table from Baseball Prospectus. Ideally I would have used multiple seasons but this is a seat-of-the-pants metric. The 2010 post goes into the detail of how this measure is figured; this year, I’ll just tell you that the out coefficient was -.234, the CS coefficient was -.601, and for other details refer you to that post. I then restate it per the number of PA for an average leadoff spot (748 in 2017):<br /><br />1. BOS (Betts/Benintendi), 57<br />2. STL (Carpenter/Pham), 20<br />3. CLE (Lindor), 17<br />Leadoff average, 0<br />ML average, -7<br />28. SF (Hernandez/McCutchen/Blanco/Panik), -22<br />29. DET (Martin/Candelario), -24<br />30. BAL (Mancini/Mullens/Beckham), -27<br /><br />Boston completely dominated the quality metrics for leadoff hitters in 2018, due mostly of course to the superlative season of Mookie Betts, who was the second-best offensive player in the game in 2018. Put one of the top hitters in the leadoff spot and you can expect to lead in a lot of categories - BOS led easily not just in categories that reflect quality without any shape distortion (not necessarily context-free) like R/G, OBA, 2OPS, RG, and LE, but also in ROBA and LOBA, which are designed to not measure value but rather the rate of leadoff hitters reaching base for their teammates to drive in. What’s more, it’s not as if Boston achieved this by having a really good OBA out of the leadoff spot without a lot of power - the Red Sox and Cardinals tied for the ML league with 38 homers out of the leadoff spot. The Indians ranked third with 37, and those teams ranked 1-2-3 in all of the overall quality measures. One can argue about optimal lineup construction, but in 2018, leadoff hitters hit dingers like everyone else. Every team was in double digits in homers out of the leadoff spot; in 2017 there were only two, but one of those teams hit just three homers. <br /><br />The spreadsheet with full data is available <a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vT5rkQgVLZWlXsTkPN-vzPsaN3lGK2VZu9FEPoGEGZrW6kK5FhDYdaIJkf7L6OB9vOYFp9_mBDyazou/pub?output=html">here</a>. phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-23074946271726134622018-11-12T07:07:00.000-05:002018-11-12T07:07:07.155-05:00Hypothetical Ballot: MVPI tend to think I’m pretty objective when it comes to baseball analysis. Someone reading my blog or Twitter feed (RIP, mostly) with a critical eye might beg to differ: I like the Indians, players accused of using steroids, hate the Royals, and oh yeah I really love Mike Trout. The latter is certainly not unique to me -- how could you not like Mike Trout? -- but it is pronounced enough that my objectivity could be called into question when (for once) Mike Trout is engaged in a close race for AL MVP.<br /><br />I think Mike Trout was most likely the most valuable player in baseball in 2018, and I firmly believe I would say that even if I was not a huge fan. While Baseball-Reference and Fangraphs’ WAR would disagree, Baseball Prospectus’ WARP agrees, so I’m not completely on an island.<br /><br />The key consideration for me is that Trout was markedly superior offensively to Mookie Betts once you properly weight offensive events (read: more credit to Trout for his walks than metrics of the OPS family would allow) and adjust for the big difference in park factors between Angels Stadium and Fenway Park (97 and 105 PF respectively). I estimate that, adjusting for park, Trout created six more runs than Betts while making twenty fewer outs. That’s about a nine run difference. Then there is the position adjustment, which is worth another four.<br /><br />Betts does cut into this lead with his defensive value: going in the order FRAA/UZR/DRS, Betts (11/15/20) has an average twelve runs higher than Trout (-2/4/8). I don’t credit the full difference, but even if I did, Trout would still have a one run edge. Give Betts a couple extra runs for baserunning (a debatable point)? I’m still going with the player with a clear advantage in offensive value. Regress the defense 50%? It’s close but the choice is much clearer.<br /><br />The rest of my ballot is pretty self-explanatory if you look at my RAR estimates. I could justify just about any order of 6-9; I’m not at all convinced that JD Martinez was more valuable than Jose Ramirez, but chalk that one up to avoiding the indication of bias. Francisco Lindor rises based on excellent fielding metrics (6/14/14):<br /><br />1. CF Mike Trout, LAA<br />2. RF Mookie Betts, BOS<br />3. SP Justin Verlander, HOU<br />4. 3B Alex Bregman, HOU<br />5. SP Chris Sale, BOS<br />6. DH JD Martinez, BOS<br />7. SP Blake Snell, TB<br />8. SS Francisco Lindor, CLE<br />9. 3B Jose Ramirez, CLE<br />10. SP Corey Kluber, CLE<br /><br />The NL MVP race is weird. Christian Yelich had an eighteen RAR lead over the next closest position player (Javier Baez), which is typically an indication of a historically great season. Triple crown bid aside, Yelich did not have a historically great season, “merely” a typical MVP-type season. In the AL, he would have been well behind Trout and Betts with Bregman and Martinez right on his heels.<br /><br />Thus the only meaningful comparison for the top of the ballot is the top hitter (Yelich) against the top pitcher (Jacob deGrom). When it comes to an MVP race between a hitter and a pitcher, I usually try to give the former the benefit of the doubt. Specifically, while there is one primary way in which I evaluate the offensive contribution of a hitter (runs created based on their statistics, converted to RAR), there are three obvious ways using the traditional stat line to calculate RAR for a pitcher. The first is based on actual runs allowed; the second on peripheral statistics (this one is most similar to the comparable calculation for batters); the third based on DIPS principle. In order for me to support a pitcher for MVP, ideally he would be more valuable using each of these perspectives on evaluating performance. deGrom achieved this, with his lowest RAR total (72 based on DIPS principles) exceeding Yelich’s 69 RAR (and with Yelich’s -5/-2/4 fielding metrics, 69 is as good as it gets). <br /><br />Given the huge gap between Yelich and Baez, starting pitchers dominate the top of my ballot. The movers upward when considering fielding are a pair of first basemen (Freddie Freeman and Paul Goldschmidt) and Nolan Arenado, while Bryce Harper’s fielding metrics were dreadful (-12/-14/-26) and drop him all the way off the ballot:<br /><br />1. SP Jacob deGrom, NYN<br />2. LF Christian Yelich, MIL<br />3. SP Max Scherzer, WAS<br />4. SP Aaron Nola, PHI<br />5. SP Kyle Freeland, COL<br />6. SP Patrick Corbin, ARI<br />7. SS Javier Baez, CHN<br />8. 1B Freddie Freeman, ATL<br />9. 1B Paul Goldschmidt, ARI<br />10. 3B Nolan Arenado, COLphttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-51829950949151912202018-11-08T07:05:00.000-05:002018-11-08T07:05:02.073-05:00Hypothetical Ballot: Cy YoungThe AL Cy Young race is extremely close due to the two candidates who appeared to be battling it out for the award much of the season missing significant time in the second half. Despite their injuries, Chris Sale and Trevor Bauer had logged enough innings preventing enough runs on a rate basis to still be legitimate contenders in the end. Justin Verlander and Blake Snell each tied with 74 RAR based on actual runs allowed adjusted for bullpen support, an eight run lead over Sale in third. But when you look at metrics based on eRA (based on “components”) and dRA (based on DIPS concepts), Sale, Bauer, Corey Kluber, and Gerrit Cole all cut into that gap.<br /><br />In fact, using a crude weighting of 50% RA-based, 25% eRA-based, and 25% dRA-based RAR, there are six pitchers separated by seven RAR. A seventh, Mike Clevinger, had 65 standard RAR but worse peripherals to drop four runs behind the bottom of that pack.<br /><br />There are any number of reasonable ways to fill out one’s ballot, but I think the best choice for across-the-board excellence is Verlander. He pitched just one fewer inning than league leader Kluber, tied for the league lead in standard RAR, was second one run behind Kluber in eRA-based RAR, and was third by five runs to Sale in dRA-based RAR. Chris Sale sneaks into second for me as he led across the board in RA; even pitching just 158 innings, seventeen fewer than even Bauer, his excellence allowed him to accrue a great deal of value. Snell and the Indians round out my ballot; I’ve provided the statistics I considered below as evidence of how close this is:<br /><br />1. Justin Verlander, HOU<br />2. Chris Sale, BOS<br />3. Blake Snell, TB<br />4. Corey Kluber, CLE<br />5. Trevor Bauer, CLE<br /><br /><a href="https://4.bp.blogspot.com/-Xjpfx3nkKV8/W-Nh3kfayhI/AAAAAAAAClc/-yr5VwohcqEY7PvdBsgw7z9e4od0TqnsACLcBGAs/s1600/alcy18.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-Xjpfx3nkKV8/W-Nh3kfayhI/AAAAAAAAClc/-yr5VwohcqEY7PvdBsgw7z9e4od0TqnsACLcBGAs/s400/alcy18.JPG" width="400" height="73" data-original-width="713" data-original-height="130" /></a><br /><br />The NL race is not nearly as close, as Jacob deGrom was second in innings (by just three to Max Scherzer) and led in all of the RA categories, plus Quality Start % and probably a whole bunch of equally suspect measures of performance. <br /><br />Behind him I see no particular reason to deviate from the order suggested by RAR; Scherzer over Aaron Nola is an easy choice due to the former’s superior peripherals, and while Patrick Corbin had superior peripherals to Kyle Freeland, the latter’s 13 RAR lead is a lot to ignore, although Corbin should be recognized for having an eRA and dRA quite similar to Max Scherzer and otherwise lapping the rest of the field. With the exception of course of Jacob deGrom, the author a season that is worthy of considerable discussion in the next installment of “meaningless hypothetical award ballots”:<br /><br />1. Jacob deGrom, NYN<br />2. Max Scherzer, WAS<br />3. Aaron Nola, PHI<br />4. Kyle Freeland, COL<br />5. Patrick Corbin, ARIphttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-86569774105154479982018-10-31T07:54:00.000-04:002018-10-31T07:54:02.781-04:00Hypothetical Ballot: Rookie of the YearI would expect to see some fairly wide variations in Rookie of the Year rankings even among sabermetric-minded people this season, especially in the American League where nuances in player value methodology can result in significant differences in how ones ranks the candidates. <br /><br />I think that Shohei Ohtani was the most valuable AL rookie, by a decent margin. Offensively, I have him at 28 RAR; you may want to cut a few runs off of that if you think there should be a more punitive DH penalty. That ranks behind Miguel Andujar (36) and Joey Wendle (31), and even with Gleybar Torres (27). However, Andujar’s fielding marks are truly dreadful (-11 BP FRAA is the most generous evaluation; UZR at -16 and DRS at -25 are even more down on his performance). Wendle consistently gets around 5 RAA, and evaluations of Torres are varied (7, -8, -1). <br /><br />Ohtani also contributed as a pitcher. While he only pitched 52 innings, his 3.41 park-adjusted RA over that work is good for 14 RAR. I see no reason why he shouldn’t be viewed separately against replacement level for his offensive and pitching work; this isn’t the same situation as evaluating a batter against separate replacement levels for offense and fielding. Ohtani’s role can be bifurcated by his manager; if he was not contributing value offensively, he would lose his opportunities in that space while still being permitted to take the mound. A player’s performance as a batter and a fielder cannot be similarly divided, except if the DH role is available. If anything, Ohtani should get a bonus for only taking up one roster spot (can we use that to offset any docking for the DH positional adjustment and call it even?)<br /><br />Ohtani at 42 RAR outshines Wendle, even with full credit for fielding, as well as the top pitching candidate, Brad Keller (36 RAR with a good eRA but only 28 RAR if evaluated on a DIPS basis). The other top pitching candidate by standard RAR, Jamie Barria (33), had worse peripherals and a very poor dRA (5.24). Regressing the fielding stats a little, I give Andujar the nod over Torres with offense as tiebreaker, but they should be the bottom of the ballot, not the top:<br /><br />1. DH/SP Shohei Ohtani, LAA<br />2. 3B Joey Wendle, TB<br />3. SP Brad Keller, KC<br />4. 3B Miguel Andujar, NYA<br />5. 2B Gleyber Torres, NYA<br /> <br />In the NL, the old pull to bestow RoY upon the transcendent prospect rather than the most valuable rookie comes into play a little bit. With two young hitters the caliber of Ronald Acuna and Juan Soto to choose from, it is very tempting to put them on top. I think Walker Buehler deserves better. At 43 RAR, Buehler is ahead of Acuna and Soto (38) before taking fielding into account, and neither Acuna (-2 to -9) or Soto (3 to -5) shine in those metrics. <br /><br />I think the three can be placed in any ballot order quite reasonably; while most people (including me) would take Acuna’s future, Soto and Acuna were nearly even this season, with essentially the same park-adjusted batting averages supplemented by Soto’s amazing walk rate and Acuna’s superior power. Give Soto some credit as a fielder and Acuna some as a baserunner and it’s still very close. Buehler was not as strong in RAR if using a DIPS approach (30), which drops him back to their level. Acuna may be the better prospect, but Soto’s younger, and while I don’t like to give extra credit for performance by time in the season, Buehler came up huge in a regular season game that would conclusively decide a division title. Somewhat arbitrarily, I have it:<br /><br />1. LF Juan Soto, WAS<br />2. SP Walker Buehler, LA<br />3. LF Ronald Acuna, ATL<br />4. SP Jack Flaherty, STL<br />5. RF Brian Anderson, MIAphttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-70817081757431695812018-10-05T10:25:00.000-04:002018-10-05T10:25:49.534-04:00End of Season Statistics, 2018The spreadsheets are published as Google Spreadsheets, which you can download in Excel format by changing the extension in the address from "=html" to "=xlsx", or in open format as "=ods". That way you can download them and manipulate things however you see fit. <br /><br />The data comes from a number of different sources. Most of the data comes from Baseball-Reference. KJOK's park database is extremely helpful in determining when park factors should reset. Data on bequeathed runners comes from Baseball Prospectus. <br /><br />The basic philosophy behind these stats is to use the simplest methods that have acceptable accuracy. Of course, "acceptable" is in the eye of the beholder, namely me. I use Pythagenpat not because other run/win converters, like a constant RPW or a fixed exponent are not accurate enough for this purpose, but because it's mine and it would be kind of odd if I didn't use it. <br /><br />If I seem to be a stickler for purity in my critiques of others' methods, I'd contend it is usually in a theoretical sense, not an input sense. So when I exclude hit batters, I'm not saying that hit batters are worthless or that they *should* be ignored; it's just easier not to mess with them and not that much less accurate (note: hit batters are actually included in the offensive statistics now).<br /><br />I also don't really have a problem with people using sub-standard methods (say, Basic RC) as long as they acknowledge that they are sub-standard. If someone pretends that Basic RC doesn't undervalue walks or cause problems when applied to extreme individuals, I'll call them on it; if they explain its shortcomings but use it regardless, I accept that. Take these last three paragraphs as my acknowledgment that some of the statistics displayed here have shortcomings as well, and I've at least attempted to describe some of them in the discussion below.<br /><br />The League spreadsheet is pretty straightforward--it includes league totals and averages for a number of categories, most or all of which are explained at appropriate junctures throughout this piece. The advent of interleague play has created two different sets of league totals--one for the offense of league teams and one for the defense of league teams. Before interleague play, these two were identical. I do not present both sets of totals (you can figure the defensive ones yourself from the team spreadsheet, if you desire), just those for the offenses. The exception is for the defense-specific statistics, like innings pitched and quality starts. The figures for those categories in the league report are for the defenses of the league's teams. However, I do include each league's breakdown of basic pitching stats between starters and relievers (denoted by "s" or "r" prefixes), and so summing those will yield the totals from the pitching side. The one abbreviation you might not recognize is "N"--this is the league average of runs/game for one team, and it will pop up again.<br /><br />The Team spreadsheet focuses on overall team performance--wins, losses, runs scored, runs allowed. The columns included are: Park Factor (PF), Home Run Park Factor (PFhr), Winning Percentage (W%), Expected W% (EW%), Predicted W% (PW%), wins, losses, runs, runs allowed, Runs Created (RC), Runs Created Allowed (RCA), Home Winning Percentage (HW%), Road Winning Percentage (RW%) [exactly what they sound like--W% at home and on the road], Runs/Game (R/G), Runs Allowed/Game (RA/G), Runs Created/Game (RCG), Runs Created Allowed/Game (RCAG), and Runs Per Game (the average number of runs scored an allowed per game). Ideally, I would use outs as the denominator, but for teams, outs and games are so closely related that I don’t think it’s worth the extra effort.<br /><br />The runs and Runs Created figures are unadjusted, but the per-game averages are park-adjusted, except for RPG which is also raw. Runs Created and Runs Created Allowed are both based on a simple Base Runs formula. The formula is:<br /><br />A = H + W - HR - CS<br />B = (2TB - H - 4HR + .05W + 1.5SB)*.76<br />C = AB - H<br />D = HR<br />Naturally, A*B/(B + C) + D.<br /><br />I have explained the methodology used to figure the PFs before, but the cliff’s notes version is that they are based on five years of data when applicable, include both runs scored and allowed, and they are regressed towards average (PF = 1), with the amount of regression varying based on the number of years of data used. There are factors for both runs and home runs. The initial PF (not shown) is:<br /><br />iPF = (H*T/(R*(T - 1) + H) + 1)/2<br />where H = RPG in home games, R = RPG in road games, T = # teams in league (14 for AL and 16 for NL). Then the iPF is converted to the PF by taking x*iPF + (1-x), where x = .6 if one year of data is used, .7 for 2, .8 for 3, and .9 for 4+. <br /><br />It is important to note, since there always seems to be confusion about this, that these park factors already incorporate the fact that the average player plays 50% on the road and 50% at home. That is what the adding one and dividing by 2 in the iPF is all about. So if I list Fenway Park with a 1.02 PF, that means that it actually increases RPG by 4%. <br /><br />In the calculation of the PFs, I did not take out “home” games that were actually at neutral sites (of which there were a rash this year).<br /><br />There are also Team Offense and Defense spreadsheets. These include the following categories:<br /><br />Team offense: Plate Appearances, Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Walks and Hit Batters per At Bat (WAB), Isolated Power (SLG - BA), R/G at home (hR/G), and R/G on the road (rR/G) BA, OBA, SLG, WAB, and ISO are park-adjusted by dividing by the square root of park factor (or the equivalent; WAB = (OBA - BA)/(1 - OBA), ISO = SLG - BA, and SEC = WAB + ISO).<br /><br />Team defense: Innings Pitched, BA, OBA, SLG, Innings per Start (IP/S), Starter's eRA (seRA), Reliever's eRA (reRA), Quality Start Percentage (QS%), RA/G at home (hRA/G), RA/G on the road (rRA/G), Battery Mishap Rate (BMR), Modified Fielding Average (mFA), and Defensive Efficiency Record (DER). BA, OBA, and SLG are park-adjusted by dividing by the square root of PF; seRA and reRA are divided by PF.<br /><br />The three fielding metrics I've included are limited it only to metrics that a) I can calculate myself and b) are based on the basic available data, not specialized PBP data. The three metrics are explained in this post, but here are quick descriptions of each:<br /><br />1) BMR--wild pitches and passed balls per 100 baserunners = (WP + PB)/(H + W - HR)*100<br /><br />2) mFA--fielding average removing strikeouts and assists = (PO - K)/(PO - K + E)<br /><br />3) DER--the Bill James classic, using only the PA-based estimate of plays made. Based on a suggestion by Terpsfan101, I've tweaked the error coefficient. Plays Made = PA - K - H - W - HR - HB - .64E and DER = PM/(PM + H - HR + .64E)<br /><br />Next are the individual player reports. I defined a starting pitcher as one with 15 or more starts. All other pitchers are eligible to be included as a reliever. If a pitcher has 40 appearances, then they are included. Additionally, if a pitcher has 50 innings and less than 50% of his appearances are starts, he is also included as a reliever (this allows some swingmen type pitchers who wouldn’t meet either the minimum start or appearance standards to get in). This would be a good point to note that I didn't do much to adjust for the opener--I made the decision to classify Ryan Yarbrough as a starter and Ryne Stanek as a reliever, but maybe next year I can implement some <a href="http://tangotiger.com/index.php/site/comments/does-war-need-to-be-adjusted-for-the-opener">good ideas</a> into the RAA/RAR methodology. <br /><br />For all of the player reports, ages are based on simply subtracting their year of birth from 2017. I realize that this is not compatible with how ages are usually listed and so “Age 27” doesn’t necessarily correspond to age 27 as I list it, but it makes everything a heckuva lot easier, and I am more interested in comparing the ages of the players to their contemporaries than fitting them into historical studies, and for the former application it makes very little difference. The "R" category records rookie status with a "R" for rookies and a blank for everyone else; I've trusted Baseball Prospectus on this. Also, all players are counted as being on the team with whom they played/pitched (IP or PA as appropriate) the most. <br /><br />For relievers, the categories listed are: Games, Innings Pitched, estimated Plate Appearances (PA), Run Average (RA), Relief Run Average (RRA), Earned Run Average (ERA), Estimated Run Average (eRA), DIPS Run Average (dRA), Strikeouts per Game (KG), Walks per Game (WG), Guess-Future (G-F), Inherited Runners per Game (IR/G), Batting Average on Balls in Play (%H), Runs Above Average (RAA), and Runs Above Replacement (RAR).<br /><br />IR/G is per relief appearance (G - GS); it is an interesting thing to look at, I think, in lieu of actual leverage data. You can see which closers come in with runners on base, and which are used nearly exclusively to start innings. Of course, you can’t infer too much; there are bad relievers who come in with a lot of people on base, not because they are being used in high leverage situations, but because they are long men being used in low-leverage situations already out of hand.<br /><br />For starting pitchers, the columns are: Wins, Losses, Innings Pitched, Estimated Plate Appearances (PA), RA, RRA, ERA, eRA, dRA, KG, WG, G-F, %H, Pitches/Start (P/S), Quality Start Percentage (QS%), RAA, and RAR. RA and ERA you know--R*9/IP or ER*9/IP, park-adjusted by dividing by PF. The formulas for eRA and dRA are based on the same Base Runs equation and they estimate RA, not ERA.<br /><br />* eRA is based on the actual results allowed by the pitcher (hits, doubles, home runs, walks, strikeouts, etc.). It is park-adjusted by dividing by PF.<br /><br />* dRA is the classic DIPS-style RA, assuming that the pitcher allows a league average %H, and that his hits in play have a league-average S/D/T split. It is park-adjusted by dividing by PF.<br /><br />The formula for eRA is:<br /><br />A = H + W - HR<br />B = (2*TB - H - 4*HR + .05*W)*.78<br />C = AB - H = K + (3*IP - K)*x (where x is figured as described below for PA estimation and is typically around .93) = PA (from below) - H - W<br />eRA = (A*B/(B + C) + HR)*9/IP<br /><br />To figure dRA, you first need the estimate of PA described below. Then you calculate W, K, and HR per PA (call these %W, %K, and %HR). Percentage of balls in play (BIP%) = 1 - %W - %K - %HR. This is used to calculate the DIPS-friendly estimate of %H (H per PA) as e%H = Lg%H*BIP%.<br /><br />Now everything has a common denominator of PA, so we can plug into Base Runs:<br /><br />A = e%H + %W<br />B = (2*(z*e%H + 4*%HR) - e%H - 5*%HR + .05*%W)*.78<br />C = 1 - e%H - %W - %HR<br />cRA = (A*B/(B + C) + %HR)/C*a<br /><br />z is the league average of total bases per non-HR hit (TB - 4*HR)/(H - HR), and a is the league average of (AB - H) per game.<br /><br />In the past I presented a couple of batted ball RA estimates. I’ve removed these, not just because batted ball data exhibits questionable reliability but because these metrics were complicated to figure, required me to collate the batted ball data, and were not personally useful to me. I figure these stats for my own enjoyment and have in some form or another going back to 1997. I share them here only because I would do it anyway, so if I’m not interested in certain categories, there’s no reason to keep presenting them.<br /><br />Instead, I’m showing strikeout and walk rate, both expressed as per game. By game I mean not nine innings but rather the league average of PA/G. I have always been a proponent of using PA and not IP as the denominator for non-run pitching rates, and now the use of per PA rates is widespread. Usually these are expressed as K/PA and W/PA, or equivalently, percentage of PA with a strikeout or walk. I don’t believe that any site publishes these as K and W per equivalent game as I am here. This is not better than K%--it’s simply applying a scalar multiplier. I like it because it generally follows the same scale as the familiar K/9.<br /><br />To facilitate this, I’ve finally corrected a flaw in the formula I use to estimate plate appearances for pitchers. Previously, I’ve done it the lazy way by not splitting strikeouts out from other outs. I am now using this formula to estimate PA (where PA = AB + W):<br /><br />PA = K + (3*IP - K)*x + H + W<br />Where x = league average of (AB - H - K)/(3*IP - K)<br /><br />Then KG = K*Lg(PA/G) and WG = W*Lg(PA/G).<br /><br />G-F is a junk stat, included here out of habit because I've been including it for years. It was intended to give a quick read of a pitcher's expected performance in the next season, based on eRA and strikeout rate. Although the numbers vaguely resemble RAs, it's actually unitless. As a rule of thumb, anything under four is pretty good for a starter. G-F = 4.46 + .095(eRA) - .113(K*9/IP). It is a junk stat. JUNK STAT JUNK STAT JUNK STAT. Got it?<br /><br />%H is BABIP, more or less--%H = (H - HR)/(PA - HR - K - W), where PA was estimated above. Pitches/Start includes all appearances, so I've counted relief appearances as one-half of a start (P/S = Pitches/(.5*G + .5*GS). QS% is just QS/(G - GS); I don't think it's particularly useful, but Doug's Stats include QS so I include it.<br /><br />I've used a stat called Relief Run Average (RRA) in the past, based on Sky Andrecheck's article in the August 1999 By the Numbers; that one only used inherited runners, but I've revised it to include bequeathed runners as well, making it equally applicable to starters and relievers. I use RRA as the building block for baselined value estimates for all pitchers. I explained RRA in this article, but the bottom line formulas are:<br /><br />BRSV = BRS - BR*i*sqrt(PF)<br />IRSV = IR*i*sqrt(PF) - IRS<br />RRA = ((R - (BRSV + IRSV))*9/IP)/PF<br /><br />The two baselined stats are Runs Above Average (RAA) and Runs Above Replacement (RAR). Starting in 2015 I revised RAA to use a slightly different baseline for starters and relievers as described here. The adjustment is based on patterns from the last several seasons of league average starter and reliever eRA. Thus it does not adjust for any advantages relief pitchers enjoy that are not reflected in their component statistics. This could include runs allowed scoring rules that benefit relievers (although the use of RRA should help even the scales in this regard, at least compared to raw RA) and the talent advantage of starting pitchers. The RAR baselines do attempt to take the latter into account, and so the difference in starter and reliever RAR will be more stark than the difference in RAA.<br /><br />RAA (relievers) = (.951*LgRA - RRA)*IP/9<br />RAA (starters) = (1.025*LgRA - RRA)*IP/9<br />RAR (relievers) = (1.11*LgRA - RRA)*IP/9<br />RAR (starters) = (1.28*LgRA - RRA)*IP/9<br /><br />All players with 250 or more plate appearances (official, total plate appearances) are included in the Hitters spreadsheets (along with some players close to the cutoff point who I was interested in). Each is assigned one position, the one at which they appeared in the most games. The statistics presented are: Games played (G), Plate Appearances (PA), Outs (O), Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Runs Created (RC), Runs Created per Game (RG), Speed Score (SS), Hitting Runs Above Average (HRAA), Runs Above Average (RAA), Hitting Runs Above Replacement (HRAR), and Runs Above Replacement (RAR).<br /><br />Starting in 2015, I'm including hit batters in all related categories for hitters, so PA is now equal to AB + W+ HB. Outs are AB - H + CS. BA and SLG you know, but remember that without SF, OBA is just (H + W + HB)/(AB + W + HB). Secondary Average = (TB - H + W + HB)/AB = SLG - BA + (OBA - BA)/(1 - OBA). I have not included net steals as many people (and Bill James himself) do, but I have included HB which some do not.<br /><br />BA, OBA, and SLG are park-adjusted by dividing by the square root of PF. This is an approximation, of course, but I'm satisfied that it works well (I plan to post a couple articles on this some time during the offseason). The goal here is to adjust for the win value of offensive events, not to quantify the exact park effect on the given rate. I use the BA/OBA/SLG-based formula to figure SEC, so it is park-adjusted as well.<br /><br />Runs Created is actually Paul Johnson's ERP, more or less. Ideally, I would use a custom linear weights formula for the given league, but ERP is just so darn simple and close to the mark that it’s hard to pass up. I still use the term “RC” partially as a homage to Bill James (seriously, I really like and respect him even if I’ve said negative things about RC and Win Shares), and also because it is just a good term. I like the thought put in your head when you hear “creating” a run better than “producing”, “manufacturing”, “generating”, etc. to say nothing of names like “equivalent” or “extrapolated” runs. None of that is said to put down the creators of those methods--there just aren’t a lot of good, unique names available. <br /><br />For 2015, I refined the formula a little bit to:<br /><br />1. include hit batters at a value equal to that of a walk<br />2. value intentional walks at just half the value of a regular walk<br />3. recalibrate the multiplier based on the last ten major league seasons (2005-2014)<br /><br />This revised RC = (TB + .8H + W + HB - .5IW + .7SB - CS - .3AB)*.310<br /><br />RC is park adjusted by dividing by PF, making all of the value stats that follow park adjusted as well. RG, the Runs Created per Game rate, is RC/O*25.5. I do not believe that outs are the proper denominator for an individual rate stat, but I also do not believe that the distortions caused are that bad. (I still intend to finish my rate stat series and discuss all of the options in excruciating detail, but alas you’ll have to take my word for it now).<br /><br />Several years ago I switched from using my own "Speed Unit" to a version of Bill James' Speed Score; of course, Speed Unit was inspired by Speed Score. I only use four of James' categories in figuring Speed Score. I actually like the construct of Speed Unit better as it was based on z-scores in the various categories (and amazingly a couple other sabermetricians did as well), but trying to keep the estimates of standard deviation for each of the categories appropriate was more trouble than it was worth.<br /><br />Speed Score is the average of four components, which I'll call a, b, c, and d:<br /><br />a = ((SB + 3)/(SB + CS + 7) - .4)*20<br />b = sqrt((SB + CS)/(S + W))*14.3<br />c = ((R - HR)/(H + W - HR) - .1)*25<br />d = T/(AB - HR - K)*450<br /><br />James actually uses a sliding scale for the triples component, but it strikes me as needlessly complex and so I've streamlined it. He looks at two years of data, which makes sense for a gauge that is attempting to capture talent and not performance, but using multiple years of data would be contradictory to the guiding principles behind this set of reports (namely, simplicity. Or laziness. You're pick.) I also changed some of his division to mathematically equivalent multiplications.<br /><br />There are a whopping four categories that compare to a baseline; two for average, two for replacement. Hitting RAA compares to a league average hitter; it is in the vein of Pete Palmer’s Batting Runs. RAA compares to an average hitter at the player’s primary position. Hitting RAR compares to a “replacement level” hitter; RAR compares to a replacement level hitter at the player’s primary position. The formulas are:<br /><br />HRAA = (RG - N)*O/25.5<br />RAA = (RG - N*PADJ)*O/25.5<br />HRAR = (RG - .73*N)*O/25.5<br />RAR = (RG - .73*N*PADJ)*O/25.5<br /><br />PADJ is the position adjustment, and it is based on 2002-2011 offensive data. For catchers it is .89; for 1B/DH, 1.17; for 2B, .97; for 3B, 1.03; for SS, .93; for LF/RF, 1.13; and for CF, 1.02. I had been using the 1992-2001 data as a basis for some time, but finally updated for 2012. I’m a little hesitant about this update, as the middle infield positions are the biggest movers (higher positional adjustments, meaning less positional credit). I have no qualms for second base, but the shortstop PADJ is out of line with the other position adjustments widely in use and feels a bit high to me. But there are some decent points to be made in favor of offensive adjustments, and I’ll have a bit more on this topic in general below.<br /><br />That was the mechanics of the calculations; now I'll twist myself into knots trying to justify them. If you only care about the how and not the why, stop reading now. <br /><br />The first thing that should be covered is the philosophical position behind the statistics posted here. They fall on the continuum of ability and value in what I have called "performance". Performance is a technical-sounding way of saying "Whatever arbitrary combination of ability and value I prefer".<br /><br />With respect to park adjustments, I am not interested in how any particular player is affected, so there is no separate adjustment for lefties and righties for instance. The park factor is an attempt to determine how the park affects run scoring rates, and thus the win value of runs.<br /><br />I apply the park factor directly to the player's statistics, but it could also be applied to the league context. The advantage to doing it my way is that it allows you to compare the component statistics (like Runs Created or OBA) on a park-adjusted basis. The drawback is that it creates a new theoretical universe, one in which all parks are equal, rather than leaving the player grounded in the actual context in which he played and evaluating how that context (and not the player's statistics) was altered by the park.<br /><br />The good news is that the two approaches are essentially equivalent; in fact, they are precisely equivalent if you assume that the Runs Per Win factor is equal to the RPG. Suppose that we have a player in an extreme park (PF = 1.15, approximately like Coors Field pre-humidor) who has an 8 RG before adjusting for park, while making 350 outs in a 4.5 N league. The first method of park adjustment, the one I use, converts his value into a neutral park, so his RG is now 8/1.15 = 6.957. We can now compare him directly to the league average:<br /><br />RAA = (6.957 - 4.5)*350/25.5 = +33.72<br /><br />The second method would be to adjust the league context. If N = 4.5, then the average player in this park will create 4.5*1.15 = 5.175 runs. Now, to figure RAA, we can use the unadjusted RG of 8:<br /><br />RAA = (8 - 5.175)*350/25.5 = +38.77<br /><br />These are not the same, as you can obviously see. The reason for this is that they take place in two different contexts. The first figure is in a 9 RPG (2*4.5) context; the second figure is in a 10.35 RPG (2*4.5*1.15) context. Runs have different values in different contexts; that is why we have RPW converters in the first place. If we convert to WAA (using RPW = RPG, which is only an approximation, so it's usually not as tidy as it appears below), then we have:<br /><br />WAA = 33.72/9 = +3.75<br />WAA = 38.77/10.35 = +3.75<br /><br />Once you convert to wins, the two approaches are equivalent. The other nice thing about the first approach is that once you park-adjust, everyone in the league is in the same context, and you can dispense with the need for converting to wins at all. You still might want to convert to wins, and you'll need to do so if you are comparing the 2015 players to players from other league-seasons (including between the AL and NL in the same year), but if you are only looking to compare Christian Yelich to Matt Carpenter, it's not necessary. WAR is somewhat ubiquitous now, but personally I prefer runs when possible--why mess with decimal points if you don't have to? <br /><br />The park factors used to adjust player stats here are run-based. Thus, they make no effort to project what a player "would have done" in a neutral park, or account for the difference effects parks have on specific events (walks, home runs, BA) or types of players. They simply account for the difference in run environment that is caused by the park (as best I can measure it). As such, they don't evaluate a player within the actual run context of his team's games; they attempt to restate the player's performance as an equivalent performance in a neutral park.<br /><br />I suppose I should also justify the use of sqrt(PF) for adjusting component statistics. The classic defense given for this approach relies on basic Runs Created--runs are proportional to OBA*SLG, and OBA*SLG/PF = OBA/sqrt(PF)*SLG/sqrt(PF). While RC may be an antiquated tool, you will find that the square root adjustment is fairly compatible with linear weights or Base Runs as well. I am not going to take the space to demonstrate this claim here, but I will some time in the future. <br /><br />Many value figures published around the sabersphere adjust for the difference in quality level between the AL and NL. I don't, but this is a thorny area where there is no right or wrong answer as far as I'm concerned. I also do not make an adjustment in the league averages for the fact that the overall NL averages include pitcher batting and the AL does not (not quite true in the era of interleague play, but you get my drift). <br /><br />The difference between the leagues may not be precisely calculable, and it certainly is not constant, but it is real. If the average player in the AL is better than the average player in the NL, it is perfectly reasonable to expect the average AL player to have more RAR than the average NL player, and that will not happen without some type of adjustment. On the other hand, if you are only interested in evaluating a player relative to his own league, such an adjustment is not necessarily welcome.<br /><br />The league argument only applies cleanly to metrics baselined to average. Since replacement level compares the given player to a theoretical player that can be acquired on the cheap, the same pool of potential replacement players should by definition be available to the teams of each league. One could argue that if the two leagues don't have equal talent at the major league level, they might not have equal access to replacement level talent--except such an argument is at odds with the notion that replacement level represents talent that is truly "freely available".<br /><br />So it's hard to justify the approach I take, which is to set replacement level relative to the average runs scored in each league, with no adjustment for the difference in the leagues. The best justification is that it's simple and it treats each league as its own universe, even if in reality they are connected.<br /><br />The replacement levels I have used here are very much in line with the values used by other sabermetricians. This is based both on my own "research", my interpretation of other's people research, and a desire to not stray from consensus and make the values unhelpful to the majority of people who may encounter them.<br /><br />Replacement level is certainly not settled science. There is always going to be room to disagree on what the baseline should be. Even if you agree it should be "replacement level", any estimate of where it should be set is just that--an estimate. Average is clean and fairly straightforward, even if its utility is questionable; replacement level is inherently messy. So I offer the average baseline as well.<br /><br />For position players, replacement level is set at 73% of the positional average RG (since there's a history of discussing replacement level in terms of winning percentages, this is roughly equivalent to .350). For starting pitchers, it is set at 128% of the league average RA (.380), and for relievers it is set at 111% (.450). <br /><br />I am still using an analytical structure that makes the comparison to replacement level for a position player by applying it to his hitting statistics. This is the approach taken by Keith Woolner in VORP (and some other earlier replacement level implementations), but the newer metrics (among them Rally and Fangraphs' WAR) handle replacement level by subtracting a set number of runs from the player's total runs above average in a number of different areas (batting, fielding, baserunning, positional value, etc.), which for lack of a better term I will call the subtraction approach.<br /><br />The offensive positional adjustment makes the inherent assumption that the average player at each position is equally valuable. I think that this is close to being true, but it is not quite true. The ideal approach would be to use a defensive positional adjustment, since the real difference between a first baseman and a shortstop is their defensive value. When you bat, all runs count the same, whether you create them as a first baseman or as a shortstop. <br /><br />That being said, using "replacement hitter at position" does not cause too many distortions. It is not theoretically correct, but it is practically powerful. For one thing, most players, even those at key defensive positions, are chosen first and foremost for their offense. Empirical research by Keith Woolner has shown that the replacement level hitting performance is about the same for every position, relative to the positional average.<br /><br />Figuring what the defensive positional adjustment should be, though, is easier said than done. Therefore, I use the offensive positional adjustment. So if you want to criticize that choice, or criticize the numbers that result, be my guest. But do not claim that I am holding this up as the correct analytical structure. I am holding it up as the most simple and straightforward structure that conforms to reality reasonably well, and because while the numbers may be flawed, they are at least based on an objective formula that I can figure myself. If you feel comfortable with some other assumptions, please feel free to ignore mine.<br /><br />That still does not justify the use of HRAR--hitting runs above replacement--which compares each hitter, regardless of position, to 73% of the league average. Basically, this is just a way to give an overall measure of offensive production without regard for position with a low baseline. It doesn't have any real baseball meaning. <br /><br />A player who creates runs at 90% of the league average could be above-average (if he's a shortstop or catcher, or a great fielder at a less important fielding position), or sub-replacement level (DHs that create 3.5 runs per game are not valuable properties). Every player is chosen because his total value, both hitting and fielding, is sufficient to justify his inclusion on the team. HRAR fails even if you try to justify it with a thought experiment about a world in which defense doesn't matter, because in that case the absolute replacement level (in terms of RG, without accounting for the league average) would be much higher than it is currently. <br /><br />The specific positional adjustments I use are based on 2002-2011 data. I stick with them because I have not seen compelling evidence of a change in the degree of difficulty or scarcity between the positions between now and then, and because I think they are fairly reasonable. The positions for which they diverge the most from the defensive position adjustments in common use are 2B, 3B, and CF. Second base is considered a premium position by the offensive PADJ (.97), while third base and center field have similar adjustments in the opposite direction (1.03 and 1.02).<br /><br />Another flaw is that the PADJ is applied to the overall league average RG, which is artificially low for the NL because of pitcher's batting. When using the actual league average runs/game, it's tough to just remove pitchers--any adjustment would be an estimate. If you use the league total of runs created instead, it is a much easier fix.<br /><br />One other note on this topic is that since the offensive PADJ is a stand-in for average defensive value by position, ideally it would be applied by tying it to defensive playing time. I have done it by outs, though.<br /><br />The reason I have taken this flawed path is because 1) it ties the position adjustment directly into the RAR formula rather than leaving it as something to subtract on the outside and more importantly 2) there’s no straightforward way to do it. The best would be to use defensive innings--set the full-time player to X defensive innings, figure how Derek Jeter’s innings compared to X, and adjust his PADJ accordingly. Games in the field or games played are dicey because they can cause distortion for defensive replacements. Plate Appearances avoid the problem that outs have of being highly related to player quality, but they still carry the illogic of basing it on offensive playing time. And of course the differences here are going to be fairly small (a few runs). That is not to say that this way is preferable, but it’s not horrible either, at least as far as I can tell.<br /><br />To compare this approach to the subtraction approach, start by assuming that a replacement level shortstop would create .86*.73*4.5 = 2.825 RG (or would perform at an overall level of equivalent value to being an average fielder at shortstop while creating 2.825 runs per game). Suppose that we are comparing two shortstops, each of whom compiled 600 PA and played an equal number of defensive games and innings (and thus would have the same positional adjustment using the subtraction approach). Alpha made 380 outs and Bravo made 410 outs, and each ranked as dead-on average in the field.<br /><br />The difference in overall RAR between the two using the subtraction approach would be equal to the difference between their offensive RAA compared to the league average. Assuming the league average is 4.5 runs, and that both Alpha and Bravo created 75 runs, their offensive RAAs are:<br /><br />Alpha = (75*25.5/380 - 4.5)*380/25.5 = +7.94<br /><br />Similarly, Bravo is at +2.65, and so the difference between them will be 5.29 RAR.<br /><br />Using the flawed approach, Alpha's RAR will be:<br /><br />(75*25.5/380 - 4.5*.73*.86)*380/25.5 = +32.90<br /><br />Bravo's RAR will be +29.58, a difference of 3.32 RAR, which is two runs off of the difference using the subtraction approach.<br /><br />The downside to using PA is that you really need to consider park effects if you do, whereas outs allow you to sidestep park effects. Outs are constant; plate appearances are linked to OBA. Thus, they not only depend on the offensive context (including park factor), but also on the quality of one's team. Of course, attempting to adjust for team PA differences opens a huge can of worms which is not really relevant; for now, the point is that using outs for individual players causes distortions, sometimes trivial and sometimes bothersome, but almost always makes one's life easier.<br /><br />I do not include fielding (or baserunning outside of steals, although that is a trivial consideration in comparison) in the RAR figures--they cover offense and positional value only). This in no way means that I do not believe that fielding is an important consideration in player evaluation. However, two of the key principles of these stat reports are 1) not incorporating any data that is not readily available and 2) not simply including other people's results (of course I borrow heavily from other people's methods, but only adapting methodology that I can apply myself).<br /><br />Any fielding metric worth its salt will fail to meet either criterion--they use zone data or play-by-play data which I do not have easy access to. I do not have a fielding metric that I have stapled together myself, and so I would have to simply lift other analysts' figures. <br /><br />Setting the practical reason for not including fielding aside, I do have some reservations about lumping fielding and hitting value together in one number because of the obvious differences in reliability between offensive and fielding metrics. In theory, they absolutely should be put together. But in practice, I believe it would be better to regress the fielding metric to a point at which it would be roughly equivalent in reliability to the offensive metric.<br /><br />Offensive metrics have error bars associated with them, too, of course, and in evaluating a single season's value, I don't care about the vagaries that we often lump together as "luck". Still, there are errors in our assessment of linear weight values and players that collect an unusual proportion of infield hits or hits to the left side, errors in estimation of park factor, and any number of other factors that make their events more or less valuable than an average event of that type. <br /><br />Fielding metrics offer up all of that and more, as we cannot be nearly as certain of true successes and failures as we are when analyzing offense. Recent investigations, particularly by Colin Wyers, have raised even more questions about the level of uncertainty. So, even if I was including a fielding value, my approach would be to assume that the offensive value was 100% reliable (which it isn't), and regress the fielding metric relative to that (so if the offensive metric was actually 70% reliable, and the fielding metric 40% reliable, I'd treat the fielding metric as .4/.7 = 57% reliable when tacking it on, to illustrate with a simplified and completely made up example presuming that one could have a precise estimate of nebulous "reliability").<br /><br />Given the inherent assumption of the offensive PADJ that all positions are equally valuable, once RAR has been figured for a player, fielding value can be accounted for by adding on his runs above average relative to a player at his own position. If there is a shortstop that is -2 runs defensively versus an average shortstop, he is without a doubt a plus defensive player, and a more valuable defensive player than a first baseman who was +1 run better than an average first baseman. Regardless, since it was implicitly assumed that they are both average defensively for their position when RAR was calculated, the shortstop will see his value docked two runs. This DOES NOT MEAN that the shortstop has been penalized for his defense. The whole process of accounting for positional differences, going from hitting RAR to positional RAR, has benefited him.<br /><br />I've found that there is often confusion about the treatment of first baseman and designated hitters in my PADJ methodology, since I consider DHs as in the same pool as first baseman. The fact of the matter is that first baseman outhit DH. There are any number of potential explanations for this; DHs are often old or injured, players hit worse when DHing than they do when playing the field, etc. This actually helps first baseman, since the DHs drag the average production of the pool down, thus resulting in a lower replacement level than I would get if I considered first baseman alone.<br /><br />However, this method does assume that a 1B and a DH have equal defensive value. Obviously, a DH has no defensive value. What I advocate to correct this is to treat a DH as a bad defensive first baseman, and thus knock another five or so runs off of his RAR for a full-time player. I do not incorporate this into the published numbers, but you should keep it in mind. However, there is no need to adjust the figures for first baseman upwards --the only necessary adjustment is to take the DHs down a notch. <br /><br />Finally, I consider each player at his primary defensive position (defined as where he appears in the most games), and do not weight the PADJ by playing time. This does shortchange a player like Ben Zobrist (who saw significant time at a tougher position than his primary position), and unduly boost a player like Buster Posey (who logged a lot of games at a much easier position than his primary position). For most players, though, it doesn't matter much. I find it preferable to make manual adjustments for the unusual cases rather than add another layer of complexity to the whole endeavor.<br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vTaQg3nAA_tGoNdBKcsrZSFHdA93ufeoJqYbW02bJqYJf71LW-9ZCzVzMOUe15hq09tkaQWoJ2pPQVE/pub?output=html">2018 League</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vSZatkqGzxneYwjOQTGWj4Kcb3pRz3GcogFZz1VRAGWQenQ16MM32ecxG8KjZAQDMRs5lrukKLAWW49/pub?output=html">2018 Park Factors</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vRMiROBj_Ahf3ZuiU-3OZgeA89isrv_hixWMQeNil4p5QU731-ZZitmgwAUOELGpFmh5VboR5LynbOS/pub?output=html">2018 Team</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vTt0LeIrM1bIvPY1CGbSHS2dILWea3ykmfPw42FGi06cizDFT2Pz-5TWLcszmBN9qLbBEiJMVRChCup/pub?output=html">2018 Team Defense</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vQzJ7glb-W_tig3v3G39NejcqwIAeU3qnajGXKY_gzDaRaSOARrW4Zt6nm0N36IZL0EgYjLezIw4TEk/pub?output=html">2018 Team Offense</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vR6nEhkq1Tx_BoSK2dMSK76AcqOBrR7mhPDJePH68-OzBPWsPKOVg44P9P29rHeryq8Fjvfq8dr0kTl/pub?output=html">2018 AL Relievers</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vTd6v7TylIIKhtTwm8qeoBWY1KyQ5C3WCRSDG2ph9D00YnT5n-7o-djwr9mFBiYo8yeTCQl_31907b7/pub?output=html">2018 NL Relievers</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vQo2p8b-H-cb69VPuVOyBMExPOSonRAUEwFqJpUbdSCAgENBrRL-671yF2voOIAIWxoXIGlkQNzis4m/pub?output=html">2018 AL Starters</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vTbJse1wdTr-FjoYwY6i9pVRWfVKNplWoQUcllTcJqyHogsPdJcCz0Y1Fl7hxcJZyGc9DlCSct8yOhe/pub?output=html">2018 NL Starters</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vRvO1qrhfeBsPhX4nXI7ia533ht1fwAdxzPW28GhMdSAZG63Ehp-5og5ws-YPTKQKTIr3-lJqPRY4R3/pub?output=html">2018 AL Hitters</a><br /><br /><a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vQ-kbPs35IY6dVt7KY2N4HvhsNaGOiLfRKioIf2qVGYHa2NEJ5kn2ASXTL3iwpfDwytYie_ax6FVcmq/pub?output=html">2018 NL Hitters</a>phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-86412208128464157742018-10-01T20:26:00.000-04:002018-10-01T20:26:34.045-04:00Crude Playoff Odds -- 2018These are very simple playoff odds, based on my crude rating system for teams using an equal mix of W%, EW% (based on R/RA), PW% (based on RC/RCA), and 69 games of .500. They account for home field advantage by assuming a .500 team wins 54.2% of home games (major league average 2006-2015). They assume that a team's inherent strength is constant from game-to-game. They do not generally account for any number of factors that you would actually want to account for if you were serious about this, including but not limited to injuries, the current construction of the team rather than the aggregate seasonal performance, pitching rotations, estimated true talent of the players, etc.<br /><br />The CTRs that are fed in are:<br /><br /><a href="https://2.bp.blogspot.com/-8XABHpmben8/W7K5EKfXfsI/AAAAAAAACkk/BlvIXsWVFr83z6ticTaE27xr_Va9JJoAwCLcBGAs/s1600/podd18a.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-8XABHpmben8/W7K5EKfXfsI/AAAAAAAACkk/BlvIXsWVFr83z6ticTaE27xr_Va9JJoAwCLcBGAs/s400/podd18a.JPG" width="400" height="336" data-original-width="238" data-original-height="200" /></a><br /><br />Wilcard game odds (the least useful since the pitching matchups aren’t taken into account, and that matters most when there is just one game):<br /><br /><a href="https://2.bp.blogspot.com/-8_D38aKp8gs/W7K5RXn1ksI/AAAAAAAACko/qT09Bx8_Hgk5AGY_XWfXF_fDgucqlzllQCLcBGAs/s1600/podd18b.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-8_D38aKp8gs/W7K5RXn1ksI/AAAAAAAACko/qT09Bx8_Hgk5AGY_XWfXF_fDgucqlzllQCLcBGAs/s400/podd18b.JPG" width="400" height="79" data-original-width="278" data-original-height="55" /></a><br /><br />LDS:<br /><br /><a href="https://3.bp.blogspot.com/-I4JzHlkQFx4/W7K5sUwyBtI/AAAAAAAACkw/ZWgyPzwPaE8IVkgRbmvKFwNgFA7_AO-KQCLcBGAs/s1600/podd18c.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-I4JzHlkQFx4/W7K5sUwyBtI/AAAAAAAACkw/ZWgyPzwPaE8IVkgRbmvKFwNgFA7_AO-KQCLcBGAs/s400/podd18c.JPG" width="400" height="93" data-original-width="548" data-original-height="127" /></a><br /><br />LCS:<br /><br /><a href="https://1.bp.blogspot.com/-b_zvnknJ3Gw/W7K58ngw77I/AAAAAAAACk4/-kK6N8G_C88DeafubQeLZL2oQ8M36aq7QCLcBGAs/s1600/podd18d.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-b_zvnknJ3Gw/W7K58ngw77I/AAAAAAAACk4/-kK6N8G_C88DeafubQeLZL2oQ8M36aq7QCLcBGAs/s400/podd18d.JPG" width="400" height="172" data-original-width="548" data-original-height="236" /></a><br /><br />WS:<br /><br /><a href="https://4.bp.blogspot.com/-r_iSHBViTHc/W7K6OuF3xCI/AAAAAAAAClA/pVlPKpg_WxUIz7DZ9K3-jzLpI0E5oYzAwCLcBGAs/s1600/podd18e.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-r_iSHBViTHc/W7K6OuF3xCI/AAAAAAAAClA/pVlPKpg_WxUIz7DZ9K3-jzLpI0E5oYzAwCLcBGAs/s400/podd18e.JPG" width="400" height="342" data-original-width="549" data-original-height="470" /></a><br /><br />Because I set this spreadsheet up when home field advantage went to a particular league (as it has been for the entire history of the World Series prior to this year), all of the AL teams are listed as the home team. But the probabilities all consider which team would actually have the home field advantage in each matchup. Incidentally, the first tiebreaker after overall record is intra-divisional record, which if anything should favor the team with the worse record but would amusingly give Cleveland home field advantage in a series against Los Angeles or Colorado.<br /><br />Putting it all together:<br /><br /><a href="https://4.bp.blogspot.com/-g04joo3gwxw/W7K6ocElKVI/AAAAAAAAClI/OD9owVkWh8Ms4kdgB2Emv8W-wOQGOxQtQCLcBGAs/s1600/podd18f.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-g04joo3gwxw/W7K6ocElKVI/AAAAAAAAClI/OD9owVkWh8Ms4kdgB2Emv8W-wOQGOxQtQCLcBGAs/s400/podd18f.JPG" width="400" height="206" data-original-width="393" data-original-height="202" /></a>phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-25837257236214134972018-09-26T08:58:00.000-04:002018-09-26T08:58:14.267-04:00Enby Distribution, pt. 8: Cigol at the Extremes--Pythagorean ExponentAmong the possible choices, the Pythagorean family of W% estimators is by far the dominant species in the win estimator genus. While I’m sure that anyone reading this is aware, just to be sure, the Pythagorean family takes the form:<br /><br />W% = R^x/(R^x + RA^x)<br /><br />While as a co-purveyor of one of the variants I am not exactly unbiased in this matter, here are a few reasons as to why this family dominates sabermetric usage:<br /><br />1. The Bill James effect--The number one reason why Pythagorean estimators are widely used is because of Bill James. Had James used a RPW method as Pete Palmer did, I would still be writing soapbox-y blog posts about why some other form made more sense (as I still do from time to time on the matter on run estimation, in which Palmer’s form has finally won the day over James). Had James not used Pythagorean, it is possible that whatever non-linear win estimator was widely used in sabermetrics (and one doubtlessly would have been developed) would take on a different form than Pythagorean.<br /><br />2. Naturally bounded at zero and one--Winning percentage is by its nature bounded at zero and one. The Pythagorean form inherently captures this reality. Had the trail in this area been blazed by statisticians rather than James, we might have gotten a logit or probit regression equation that did the same, just to name a couple of common functions that also are bounded by zero and one. In order to have a theoretically justifiable formula, the bounds must be observed, and Pythagorean is a fairly straight forward way to do it.<br /><br />3. Non-linearity reflects reality--It can be demonstrated even with “extreme” but actual major league teams that the run-to-wins relationship is non-linear. Pythagorean may not capture this perfectly, but it seems right to account for it in some way. This is one reason why people still cling to Runs Created after it was shown to be inaccurate (particularly before Base Runs, which fills the void, had been popularized)--people inherently realize that run scoring is a non-linear process, and are more comfortable with a method that recognizes that, even if it captures the effect in a very flawed manner.<br /><br />James’ original versions used fixed exponents (x = 2 , refined to x = 1.83) but the <a href="https://www.baseballprospectus.com/news/article/342/revisiting-the-pythagorean-theorem-putting-bill-james-pythagorean-theorem-to-the-test/">breakthrough research</a> on factoring scoring context into the equation was performed by Clay Davenport and Keith Woolner at <u>Baseball Prospectus</u>, who found that an exponent x = 1.5*log(RPG) + .45 worked well when RPG was greater than 4. This variant is known as Pythagenport. A couple years later, David Smyth realized that the minimum possible RPG was one, since a game will continue indefinitely until one side wins (which requires scoring one run), and that if a team had a RPG of one, their exponent would have to be equal to one. Based on this insight, Smyth and I were able to both independently find a form that returned an estimate of x = 1 at RPG = 1 and also estimates similar to Pythagenport for normal teams. This form has become known as Pythagenpat.<br /><br />Let’s begin by trying to find an equation to estimate the exponent based on RPG from our Cigol estimates. In order to do this, we first need to be able to solve for the exponent x from W% and Run Ratio:<br /><br />W/L = (R/RA)^x is a restatement of the generic Pythagorean equation W% = R^x/(R^x + RA^x)<br /><br />thus x = log(W/L)/log(R/RA)<br /><br />which when working with W% can be expressed equivalently as:<br /><br />x = log(W%/(1 - W%))/log(R/RA)<br /><br />We can now attempt to fit a regression line to predict x from RPG based on the Cigol estimates. For illustration, I’ll start with the full data discussed last time rather than what I’ll call the limited set (the limited set is limited to normal-ish major league teams--W%s between .3 and .7 with R/G and RA/G between 3 and 7):<br /><br /><a href="https://4.bp.blogspot.com/-RnVYwwjqZGQ/W6uB728VWFI/AAAAAAAACkA/DARCDyMHRrQ6uWUsCSg_oeXUv4bjRrjpACLcBGAs/s1600/cigol8a.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-RnVYwwjqZGQ/W6uB728VWFI/AAAAAAAACkA/DARCDyMHRrQ6uWUsCSg_oeXUv4bjRrjpACLcBGAs/s400/cigol8a.JPG" width="400" height="269" data-original-width="1219" data-original-height="819" /></a><br /><br />This graph is not very helpful, but one can see the general shape, which can be approximated by a logarithmic curve as noted by Davenport and Woolner. I’ve gone ahead and included the logarithmic regression line per Excel, but you’ll note that it uses natural log rather than base-10 log as in Pythagenport. Running a regression on log(RPG) results in this equation:<br /><br />x = 1.346*log(RPG) + .596<br /><br />That is a relatively decent match for Pythagenport--the two equations produce essentially the same exponent at normal RPGs (for example, for 9 RPG the Pythagenport exponent is 1.881 and the regression exponent is 1.880). At lower RPGs, the higher intercept in the regression equation allows the estimate to be closer to one at the known point, but it still falls well short of matching the known point value of one.<br /><br />Just to be complete, we can also look at how this relationship plays out in the limited set:<br /><br /><a href="https://2.bp.blogspot.com/-mCAyKyzC2RA/W6uCCa1jjHI/AAAAAAAACkE/w5Pe0QQHz10WP9opFeOiBYTXq9d9s1PYQCLcBGAs/s1600/cigol8b.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-mCAyKyzC2RA/W6uCCa1jjHI/AAAAAAAACkE/w5Pe0QQHz10WP9opFeOiBYTXq9d9s1PYQCLcBGAs/s400/cigol8b.JPG" width="400" height="265" data-original-width="1235" data-original-height="817" /></a><br /><br />Here, the base-10 equation is:<br /><br />x = 1.324*log(RPG) + .580<br /><br />One thing that is interesting to note is that in the last installment, when we focused on estimating Runs Per Win, the regression equations were quite different depending on which dataset was being used. Here, whether looking at the full scope of teams or the limited set, the regression equations are quite close. This implies that the manner in which we are expressing W% (Pythagorean) is closer to capturing the real relationship between scoring context and W% than is the RPW model. If there existed a perfect model, it would have the same equation regardless of which data was used to calibrate it. While Pythagorean is not a perfect model, it exhibits a consistency that the run differential-based model does not.<br /><br />As the graphs illustrate, the relationship between RPG and x appears to follow a logarithmic curve, and so it is quite understandable that Davenport and Woolner chose this form. However, Smyth and I both found that a power regression also provided a nice fit for the curve (for example, this is the result for the all teams Cigol estimate):<br /><br /><a href="https://4.bp.blogspot.com/-ApMpK5kvFt0/W6uCLKNf3iI/AAAAAAAACkI/5REMCkm8qWAbCjbIHPlytfd4k2UlWWOVgCLcBGAs/s1600/cigol8c.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-ApMpK5kvFt0/W6uCLKNf3iI/AAAAAAAACkI/5REMCkm8qWAbCjbIHPlytfd4k2UlWWOVgCLcBGAs/s400/cigol8c.JPG" width="400" height="263" data-original-width="1217" data-original-height="799" /></a><br /><br />The power estimate does an excellent job of matching the Cigol-implied exponent at very low levels of RPG. Mathematically, it works well for this task since one raised to any power is equal to one. Since the logarithm of one is zero, the logarithmic form would only be able to match reality at 1 RPG by setting the intercept equal to one, which would distort results at higher RPG values. <br /><br />As RPG grows large, though, the power model begins overestimating the exponent, while the logarithmic model provides a tighter fit. From a practical standpoint, performance at low levels of RPG is much more important than performance at high levels of RPG, since extremely low RPGs are much more likely to exist in the majors. As a stickler for theoretical accuracy, though, it is a bit troubling to see that the power regression and Cigol are not a great match at the right tail.<br /><br />If we restrict the sample to the limited set, we find:<br /><br /><a href="https://3.bp.blogspot.com/-AWXZNbnK_pM/W6uCQ46k3YI/AAAAAAAACkQ/CDO61x-viZIJcf_A0teCIyqMIjQAxcOXgCLcBGAs/s1600/cigol8d.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-AWXZNbnK_pM/W6uCQ46k3YI/AAAAAAAACkQ/CDO61x-viZIJcf_A0teCIyqMIjQAxcOXgCLcBGAs/s400/cigol8d.JPG" width="400" height="262" data-original-width="1219" data-original-height="798" /></a><br /><br />Here the power model also provides a decent fit, although it appears to be overfitted to moderate RPG levels more so than the version based on the full dataset. <br /><br />It should be noted that the regression includes a multiplicative coefficient (.979 for the full dataset) which serves to dampen the effect of the exponent. However, any multiplier other than one will result in a non-one result at one RPG, which means that Smyth’s fundamental insight that led to Pythagenpat is lost. I believe that when I originally came up with Pythagenpat, I simply ignored the multiplicative coefficient from the regression and made no offsetting adjustment. <br /><br />While neither approach is precise mathematically, another crude option is to modify the exponent to force the x estimate to match the with-coefficient equation at a certain RPG. At the normal 9 RPG level, the full dataset equation above suggests a Pythagorean exponent of 1.863. With a multiplier of 1, you would need the following equation to match that:<br /><br />x = RPG^.2831<br /><br />Such a result fits comfortably within our expectation for Pythagenpat.phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-19960120111874475812018-07-21T15:37:00.001-04:002018-07-21T15:37:48.468-04:00A Mildly Pleasant SurpriseGiven the low expectations that this author held for the 2018 Buckeyes, the season that was can be seen as quite successful. Rebounding from a dreadful 2017, OSU went 36-23 and 14-10 in the Big Ten. Although finishing in the exact middle of the conference (seventh), strength of schedule and a solid non-conference showing that included wins over Southern Miss and Coastal Carolina earned Ohio a second NCAA tournament bid under eighth-year coach Greg Beals. In the tournament, OSU’s bullpen buckled in the opener against South Carolina (8-3 loss), and a mid-game delay bifurcated the 4-3 thirteen inning loss to UNC-Wilmington that ended the season.<br /><br />While the season itself was a success, the 2019 roster would appear to have a number of question marks, and in this corner, while rooting against the team is never an option, further job security for Beals is problematic.<br /><br />Coming into the season it appeared that starting pitching would be a major issue after a disastrous 2017. It still was; the team succeeded despite, not because, of its starting pitching. Junior Connor Curlis was the only reliable starter, turning in 8 RAA while tying for the team lead with 17 appearances and 16 starts. Curlis was drafted by Cincinnati in the 24th round. Tying Curlis in appearances/starts and pitching 92 innings to lead him by three was classmate Ryan Feltner. Despite possessing good enough stuff to be a fourth round pick of Colorado, Feltner was -13 RAA with a 6.06 eRA that suggested he really pitched that poorly. Feltner’s ERA was held down as he allowed a whopping 21 unearned runs that drove his RA just over two runs higher than his ERA. (As an aside, OSU did not field well at all, last in the Big Ten with a .924 mFA and .647 DER). The other weekend starter, senior Adam Niemeyer, fared even worse at -15 RAA.<br /><br />The bullpen was the saving grace, led by senior Seth Kinker who capped his career as one of the finest relievers in school history by walking just five batters in 63 innings, fanning 60, and leading the team with 13 RAA. The only real blemish on Kinker’s season was that he was unable to hold the lead in the NCAA opener against the Gamecocks. OSU also got solid work from senior Austin Woodby (9 RAA in 45 innings) and sophomore Jake Vance (3 RAA in 36 innings). Senior Kyle Michalik was slightly below average but still reliable, while classmate and erstwhile closer Yianni Pavlopulous somehow matched him at -2 RAA despite a ghastly 28/33 K/W ratio over 36 innings. His career came to an unfairly ignominious end when he was walked off by UNCW. Beals always relies on lefty specialists but sophomore Andrew Magno was injured early and freshman Griffin Smith (-7 RAA in 32 innings over 25 appearances) wasn’t ready for prime time. Junior Thomas Waning, who showed promise in 2017 as Michalik’s heir apparent as a sidearming middle reliever, was rocked for 18 runs in 16 innings.<br /><br />It was offense that drove OSU’s success, as the Bucks averaged 6.5 runs per game (good for second in the conference). Junior Jacob Barnwell was again productive enough (-3 RAA) given his solid catch/throw game, and Colorado concurred, plucking him in the 22nd round of the draft. Freshman Dillon Dingler started the year as his backup before eventually becoming the starting center fielder (that’s nothing, as prior backstop Jalen Washington went to shortstop between 2016 - 2017); his .244/.325/.369 line definitely understates the future that the coaching staff sees for him. Junior Andrew Fishel got only 39 PA and slugged just .294, leaving backstop as a huge question mark for 2019.<br /><br />Dingler wasn’t the only Buckeye who played at multiple positions, as the defensive alignment was in flux for much of the season. After getting hurt in his first season, senior JUCO transfer Noah McGowan was a monster, mashing .351/.433/.561 for 25 RAA, one of the best offensive outbursts by a Buckeye in recent years. McGowan played primarily first, third, and DH, where classmate Bo Coolen did not have as impressive of a second year bounce with an ISO of just .086 en route to -3 RAA. Junior JUCO transfer Kobie Foppe started at short but eventually moved to second coinciding with turning around his season at the plate. Foppe filled the leadoff role perfectly with a .335/.432/.385 line that produced 11 RAA. He took the spot lost by junior Brady Cherry, who failed to build on a promising sophomore season (.260/.336/.410 in 2017 to .226/.321/.365 in 2018). <br /><br />Sophomore Connor Pohl started at third but eventually swapped corners with McGowan; his production was underwhelming for the latter role (.279/.377/.393 for 3 RAA) but still quite playable. Foppe’s replacement at short was sophomore Noah West, who improved on his 2017 offensive showing by taking walks but still has much room for improvement in other areas (.223/.353/.292). Junior Nate Romans was good in his utility role (.236/.360/.431 over 91 PA).<br /><br />Senior Tyler Cowles also followed the Noah McGowan career path (although Cowles really struggled in 2017 as opposed to being derailed by injuries); Cowles was second on the team with 13 RAA from a .322/.381/.582 line. The aforementioned Dingler took centerfield after JUCO transfer Malik Jones struggled mightily outside of patience (.245/.383/.286 in 63 PA). Sophomore Dominic Canzone took a slight step back but was still excellent (.323/.396/.447 for 11 RAA) and will be counted on to anchor the 2019 attack.<br /><br />It’s too early to draw many conclusions about the outlook for 2019, especially given Beals’ penchant for supplementing his roster through the JUCO ranks. But it is striking to note how the entire already mediocre starting rotation and most of the high-performing relievers are gone, along with the starting catcher and two of the top three offensive performers at the corners. As has usually been the case through his tenure, Beals will look to his modest past successes to ward off the heat that can result from a roster short on homegrown replacements. At a school where the demand for winning can sometimes be cutthroat, Beals has survived almost a decade by skirting by doing the bare minimum needed. <br />phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-79604386149974536462018-05-23T09:35:00.000-04:002018-05-23T23:19:47.775-04:00Enby Distribution, pt. 7: Cigol at the Extremes--Runs Per WinNow that you presumably have some confidence in Cigol’s ability to do something fairly easy by the standards of classical sabermetrics, you may have some more interest in what Cigol says about a much harder question--how does W% vary by runs scored and runs allowed in extreme situations? This is the area in which Cigol (whether powered by Enby or any other run distribution model) has the potential to enhance our understanding of the relationship between runs and wins. Unfortunately, it is difficult to tell whether these results are reasonable, since we don’t have empirical data regarding extreme teams. If Cigol deviates from Pythagenpat, we won’t know which one to trust. Throughout this post, I am going to discuss these issues as if Cigol is in fact the “true” or “correct” estimate. This is simply for the sake of discussion--it would be unwieldy to have to issue a disclaimer every time we compare Cigol and Pythagenpat. Please note that I am not asserting that this is demonstrably the case.<br /><br />For a first look at how the two compare at the extreme, let’s assume that a team’s runs scored are fixed at an average 4.5, and look at their estimated W% at each interval of .5 in runs allowed from 1-15 RA/G using Cigol and Pythagenpat with three different exponents (.27, .28, and .29; I’ve always called this Pythagenpat constant z and will stick with that notation here, hoping that it will not be confused with the Enby z parameter):<br /><br /><a href="https://4.bp.blogspot.com/-IgUNAsJB8vQ/WwTc1SUaZ0I/AAAAAAAAChc/AaMj1YN4EZYfif_lrXL2ZUU9kQA75BgxgCLcBGAs/s1600/enby8-2.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-IgUNAsJB8vQ/WwTc1SUaZ0I/AAAAAAAAChc/AaMj1YN4EZYfif_lrXL2ZUU9kQA75BgxgCLcBGAs/s400/enby8-2.JPG" width="240" height="400" data-original-width="307" data-original-height="511" /></a><br /><br />Just eyeballing the data, two things are evident. The first is that Pythagenpat with any of the exponent choices is a fairly decent match at any RA value. The largest differences come at the extremes, as you’d expect, but the maximum difference is .013 between the Cigol and z = .27 estimate for the 4.5 R/15 RA team. This is a difference of a little over 2 wins over the course of a 162 game schedule, which isn’t terrible since it represents close to the maximum discrepancy. While I have not figured Enby parameters past 15 RG, at some point the differences would begin to decline as both Cigol and Pythagenpat estimates converge at a 1.000 W%. For comparison, a Pythagorean fixed exponent of 1.83 predicts a W% of .099 for the 4.5/15 team, almost 8 wins/162 off of the Cigol estimate.<br /><br />The second thing that becomes apparent is that Cigol implies that as scoring increases, the Pythagenpat z constant is not fixed. For the lowest RPGs on the table (1-3 RA/G, which when combined with the 4.5 R/G is 5.5-7.5 RPG), .27 performs the best relative to Cigol. Once we cross 3.5 RA/G, .28 performs best, and maintains that advantage from 3.5-8 RA/G (8-12.5 RPG). Past that point (>8.5 RA/G, >13 RPG), .29 is the top-performer. This explains why studies have tended to peg z somewhere in the .28-.29 range, as such a value represents the best fit at normal major league scoring levels.<br /><br />A nice way to see the relationship is to plot the difference (Pythagenpat - Cigol) relative to RA/G for each exponent:<br /><br /><a href="https://1.bp.blogspot.com/-yDE-efbijos/WwTcT3nfEmI/AAAAAAAAChU/MROGsbkSmY0a0DurRnfnddA_QR4Vgf6jgCLcBGAs/s1600/enby8-1.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-yDE-efbijos/WwTcT3nfEmI/AAAAAAAAChU/MROGsbkSmY0a0DurRnfnddA_QR4Vgf6jgCLcBGAs/s400/enby8-1.JPG" width="400" height="273" data-original-width="1227" data-original-height="838" /></a><br /><br />The point at which all converge is 4.5 RA/G, where R = RA and all estimators predict .500. As you can see, the differences converge as we approach either a .000 or 1.000 W%, since there is a hard cap on the linear difference at those points. <br /><br />This exercise gives us some direction on where to go, but it is not comprehensive enough to draw any conclusions. In order to do that, we need a more comprehensive set of data than simply fixing R/G at 4.5. To do so, I figured the Cigol W% for each interval of .25 runs scored and runs allowed between 1-15 RPG (removing all points at which R = RA). This yields 3,192 R/RA pairs, many of which are so extreme as to be absurd, which is the point.<br /><br />In order to make sense of this data, we will need to simplify the scope of what we are considering, so let’s start by trying to ascertain the relationship between runs and wins if we assume that a linear model should be used. Basically, the idea here is that we should be able to determine a runs per win (RPW) factor such that:<br /><br />W% = (R - RA)/RPW + .5<br /><br />From this, we can calculate RPW given W%, R, and RA as:<br /><br />RPW = (R - RA)/(W% - .5)<br /><br />In its most simple form, this type of equation assumes a fixed number of runs per win; for standard scoring contexts, 10 is a nice, round number that does the job and of course has become famous as a sabermetric rule of thumb. But it has long been known that RPW varies with the scoring context, and usually sabermetricians have attempted to express this by making RPW a function of RPG. So let’s graph our data in that manner:<br /><br /><a href="https://2.bp.blogspot.com/-eto0k7zXVgA/WwTdj4e6T9I/AAAAAAAACho/LE6hTxyYSjUthm0DZFhPFWNCseMKvU78gCLcBGAs/s1600/enby8-3.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-eto0k7zXVgA/WwTdj4e6T9I/AAAAAAAACho/LE6hTxyYSjUthm0DZFhPFWNCseMKvU78gCLcBGAs/s400/enby8-3.JPG" width="400" height="277" data-original-width="1208" data-original-height="838" /></a><br /><br />As you can see, RPW is not even close to being a linear function of RPG when extreme teams are considered. The bulk of the observations scattered around a nice, linear-looking function, but the outliers are such that the linear function will fail horrifically at the extremes. And when I say extremes, I really mean extremes. For instance, a 15 R/1 RA team is at 16 RPG, but would need much more than 16 marginal runs for a marginal win--Cigol estimates that such a team would need 28.11 marginal runs (as would it’s 1/15 counterpart). This should make sense to you logically--the team’s W% is already so high, and so many of the games blowouts, that you need to scatter a large number of runs around to move the win needle. This point represents the maximum RPW for the points I’ve included--the minimum is 3.69 at 1.25/1.<br /><br />This is not to say that a linear model cannot be used to estimate W%; it is simply the case that one linear model cannot be used to estimate W% over a wide range of possible scoring contexts and/or disparities in team strength. Let’s suppose that we limit the scope of our data in each of these manners. First, let’s consider only cases in which a team’s runs are between 3-7 and its runs allowed are between 3-7. This essentially captures the range of teams in modern major league baseball and limits the sample to 272 data points:<br /><br /><a href="https://4.bp.blogspot.com/-k99A4re2GZY/WwYvRA6ZO4I/AAAAAAAACic/qCEV1CkNNRIUlB9dDsHi9ngQWFWEV6-6gCLcBGAs/s1600/enby8-8.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://4.bp.blogspot.com/-k99A4re2GZY/WwYvRA6ZO4I/AAAAAAAACic/qCEV1CkNNRIUlB9dDsHi9ngQWFWEV6-6gCLcBGAs/s400/enby8-8.JPG" width="400" height="242" data-original-width="1323" data-original-height="801" /></a><br /><br />I’ve taken the liberty of including a linear regression line, which now has the slope we’d expect (recall that Tango’s formula for RPW is .75*RPG + 3, and that this is <a href="http://walksaber.blogspot.com/2009/01/runs-per-win-from-pythagenpat.html">consistent with Pythagenpat</a>). The line is shifted up more than the best fit using normal teams or centering Pythagenpat at 9 RPG indicates, as there are still some extreme combinations here (for example, a 7 R/3 RA team is expected by Cigol to play .815 ball, well beyond anything we’ll ever see in modern MLB).<br /><br />We can also try limiting the data in another way--only looking at cases in which the resulting records are feasible in modern MLB. For simplicity, I’ll define this as cases in which the Cigol W% is between .300 and .700 (yes, I realize the 2001 Mariners and 2003 Tigers fall outside of this range in terms of actual W%, but in fact it’s probably too wide of a band if we consider only expected W% based on R and RA). Here are the results from our Cigol data points, including all intervals of R and RA between 1-15 (this leaves us with 1,126 cases):<br /><br /><a href="https://3.bp.blogspot.com/-suxTjnsQIEE/WwTfDg2rOEI/AAAAAAAACiE/DWDmjOUiv5AqvTbiA1_dVwDEvVSiQmIFwCLcBGAs/s1600/enby8-6.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-suxTjnsQIEE/WwTfDg2rOEI/AAAAAAAACiE/DWDmjOUiv5AqvTbiA1_dVwDEvVSiQmIFwCLcBGAs/s400/enby8-6.JPG" width="400" height="249" data-original-width="1333" data-original-height="830" /></a><br /><br />Once again, the slope of the line is the ballpark of what we observe with normal teams, but the intercept is still off, shifting the line up to get closer to the extreme cases. If we make both adjustments simultaneously (look only at cases between 3-7 R, 3-7 RA, and .3-.7 Cigol W%), we are left with 202 data points and this graph:<br /><br /><a href="https://3.bp.blogspot.com/-Ku5lHCnGEEU/WwTeqBA7BXI/AAAAAAAACh8/0NiZMBYlT8MWF_yKPaUkybVmUEqMB784gCLcBGAs/s1600/enby8-5.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://3.bp.blogspot.com/-Ku5lHCnGEEU/WwTeqBA7BXI/AAAAAAAACh8/0NiZMBYlT8MWF_yKPaUkybVmUEqMB784gCLcBGAs/s400/enby8-5.JPG" width="400" height="248" data-original-width="1317" data-original-height="817" /></a><br /><br />Closer still, with the slope now essentially exactly where we expect it to be, but the intercept still shifting the line upwards. Why is this happening? We know that it’s not because of a breakdown of Cigol when estimating W% for normal teams--as we saw in the previous post, Cigol is of comparable accuracy to Pythagenpat and RPW = .75*RPG + 3 with normal teams. What’s happening is that we are not biasing our sample with near-.500 team as happens when we observe real major league data. All of our hypothetical teams have a run differential of at least +/- .25. In 1996-2006, about one quarter of teams had run differentials of less than +/- .25.<br />The standard deviation of W% for 1996-2006 was .073; the standard deviation of Cigol W% for this data is .111. This illustrates the point that I and other sabermetricians who seek theoretical soundness make repeatedly--using normal major league full season data, the variance is small enough that any halfway intelligible model will come close to predicting whatever it is your predicting. Anything that centers estimated W% at .500 and allows it to vary as run differential varies from zero will work just fine. But if you run into a sample that includes a lot of unusual cases, or you start looking at smaller sample sizes, or a higher variance league, or try to extrapolate results to individual player data, then many formulas that work just fine normally will begin to break down.<br /><br />A linear conversion between runs and wins breaks down in extreme cases for a few main reasons, including no bounds as is the case for real world W% [0,1] and the declining value of marginal runs on not one but two determinants--scoring context and differential between the two teams. There are some things we could attempt to do to salvage it, such as introducing run differential as a variable. If we did this, we could allow RPW to increase not only as RPG increases, but also as absolute value of RD increases.<br /><br />Let’s use the pared down in both dimensions data set to find a RPW estimator using both RPG and abs(RD) as predictors. I simply ran a multiple regression and got this equation:<br /><br />RPW = .732*RPG + .204*abs(R - RA) + 3.081<br /><br />If we assume that a team has R = RA, then this equation is a very good match for our expected .75*RPG + 3, as it would reduce to .732*RPG + 3.081. This is encouraging, since it should work with normal teams and offers the prospect for better performance with extreme teams. <br /><br />Remember, though, that “extreme” teams in the context of this dataset is a lot more restrictive than extreme teams in the broader set--we've limited the data to only 3-7 R, 3-7 RA, and .3-.7 Cigol W%. If we step outside of that range, the equation will break down again. For example, a 10 R/5 RA has a RPW of 15.081 according to this equation, which suggests a .832 W% versus the .819 expected by Cigol. While this is not a catastrophic error (and much better than the .851 suggested by .75*RPG + 3), don’t lose sight of the fact that the W% function is non-linear. <br /><br />If we use this equation on the rounded to nearest .05 1996-2006 major league data discussed in the last post, the RMSE times 162 is 3.858--just a tad worse than the RPW version that does not account for RD, but still comparable to (in fact, slightly lower RMSE than) the heavy hitters Pythagenpat and Cigol. It produces a very good match for Cigol over this dataset, in fact closer to Cigol than is Pythagenpat with z = .28.<br /><br />A similar equation to this one was previously developed by Tango Tiger (which is where I got the idea to use abs(R - RA) as the second variable; there might be some other ways one could construct the equation and achieve a similar outcome) and posted on FanHome in 2001:<br /><br />RPW = .756*RPG + .403*abs(R - RA) + 2.645<br /><br />In this version, the lower intercept is offset by the higher coefficient on RD. <br /><br />We can also attempt to improve the RPW estimate by using a non-linear equation. The best fit comes from a power regression, and again I will limit this to the 3-7 RPG, .300-.700 Cigol W% set of teams to produce this estimate:<br /><br /><a href="https://1.bp.blogspot.com/-Xf741cUmE5U/WwTg2qFoSaI/AAAAAAAACiQ/j7eYnu6gMuMFcvCe7kz4emxhmK1bkYMnQCLcBGAs/s1600/enby8-7.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-Xf741cUmE5U/WwTg2qFoSaI/AAAAAAAACiQ/j7eYnu6gMuMFcvCe7kz4emxhmK1bkYMnQCLcBGAs/s400/enby8-7.JPG" width="400" height="248" data-original-width="1323" data-original-height="820" /></a><br />RPW = 2.171*RPG^.691<br /><br />This may look familiar, because as I have <a href="http://walksaber.blogspot.com/2009/01/runs-per-win-from-pythagenpat.html">demonstrated in the past</a>, the Pythagenpat implied RPW at a given RPG for a .500 team is 2*RPG^(1 - z). Here the implied z value of .309 is higher than we typically see (.27 - .29), but the form is essentially the same.<br /><br />Any linear approximation might work well near the RPG/team quality level where it was constructed, but will falter outside of that range. We could develop an equation based on teams similar to the 10/5 example that would work well for them, but we’d necessarily lose accuracy when looking at normal teams. Non-linear W% functions allow us to capture a wider range of contexts with one particular equation. We can push the envelope a little bit by using a non-linear estimate of RPW, but we’d still have to be very careful as we varied the scoring context and skill difference between the teams.<br /><br />Assuming we are not just satisfied with an equation to use for normal teams, all of this caution is a lot to go through to salvage a functional form that still allows for sub-zero or greater than one W% estimates. Instead, it makes more sense to attempt to construct a W% estimate that bounds W% between 0 and 1 and builds-in non-linearity. This of course is why Bill James and many sabermetricians who have followed have turned to the Pythagorean family of estimators.<br />phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-89658350754085586402018-04-26T08:33:00.000-04:002018-04-26T17:41:03.229-04:00Enby Distribution, pt. 6: Accuracy of Enby W% EstimateIn the last post, I demonstrated how one can estimate W% from any runs per game and runs per inning distribution by using the basic principles of how baseball games are decided. This model is simple conceptually, but a bear to implement computationally when compared to the other W% estimators that have been developed by sabermetricians over the last fifty years. As such, it is not a practical tool to use for common sabermetric applications of a winning percentage estimator. If you want to know how many games a team that scores 828 runs and allows 753 runs in a season can expect to win, there are any number of formulas that are better practical options than Enby. <br /><br />However, it is important to verify that Enby is able to hold its own when estimating W% for normal teams. If it does not work as well as our other tools for normal situations, it will be harder to put any stock in its results when looking at extreme situations.<br /><br />To check if Enby was up to the challenge, I performed a limited accuracy test based on 1996-2006 data (a sample of 326 teams). This was in no way intended to be a comprehensive accuracy test, but rather one with a sufficiently large sample to determine if Enby can predict normal teams with comparable accuracy to other approaches. <br /><br />Since I have only calculated Enby distribution parameters at intervals of .05 RG, I rounded all team’s R/G and RA/G to the nearest .05 and used these figures as the inputs for all of the estimators. This ensured that they were all on equal footing, rather than Enby only having some imprecision in terms of the actual R and RA counts. In addition to Enby, I tested four other estimators:<br /><br />* A simple assumption of 10 RPW <br />* Tango’s formula that varies RPW by RPG (runs per game for both teams): RPW = .75*RPG + 3. This formula (or at least something very close to it) can be derived by <a href="http://walksaber.blogspot.com/2009/01/runs-per-win-from-pythagenpat.html">using Pythagenpat</a>.<br />* Pythagorean with a fixed exponent of 1.83<br />* Pythagenpat using x = RPG^.28<br /><br />The resulting RMSE for each estimator (W% RMSE multiplied by 162 for ease of interpretation):<br /><br /><a href="https://2.bp.blogspot.com/-k8vOK6458EI/WuJFrA1dTuI/AAAAAAAACg8/eU4GBHDiHS0KvxFax9ORoNBSLL_0ccwOACLcBGAs/s1600/rmse.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://2.bp.blogspot.com/-k8vOK6458EI/WuJFrA1dTuI/AAAAAAAACg8/eU4GBHDiHS0KvxFax9ORoNBSLL_0ccwOACLcBGAs/s400/rmse.JPG" width="400" height="247" data-original-width="167" data-original-height="103" /></a><br /><br />The three methods which allow the relationship between runs and wins to vary by scoring context (either by explicitly changing the RPW factor or Pythagorean exponent, or by estimating the scoring distribution as Enby does) come out on top. The linear RPW formula wins here, although the best performer would be Pythagenpat with x = RPG^.29, edging it out at a 3.850 RMSE. Of course, we could also find the coefficients in Tango’s RPW formula that minimize error, and quite possibly push that method back ahead of Pythagenpat.<br /><br />In any event, the three formulas allowing for customization are close enough that we can safely conclude that none is grossly deficient for the task of estimating W% for normal teams. That means that Enby has passed the first hurdle towards being taken seriously as a model for W% based on average runs scored and allowed.<br /><br />I also thought it would be interesting to test the RMSE of using each W% estimator to predict Pythagenpat. This is obviously a biased approach, assuming that Pythagenpat is the standard by which other estimators should be compared. The real reason to do this is to see how closely Enby tracks Pythagenpat with normal teams, since Pythagenpat is the closest W% estimator in theory to Enby. Both attempt to dynamically model the relationship between runs and wins; the other approaches, even the dynamic RPW estimator, assume that there is a fixed relationship between runs and wins. We should expect Pythagenpat and Enby to be in general agreement. And they are (RMSE once again multiplied by 162):<br /><br /><a href="https://1.bp.blogspot.com/-7kLwUQb3vr0/WuJGHE2J9JI/AAAAAAAAChE/PBCJYUE3Tf8NpdESG3aLaUW0lFODHamUACLcBGAs/s1600/rmse2.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://1.bp.blogspot.com/-7kLwUQb3vr0/WuJGHE2J9JI/AAAAAAAAChE/PBCJYUE3Tf8NpdESG3aLaUW0lFODHamUACLcBGAs/s400/rmse2.JPG" width="400" height="206" data-original-width="167" data-original-height="86" /></a><br /><br />Enby and Pythagenpat are essentially in lockstep. In fact, the largest discrepancy between the two is for 2002 Braves, who scored 4.40 and allowed 3.50 runs per game (rounded). Pythagenpat expects that such a team would have a W% of .6007, while Enby predicts a .5997 W%, a difference of .15 wins over the course of a season. <br /><br />The minimum RMSE between Pythagenpat and Enby occurs when the Pythagenpat exponent is dropped slightly to .279 (.026 RMSE). As the exponent varies, the discrepancy increases; with a Pythagenpat exponent of .29, the RMSE is .274. <br /><br />At this point, I’d like to pause for a moment and change the name of the Enby estimate of W%. This is just for my own sanity as I write and hopefully use these tools in the future, but I want to draw a distinction between the Enby distribution, which is used to estimate the probability of scoring k runs in a game, and the methodology described for estimating W%. I’m a little hesitant to put a name on it, since I haven’t earned that right--the logic is based in reality, not my insight, and has been used by many sabermetricians long before me. Plus, I’m not very good at making up these kinds of names--if you don’t believe me, re-examine the name of the blog.<br /><br />This methodology is compatible with any means of estimating the probability of scoring k runs a game, whether empirically, through the Enby distribution, solely through the Tango Distribution (as Enby itself borrows from the Tango Distribution), the Weibull distribution (as implemented by <a href="http://www.hardballtimes.com/main/article/consistency-is-key/">Sal Baxamusa</a> or <a href="http://web.williams.edu/Mathematics/sjmiller/public_html/399/handouts/PythagWonLoss_Paper.pdf">Steven Miller</a>), or any other approach that may be developed in the future. Going forward, I will be referring to this as the Cigol method. As Toirtap can attest, I like to spell things backwards when I am flummoxed. Since the W% estimator is based on simple logic, Cigol it is.<br />phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-78084223289457716682018-03-27T08:50:00.000-04:002018-03-27T08:50:16.624-04:002018 PredictionsSee the <a href="http://sportsdataresearch.com/difficulties-associated-with-preseason-projections/">standard disclaimers</a>. This is an exercise in fun more than analysis, although hopefully there's a touch of the latter or you're just wasting your time.<br /><br />AL EAST<br /><br />1. Boston<br />2. New York (wildcard)<br />3. Toronto (wildcard)<br />4. Baltimore<br />5. Tampa Bay<br /><br />Picking the Red Sox is something of a tradition in this space. I don’t do it on purpose, it’s just that my “model” (such as it is) has tended to pick them consistently. This year it’s a virtual tie with the Yankees; some projections agree with that, but others (notably PECOTA) see a huge advantage for the latter. The Yankees arguably were more impressive last season given their component statistics, and yet Boston’s offense should bounce back, their starting pitching should be better, their bullpen could benefit from some healthy pieces coming back...and if Aaron Judge and Giancarlo Stanton combine for even ninety homers the takes will be hot. The top-heaviness of the AL, with four teams that really stand out, leaves a team like the Blue Jays a stealthy wildcard contender. Incidentally, I have them +9 runs on both offense and defense. The Orioles added enough late and the Rays subtracted enough to make me flip-flop their places, but it would be surprising if either gets into this race.<br /><br />AL CENTRAL<br /><br />1. Cleveland<br />2. Minnesota<br />3. Kansas City<br />4. Detroit <br />5. Chicago<br /><br />As a partisan am I always queasy about picking the Indians, but last year it worked out okay, and again this year the on-paper gap is just too large to superstitiously pick someone else. But it’s easier to see how the Indians might lose to the Twins in 2018 than it was to compare them to the field in 2017. While it’s easy to overstate the impact of the Twins pitching additions (one could argue that Jake Odorozzi and Lance Lynn would be no more than #4 starters for the Tribe, even #5 if Danny Salazar could get it together), Cleveland’s bullpen is showing signs of vulnerability without a lot of clear candidates to step in, there are still injury questions surrounding Jason Kipnis and Michael Brantley, the outfield is unsettled...but it’s also sometimes easier to worry about these things as a fan. The Twins true quality for 2016-2017 might be matched by the win total, but the distribution was all off. A plexiglass principle year would not surprise. The Royals kept just enough of the band together to a) still be annoying and b) provide some measure of optimism for their partisans, but probably more of the former. I’ve been calling for the Tigers to dead cat bounce for a couple years; I’m surrendering and just expecting it for Miguel Cabrera. The White Sox have a lot of prospects and could well be the future of this division, but it’s still a year or two away.<br /><br />AL WEST<br /><br />1. Houston<br />2. Los Angeles<br />3. Seattle<br />4. Texas<br />5. Oakland<br /><br />The Astros, in my crude system, are the second-best team in the AL...on offense and defense. Just slightly behind the Yankees and the Indians, respectively; combined, that’s enough to declare them the best and most well-rounded team on paper. Prior to Shohei Ohtani’s rough showing in spring training, I was set to pick the Angels as the second wildcard. Is dropping them a small sample size overreaction? Quite possibly, yes, but there wasn’t much separating teams like the Angels, Blue Jays, and Twins to begin with. You have to feel bad (unless you’re a fan of the…wait, do they even have a rival of note) for the Mariners - they now have the longest playoff drought in North American sports. Longer than the Cleveland Browns (this has been true for years but it is a miscarriage of justice that it’s not the Browns that hold this dubious distinction). They’ve been good enough to squeak out a second wildcard for a few years, but it never came together, and the window may be closing. The Rangers franchise history from 2010 - 2017 will make a fascinating case study some day, but I don’t think 2018 will add another dramatic return from the dead to the story. I still like the A’s players and think they could contend in the coming years, but the starting pitching is too shaky to predict good things this season.<br /><br />NL EAST<br /><br />1. Washington<br />2. New York<br />3. Philadelphia<br />4. Atlanta <br />5. Miami<br /><br />The Nationals are basically what they have been for the last six years -- the clear favorite in the NL East. This is probably the last year for them to enjoy that status, but that’s a pretty impressive run in a division that features two big markets and a Braves franchise that until some point in the Washington run had basically contended for 25 years. As a neutral observer, it would be nice to at least see them get a NLCS out of the deal. Everyone talks about the health of the Mets rotation, but I think scoring runs might be a bigger question mark. I like the Phillies over the Braves this season, but over the next five years I’d flip that. Philly is a popular second wildcard pick--while that’s certainly within the realm of possibility, it will take better than forecast performances from some of the rookies (JP Crawford, Jorge Alfaro, Nick Williams) and Maikel Franko to make that happen. The Marlins are obviously a sad team to ponder, but the fact that Derek Jeter’s halo is being tarnished in the process makes it more entertaining than the usual Miami teardown.<br /><br />NL CENTRAL<br /><br />1. Chicago<br />2. St. Louis (wildcard)<br />3. Milwaukee<br />4. Pittsburgh<br />5. Cincinnati<br /><br />The Cubs have the best offense in the NL by my estimation (although they distributed their runs across games so unfortunately last season that it wasn’t evident in the standings), their rotation is stronger entering this season (relative to last April) with the acquisitions of Jose Quintana and Yu Darvish, and I think they’re ready to re-challenge the Dodgers for NL superiority. The Cardinals look like a solid 86 win team, which is enough to make them a wildcard favorite; if they win with it, it’s a departure from the Pujols-era Cardinal teams which always had big stars, although maybe Carlos Martinez will take a step forward or Marcell Ozuna will hold his level and people will recognize how good he is outside of Miami and Stanton’s shadow. I look at four sources for team win projections when writing these up: my own crude version (fueled by the Steamer projections published at Fangraphs and some manual overrides on my part), Fangraphs, PECOTA from Baseball Prospectus, and Clay Davenport’s. The Brewers projected wins range from 76 - 86, which is tied with PECOTA darling Tampa Bay for the largest spread. Mine is on the low end of the spectrum--it just doesn’t seem like they have the pitching, and they have an outfield/corners logjam that’s good for depth but bad for allowing all of their name hitters to fully contribute. Last year I held on to hope for the Pirates; now I think it’s safe to say their 2012 - 2015 revival is over (come here for the bold statements). Amazingly, they would have been better off to have been in the NL East. If the only baseball I was allowed to watch this year was the games of one of the teams I picked last, I’d go with the Reds. Joey Votto, Luis Castillo, some interesting bullpen pieces, Billy Hamilton as a side show…it’s a fun team if not a good one.<br /><br />NL WEST<br /><br />1. Los Angeles<br />2. Arizona (wildcard)<br />3. San Francisco<br />4. Colorado<br />5. San Diego<br /><br />I might be shortchanging the Dodgers by not picking them as the best team in the NL. They are still really good, they still have good depth, they still have the resources to address issues, but you know all that. There’s not much to say other than to tip one’s cap to the machine. I’m not bullish on the Diamondbacks, per se, but I’ll see your Zack Greinke decline concerns and raise you Zack Godley. I was surprised at how well the Giants came out when I put my forecast spreadsheet together; I was expecting 78-82 wins. A few more put them in prime wildcard contention position, but that was before Bumgarner and Shark became huge injury concerns. I don’t think the Rockies offense is all that good. I don’t think you can expect Charlie Blackmon to be as good, I still am skeptical of DJ LeMahieu, catcher and first base aren’t exactly settled. The Padres are definitely intriguing going forward, but it’s too soon to expect contention.<br /><br />WORLD SERIES<br /><br />Houston over Chicago<br /><br />AL ROY: RF Austin Hays, BAL<br />AL Cy Young: Trevor Bauer, CLE<br />I don’t actually think this is the most likely outcome, I just love Trevor Bauer.<br />AL MVP: SS Carlos Correa, HOU<br />NL ROY: SP Alex Reyes, STL<br />NL Cy Young: Stephen Strasburg, WAS<br />NL MVP: RF Bryce Harper, WAS<br />phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-18801905049897703622018-02-28T07:53:00.000-05:002018-02-28T07:53:16.345-05:00Enby Distribution, pt. 5: W% EstimateWhile an earlier post contained the full explanation of the methodology used to estimate W%, it’s an important enough topic to repeat in full here. The methodology is not unique to Enby; it could be implemented with any estimate of the frequency of runs scored per game (and in fact I first implemented it with the Tango Distribution). As I discussed last time, the math may look complicated and require a computer to implement, but the model itself is arguably the simplest conceptually because it is based on the simple logic of how games are decided.<br /><br />Let p(k) be the probability of scoring k runs in a game and q(m) be the probability of allowing m runs a game. If k is greater than m, then the team will win; if k is less than m, then the team will lose. If k and m are equal, then the game will go to extra innings. In setting it up this way, I am implicitly assuming that p(k) is the probability of scoring k runs in nine innings rather than in a game. This is not a horrible way to go about it since the average major league game has about 27 outs once the influences that cause shorter games (not batting in the ninth, rain) are balanced with the longer games created by extra innings. Still, it should be noted that the count of runs scored from a particular game does not necessarily arise from an equivalent opportunity context (as defined by innings or outs) of another game.<br /><br />Given this notation, we can express the probability of winning a game in the standard nine innings as:<br /><br />P(win 9) = p(1)*q(0) + p(2)*[q(0) +q(1)] +p(3)*[q(0) + q(1) + q(2)] + p(4)*[q(0) + q(1) + q(2) + q(3)] + ...<br /><br />Extra innings will occur whenever k and m are equal:<br /><br />P(X) = p(0)*q(0) + p(1)*q(1) + p(2)*q(2) + p(3)*q(3) + p(4)*q(4) + ...<br /><br />When the game goes to extra innings, it becomes an inning by inning contest. Let n(k) be the probability of scoring k runs in an inning and r(m) be the probability of allowing m runs in an inning. If k is greater than m, the team wins; if k is less than m, the team loses; and if k is equal to m, then the process will repeat until a winner is determined. <br /><br />To find the probability of each of the three possible outcomes of an extra inning, we can follow the same logic as used above for P(win 9). The probability of winning the inning is:<br /><br />P(win inning) = n(1)*r(0) +n(2)*[r(0) +r(1)] +n(3)*[r(0) + r(1) + r(2)] + n(4)*[r(0) + r(1) + r(2) + r(3)] + ...<br /><br />The probability of the game continuing (equivalent to tying the inning) is similar to P(extra innings above):<br /><br />P(tie inning) = n(0)*r(0) + n(1)*r(1) +n(2)*r(2) + n(3)*r(3) + n(4)*r(4) + ...<br /><br />The probability of winning in extra innings [P(win X)] is:<br /><br />P(win X) = P(win inning) + P(tie inning)*P(win inning) + P(tie inning)^2*P(win inning) + P(tie inning)^3*P(win inning) + ...<br /><br />This is a geometric series that simplifies to:<br /><br />P(win X) = P(win inning)*[P(tie inning) + P(tie inning)^2 + P(tie inning)^3 + ...] = P(win inning)*1/[1 - P(tie inning)] = P(win inning)/[1 - P(tie inning)]<br /><br />This could also be expressed in a very clever way using the <a href="https://en.wikipedia.org/wiki/Craps_principle">Craps Principle</a> if we had also computed P(lose inning); I did it that way last time, but it doesn’t really cut down on the amount of calculation necessary in this case.<br /><br />Since I want these last few posts to serve as a comprehensive explanation of how to calculate the Enby run and win estimates, it is necessary to take a moment to review how to use the Tango Distribution to estimate the runs per inning distribution. c of course is the constant, set at .852 when looking with a head-to-head matchup. RI is runs/inning, which I’ve defined as RG/9:<br /><br />a = c*RI^2<br />n(0) = RI/(RI + a)<br />d = 1 - c*f(0)<br />n(1) = (1 - n(0))*(1 - d)<br />n(k) = n(k - 1)*d for k >= 2<br /><br />Once we have these three key probabilities [P(win 9), P(X), and P(win X)], the formula for W% is obvious:<br /><br />W% = P(win 9) + P(X)*P(win X)<br /><br />We will use the Enby Distribution to determine p(k) and q(m), and the Tango Distribution to determine n(k) and r(m). In both cases, we’ll use the Tango Distribution constant c = .852 since this works best when looking at a head-to-head matchup, which certainly is the applicable context when discussing W%.<br /><br />I have put together a <a href="https://docs.google.com/spreadsheets/d/e/2PACX-1vRUts4tT5khcH1bBUjw5buCRdmn3xDnJoQEPDDN5xHu91_AUdJKOtIvyVtZwt1ZY2i-xzVH2PITeMJy/pub?output=xlsx">spreadsheet</a> that will handle all of the calculations for you. The yellow cells are the ones that you can edit, with the most important being R (cell B1) and RA (cell L1), which naturally are where you enter the average R/G and RA/G for the team whose W% you’d like to estimate. The other yellow cell is for the c value of Tango Distribution. Please note that editing this cell will do nothing to change the Enby Distribution parameters--those are fixed based on using c = .852. Editing c in this cell (B8) will only change the estimates of the per inning scoring probabilities estimated by the Tango Distribution. I don’t advise changing this value, since .852 has been found to work best for head-to-head matchups and leaving it there keeps the Tango Distribution estimates consistent with the Enby Distribution estimates. The sheet also calculates Pythagenpat W% for a given exponent (which you can change in cell B15). <br /><br />The calculator supports the same range of values as the one for single team run distribution introduced in part 9--RG at intervals of .25 between 0-3 and 7-15 runs, and at intervals of .05 between 3-7 runs. The vlookup function will round down to the next R/G value on the parameter sheet (for example, the two highest values supported are 14.75 and 15.00. You can enter 14.93 if you want, but the Enby calculation will be based on 14.75 (the Pythagenpat calculation will still be based on 14.93). Have some fun playing around with it, and next time we’ll look at how accurate the Enby estimate is compared to other W% models.phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0tag:blogger.com,1999:blog-12133335.post-81325313025579644302018-02-13T17:48:00.001-05:002018-02-13T17:49:33.161-05:00Doubles or NothingIn previewing the season to come for any team, it is customary (for good reason) to start by taking a look back at the previous season. Sometimes this is a pleasant or at least unobjectionable experience. On some occasions, though, it forces one to review an absolute disaster of a season, as was turned in by the 2017 Ohio State Buckeyes.<br /><br />OSU went 22-34, which was the lowest W% by a Buckeye club since 1974. Their 8-16 Big Ten record was the worst since 1987. The seven years in which Beals have been at the helm have produced a .564 W%, which excepting the largely overlapping span of 2008-2014, is the worst since 1986-1992. Beals has taken the program build by Bob Todd, who inherited the late 80s malaise, and driven it right back into mediocrity.<br /><br />Yet merrily he rolls along, untroubled by the pressures of coaching at a school that fired its all-time winningest basketball coach for having two straight NCAA tournament misses, despite compiling a .500 record in Big Ten play over those two seasons. Beals and his unenlightened brand of baseball may be too small fry to draw the ire of AD Gene Smith, but tell that to the track, gymnastics, and women’s hockey coaches who have been pushed out in recent years. Beals record of doing less with a historically strong program is unmatched at the University.<br /><br />When one peruses the likely lineup for 2018, it’s hard to think that a turnaround is imminent. Stranger things have happened, of course, but eight years into his tenure in Columbus, enough time to have nearly turned over two whole recruiting classes with no overlap, he is still plugging roster wholes with unproven JUCO transfers, failing to develop the high school recruits he’s brought in. It’s gotten to the point that if a player doesn’t find a role as a freshman, you can basically write him off as a future contributor.<br /><br />Junior Jacob Barnwell is firmly ensconced at catcher; he was an average hitter last year and appears to have the coach seal of approval as a receiver, so he’s golden for playing time over the next two seasons. True freshman Dillon Dingler may be the heir apparent, with junior Andrew Fishel and redshirt freshman Scottie Seymour providing depth.<br /><br />Seniors Bo Coolen and Noah McGowan, both JUCO transfers a year ago, will compete for first base; Coolen was bad offensively in 2017 with no power (.074 ISO), McGowan a little better but still below average. Junior Brady Cherry will move from the hot corner to the keystone, a curious move to this observer; Cherry flashed power as a freshman but was middling with the bat last year. That opens up third for sophomore Connor Pohl, who filled in admirably at second last year but does look more like a third baseman; on a rate basis he was the second most productive returning hitter, although it wasn’t a huge sample size (89 PA and it was very BA-heavy with a .325 BA/.225 SEC). JUCO transfer junior Kobie Foppe is penciled in at shortstop. The utility infielders are both sophomores; Noah West played more as a freshman, getting starts at second base (he didn’t hit at .213/.278/.303) and serving as a defensive replacement for Pohl, while Carpenter had 14 hitless (one walk) PAs. True freshman Aaron Hughes rounds out the roster.<br /><br />Senior Tyler Cowles has the inside track at left field, coming off a first season as a JUCO transfer in which he hit .190/.309/.314 over 129 PA. McGowan could also contend for this spot, with backup outfield redshirt juniors Nate Romans and Ridge Winand also in the mix. JUCO transfer Malik Jones has been anointed as the centerfielder, with true freshman Jake Ruby as an understeady. Right field along with catcher is the only spot on the roster that features an established starter at the same position; sophomore Dominic Canzone is OSU’s best returning hitter, although it was BA heavy (.343 BA/.205 SEC). Some combination of Cowles, McGowan, and Fishel would appear to have the first crack at DH.<br /><br />OSU’s pitching was an utter disaster last year, partly due to injury and partly because, well, Greg Beals. The only sure bet for the rotation appears to be senior Adam Niemeyer, with junior lefty Connor Curlis and senior Yianni Pavlopoulos (who closed as a sophomore) most likely to join him. Their RAs were 6.23, 5.03, and 7.65 respectively in 2017, although only Curlis had good health. Junior Ryan Feltner pitched poorly last year (7.32 RA over 62 IP despite 8.2 K/9), then went to the Cape Cod league and was named Reliever of the Year. Sophomore Jake Vance had a 6.92 RA over 26 innings, largely thanks to 20 walks, and is the fifth rotation candidate.<br /><br />The perennial bright spot of the pitching staff is senior righty Seth Kinker, who easily led the team with 13 RAA over 58 innings, even getting 3 starts when everything fell to pieces. He figures to be the go-to reliever, with fifth-year senior righties Kyle Michalik, Austin Woody, and Curtiss Irving in middle relief. You’re not going to believe this, but their RAs ranged between 6.85 and 7.94 over a combined 66 innings. Sophomore Thomas Waning will follow Kinker and Michalik in one of Beals’ good traits, which is an affinity for sidearmers; Waning was effective (11 K, 4 W) in a 12 inning injury-shortened debut season. Junior Dustin Jourdan will be in the mix as well.<br /><br />Beals also has an affinity for lefty specialists, which he will have to cultivate anew from sophomore Andrew Magno (4 appearances in 2016) and true freshman Luke Duermit, Griffan Smith, and Alex Theis.<br /><br />The schedule is fairly typical, with the opening weekend (starting Friday) featuring a pair of games with both Canisus and UW-Milwaukee in Florida. The following weekend will see the Bucks in Arizona for the Big Ten/Pac-12 Challenge where they’ll play two each against Utah and Oregon State. Another trip to Florida to play low-level opponents (Nicholls State, Southern Miss, and Eastern Michigan) follows, followed by a trip to the Carolinas that will feature two games each against High Point, Coastal Carolina, and UNC-Wilmington.<br /><br />Bizarrely, the home schedule opens March 16 with a weekend series against Cal St-Northridge; usually any home dates with non-Northern opponents come later in the calendar. Another non-conference weekend series against Georgetown follows, and then Big Ten play: Nebraska, @ Iowa, @ Penn St, Indiana, Minnesota, Illinois, Purdue, @ Michigan St. Mixed in will be a typically home-heavy mid-week slate (Eastern Michigan, Toledo, Kent St, Ohio University, Miami, Campbell) with road games at Ball St and Cincinnati.<br /><br />As I wrote the roster outlook (which relied on my own knowledge and guesses but also heavily on the season preview released by the athletic department), two things that I already thought I knew struck me even more plainly.<br /><br />1) This team does not appear to be very good. One can construct a rosy scenario where the pitching woes of 2017 were due largely to injury, but we’re talking about pitcher injuries. It takes extra tint on those glasses. It has to be better than last year, when nine pitchers started at least three games, but this team was 22-34; “better” isn’t going to cut it. <br /><br />2) The offense has a couple solid returnees, but in the eighth year of Beals tenure, major positions on the diamond are still being papered over with JUCO transfers. There is no pipeline of young players getting their feet wet in utility roles and transitioning into starting as you would expect in a healthy program. There are no freshman studs to come in and commandeer lineup positions as you would expect in a strong program. It is quite easy to imagine a scenario in which five of the nine lineup spots are held by first or second-year JUCO transfers.<br /><br />Beals has failed in recruiting, he has failed in player development, and most importantly he has failed to win at the level to which an OSU program should aspire. I’ve devoted many words in previous season previews and recaps (and the hashtag #BealsBall) to his asinine tactics. I won’t rehash that here, but I will end with a quote from the Meet the Team Dinner that program icon Nick Swisher was roped into headlining, which makes one seriously question in what decade Mr. Beals thinks he coaches:<br /><br /><i>“Our goal in 2018 is to hit a lot of doubles,” said Beals on Saturday night.<br /></i>phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.com0