Wednesday, February 28, 2018

Enby Distribution, pt. 5: W% Estimate

While an earlier post contained the full explanation of the methodology used to estimate W%, it’s an important enough topic to repeat in full here. The methodology is not unique to Enby; it could be implemented with any estimate of the frequency of runs scored per game (and in fact I first implemented it with the Tango Distribution). As I discussed last time, the math may look complicated and require a computer to implement, but the model itself is arguably the simplest conceptually because it is based on the simple logic of how games are decided.

Let p(k) be the probability of scoring k runs in a game and q(m) be the probability of allowing m runs a game. If k is greater than m, then the team will win; if k is less than m, then the team will lose. If k and m are equal, then the game will go to extra innings. In setting it up this way, I am implicitly assuming that p(k) is the probability of scoring k runs in nine innings rather than in a game. This is not a horrible way to go about it since the average major league game has about 27 outs once the influences that cause shorter games (not batting in the ninth, rain) are balanced with the longer games created by extra innings. Still, it should be noted that the count of runs scored from a particular game does not necessarily arise from an equivalent opportunity context (as defined by innings or outs) of another game.

Given this notation, we can express the probability of winning a game in the standard nine innings as:

P(win 9) = p(1)*q(0) + p(2)*[q(0) +q(1)] +p(3)*[q(0) + q(1) + q(2)] + p(4)*[q(0) + q(1) + q(2) + q(3)] + ...

Extra innings will occur whenever k and m are equal:

P(X) = p(0)*q(0) + p(1)*q(1) + p(2)*q(2) + p(3)*q(3) + p(4)*q(4) + ...

When the game goes to extra innings, it becomes an inning by inning contest. Let n(k) be the probability of scoring k runs in an inning and r(m) be the probability of allowing m runs in an inning. If k is greater than m, the team wins; if k is less than m, the team loses; and if k is equal to m, then the process will repeat until a winner is determined.

To find the probability of each of the three possible outcomes of an extra inning, we can follow the same logic as used above for P(win 9). The probability of winning the inning is:

P(win inning) = n(1)*r(0) +n(2)*[r(0) +r(1)] +n(3)*[r(0) + r(1) + r(2)] + n(4)*[r(0) + r(1) + r(2) + r(3)] + ...

The probability of the game continuing (equivalent to tying the inning) is similar to P(extra innings above):

P(tie inning) = n(0)*r(0) + n(1)*r(1) +n(2)*r(2) + n(3)*r(3) + n(4)*r(4) + ...

The probability of winning in extra innings [P(win X)] is:

P(win X) = P(win inning) + P(tie inning)*P(win inning) + P(tie inning)^2*P(win inning) + P(tie inning)^3*P(win inning) + ...

This is a geometric series that simplifies to:

P(win X) = P(win inning)*[P(tie inning) + P(tie inning)^2 + P(tie inning)^3 + ...] = P(win inning)*1/[1 - P(tie inning)] = P(win inning)/[1 - P(tie inning)]

This could also be expressed in a very clever way using the Craps Principle if we had also computed P(lose inning); I did it that way last time, but it doesn’t really cut down on the amount of calculation necessary in this case.

Since I want these last few posts to serve as a comprehensive explanation of how to calculate the Enby run and win estimates, it is necessary to take a moment to review how to use the Tango Distribution to estimate the runs per inning distribution. c of course is the constant, set at .852 when looking with a head-to-head matchup. RI is runs/inning, which I’ve defined as RG/9:

a = c*RI^2
n(0) = RI/(RI + a)
d = 1 - c*f(0)
n(1) = (1 - n(0))*(1 - d)
n(k) = n(k - 1)*d for k >= 2

Once we have these three key probabilities [P(win 9), P(X), and P(win X)], the formula for W% is obvious:

W% = P(win 9) + P(X)*P(win X)

We will use the Enby Distribution to determine p(k) and q(m), and the Tango Distribution to determine n(k) and r(m). In both cases, we’ll use the Tango Distribution constant c = .852 since this works best when looking at a head-to-head matchup, which certainly is the applicable context when discussing W%.

I have put together a spreadsheet that will handle all of the calculations for you. The yellow cells are the ones that you can edit, with the most important being R (cell B1) and RA (cell L1), which naturally are where you enter the average R/G and RA/G for the team whose W% you’d like to estimate. The other yellow cell is for the c value of Tango Distribution. Please note that editing this cell will do nothing to change the Enby Distribution parameters--those are fixed based on using c = .852. Editing c in this cell (B8) will only change the estimates of the per inning scoring probabilities estimated by the Tango Distribution. I don’t advise changing this value, since .852 has been found to work best for head-to-head matchups and leaving it there keeps the Tango Distribution estimates consistent with the Enby Distribution estimates. The sheet also calculates Pythagenpat W% for a given exponent (which you can change in cell B15).

The calculator supports the same range of values as the one for single team run distribution introduced in part 9--RG at intervals of .25 between 0-3 and 7-15 runs, and at intervals of .05 between 3-7 runs. The vlookup function will round down to the next R/G value on the parameter sheet (for example, the two highest values supported are 14.75 and 15.00. You can enter 14.93 if you want, but the Enby calculation will be based on 14.75 (the Pythagenpat calculation will still be based on 14.93). Have some fun playing around with it, and next time we’ll look at how accurate the Enby estimate is compared to other W% models.

Tuesday, February 13, 2018

Doubles or Nothing

In previewing the season to come for any team, it is customary (for good reason) to start by taking a look back at the previous season. Sometimes this is a pleasant or at least unobjectionable experience. On some occasions, though, it forces one to review an absolute disaster of a season, as was turned in by the 2017 Ohio State Buckeyes.

OSU went 22-34, which was the lowest W% by a Buckeye club since 1974. Their 8-16 Big Ten record was the worst since 1987. The seven years in which Beals have been at the helm have produced a .564 W%, which excepting the largely overlapping span of 2008-2014, is the worst since 1986-1992. Beals has taken the program build by Bob Todd, who inherited the late 80s malaise, and driven it right back into mediocrity.

Yet merrily he rolls along, untroubled by the pressures of coaching at a school that fired its all-time winningest basketball coach for having two straight NCAA tournament misses, despite compiling a .500 record in Big Ten play over those two seasons. Beals and his unenlightened brand of baseball may be too small fry to draw the ire of AD Gene Smith, but tell that to the track, gymnastics, and women’s hockey coaches who have been pushed out in recent years. Beals record of doing less with a historically strong program is unmatched at the University.

When one peruses the likely lineup for 2018, it’s hard to think that a turnaround is imminent. Stranger things have happened, of course, but eight years into his tenure in Columbus, enough time to have nearly turned over two whole recruiting classes with no overlap, he is still plugging roster wholes with unproven JUCO transfers, failing to develop the high school recruits he’s brought in. It’s gotten to the point that if a player doesn’t find a role as a freshman, you can basically write him off as a future contributor.

Junior Jacob Barnwell is firmly ensconced at catcher; he was an average hitter last year and appears to have the coach seal of approval as a receiver, so he’s golden for playing time over the next two seasons. True freshman Dillon Dingler may be the heir apparent, with junior Andrew Fishel and redshirt freshman Scottie Seymour providing depth.

Seniors Bo Coolen and Noah McGowan, both JUCO transfers a year ago, will compete for first base; Coolen was bad offensively in 2017 with no power (.074 ISO), McGowan a little better but still below average. Junior Brady Cherry will move from the hot corner to the keystone, a curious move to this observer; Cherry flashed power as a freshman but was middling with the bat last year. That opens up third for sophomore Connor Pohl, who filled in admirably at second last year but does look more like a third baseman; on a rate basis he was the second most productive returning hitter, although it wasn’t a huge sample size (89 PA and it was very BA-heavy with a .325 BA/.225 SEC). JUCO transfer junior Kobie Foppe is penciled in at shortstop. The utility infielders are both sophomores; Noah West played more as a freshman, getting starts at second base (he didn’t hit at .213/.278/.303) and serving as a defensive replacement for Pohl, while Carpenter had 14 hitless (one walk) PAs. True freshman Aaron Hughes rounds out the roster.

Senior Tyler Cowles has the inside track at left field, coming off a first season as a JUCO transfer in which he hit .190/.309/.314 over 129 PA. McGowan could also contend for this spot, with backup outfield redshirt juniors Nate Romans and Ridge Winand also in the mix. JUCO transfer Malik Jones has been anointed as the centerfielder, with true freshman Jake Ruby as an understeady. Right field along with catcher is the only spot on the roster that features an established starter at the same position; sophomore Dominic Canzone is OSU’s best returning hitter, although it was BA heavy (.343 BA/.205 SEC). Some combination of Cowles, McGowan, and Fishel would appear to have the first crack at DH.

OSU’s pitching was an utter disaster last year, partly due to injury and partly because, well, Greg Beals. The only sure bet for the rotation appears to be senior Adam Niemeyer, with junior lefty Connor Curlis and senior Yianni Pavlopoulos (who closed as a sophomore) most likely to join him. Their RAs were 6.23, 5.03, and 7.65 respectively in 2017, although only Curlis had good health. Junior Ryan Feltner pitched poorly last year (7.32 RA over 62 IP despite 8.2 K/9), then went to the Cape Cod league and was named Reliever of the Year. Sophomore Jake Vance had a 6.92 RA over 26 innings, largely thanks to 20 walks, and is the fifth rotation candidate.

The perennial bright spot of the pitching staff is senior righty Seth Kinker, who easily led the team with 13 RAA over 58 innings, even getting 3 starts when everything fell to pieces. He figures to be the go-to reliever, with fifth-year senior righties Kyle Michalik, Austin Woody, and Curtiss Irving in middle relief. You’re not going to believe this, but their RAs ranged between 6.85 and 7.94 over a combined 66 innings. Sophomore Thomas Waning will follow Kinker and Michalik in one of Beals’ good traits, which is an affinity for sidearmers; Waning was effective (11 K, 4 W) in a 12 inning injury-shortened debut season. Junior Dustin Jourdan will be in the mix as well.

Beals also has an affinity for lefty specialists, which he will have to cultivate anew from sophomore Andrew Magno (4 appearances in 2016) and true freshman Luke Duermit, Griffan Smith, and Alex Theis.

The schedule is fairly typical, with the opening weekend (starting Friday) featuring a pair of games with both Canisus and UW-Milwaukee in Florida. The following weekend will see the Bucks in Arizona for the Big Ten/Pac-12 Challenge where they’ll play two each against Utah and Oregon State. Another trip to Florida to play low-level opponents (Nicholls State, Southern Miss, and Eastern Michigan) follows, followed by a trip to the Carolinas that will feature two games each against High Point, Coastal Carolina, and UNC-Wilmington.

Bizarrely, the home schedule opens March 16 with a weekend series against Cal St-Northridge; usually any home dates with non-Northern opponents come later in the calendar. Another non-conference weekend series against Georgetown follows, and then Big Ten play: Nebraska, @ Iowa, @ Penn St, Indiana, Minnesota, Illinois, Purdue, @ Michigan St. Mixed in will be a typically home-heavy mid-week slate (Eastern Michigan, Toledo, Kent St, Ohio University, Miami, Campbell) with road games at Ball St and Cincinnati.

As I wrote the roster outlook (which relied on my own knowledge and guesses but also heavily on the season preview released by the athletic department), two things that I already thought I knew struck me even more plainly.

1) This team does not appear to be very good. One can construct a rosy scenario where the pitching woes of 2017 were due largely to injury, but we’re talking about pitcher injuries. It takes extra tint on those glasses. It has to be better than last year, when nine pitchers started at least three games, but this team was 22-34; “better” isn’t going to cut it.

2) The offense has a couple solid returnees, but in the eighth year of Beals tenure, major positions on the diamond are still being papered over with JUCO transfers. There is no pipeline of young players getting their feet wet in utility roles and transitioning into starting as you would expect in a healthy program. There are no freshman studs to come in and commandeer lineup positions as you would expect in a strong program. It is quite easy to imagine a scenario in which five of the nine lineup spots are held by first or second-year JUCO transfers.

Beals has failed in recruiting, he has failed in player development, and most importantly he has failed to win at the level to which an OSU program should aspire. I’ve devoted many words in previous season previews and recaps (and the hashtag #BealsBall) to his asinine tactics. I won’t rehash that here, but I will end with a quote from the Meet the Team Dinner that program icon Nick Swisher was roped into headlining, which makes one seriously question in what decade Mr. Beals thinks he coaches:

“Our goal in 2018 is to hit a lot of doubles,” said Beals on Saturday night.