Walk Like a Sabermetrician: June 2011

Tuesday, June 28, 2011

Once in a Lifetime

What is the probability that your favorite team will win the World Series during your remaining lifetime? Obviously, the best estimate of this probability depends on the expected quality of your team each year, as well as a specific estimate of your mortality, which is dependent upon any number of factors. This post generalizes this question to estimate the probability of an average team winning the Series during an average person's lifetime.

Before delving into that, I must note that the subject matter is simultaneously frivolous (from a rigid sabermetric standpoint) and a tad morbid. After all, no one really wants to think about their own mortality, and thinking about the chances that your team will break your heart until your heart breaks isn't exactly fun.

I will assume here that the probability of winning the World Series each year takes one of three values, which are independent from year-to-year: 1/30 (all teams have an equal chance), 1/60 (50% as likely to win as average), or 1/15 (twice as likely to win). The last value is also essentially equal to the average chance of winning a pennant (with 14 and 16 team leagues it splits the difference).

What's trickier is figuring the probability of survival at each age. I have used a life table from the Social Security Administration as the basis for these estimates. The table is a projected life table for Americans born in 1980. As such, it doesn't directly apply to people in other age groups, but for the sake of simplicity I have assumed that it does. The effect of this will be to overstate the life expectancy and championship probabilities for people born prior to 1980, as the SSA models assume that the force of mortality will decrease.

Even if the assumptions about mortality prove to be accurate, the probability of your team winning will probably be lower than assumed here thanks to expansion, which may not be imminent but will almost certainly occur at some point.

You can skip ahead a couple of paragraphs if that explanation of the life functions satisfies your curiosity, because this has absolutely nothing to do with baseball. The charts show data for age at five year intervals, starting at age five and running through age 100. They are based only on the male life expectancy chart; females have higher life expectancy, but most fans are male and presenting two separate charts or combining the two would add a lot of trouble to a silly exercise.

There are three pieces of data presented in the chart for each age:

1. e(x)--curtate life expectancy

The curtate life expectancy only considers full-year survival. For example, the life table tells us that at age five there are 98,357 survivors. At age six, there are 98,324. The 33 deaths between age five and six do not contribute to the curtate life expectancy for age five. The complete life expectancy does include partial-years and is included in the life table, but requires the use of assumptions about mortality at fractional ages to estimate. The curtate expectancy is easily calculated from the life table:

e(5) = l(6)/l(5) + l(7)/l(5) + l(8)/l(5) + ... + l(117)/l(5)

= 98324/98357 + 98294/98357 + 98264/98357 + ... + 1/98357 = 73.38

2. Exp Wins--expected championships won by the team during one's lifetime

The probability of a team winning in any given year is assumed to be one of the constants explained above (1/15, 1/30, 1/60). The probability of a person age x surviving to age x + 1 is l(x + 1)/l(x). The product of these two is the probability that a person age x survives to see their team win a title in the year of age x + 1.

So the expected wins for a life age 5 is figured thusly (assuming a 1/30 probability of team winning):

1/30*l(6)/l(5) + 1/30*l(7)/l(5) + 1/30*l(8)/l(5) + ... + 1/30*l(117)/l(5)

= 1/30*e(5)

The assumption here is that one needs to survive the full year in order to see one's team win during that year. To put it in baseball terms, we could say that these calculations assume that each person is born on October X and the World Series always concludes on October X. I'm also assuming that survival means one is able to enjoy a victory by their team, but sadly this is not always the case.

The use of curtate functions makes computations easier but it also undersells the life expectancy and the championship expectations and probabilities a little bit.

3. >=1 win--the probability that the team will win at least one championship during one's life

For any given age x, the probability that one's team wins and that they survive to see it is 1/30*l(x + 1)/l(x). The complement of this result is the probability that the team does not win the title in this year plus the probability that the team does win the title but life (x) does not survive to see it.

Multiplying all of the complements for a lifetime results in the probability that the team never wins while the subject survives. The complement of this is therefore the probability that the team does win during the lifetime of (x).

Here is the chart based on an average (1/30) chance of winning the title each year:

This paints a pretty rosy picture for most people. Keeping in mind that the life functions are based on expected mortality for those born in 1980, from fifty and younger the expected number of titles is still one or greater, and from 60 and younger the probability of seeing a title is still greater than 50%. If you are indoctrinating your ten-year old son into fandom of your favorite team, you have about a 90% chance of not making his fan life unrewarding, so you can feel good about that.

Of course, the odds aren't as favorable if your team is only half as likely to win as an average franchise:

Still, even if your team can only expect to win once every sixty years, there's no reason to despair. If you are only concerned about winning the pennant, and your team can manage an average probability of doing so, you can feel pretty good regardless of your age:

Sunday, June 26, 2011

Eric Fryer, #50

Today (Sunday June 26, 2011) may have been the best day for OSU major leaguers in years. Nick Swisher hit a home run in the Yankees’ win over the Rockies, while Cory Luebke got his first start of the year for San Diego (on the roster all season, he’s made 29 relief appearances) and pitched five shutout innings, allowing one hit and two walks while fanning six. The third Buckeye in the majors had the least impressive game (0-3 with a walk, plus getting flattened by David Ortiz on a play at the plate), but his is the most significant performance of the day since it was his major league debut.

Eric Fryer became the fiftieth Buckeye to play in the majors (although the unconfirmed list I maintain now numbers 59) when he started for Pittsburgh at catcher. Fryer was the starting catcher for OSU from the day he set foot on campus, playing from 2005-07 until he was a 10th round pick by Milwaukee. The Brewers dealt him to the Yankees for Chase Wright in the 2009 pre-season, and they passed him along to the Pirates that summer for Eric Hinske. Both Milwaukee and New York tried him in the outfield as a minor league, so his long-term prognosis as a catcher is unclear, and to have a real major league career he’ll almost certainly have to stick behind the plate. But Pittsburgh has shown a commitment to using him as a catcher, and a strong minor league showing in 2011 (.320/.408/.520 in 201 PA between AA and AAA) opened the door for a major league roster spot.

Monday, June 20, 2011

Freshman Blues

I really am struggling with how to write this post, because it will probably come off as fairly critical, and that is far from my intent. I support the OSU baseball program fully, and any time I make a prediction or say anything less than fully complementary of a player, I would like nothing more than to be proven wrong.

On top of that, it’s also necessary to note that the length of the college season is such that statistics are compiled over much smaller samples and with much less predictive value than what I am accustomed to when looking at the major leagues. Of course, caution must be taken with regard to sample sizes and predictability when dealing with statistics of any kind, but it’s more important than usual here. More weight and deference must be given to observation and to the coaches who see the players in practice every day.

A final disclaimer that I need to throw in is that I didn’t really expect the 2011 Bucks to be very good. I thought qualifying for the Big Ten Tournament would be a challenge, and Ohio wound up as the #4 seed, so they actually exceeded my expectations, at least in conference play. It’s difficult to put too much blame on Coach Greg Beals, as he was in his first year and had nothing to do with recruiting the players he had to work with.

OSU lost its first two games of the season, then trailed St. John’s in the third game 7-0 before rallying for an 8-7 win. The team was 6-5 against a not particularly tough schedule before embarking on a California trip that saw them be hammered to the tune of 1-5. From that point on, the team had to work hard to stay near .500.

Mid-week games against non-conference opponents saw OSU go 5-4, which was actually an improvement over 2010 (and included a win over Oklahoma State in a rare matchup with a national program in a game of that type). In Big Ten play, the Buckeyes went 13-11 to earn the #4 seed in the conference tournament. They defeated #5 Minnesota, lost to #1 Illinois, and lost a rematch with Minnesota to bow out with a final season record of 26-27. It was the first sub-.500 season for Ohio State since 1987--the season before Bob Todd arrived in Columbus. It wasn’t just unadjusted record that suggested it was a very poor team by OSU standards--since Boyd Nation’s ISR ratings started in 1997, the lowest national rank for the program had been #130 in 1998, but in 2011 OSU could manage no better than #160.

OSU’s .491 W% ranked seventh in the Big Ten (Purdue led at .649); their .464 EW% ranked seventh (Purdue led at .668); and their .477 PW% ranked sixth (MSU led at .643). The Big Ten averaged 5.38 runs scored and 5.30 runs allowed per game; OSU ranked fifth with 5.6 runs scored and ninth with 6.0 runs allowed. Ohio had a .930 modified Fielding Average, which ranked eighth (the conference average was .939). OSU was also eighth in DER, converting approximately 66% of balls in play into outs compared to an average of 67%.

Offensively, the Buckeyes were almost perfectly average in the major components of offense: average (.280 compared to a .279 B10 average); walks (.100 per at bat versus .l02); and power (.104 isolated power matched the mean). Looking at the individual players, catching was the biggest weakness. Greg Solomon got the vast majority of the playing time, but was ice cold from the mid-point of the B10 schedule on, winding up at -5 RAA for the season. Solomon’s strikeout to walk ratio was a dreadful 42/4. First baseman Josh Dezse copped B10 Freshman of the Year honors thanks to his team-leading +15 RAA, fueled by a team-leading .332 BA but a decent .280 SEC as well. Second baseman Ryan Cypret hit .323, which despite a .208 SEC was enough to finish second on the team with +9 RAA. Third baseman Matt Streng drew just five walks, leaving him with a putrid .138 SEC despite a solid .115 ISO, and was two runs below average as a result. Tyler Engle bounced back to something resembling his sophomore form, drawing enough walks to make himself an average overall offensive player. Steng and Engle were both seniors.

Left field began as a platoon between David Corna and Joe Ciamocco, but Corna’s doubles power (he led the team with 16) coupled with Ciamocco’s failure to hit at all quickly left the former getting all the playing time. Unfortunately, Corna didn’t hit for a high enough average of draw enough walks to rank any better than average (-1 RAA). Freshman center fielder Tim Wetzel showed decent promise, but his lack of power (just three doubles and two triples in 176 at bats) left him at -4 RAA. Senior right fielder Brian DeLucia didn’t hit for power as he had in the past, and thus ended up at just +2 RAA with a .276/.359/.381 line. DH Brad Hallberg led the team with 28 walks, which was enough to make him an average contributor despite hitting .254 with a ISO of just .042. The Bucks did not have much depth, and there were no particularly notable performances by non-regulars; they combined for a .202/.248/.287 line in 137 plate appearances.

On the pitching side, senior Drew Rucinski was the clear ace of the staff, leading the team in innings (82), Run Average (4.29), and RAA (+9), with an even better 3.86 eRA. Sophomore Brett McKinney settled into the Saturday starter role, and turned in average results (5.18 RA) with a promising 49/20 K/W ratio. Freshman Greg Greve got better as the Big Ten season went on, but his overall performance (6.72 RA, -11 RAA) left much to be desired for a third starter. Fellow freshman John Kuchno got a number of mid-week starting assignments (10 appearances, 7 starts)

McKinney and Greve were bumped up in the pecking order because of a complete collapse in performance from senior Dean Wolosiansky, penciled in as the #2 but quickly shuffled off to low-leverage work with a 8.46 RA in 55 innings for -19 RAA.

Beals brought a major change in philosophy to the bullpen. Todd’s teams tended to have a closer, a middle reliever that he trusted, and a bunch of other guys. Beals managed more as a professional manager, with a closer and a number of pitchers who he mixed and matched to get the platoon advantage. Left-hander Andrew Armstrong was used in a LOOGY-esque capacity (29 innings in 33 appearances, with an essentially average 4.91 RA despite 39/16 K/W), a new experience for OSU baseball fans. Fellow southpaw Theron Minimum was not used in as much of a matchup role (he started once and pitched 29 innings in 21 appearances), and was used in lower leverage situations than Armstrong. He recorded a 6.28 RA for -3 RAA.

The situational usage of the lefties was no doubt enhanced by the fact that the Bucks’ top two right-handed setup men were somewhat underhanded. Senior Jared Strayer raised his arm slot to 3/4 this year, while junior former catcher David Fathalikhani delighted this fan with his more pure sidearm approach. Both turned in similar performances, as Strayer worked 29 innings in 27 outings with a 4.66 RA (+2 RAA); Fathalikhani matched that RAA by working 26 innings in 26 appearances with a 4.56 RA.

The other reliever of note (Brian Bobinski and Paul Geuy worked 21 innings between them) was Dezse, the freshman first baseman who doubled as a closer. Dezse throws hard, in the mid-to-high nineties, but he left a lot to be desired in the polish department with 32/22 K/W in 28 IP (plus two hit batters and eight wild pitches) with a 7.48 RA. I’ll have more to say about Dezse in a moment, which will be a little critical, but I intend that to be aimed at the way he was utilized, not at the pitcher himself.

Given that OSU did not return much talent and that could not be pinned on Beals, the most interesting aspect to evaluation of the new coach was his strategy. What I saw I did not like. OSU *seemed* to make more baserunning mistakes than usual, including a couple of horrible little league style delayed double steals of home that were sniffed out with ease by the opposition. OSU’s SB% was 63%, an improvement from last year’s 56%, but well below 2009 (82%) and 2008 (74%). More disturbing was that OSU’s stolen base attempt frequency (measured here as (SB + CS)/(S + W)) increased to 9.9% (2008-10: 11.9%, 8.9%, 7.4%).

Beals also called for many more sacrifices than Todd had over the last three seasons. While he had a less potent offense to work with, the difference was stark enough to suggest that perhaps Beals is a bigger believer in the bunt than Todd. OSU’s SH/(S + W) was .073, while Todd’s final three teams had ratios of .028, .047, and .033.

What really befuddled this observer, though, was Beals’ handling of Dezse. For one, Beals never gave him a chance to start. One of the most bizarre discussions by a TV announcer I can recall occurred during the Buckeyes’ Big Ten Tournament game against Illinois. The announcer lamented the difficult job college coaches have in balancing player development with winning games, and his example was that Dezse could develop more starting but was more valuable to his team as a reliever.

In fairness, it might have been for the best that Dezse was limited to short mound outings, because he showed little command and really was not effective. There were a couple games in which Dezse came in and absolutely blew away the opposition, but he also had several games that can only be described as meltdowns.

On April 3, OSU led Northwestern 14-10 entering the top of the ninth. Dezse gave up four hits and a walk, was bailed out by an idiotic piece of Wildcat baserunning…and still yielded a game-tying three run blast. The Bucks did pull it out with a tally in the bottom of the frame. On May 14, OSU led Iowa 8-4 entering the bottom of the ninth. Dezse gave up three hits, three walks, and the lead as the Hawkeyes wound up prevailing in ten.

But the costliest such appearance came in the second round of the Big Ten Tournament. Fresh off a win over #5 Minnesota, #4 OSU had a chance to knock off #1 Illinois and stay in the winner’s bracket. Dezse was summoned to start the eighth with OSU up 4-1. He issued two walks and a wild pitch in the eighth, but kept the Illini off the board. In the ninth he surrendered a leadoff double, uncorked a wild pitch, yielded a single, got a groundout, uncorked a second wild pitch, gave up another single, got the second out on a fly to left, then issued a walk and a single that tied the game. Andrew Armstrong relieved him but his first pitch was hit back up the middle for a single that essentially ended Ohio’s season (the Bucks bowed out with a whimper against Minnesota the next day).

Beals *seemed* to be locked into the mindset that he had to have a closer, and that it had to be his hardest throwing option, regardless of whether he was able to throw strikes or pitch efficiently. Dezse may have outstanding potential, and I’ll grant that it’s possible that he truly was the best option--but as a fan, it was beyond frustrating to watch the same movie play out three times.

I do not want to close this on a down note, so it’s important to point out that Beals has by all accounts done a terrific job in recruiting, illustrated by the fact that two OSU signees went in the first ten rounds of the June draft. There’s a lot more to the job of being a major league manager than just strategy, and that applies even more in a collegiate setting in which procuring talent is also the coach’s responsibility. The true test of Beals’ success will not be bunt frequencies, but wins and losses, and that test begins in 2012.

Friday, June 17, 2011

Great Moments in Yahoo! Box Scores

Thanks to a reader for pointing this one out.

Tuesday, June 14, 2011

Comments on Bill James' “Solid Fool’s Gold”

For the past three seasons, Bill James had published an annual book called the Bill James Gold Mine. The book included a sampling of some of the material available to subscribers of his Bill James Online website, including some unique split data (unique in the sense that it’s not commonly in print on actual pieces of paper), like breakdowns of pitches thrown to left-handed and right-handed batters. There were little boxes with “nuggets” (gold is a theme needless to say) pointing out various oddities.

Those elements of the book really took up a lot of space, but were woefully incomplete (they clearly weren’t intended to be complete, but the point is that the book had no utility as a reference). The most interesting aspect of the book was that it reprinted a number of full-length essays that James had written for his website over the course of the year. I generally enjoyed those essays as you will see if you look back at my comments on the previous editions of the book.

This year, there was no Gold Mine. Instead, ACTA and James released a slimmer, smaller volume entitled Solid Fool’s Gold, which includes only essays. I didn’t bother to count, but I’d guess that the new book has about as many essays as the Gold Mine, it is sold for a lower price, and personally I won’t miss the hodgepodge of charts that much.

If a fairly quick read of non-technical essays by Bill James is something you think you’d enjoy, you’ll probably like Solid Fool’s Gold, unless you already subscribe to Bill James Online. The new format is much better, as it includes a bunch of essays that you could pick up and read in five years rather than some of the more limited shelf-life aspects of the Gold Mine and it does it more cheaply and compactly. I certainly enjoyed it, which you should keep in mind as I now launch into a more critical review of some specific essays in the book.

The essay that has probably gotten the most attention (outside of the much-panned essay on Shakespeare and Topeka which was published on Slate) is called “Minor League Pyramid”, and it includes James’ outline of a way to reform the minor league structure to make it resemble a pyramid rather than a tube (his description) as it does now. I won’t repeat his argument here, but there are couple points that I feel strongly about:

1. I agree with James that talent is choked out of the game by the limited number of openings at the entry level. This is offset somewhat by the existence of college baseball as an alternative means to improve baseball, but scholarship restrictions make it a less reliable means of keeping quality talent engaged than college football or basketball. The scholarship restrictions also leave college baseball as next to useless for attracting low-income players to the game.

2. The proposal that James offers includes limits on the rate at which a prospect can be advanced through the minors, with the bottom line being that a player could not reach the majors until he’d played three seasons in the minors. This is impossible to square with the existence of college baseball, and it also removes the illusion of a meritocracy which I think is very important, even if it is only an illusion. We all know that teams play service time games, and that there are few twenty-year olds ready to be major league contributors anyway, but the potential for a wunderkind to reach the majors, even if there are only a handful each year, is something that should be preserved.

3. James doesn’t seem to think his system would have much of an impact on the ability of baseball to attract talented athletes with other options. In fact, he claims that a minor league pyramid would reduce the pressure on signing bonuses. While I’m sure MLB CFOs would like that (as they would like the destruction of college as an alternative path), I can’t fathom how it wouldn’t make MLB a much less attractive option.

He doubles down on this by saying that it would reduce “pressure” on teams to scout internationally to find talent, since they would have a larger supply of homegrown players. I can’t for the life of me understand why that would be considered a positive. The piece does make some good points, but that one is a real head-scratcher.

Another essay in the book is called “Stink-O-Meter”; it discusses a fairly simple method of tracking the persistency of losing for a franchise. The article is good in that it reinforces something that the average fan with no historical perspective constantly needs to be reminded--the state of even sorry franchises like the Pirates and Royals is nothing like that of the terrible franchises of the game’s past.

The article is way too long, though--James feels compelled to run a chart every few paragraphs listing the top five or ten losing teams at a given moment in time. This works better online than in print where it just wastes space, but it also helped me to see what I think is a pattern in James more recent work. James has realized that it’s more difficult (not just for him, but for anyone) to produce cutting edge technical work. Rather than introducing more rigorous means of analysis, James has decided to play show-and-tell; his more recent essays are filled with tables that in the past would have been left to the imagination or determination of the reader. I could be off base, but it seems that increased comprehensiveness represents James’ attempt to keep up with the Joneses.

Another essay is a reprinting of a speech James gave called “Battling Expertise with the Power of Ignorance”. It’s a good read, but there was a portion that mentioned Pythagorean record and Runs Created that, shockingly, I can’t help but comment on.

James describes the two methods as the best-known of the “large number of heuristic rules” he developed during what could loosely be defined as his Abstract years. About the Pythagorean theorem, James said: “Later research has demonstrated that it works better still if you modify the exponent for the level of scoring”.

The relationship between the slope of a run to win converter has been known for many years (dating back at least to Pete Palmer), but Clay Davenport’s Pythagenport was the first well-known modification to James’ Pythagorean theorem. Later, this was refined further into Pythagenpat by recognizing the minimum theoretical exponent was one.

James freely acknowledges the refinements to Pythagorean and in fact has used Pythagenpat in at least one of his own studies. That only makes it all the more strange that he continues to cling to Runs Created, even when RC has been demonstrated to be a less accurate tool than Pythagorean with a fixed exponent. RC is subject to complete meltdown under theoretical extreme conditions; while Pythagorean incorrectly handles the known point of 1 RPG, it correctly imposes a range of [0, 1] on all of its estimates.

Discussing RC, James made no mention of subsequent work on other run estimators. Understand that I am not trying to claim that he had any obligation to do so in the context of this speech--only that his continuing clinging to RC while recognizing other refinements to his original tools grows more bizarre as time passes. I suppose one could argue that at least the Pythagorean refinements maintain the original R^x/(R^x + RA^x) model, and James always pointed out that an exponent other than two could result in more accurate estimates. In any event, it’s extremely hard for me not to comment on a run estimator when the opportunity arises.

While James recognized the existence of variable exponent refinements to Pythagorean record, he unfortunately missed a golden opportunity to utilize them in another essay in the book. There is an article in which James examines the performance of starting pitchers when supported by X runs--basically, an attempt to examine the mystical phenomenon of “pitching to the score”. I won’t steal his thunder by discussing his conclusions, but I will point out a methodological shortcoming in his approach.

When Whitey Ford’s teams scored one run with him pitching, their record (not Ford’s record) was 10-28. James converts this to an effective rate of runs allowed using Pythagorean math. In this case we know that Ford’s teams scored 38 runs (one for each game), so the equivalent number of runs Ford allowed to produce a Pythagorean record of 10-28 is x in the following equation:

10/(10 + 28) = 38^2/(38^2 + x^2)

This eventually simplifies to sqrt(L/W)*R, or sqrt(28/10)*38 = 63.6. With two runs, Ford’s team were 19-22, which is equivalent to sqrt(22/19)*82 = 88.2 runs. Adding these up and dividing by the total number of games produces an “Effective Runs Allowed Rate” for Ford in games in which his team scored one or two runs: (62.6 + 88.2)/(38 + 41) = 1.91. Continuing in this manner for scoring three, four, five, … runs (while ignoring shutouts which are always losses), James has a measure of pitching effectiveness given the level of offensive support on a discrete game-by-game basis.

However, the use of a fixed exponent severely distorts things by essentially assuming an average run scoring environment (an exponent of 2 corresponds to a RPG of around 10.9 using Pythagenpat), when we know the scoring output of one of the teams involved. If one team scored only one run, the expected RPG is going to be lower than average.
We could assume that an average number of runs would be scored by the other team involved in the game, and instead say that the RPG is 5.5, and the Pythagorean exponent should be around 1.64. In that case, the equivalent runs allowed would be (28/10)^(1/1.64)*38 = 71.2 runs, a 12% difference from James’ estimate.

That approach assumes that we know nothing about the “other” team’s run scoring rate--but of course, we know a great deal about it, because we know the identity of the starting pitcher: Whitey Ford. For his career, Whitey Ford had a Run Average of 3.14 and averaged 6.94 innings/start in a league that averaged about 4.31 runs/game, so we could estimate that his team’s RA is (6.94*3.14 + (9 - 6.94)*4.31)/9 = 3.41, and that the expected RPG for a game in which his offense scores one run is 4.41, producing a Pythagorean exponent of 1.54 and (28/10)^(1/1.54)*38 = 74.2 run equivalent. This new estimate is approximately 17% higher than James’ original estimate.

The good news is that when you extend things across the entire spectrum of the run distribution, much of the distortion is canceled out. James presents complete breakouts for several pitchers, but the two of historical interest are Ford and Tom Seaver. Setting aside the adjustments he introduces to smooth the data and restate the effective runs allowed rate on the actual RA scale, let me just run the crude totals for those two under three assumptions: the James approach of a Pythagorean exponent of 2, an assumption that the RPG at each scoring level is 4.5 plus the number of runs the pitcher’s team scored, and the customized type assumption I described for Ford above (Seaver had a career RA of 3.15 and averaged 7.38 innings/start in a league that averaged approximately 4.11 runs per game, resulting in a customized team RA of 3.32 runs/game):

These differences are small, in the neighborhood of 1%, and thus not worth getting too worked up about. However, it’s important to keep in mind that fixed Pythagorean will not work particularly well at the extremes, and it would be a mistake to put a lot of confidence in the isolated application of the fixed exponent Pythagorean estimate to an extreme RPG.

Wednesday, June 08, 2011

Great Moments in Yahoo! Box Scores

Screencap grabbed at 5:22 for a game that ended around 3:00.

Tuesday, June 07, 2011

Scoring Self-Indulgence, pt. 3: Outs in Play

The scoring of outs in play is of course heavily dependent on the traditional numbering system for the fielders. Since the use of those familiar position numbers is a part of just about every scoring system I’ve seen (LL Bean’s pictorial system as the exception), the way I score outs in play is very similar to the way everyone else scores outs in play. The only real difference is the particular field location modifiers I use, but those are easier to discuss in their own separate post near the end of this series.

The two principles that I attempt to adhere to when scoring an out in play (these are batters retired without reaching safely, or hitting into fielder’s choices that force their teammates out, not baserunning outs) that are a little different than the systems that some people use are:

1. No dashes to indicate throws. All of the position codes are one number (a convenient result of having only nine fielders). If two of them end up next to each other, I can deduce that there was a throw 99% of the time.

Instead, I use a dash on those occasions in which the ball goes between two fielders involved in recording an out for their team without a throw. The most common is a deflection by the pitcher, but you can have crazy scenarios where balls are deflected between outfielders and caught and the like.

2. If there’s a throw involved, it’s an infield groundout, unless otherwise noted. If it’s an unassisted out, it’s in the air, unless otherwise noted (I make an exception for first baseman. Three with no elaboration is always a groundout on my scoresheet, which admittedly violates this rule). If it’s in the air, it’s a flyball/popup rather than a line drive unless otherwise noted.

Based on these rules, here are some sample scoreboxes for certain outs. For all of the examples, I’ve excluded any pitch scoring, because it would just distract from the main focus of the post. Just pretend all these outs are first pitch swinging:

Here is a simple groundout to short, scored the way everyone does except for those who insist on using dashes.

Here’s where the dash comes in--a deflected ball. The dash comes after the player doing the deflecting, in this case the pitcher. The second baseman recovers it and throws to first for the out.

In this case, the T indicates tag; the first baseman recorded the out not by stepping on the base but by physically tagging the batter-runner. On certain baserunning plays, the tag is implied (caught stealings are an obvious example), but I’ll deal with that in the appropriate section. You could also have “T3”, which would be a groundout fielded by the first baseman, who tags the runner himself. Sometimes you’ll see “T1” as well.

The “SH” indicates that this will be scored as a sacrifice hit, catcher to first base. I do not include the bunt symbol because the SH designation communicates that.

If it’s a bunt, but not a sacrifice hit, then I use the squiggly line modifier that indicates bunt. In this case, the third baseman threw out the batter at first on the bunt attempt.

I use a “DP” modifier to indicate double plays either on groundballs, line drives, or flyballs (we’ll see those later)--basically, double plays in which there was a force out (I realize my usage of that term is not necessarily in full compliance with the rule book definition). I do not note a double play on a strikeout/caught stealing or on a runner thrown out attempting to tag, even if these technically are double plays as well. The scoring is done in such a manner that someone reading the sheet can ascertain the double play, but there is no “DP” code employed.

My symbol for a groundout is a straight line underneath the fielder’s number, but the only situation in which I actually end up using this is an unassisted play in which the pitcher tags first base for the out. Other groundballs are implied by the use of throws to record the outs; even an unusual event in which an outfielder fields a groundball and throws the batter-runner out at first or records a fielder’s choice is clearly suggested to be a groundball by the indicated throw. If for some reason a second baseman or someone else ended up running to first and tagging the base unassisted, they’d get the underline modifier as well.

Fielder’s choices involve the batter reaching base safely, so I’ll cover them in the section on scoring plays where the batter reaches. However, a special case is the two out fielder’s choice. While the batter is not himself retired, he also never actually takes his place as a runner, and so I don’t make it look as if he does in the scoresheet. In that case, I write it the scoring of the play large across the box as I would for any other out. This one was third to second, obviously. You’ll not that the dot for the out is omitted, because the batter was not retired.

I don’t use this one much, but the carrot below the groundball indicates a chopper. I only use it for serious ones that John McGraw would approve of. Here, the third baseman was able to make the play anyway.

On to flyouts. This is your garden variety fly ball to center field.

If a fly is caught in foul territory, “`” is the symbol I use to indicate it. This is a foul to left field.

I use a curved line segment over the position number to indicate a fly ball. I waive this for all positions except first base, where “3” alone indicate a groundout. For any other position, an unassisted putout is always assumed to be a popup for an infielder and a flyball for an outfielder (in any event, I don’t distinguish between pops and flies).

Since flies get a curved line above the fielder’s number, a line drive gets a straight line in the same location. This one was caught by the shortstop.

Here is a regular flyout to right that is scored a sacrifice fly, indicated by a “SF” prefix. Any of the out modifiers could be combined when sensible--one could have a line drive sacrifice fly in foul territory, I suppose.

Some balls are somewhere in between popups and line drives. When I used letters rather than symbols to indicate ball trajectory, I called these “loopers (LP)”. Now I use the flyball curve plus the line drive line to indicate them. I never score outfield loopers, and I never score base hits as loopers. Only infield outs; obviously this one was snagged by the second baseman.

The “IF” prefix here indicates “infield fly”; that is the infield fly rule has been invoked and the catch was not completely cleanly. This makes it a pretty rare code, but I have had to use it a couple of times, and you’d encounter it a lot more at lower levels of the game. Obviously the first baseman was credited with the putout on this one.

Here is a popped up bunt snagged by the catcher in foul territory (none of the examples here are sacrifices or else the bunt would be indicated by “SH” and not by the bunt symbol). You can tell it’s not a grounder because that would involve a throw or a tag (and in this case because it’s marked as a foul). The exception is a first baseman. If I record this:

It indicates a groundball bunt with the play made unassisted at the bag by the first baseman. If it was a popup to the first baseman, the fly arch would be above the number.

This is an example of a line drive double play where the shortstop catches a liner and flips to second for the out, ending the inning. There is no solid out dot because the batter was not retired.

In this case, the batter flies to right, and subsequently a runner is doubled off his base. That portion of the play is recorded in the runner’s box; the DP here just lets us know that there was a double play somewhere. Again, I do not record a DP symbol when the out is recorded by a runner attempting to advance on a non-force play. If a runner is thrown out after tagging at third and attempting to score, there is no record of this made in the batter’s scorebox--you'll just see a regular flyout symbol.