<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-12133335</id><updated>2012-01-28T14:33:57.862-05:00</updated><category term='Horse Racing'/><category term='Pennant Races'/><category term='Meanderings'/><category term='Positional Adjustments'/><category term='Predictions'/><category term='Rankings'/><category term='Indians'/><category term='Current Events'/><category term='Sabermetrics'/><category term='Scorekeeping'/><category term='History--Other'/><category term='Whimsy'/><category term='Hall of Fame'/><category term='Leadoff Hitters'/><category term='Win Shares'/><category term='Run Estimators'/><category term='Pitching'/><category term='Playoffs'/><category term='Book Reviews'/><category term='World Baseball Classic'/><category term='Steroids'/><category term='Win Estimators'/><category term='Awards'/><category term='OSU'/><category term='Park Factors'/><category term='Home Field Advantage'/><category term='NFL'/><category term='Managers'/><category term='Fielding'/><category term='19th Century'/><category term='Offense'/><category term='Statistical Reports'/><category term='Baselines'/><category term='Yahoo Box Scores'/><title type='text'>Walk Like a Sabermetrician</title><subtitle type='html'>Occasional commentary on baseball and sabermetrics</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default?start-index=101&amp;max-results=100'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>387</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-12133335.post-6422790476594629395</id><published>2012-01-28T14:33:00.001-05:00</published><updated>2012-01-28T14:33:57.871-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Statistical Reports'/><title type='text'>Crude Team Ratings, 2011</title><content type='html'>Anyone can throw together a spreadsheet and declare that they have a ranking system for teams.  It’s not particularly hard to construct a reasonable method by which to take an initial estimate of team strength, adjust for strength of schedule, recalculate each team’s ranting, adjust for SOS again, rinse, repeat.  I have done &lt;a href="http://walksaber.blogspot.com/2011/01/crude-team-ratings.html"&gt;just that&lt;/a&gt;, and will present the 2011 ratings here.&lt;br /&gt;&lt;br /&gt;If you want the full details, please refer to the linked post.  The gist of the system is:&lt;br /&gt;&lt;br /&gt;1) Start with a win ratio figure for each team.  It could be actual win ratio, or an estimated win ratio.&lt;br /&gt;&lt;br /&gt;2) Figure the average win ratio of the team’s opponents.&lt;br /&gt;&lt;br /&gt;3) Adjust for strength of schedule, resulting in a new set of ratings.&lt;br /&gt;&lt;br /&gt;4) Begin the process again.  Repeat until the ratings stabilize.&lt;br /&gt;&lt;br /&gt;The resulting figure is in the form of an adjusted win ratio; I force the average team to a rating of 100.  The ratings can be plugged directly into an odds ratio--a team with a rating of 120 should win about 60% of the time against a team with a rating of 80 (120/(120 + 80)).&lt;br /&gt;&lt;br /&gt;I’ll present four different sets of ratings here, each using a different win ratio as the input.  It’s overkill to run this many, but if for some reason you prefer a certain estimate of win ratio, it may be represented.&lt;br /&gt;&lt;br /&gt;Since 2011 is in the past, there’s no particular value in predictive ratings, so I’ll focus on the CTR based on actual wins and losses:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-i2pEUemwf6U/TyRMod1LAOI/AAAAAAAABAM/0h7-syDFdB0/s1600/ctr11A.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="278" src="http://3.bp.blogspot.com/-i2pEUemwf6U/TyRMod1LAOI/AAAAAAAABAM/0h7-syDFdB0/s400/ctr11A.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;aW% is the adjusted W% based on CTR; SOS is the weighted average CTR of the team’s opponents; rk is the team’s ranking among the thirty teams; and s rk is the SOS rank.&lt;br /&gt;&lt;br /&gt;The results aren’t particularly surprising; the teams are ranked pretty close to how they would be in W%.  In some recent years, the results would favor AL teams much more than just looking at pure W%, but the National League held its own with the AL in 2011 as seen from the league/division ratings (simply the average rating for each member team):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-AOZ6N_tOUPo/TyRMtgv_UBI/AAAAAAAABAY/JQyrjRHu4fk/s1600/ctr11B.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="118" width="91" src="http://4.bp.blogspot.com/-AOZ6N_tOUPo/TyRMtgv_UBI/AAAAAAAABAY/JQyrjRHu4fk/s400/ctr11B.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;That makes for a nice rank order of divisions, with East &gt; West &gt; Central, and AL &gt; NL in each case.  Still, the overall AL/NL rating difference of 103/97 is a lot smaller than previous seasons, including 108/93 in 2010.  While the NL Central remained the weakest division, 89 was an improvement over the 82 rating in 2010.  If Houston was in the AL rather than the NL (and assuming all the ratings stayed constant), the leagues would have each had a CTR of 100.&lt;br /&gt;&lt;br /&gt;The next set of CTRs is based on Game Expected W% as described in &lt;a href="http://walksaber.blogspot.com/2012/01/run-distribution-and-w-2011.html"&gt;this post&lt;/a&gt;.  Basically, gEW% assumes independence between runs scored and runs allowed in a given game, and uses the 2011 empirical W% for teams scoring or allowing X runs in conjunction with each team’s actual game-by-game distribution of runs scored and runs allowed to estimate their W%.  The resulting CTRs:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-kqEcUyOig0g/TyRM5NFW8TI/AAAAAAAABAk/Vh1wxP1M6R4/s1600/ctr11C.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="279" src="http://1.bp.blogspot.com/-kqEcUyOig0g/TyRM5NFW8TI/AAAAAAAABAk/Vh1wxP1M6R4/s400/ctr11C.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Using classic Pythagenpat as the input:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-WXZtQFNRlso/TyRM9XVOVuI/AAAAAAAABAw/NvfMpA8tKd4/s1600/ctr11D.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="278" src="http://2.bp.blogspot.com/-WXZtQFNRlso/TyRM9XVOVuI/AAAAAAAABAw/NvfMpA8tKd4/s400/ctr11D.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Finally, using Pythagenpat estimated win ratios based on runs created and runs created allowed:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-Q6Xc3fWL3Mg/TyRNCRhGt4I/AAAAAAAABA8/_A54S3HHwAg/s1600/ctr11E.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="279" src="http://2.bp.blogspot.com/-Q6Xc3fWL3Mg/TyRNCRhGt4I/AAAAAAAABA8/_A54S3HHwAg/s400/ctr11E.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Obviously there exist any number of possible combinations of win ratio estimates one could use, regression can be mixed in, etc.  What I’ve presented here is just the most straightforward ratings based on obvious single inputs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-6422790476594629395?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/6422790476594629395/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2012/01/crude-team-ratings-2011.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6422790476594629395'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6422790476594629395'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2012/01/crude-team-ratings-2011.html' title='Crude Team Ratings, 2011'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-i2pEUemwf6U/TyRMod1LAOI/AAAAAAAABAM/0h7-syDFdB0/s72-c/ctr11A.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-318565063484394635</id><published>2012-01-17T00:20:00.000-05:00</published><updated>2012-01-17T00:20:00.506-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Win Estimators'/><category scheme='http://www.blogger.com/atom/ns#' term='Statistical Reports'/><title type='text'>Run Distribution and W%, 2011</title><content type='html'>A couple of caveats apply to everything that follows in this post. The first is that there are no park adjustments anywhere. There's obviously a difference between scoring 5 runs at Petco and scoring 5 runs at Coors, but if you're using discrete data there's not much that can be done about it unless you want to use a different distribution for every possible context. Similarly, it's necessary to acknowledge that games do not always consist of nine innings; again, it's tough to do anything about this while maintaining your sanity. &lt;br /&gt;&lt;br /&gt;All of the conversions of runs to wins are based only on 2010 data. Ideally, I would use an appropriate distribution for runs per game based on average R/G, but I've taken the lazy way out and used the empirical data for 2010 only. &lt;br /&gt;&lt;br /&gt;This post also contains little in the way of "analysis" and a lot of tables. This is probably a good thing for you as the reader, but I felt obliged to warn you anyway.  I’ve cut out a lot of what I listed last year simply because I don’t have that much free time right now.  The data was not particularly useful in any event—knowing how many runs teams scored and allowed in their wins and losses, or what percentage of their games fell into arbitrarily defined classes might offer some trivia but is not exactly essential material.&lt;br /&gt;&lt;br /&gt;The first breakout is record in blowouts versus non-blowouts.  I define a blowout as a margin of five or more runs.  This is not really a satisfactory definition of a blowout, as many five-run games are quite competitive--"blowout” is just a convenient label to use, and expresses the point succinctly.  I use these two categories with wide ranges rather than more narrow groupings like one-run games because the frequency and results of one-run games are highly biased by the home field advantage.  Drawing the focus back a little allows us to identify close games and not so close games with a margin built in to allow a greater chance of capturing the true nature of the game in question rather than a disguised situational effect.&lt;br /&gt;&lt;br /&gt;In 2011, 75.8% of games were non-blowouts and 24.2% were blowouts.  The teams sorted by non-blowout record:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-GMw1SOuuCG0/TxIGJhnhzZI/AAAAAAAAA_Q/JvtDIXD34Ec/s1600/rdA.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="140" src="http://4.bp.blogspot.com/-GMw1SOuuCG0/TxIGJhnhzZI/AAAAAAAAA_Q/JvtDIXD34Ec/s400/rdA.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The standard deviation of W% in non-blowouts was .064, which as expected is less than the standard deviation for blowouts (.114) and all games (.070).&lt;br /&gt;&lt;br /&gt;Records in blowouts:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-7yxznPSiJkE/TxIGJ-LjynI/AAAAAAAAA_Y/ZHLKWWInauI/s1600/rdB.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="139" src="http://3.bp.blogspot.com/-7yxznPSiJkE/TxIGJ-LjynI/AAAAAAAAA_Y/ZHLKWWInauI/s400/rdB.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Obviously the sample size on these games is pretty small, but Kansas City and Oakland at .500 in blowouts caught my eye. &lt;br /&gt;&lt;br /&gt;This chart shows blowout W% less non-blowout W%, along with the percentage of games that were blowouts and non-blowouts for each team:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-BHrOb2Zf4Gc/TxIGJ-eWyWI/AAAAAAAAA_k/DpzWqBNqIGw/s1600/rdC.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="139" src="http://2.bp.blogspot.com/-BHrOb2Zf4Gc/TxIGJ-eWyWI/AAAAAAAAA_k/DpzWqBNqIGw/s400/rdC.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This is the second year in a row in which San Diego has ranked high in terms of difference between blowout and non-blowout record.  Usually teams with large differences are the better teams; that description may have fit the Padres in 2010 but not in 2011.  Cleveland was the most extreme team in either direction in the majors.  Florida played in the smallest proportion of blowouts while Texas played in the most.&lt;br /&gt;&lt;br /&gt;A more interesting way to consider game-level results is to look at how teams perform when scoring or allowing a given number of runs.  For the majors as a whole, here are the counts of games in which teams scored X runs:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-wgj73rc4TR8/TxIGKJzjlTI/AAAAAAAAA_s/huTdOQlBYXg/s1600/rdD.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="300" width="317" src="http://4.bp.blogspot.com/-wgj73rc4TR8/TxIGKJzjlTI/AAAAAAAAA_s/huTdOQlBYXg/s400/rdD.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The “marg” column shows the marginal W% for each additional run scored.  The second and third run were both worth about .15 wins on average in 2011, while scoring four runs was the cutoff point between winning and losing (on average, of course).&lt;br /&gt;  &lt;br /&gt;I use these figures to calculate a measure I call game Offensive W% (or Defensive W% as the case may be), which was suggested by Bill James in an old &lt;u&gt;Abstract&lt;/u&gt;.  It is a crude way to use each team’s actual runs per game distribution to estimate what their W% should have been by using the overall empirical W% by runs scored for the majors in the particular season.&lt;br /&gt;&lt;br /&gt;Using the empirical distribution rather than a theoretical distribution has the upside of being simple (modeling the runs per game distribution is fairly messy), but the benefits are outnumbered by the drawbacks.  A non-comprehensive list of said drawbacks:&lt;br /&gt;&lt;br /&gt;1. The empirical distribution is subject to sample size fluctuations. In 2011, at least, each additional run increased W%.  This is often not the case given the low frequency of high scoring games.  Even so, the marginal values don’t necessary make sense--for instance, the marginal value of a tenth run is implied to be .006 wins while the marginal value of an eleventh run is implied to be .040.&lt;br /&gt;&lt;br /&gt;2. Using the empirical distribution forces one to use integer values for runs scored per game.  Obviously the number of runs a team scores in a game is restricted to integer values, but not allowing theoretical fractional runs makes it very difficult to apply any sort of park adjustment to the team frequency of runs scored.&lt;br /&gt;&lt;br /&gt;3. Related to #2 (really it’s root cause, although the park issue is important enough from the standpoint of using the results to evaluate teams that I wanted to single it out), when using the empirical data there is always a tradeoff that must be made between increasing the sample size and losing context.  One could use multiple years of data to generate a smoother curve of marginal win probabilities, but in doing so one would lose centering at the season’s actual run scoring rate.  On the other hand, one could split the data into AL and NL and more closely match context, but you would lose sample size and introduce quirks into the data.&lt;br /&gt;&lt;br /&gt;I will not go into the full details of how gOW%, gDW%, and gEW% (which combines both into one measure of team quality) are calculated here, but full details were disclosed in &lt;a href="http://walksaber.blogspot.com/2009/01/perfunctory-look-at-run-distribution.html"&gt;this post&lt;/a&gt;.  The “use” column here is the coefficient applied to each game to calculate gOW% while the “invuse” is the coefficient used for gDW%.  For comparison, I have looked at OW%, DW%, and EW% (Pythagenpat record) for each team; none of these have been adjusted for park to maintain consistency with the g-family of measures which are not park-adjusted.&lt;br /&gt;&lt;br /&gt;For most teams, gOW% and OW% are very similar. Teams whose gOW% is higher than OW% distributed their runs more efficiently (at least to the extent that the methodology captures reality); the reverse is true for teams with gOW% lower than OW%. The teams that had differences of +/- 2 wins between the two metrics were (all of these are the g-type less the regular estimate):&lt;br /&gt;&lt;br /&gt;Positive: BAL, PIT, ATL, FLA, HOU, SEA&lt;br /&gt;Negative: BOS, NYA, TEX, COL&lt;br /&gt;&lt;br /&gt;You'll note that the positive differences tended to belong to bad offenses; this is a natural result of the nature of the game, and is reflected in the marginal value of each run as discussed above. In the four years that I’ve been looking at these figures, I can’t recall a difference as large as the Red Sox’ deviation in 2011--a standard OW% of .610 and a gOW% of .572, a 6.2 win difference.  Boston led the majors in OW%; their gOW% was still excellent and good enough for third in the majors, but they did not spread their runs across games in an efficient fashion.  The Sox scored ten or more runs 25 times; Toronto was second with 19 and the major league average was 9.  Boston scored 36% of their runs in that 15% subset of games; the major league average was 15%, and next on the list was Texas at 28%.&lt;br /&gt;&lt;br /&gt;Differences in for gDW%:&lt;br /&gt;&lt;br /&gt;Positive: DET, BAL&lt;br /&gt;Negative: PHI, SD, TB&lt;br /&gt;&lt;br /&gt;I combine gOW% and gDW% through some Pythagorean math to produce gEW%, which can then be compared to a team’s standard Pythagorean record (EW%).  Of course, it could also be compared to actual W%, but I think the comparison to a method that also uses runs is more interesting than a comparison to the actual win totals:&lt;br /&gt;&lt;br /&gt;Positive: BAL, PIT, CHA, DET, MIN, HOU, OAK, FLA&lt;br /&gt;Negative: BOS, PHI, COL, NYA, SD, TB, LA, KC&lt;br /&gt;&lt;br /&gt;There are so many large differences that I’m a little worried that I may have made a spreadsheet error somewhere along the way, although I have double-checked and can’t find anything.  Below is a table with all of the metrics discussed in this post for each team, sorted by gEW%:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-gn8qjHo_CMw/TxIGKLTND0I/AAAAAAAABAE/sMykAVrHnLI/s1600/rdE.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="313" src="http://2.bp.blogspot.com/-gn8qjHo_CMw/TxIGKLTND0I/AAAAAAAABAE/sMykAVrHnLI/s400/rdE.jpg" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-318565063484394635?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/318565063484394635/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2012/01/run-distribution-and-w-2011.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/318565063484394635'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/318565063484394635'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2012/01/run-distribution-and-w-2011.html' title='Run Distribution and W%, 2011'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-GMw1SOuuCG0/TxIGJhnhzZI/AAAAAAAAA_Q/JvtDIXD34Ec/s72-c/rdA.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-173226479710230079</id><published>2012-01-04T19:59:00.000-05:00</published><updated>2012-01-04T19:59:51.802-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='NFL'/><title type='text'>Crude NFL Ratings, 2011</title><content type='html'>The NFL is a distant third on my list of pro sports interests (baseball is #1, of course, and horse racing ranks #2), but I’m interested enough to run the teams through my crude rating system (see explanation &lt;a href="http://walksaber.blogspot.com/2011/01/crude-team-ratings.html"&gt;here&lt;/a&gt;) and figure I might post the ratings here.  They are based on points/points allowed, adjusted for strength of schedule.  100 represents a win/loss ratio of 1, and so the resulting ratings are adjusted win ratios and can very easily be used to estimate the probability of a team winning a particular game.  A team with a rating of 100 should beat a team with a rating of 50 2/3 of the time (100/(100 + 50)).&lt;br /&gt;&lt;br /&gt;Actually, let me first run a list based on actual wins and losses.  I’ve actually calculated W/L ratio as (W + .5)/(L + .5) here just to avoid the (real in the NFL) possibility of a 16-0 team crashing the system:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-91kp39s-fpI/TwTz970dCDI/AAAAAAAAA9w/sqeI4dXbKuo/s1600/nfl1.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="229" src="http://3.bp.blogspot.com/-91kp39s-fpI/TwTz970dCDI/AAAAAAAAA9w/sqeI4dXbKuo/s400/nfl1.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In the chart, aW% is an adjusted W%; it averages to .500 for the NFL and will produce the same list in rank order as the CTR; I prefer the latter because of its Log5 readiness, but aW% is a more meaningful unit.  SOS is the weighted average of opponent’s strength of schedule.  “rk” is the team’s rank in CTR, while “s rk” is the team’s rank in the SOS estimate.&lt;br /&gt;&lt;br /&gt;I really do not care for the actual W% presentation for the NFL due to the short season magnifying differences in the teams.  The Packers tower over the league here, which is appropriate given a 15-1 record against a decent schedule, but it doesn’t have any predictive value.  You will notice in the table above that the NFC does quite well, which will be carry through to the points-based ratings:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-5iiMasau8KU/TwT0FITSwYI/AAAAAAAAA98/3i1m9saMbLA/s1600/nfl2.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="229" src="http://1.bp.blogspot.com/-5iiMasau8KU/TwT0FITSwYI/AAAAAAAAA98/3i1m9saMbLA/s400/nfl2.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Green Bay does not even rank #1 in the league; both New Orleans and San Francisco rank ahead of them.  The top nine and eleven of the top fourteen teams made the playoffs, which is pretty good I think.  &lt;br /&gt;&lt;br /&gt;The aggregate ratings for the divisions (simply the average rating of the four teams) illustrates the superiority of the NFC and why I don’t care for micro-divisions:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-TefA_e4R20Y/TwT0xMai5aI/AAAAAAAAA-I/RCdQ_fMkysU/s1600/nfl3.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="165" width="123" src="http://1.bp.blogspot.com/-TefA_e4R20Y/TwT0xMai5aI/AAAAAAAAA-I/RCdQ_fMkysU/s400/nfl3.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Last year, the NFC West in turned in a ghastly 29 rating.  Led by San Francisco, they were from the worst in the league, a distinction that went to their AFC brethren.&lt;br /&gt;&lt;br /&gt;This whole exercise would be devoid of a great deal of entertainment value if I did not use the results to estimate Super Bowl probabilities.  The disclaimer list here is lengthy enough that I will skip it less I leave anything out.  A credibility adjustment would be pretty simple to implement (adding 12 games of a 100 rating would do the trick), but this is just NFL stats, not something important.  The playoff odds do consider home field advantage; the home team’s rating is multiplied by 57/43 to reflect a fairly average NFL home field advantage.  I feel bad about listing the probabilities to the thousandth place, but there are so many possible combinations for the championship games and Super Bowl that those tables would look silly without it:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-iO3pkHQkPIM/TwT1DW_FNeI/AAAAAAAAA-U/kH6Jnaci4kA/s1600/nfl4.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="106" width="169" src="http://1.bp.blogspot.com/-iO3pkHQkPIM/TwT1DW_FNeI/AAAAAAAAA-U/kH6Jnaci4kA/s400/nfl4.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Two road favorites on the first weekend is probably pretty typical given the quality of teams that often win micro-divisions (particularly those like the AFC West).  The Denver Broncos simply aren’t a very good football team (it is tough for me to leave it at that, but piling on more snark re: you-know-who is beyond excessive at this point).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-lne8gbklv7Y/TwT1L94SgOI/AAAAAAAAA-g/HfH0ORJ1Mjw/s1600/nfl5.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="165" width="379" src="http://4.bp.blogspot.com/-lne8gbklv7Y/TwT1L94SgOI/AAAAAAAAA-g/HfH0ORJ1Mjw/s400/nfl5.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I like reseeding in theory, but when your initial seeding insists that Denver ranks #4 in the AFC because they are the sharpest scissors in the kindergarten classroom, it loses some of its luster.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-RBS8uxUfYSw/TwT1V5unEgI/AAAAAAAAA-s/uMYNeT616q8/s1600/nfl6.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="376" width="211" src="http://1.bp.blogspot.com/-RBS8uxUfYSw/TwT1V5unEgI/AAAAAAAAA-s/uMYNeT616q8/s400/nfl6.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Life is tough enough as a Browns fan without having to worry about horrors like a Denver/Cincinnati AFC title game, but thankfully there’s a 99.8% chance that will not come to pass.  Pittsburgh/Baltimore, on the other hand, is the most likely championship game scenario that doesn’t involve either conference’s #1 seed.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-wJeGUxFRy-E/TwT1hm4kpVI/AAAAAAAAA-4/EuuFIHh8F20/s1600/nfl7.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="143" src="http://1.bp.blogspot.com/-wJeGUxFRy-E/TwT1hm4kpVI/AAAAAAAAA-4/EuuFIHh8F20/s400/nfl7.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Combining all of these, here are the playoff probabilities for each team:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-ywh6DMHHTcw/TwT1of0pkfI/AAAAAAAAA_E/GmGmkz4k20U/s1600/nfl8.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="195" width="216" src="http://1.bp.blogspot.com/-ywh6DMHHTcw/TwT1of0pkfI/AAAAAAAAA_E/GmGmkz4k20U/s400/nfl8.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The system still considers Green Bay the Super Bowl favorites even though they rank below New Orleans and San Francisco, thanks to favorable second round matchups and home field advantage, which is much more significant in the NFL playoffs than in MLB.  Ratings and home field aside, if the NFC title game turns out to be Packers/Saints, I’m picking the latter to win it all.  These probabilities add up to a 57% chance of the NFC representative winning the Super Bowl.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-173226479710230079?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/173226479710230079/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2012/01/crude-nfl-ratings-2011.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/173226479710230079'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/173226479710230079'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2012/01/crude-nfl-ratings-2011.html' title='Crude NFL Ratings, 2011'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-91kp39s-fpI/TwTz970dCDI/AAAAAAAAA9w/sqeI4dXbKuo/s72-c/nfl1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-3302134648586630036</id><published>2011-12-28T23:07:00.000-05:00</published><updated>2011-12-28T23:07:00.927-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Offense'/><category scheme='http://www.blogger.com/atom/ns#' term='Statistical Reports'/><category scheme='http://www.blogger.com/atom/ns#' term='Positional Adjustments'/><title type='text'>Hitting by Position, 2011</title><content type='html'>Offensive performance by position (and the closely related topic of positional adjustments) has always interested me, and so each year I like to examine the most recent season's totals. I believe that offensive positional averages can be an important tool for approximating the defensive value of each position, but they certainly are not a magic bullet and need to include more than one year of data if they are to be utilized in that capacity.&lt;br /&gt;&lt;br /&gt;The first obvious thing to look at is the positional totals for 2011, with the data coming from Baseball-Reference.com. "MLB” is the overall total for MLB, which is not the same as the sum of all the positions here, as pinch-hitters and runners are not included in those. “POS” is the MLB totals minus the pitcher totals, yielding the composite performance by non-pitchers. “PADJ” is the position adjustment, which is the position RG divided by the position (non-pitcher) average. “LPADJ” is the long-term positional adjustment that I use, based on 1992-2001 data. The rows “79” and “3D” are the combined corner outfield and 1B/DH totals, respectively:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-hd3t-cQ0NqA/Tvva07Xs49I/AAAAAAAAA74/yhiPvP4JySU/s1600/pos1.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="125" width="400" src="http://1.bp.blogspot.com/-hd3t-cQ0NqA/Tvva07Xs49I/AAAAAAAAA74/yhiPvP4JySU/s400/pos1.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The 2011 results were most notable for the poor performance by third basemen and the pathetic effort by left fielders, who were slightly less productive than the average non-pitcher.  After a down 2010, DHs rebounded to a respectable 110.  The other positions were fairly close to their historical norms, and pitchers avoided setting a new all-time low, although the difference between 7 and 5 is negligible. &lt;br /&gt;&lt;br /&gt;Speaking of pitchers, here are the aggregate park-adjusted totals for NL pitching teams.  This analysis is based on simple ERP, and thus ignores sacrifices and the other situational goodness that makes pitcher hitting such an exciting and integral part of our national pastime:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-xCU1X6yca3M/TvvbKbom6WI/AAAAAAAAA8E/CRmkVUR-2qU/s1600/pos2.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="225" width="270" src="http://1.bp.blogspot.com/-xCU1X6yca3M/TvvbKbom6WI/AAAAAAAAA8E/CRmkVUR-2qU/s400/pos2.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Milwaukee ranked second and Arizona first last year, but on the other hand the Mets were third in 2010 and dead last in 2011.  AL pitchers don’t get enough opportunities to bother with a chart, but for trivia’s sake, Baltimore’s pitchers raked .405/.405/.630, while Kansas City’s failed to reach base in eighteen plate appearances.&lt;br /&gt;&lt;br /&gt;Moving on to positions that are actually expected to hit, I figured park-adjusted RAA for each position. The baseline for average is the overall 2011 MLB average RG for each position, with left and right field pooled.  The leading team at each position was as follows (these are generally unsurprising so I’ll spare you a big chart):&lt;br /&gt;&lt;br /&gt;C--DET, 1B--DET, 2B--BOS, 3B--CHN, SS--NYN, LF--MIL, CF--LA, RF--TOR, DH--BOS&lt;br /&gt;&lt;br /&gt;The only one of these that was a bit surprising to me even after looking at the final stats for individuals was the Cubs’ third basemen (led of course by Aramis Ramirez).  But a lot of the usual suspects at third base had injuries and other issues this year (Longoria, Zimmerman, Wright, Youkilis).&lt;br /&gt;&lt;br /&gt;Now the worst performance at each position, along with a column displaying the team leader in games played at that spot:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-oXT9ZLOwGgY/TvvbeW51DSI/AAAAAAAAA8Q/OVu9KSHB_hY/s1600/pos3.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="133" width="337" src="http://4.bp.blogspot.com/-oXT9ZLOwGgY/TvvbeW51DSI/AAAAAAAAA8Q/OVu9KSHB_hY/s400/pos3.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It’s mostly a coincidence that all of the worst-hitting positions were from AL teams, although they do generally get more PA in which to drive down their RAA.  I wrote about the Twins and Angels catchers a little in the previous post, but note here that Houston’s catchers were second last with -31 RAA and the Angels managed -29.  The continuing inability of Seattle to generate offense is a marvel, and Juan Pierre is an appropriate banner carrier for 2011’s crop of poor hitting left fielders.&lt;br /&gt;&lt;br /&gt;The following charts give the RAA at each position for each team, split up by division. The charts are sorted by the sum of RAA for the listed positions. As mentioned earlier, the league totals will not sum to zero since the overall ML average is being used and not the specific league average. Positions with negative RAA are in red; positions with +/- 20 RAA are bolded:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-JhPDX9pGylw/Tvvcu5oPxWI/AAAAAAAAA8c/4G6l7K2Ae2A/s1600/pos4.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="81" width="351" src="http://4.bp.blogspot.com/-JhPDX9pGylw/Tvvcu5oPxWI/AAAAAAAAA8c/4G6l7K2Ae2A/s400/pos4.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Third base and shortstop led the Mets to the highest infield RAA in the NL.  Atlanta tied for the lowest outfield RAA in the NL.  There must be something wrong with my spreadsheet as surely the Phillies first basemen combined for more than 8 RAA, led by their perennial MVP candidate.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-cJSotL3qlXY/Tvvc11QuYpI/AAAAAAAAA8o/iKxPFq9tGzI/s1600/pos5.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="94" width="350" src="http://3.bp.blogspot.com/-cJSotL3qlXY/Tvvc11QuYpI/AAAAAAAAA8o/iKxPFq9tGzI/s400/pos5.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;St. Louis was the only team in the game to be above average at every position, and really stood at out at the three biggest offensive positions.  Their outfield combined to lead MLB in RAA.  Milwaukee’s offense was structured similarly, although right field did not stand out and they gave a lot of it back with a black hole at third base.   The Cubs’ outfield production was evenly distributed and combined to tie Atlanta for the lowest mark in the NL.  Pittsburgh’s infield tied for the NL’s trailer spot.  Houston got decent production in the outfield but nowhere else.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-D5bOrcDhvoM/Tvvc7RHoFoI/AAAAAAAAA80/gy38S7_azw4/s1600/pos6.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="81" width="351" src="http://3.bp.blogspot.com/-D5bOrcDhvoM/Tvvc7RHoFoI/AAAAAAAAA80/gy38S7_azw4/s400/pos6.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The fact that the Los Angeles infield tied for the fewest RAA in the NL and yet the offense combined to lead the division should give you a quick idea on the offensive character of the NL West.  While the World Series title makes it easy for some to overlook, San Francisco’s offensive struggles are persistent and pitching can only take you so far.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-y6G9TpplgsY/TvvdA3VznlI/AAAAAAAAA9A/decL1gbeVrU/s1600/pos7.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="80" width="387" src="http://4.bp.blogspot.com/-y6G9TpplgsY/TvvdA3VznlI/AAAAAAAAA9A/decL1gbeVrU/s400/pos7.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Boston’s offense was terrific despite right field, leading the majors in infield RAA.  Toronto pulled a neat trick by combining for -17 RAA from the outfield despite having Jose Bautista.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-U8PfceDaUm0/TvvdG9hs3iI/AAAAAAAAA9M/Dbq4kqgkVu8/s1600/pos8.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="80" width="385" src="http://1.bp.blogspot.com/-U8PfceDaUm0/TvvdG9hs3iI/AAAAAAAAA9M/Dbq4kqgkVu8/s400/pos8.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Kansas City led the AL in outfield RAA, which not many would have predicted from Alex Gordon, Melky Cabrera, and Jeff Francoeur.  Cleveland’s outfield was second-worst in the majors, and under normal circumstances -62 from the outfield would stick out more.  The best thing that can be said about Chicago’s -98 RAA is that it was balanced -49/-49 between infield and outfield, with catcher and DH nearly average (+2/-2).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-oAzCPpQ7Ysk/TvvdL4hCO2I/AAAAAAAAA9Y/gUQrCuWPxg4/s1600/pos9.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="69" width="387" src="http://3.bp.blogspot.com/-oAzCPpQ7Ysk/TvvdL4hCO2I/AAAAAAAAA9Y/gUQrCuWPxg4/s400/pos9.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Texas kept the AL West from looking like it’s NL counterparts.  Chris Iannetta and some guy whose name I can’t remember should do wonders for LAA.  Oakland’s -50 runs from the infield was the worst in the majors, almost all driven by dreadful production at first base.  And then there’s Seattle.  What can one say about Seattle?  Every outfield position was at least -20 (only five other outfield spots across the other 29 teams were at -20).  Catcher, third base, and DH also stood out for the hapless Mariners.&lt;br /&gt;&lt;br /&gt;Earlier I displayed some long-term positional adjustments that I’ve used over the years.  It dawned on me in September that those were based on the ten-year period from 1992-2001, and that at this point, none of the most recent ten years are included in the sample.  So I figured it would be an opportune time to recalibrate my position adjustments, using the ten years from 2002-2011 as the basis.&lt;br /&gt;&lt;br /&gt;I figured two sets of PADJs; one which compared each position to the overall league average (including pitchers), and one that compared it to the league average less pitchers.  There is very little difference, of course--the ones compared to the average including pitchers tend to be one or two points higher.  This table compares the 1992-2001 and the 2002-2011 adjustments:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-aAGnnIp3Qko/Tvvd2hxBoqI/AAAAAAAAA9k/zZW9oeeJEz0/s1600/pos10.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="172" width="193" src="http://1.bp.blogspot.com/-aAGnnIp3Qko/Tvvd2hxBoqI/AAAAAAAAA9k/zZW9oeeJEz0/s400/pos10.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The big movers relative to 1992-2001 were the middle infield positions, improving offensively as first base/DH declined a little.  In the end, though, the defensive spectrum one would draw based on offense doesn’t change at all, except for third base switching places with center field (and the differences were miniscule in both decades) to match Bill James’ spectrum.&lt;br /&gt;&lt;br /&gt;A longer digression about the application of position adjustments, and some reasons why one might want to consider using offensive adjustments, will have to wait for another time, but would be appropriate here.&lt;br /&gt;&lt;br /&gt;This &lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;hl=en_US&amp;key=0AnPJbQnlHhRHdFlyb3lNbkZaeFB3ZkVwMk43eEtiM0E&amp;output=html"&gt;spreadsheet&lt;/a&gt; includes the 2011 data by position.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-3302134648586630036?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/3302134648586630036/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/12/hitting-by-position-2011.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3302134648586630036'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3302134648586630036'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/12/hitting-by-position-2011.html' title='Hitting by Position, 2011'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-hd3t-cQ0NqA/Tvva07Xs49I/AAAAAAAAA74/yhiPvP4JySU/s72-c/pos1.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-540079777178559474</id><published>2011-12-19T01:15:00.000-05:00</published><updated>2011-12-19T01:15:03.384-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Offense'/><category scheme='http://www.blogger.com/atom/ns#' term='Statistical Reports'/><title type='text'>Hitting by Lineup Slot, 2011</title><content type='html'>I devoted a whole post to leadoff hitters, whether justified or not, so it's only fair to have a post about hitting by batting order position in general. I certainly consider this piece to be more trivia than sabermetrics, since there’s no analytical content.&lt;br /&gt;&lt;br /&gt;The data in this post was taken from Baseball-Reference. The figures for each team's runs are &lt;b&gt;not&lt;/b&gt; park-adjusted--I intended to do so, but unfortunately I had already written the body of the post before I realized that they’d been omitted.  The Padres having the worst 2, 3, and 4 production in the NL should have alerted me to this sooner.  Then I had to go back and remove some comments that make no sense when ignoring park effects, so now the post is just a skeleton.  Oh well.  RC is ERP, including SB and CS, as used in my end of season stat posts. The weights used are constant across lineup positions; there was no attempt to apply specific weights to each position, although they are out there and would certainly make this a little bit more interesting.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-TKeI_CgPLU8/Tu4TuOJPdPI/AAAAAAAAA6M/orP6Kb90qSQ/s1600/batord1.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="266" width="211" src="http://4.bp.blogspot.com/-TKeI_CgPLU8/Tu4TuOJPdPI/AAAAAAAAA6M/orP6Kb90qSQ/s400/batord1.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This marks a third straight season that the most productive lineup slot in the majors was the NL’s #3 hitters…Pujols, Votto, Braun and company.  Despite all of the seemingly silly things managers do with their batting orders, it is comforting to know that, from the cleanup spot down, each subsequent spot is less productive.  Of course, that doesn’t excuse the feeble performance of NL #2 hitters, who just edged out the #8 hitters as the least productive NL spot filled by real hitters.&lt;br /&gt;&lt;br /&gt;Next, here are the team leaders in RG at each lineup position. The player listed is the one who appeared in the most games in that spot (which can be misleading as the presence of Mitch Moreland demonstrates):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-fog2c3Kq3YI/Tu4UB4K_i4I/AAAAAAAAA6Y/mHCdyuwk7UU/s1600/batord2.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="266" width="306" src="http://3.bp.blogspot.com/-fog2c3Kq3YI/Tu4UB4K_i4I/AAAAAAAAA6Y/mHCdyuwk7UU/s400/batord2.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Houston actually had the NL’s most productive hitters at two spots; of course, they were two bottom of the batting order spots in which nobody contributes anyway.  The least productive lineup spots:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-pPSMRq9vCsc/Tu4X8cGDrLI/AAAAAAAAA6k/Ka5Ig6xa9nw/s1600/batord3.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="268" width="308" src="http://1.bp.blogspot.com/-pPSMRq9vCsc/Tu4X8cGDrLI/AAAAAAAAA6k/Ka5Ig6xa9nw/s400/batord3.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As you can see, Minnesota had the worst production out of both the #8 and #9 spots.  What makes this truly impressive, though, is that Drew Butera was the leader in games played in both spots.  One thing I had meant to include in my meanderings post but forgot was a comparison of Mathis and Butera’s basic batting lines as I present them in my end of season stats.  Neither had enough PA to qualify for those lists, but their seasons were too bad to just ignore:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-M6YNIsRshlU/Tu4YNGrbYzI/AAAAAAAAA6w/TjkDrOxR45Q/s1600/batord4.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="26" width="400" src="http://4.bp.blogspot.com/-M6YNIsRshlU/Tu4YNGrbYzI/AAAAAAAAA6w/TjkDrOxR45Q/s400/batord4.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Mathis was intentionally walked twice; both came in a June 17 game at the Mets.  No word on whether or not Ron Washington temporarily replaced Terry Collins.&lt;br /&gt;&lt;br /&gt;Note that Houston’s #9 hitters (the best in the NL at 2.3 RG) almost managed to outhit their #8 hitters (worst in the NL at 2.5 RG).&lt;br /&gt;&lt;br /&gt;The next chart displays the top ten positions in terms of RAA, compared to their league’s average for each spot.  A lot of the same suspects pop up, of course:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-58aKJbGYuXI/Tu4YcnS9QpI/AAAAAAAAA68/7OtiyPheqV4/s1600/batord5.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="155" width="328" src="http://4.bp.blogspot.com/-58aKJbGYuXI/Tu4YcnS9QpI/AAAAAAAAA68/7OtiyPheqV4/s400/batord5.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And the ten worst positions:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-tSaqhyYVADk/Tu4YpLlkHxI/AAAAAAAAA7I/LzOEPK6SwYg/s1600/batord6.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="157" width="327" src="http://1.bp.blogspot.com/-tSaqhyYVADk/Tu4YpLlkHxI/AAAAAAAAA7I/LzOEPK6SwYg/s400/batord6.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Finally, this table has each team’s RG rank among the lineup slots in their league.  The top and bottom three in each league have been noted, which make Boston and Seattle stand out (for opposite reasons, of course).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-3ClNR2OFrBc/Tu4Y6UkYnqI/AAAAAAAAA7U/qxavWCd09Hk/s1600/batord7.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="240" width="332" src="http://2.bp.blogspot.com/-3ClNR2OFrBc/Tu4Y6UkYnqI/AAAAAAAAA7U/qxavWCd09Hk/s400/batord7.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-zmgZcmhLs4Y/Tu4ZMFGGy5I/AAAAAAAAA7g/BnmHzC17hLU/s1600/batord8.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="211" width="333" src="http://3.bp.blogspot.com/-zmgZcmhLs4Y/Tu4ZMFGGy5I/AAAAAAAAA7g/BnmHzC17hLU/s400/batord8.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here is a link to a &lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;hl=en_US&amp;key=0AnPJbQnlHhRHdG8wdFlOQ0ZrMEtDdlNybHdkN1hUWlE&amp;output=html"&gt;Google spreadsheet&lt;/a&gt; with the underlying data.  The RG and RAA figures in this one are park-adjusted as should have been done throughout this post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-540079777178559474?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/540079777178559474/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/12/hitting-by-lineup-slot-2011.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/540079777178559474'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/540079777178559474'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/12/hitting-by-lineup-slot-2011.html' title='Hitting by Lineup Slot, 2011'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-TKeI_CgPLU8/Tu4TuOJPdPI/AAAAAAAAA6M/orP6Kb90qSQ/s72-c/batord1.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-8157668036312076447</id><published>2011-12-08T00:15:00.004-05:00</published><updated>2011-12-08T00:15:00.743-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Leadoff Hitters'/><category scheme='http://www.blogger.com/atom/ns#' term='Offense'/><category scheme='http://www.blogger.com/atom/ns#' term='Statistical Reports'/><title type='text'>2011 Leadoff Hitters</title><content type='html'>This post kicks off a series of posts that I write every year, and therefore struggle to infuse with any sort of new perspective.  However, they're a tradition on this blog and hold some general interest, so away we go.&lt;br /&gt;&lt;br /&gt;This post looks at the offensive performance of teams' leadoff batters.  I will try to make this as clear as possible: the statistics are based on the players that hit in the #1 slot in the batting order, whether they were actually leading off an inning or not.  It includes the performance of all players who batted in that spot, including substitutes like pinch-hitters. &lt;br /&gt;&lt;br /&gt;Listed in parentheses after a team are all players that appeared in twenty or more games in the leadoff slot--while you may see a listing like "BOS (Ellsbury)” this does not mean that the statistic is only based solely on Ellsbury's performance; it is the total of all Boston batters in the #1 spot, of which Ellsbury was the only one to appear in that spot in twenty or more games.  I will list the top and bottom three teams in each category (plus the top/bottom team from each league if they don't make the ML top/bottom three); complete data is available in a spreadsheet linked at the end of the article.  There are also no park factors applied anywhere in this article.&lt;br /&gt;&lt;br /&gt;That's as clear as I can make it, and I hope it will suffice.  I always feel obligated to point out that as a sabermetrician, I think that the importance of the batting order is often overstated, and that the best leadoff hitters would generally be the best cleanup hitters, the best #9 hitters, etc.  However, since the leadoff spot gets a lot of attention, and teams pay particular attention to the spot, it is instructive to look at how each team fared there. &lt;br /&gt; &lt;br /&gt;The conventional wisdom is that the primary job of the leadoff hitter is to get on base, and most simply, score runs.  So let's start by looking at runs scored per 25.5 outs (AB - H + CS):&lt;br /&gt;&lt;br /&gt;1. TEX (Kinsler), 6.8&lt;br /&gt;2. MIL (Weeks/Hart), 6.5&lt;br /&gt;3. BOS (Ellsbury), 6.4&lt;br /&gt;Leadoff average, 5.0&lt;br /&gt;ML average, 4.3&lt;br /&gt;28. LAA (Izturis/Aybar), 4.0&lt;br /&gt;29. STL (Theriot/Furcal), 3.9&lt;br /&gt;30. WAS (Bernadina/Desmond/Espinosa), 3.9&lt;br /&gt;&lt;br /&gt;Obviously you all know the biases inherent in looking at actual runs scored.  It is odd to see St. Louis near the bottom as they had a good offense overall.  Usually the leadoff hitters will manage to score some runs when they have Pujols, Holliday and Berkman coming up behind them whether they get on base that much or not.&lt;br /&gt;&lt;br /&gt;Speaking of getting on base, the other obvious measure to look at is On Base Average. The figures here exclude HB and SF to be directly comparable to earlier versions of this article, but those categories are available in the spreadsheet if you'd like to include them:&lt;br /&gt;&lt;br /&gt;1. CHN (Castro/Fukudome), .364&lt;br /&gt;2. NYN (Reyes/Pagan), .364&lt;br /&gt;3. BOS (Ellsbury), .362&lt;br /&gt;Leadoff average, .324&lt;br /&gt;ML average, .317&lt;br /&gt;28. BAL (Hardy/Roberts/Andino), .287&lt;br /&gt;29. SF (Torres/Rowand), .282&lt;br /&gt;30. WAS (Bernadina/Desmond/Espinosa), .277&lt;br /&gt;&lt;br /&gt;I would not have correctly identified the Cubs as having the highest OBA out of the leadoff spot in my first fifteen guesses, I don’t think.  The seven point difference between the overall major league OBA and the OBA of leadoff men is a little smaller than it usually is, but last year the gap was just two points.&lt;br /&gt;&lt;br /&gt;The next statistic is what I call Runners On Base Average.  The genesis of it is from the A factor of Base Runs.  It measures the number of times a batter reaches base per PA--excluding homers, since a batter that hits a home run never actually runs the bases.  It also subtracts caught stealing here because the BsR version I often use does as well, but BsR versions based on initial baserunners rather than final baserunners do not.&lt;br /&gt;&lt;br /&gt;My 2009 leadoff post was linked to a Cardinals message board, and this metric was the cause of a lot of confusion (this was mostly because the poster in question was thick-headed as could be, but it's still worth addressing).  ROBA, like several other methods that follow, is not really a quality metric, it is a descriptive metric.  A high ROBA is a good thing, but it's not necessarily better than a slightly lower ROBA plus a higher home run rate (which would produce a higher OBA and more runs).  Listing ROBA is not in any way, shape or form a statement that hitting home runs is bad for a leadoff hitter.  It is simply a recognition of the fact that a batter that hits a home run is not a baserunner.  Base Runs is an excellent model of offense and ROBA is one of its components, and thus it holds some interest in describing how a team scored its runs, rather than how many it scored:&lt;br /&gt;&lt;br /&gt;1. CHN (Castro/Fukudome), .339&lt;br /&gt;2. NYN (Reyes/Pagan), .336&lt;br /&gt;3. PIT (Tabata/McCutchen/Presley), .315&lt;br /&gt;Leadoff average, .291&lt;br /&gt;ML average, .285&lt;br /&gt;28. SF (Torres/Rowand), .253&lt;br /&gt;29. BAL (Hardy/Roberts/Andino), .253&lt;br /&gt;30. WAS (Bernadina/Desmond/Espinosa), .247&lt;br /&gt;&lt;br /&gt;You are probably starting to notice a lot of repetition in the leaders and trailers.  Obviously a lot of these metrics measure the same thing in slightly different ways or measure similar things, so it’s to be expected.  &lt;br /&gt;&lt;br /&gt;I will also include what I've called Literal OBA here--this is just ROBA with HR subtracted from the denominator so that a homer does not lower LOBA, it simply has no effect.  You don't really need ROBA and LOBA (or either, for that matter), but this might save some poor message board out there twenty posts, so here goes.  LOBA = (H + W - HR - CS)/(AB + W - HR):&lt;br /&gt;&lt;br /&gt;1. CHN (Castro/Fukudome), .344&lt;br /&gt;2. NYN (Reyes/Pagan), .341&lt;br /&gt;3. PIT (Tabata/McCutchen/Presley), .321&lt;br /&gt;Leadoff average, .297&lt;br /&gt;ML average, .292&lt;br /&gt;28. BAL (Hardy/Roberts/Andino), .261&lt;br /&gt;29. SF (Torres/Rowand), .257&lt;br /&gt;30. WAS (Bernadina/Desmond/Espinosa), .252&lt;br /&gt;&lt;br /&gt;In this presentation, the rank difference between ROBA and LOBA is barely noticeable.&lt;br /&gt;&lt;br /&gt;The next two categories are most definitely categories of shape, not value.  The first is the ratio of runs scored to RBI.  Leadoff hitters as a group score many more runs than they drive in, partly due to their skills and partly due to lineup dynamics.  Those with low ratios don’t fit the traditional leadoff profile as closely as those with high ratios (at least in the way their seasons played out):&lt;br /&gt;&lt;br /&gt;1. LA (Gordon/Gwynn/Carroll/Furcal), 2.5&lt;br /&gt;2. HOU (Bourn/Bourgeois/Schafer), 2.1&lt;br /&gt;3. DET (Jackson), 2.0&lt;br /&gt;Leadoff average, 1.6&lt;br /&gt;26. WAS (Bernadina/Desmond/Espinosa), 1.2&lt;br /&gt;28. KC (Gordon/Getz), 1.2&lt;br /&gt;29. BOS (Ellsbury), 1.2&lt;br /&gt;30. BAL (Hardy/Roberts/Andino), 1.2&lt;br /&gt;ML average, 1.1&lt;br /&gt;&lt;br /&gt;The presence of the Red Sox in the bottom three on this list should drive home the point about this not being a quality metric.  The leadoff hitters that rank the lowest in R/BI are those that drive in almost as many runs as they score.  If you had a leadoff hitter that was driving in many more runs than he scored, that might be cause for some reconsideration of your batting order, but having some scored/batted in parity is not inherently a bad thing.&lt;br /&gt;&lt;br /&gt;A similar gauge, but one that doesn't rely on the teammate-dependent R and RBI totals, is Bill James' Run Element Ratio.  RER was described by James as the ratio between those things that were especially helpful at the beginning of an inning (walks and stolen bases) to those that were especially helpful at the end of an inning (extra bases).  It is a ratio of "setup" events to "cleanup" events.  Singles aren't included because they often function in both roles.  &lt;br /&gt;&lt;br /&gt;Of course, there are RBI walks and doubles are a great way to start an inning, but RER classifies events based on when they have the highest relative value, at least from a simple analysis:&lt;br /&gt;&lt;br /&gt;1. CHA (Pierre), 2.4&lt;br /&gt;2. MIN (Revere/Span), 1.9&lt;br /&gt;3. LA (Gordon/Gwynn/Carroll/Furcal), 1.8&lt;br /&gt;Leadoff average, 1.0&lt;br /&gt;ML average, .8&lt;br /&gt;28. BAL (Hardy/Roberts/Andino), .6&lt;br /&gt;29. BOS (Ellsbury), .6&lt;br /&gt;30. MIL (Weeks/Hart), .6&lt;br /&gt;&lt;br /&gt;Last year, the White Sox led handily in RER, due in large part to Pierre’s steals.  This year, Pierre didn’t steal as many bases but still managed to slap his team to the top. &lt;br /&gt;&lt;br /&gt;Speaking of stolen bases, last year I started including a measure that considered only base stealing.  Obviously there's a lot more that goes into being a leadoff hitter than simply stealing bases, but it is one of the areas that is often cited as important.  So I've included the ranking for what some analysts call net steals, SB - 2*CS.  I'm not going to worry about the precise breakeven rate, which is probably closer to 75% than 67%, but is also variable based on situation.  The ML and leadoff averages in this case are per team lineup slot:&lt;br /&gt;&lt;br /&gt;1. HOU (Bourn/Bourgeois/Schafer), 29&lt;br /&gt;1. NYN (Reyes/Pagan), 29&lt;br /&gt;3. SEA (Suzuki), 26&lt;br /&gt;Leadoff average, 11&lt;br /&gt;ML average, 3&lt;br /&gt;28. CHA (Pierre), -3&lt;br /&gt;29. STL (Theriot/Furcal), -6&lt;br /&gt;29. CLE (Brantley/Sizemore/Carrera), -6&lt;br /&gt;&lt;br /&gt;The Indians have been just missed the trailer spots on a number of these lists.  At least Cleveland and St. Louis are at the bottom largely because their leadoff hitters didn’t attempt that many steals.  Only Milwaukee and Baltimore leadoff hitters (16 and 21 respectively) attempted fewer steals than Cleveland (24) and St. Louis (18).  Neither the Tribe (58%) nor the Redbirds (56%) had success when they did steal, but they weren’t trying it all that much.  The White Sox, on the other hand, were 31-48 (65%), a poor percentage and the eleventh-most attempts.&lt;br /&gt;&lt;br /&gt;Let's shift gears back to quality measures, beginning with one that David Smyth proposed when I first wrote this annual leadoff review.  Since the optimal weight for OBA in a x*OBA + SLG metric is generally something like 1.7, David suggested figuring 2*OBA + SLG for leadoff hitters, as a way to give a little extra boost to OBA while not distorting things too much, or even suffering an accuracy decline from standard OPS.  Since this is a unitless measure anyway, I multiply it by .7 to approximate the standard OPS scale and call it 2OPS:&lt;br /&gt;&lt;br /&gt;1. BOS (Ellsbury), 882&lt;br /&gt;2. NYN (Reyes/Pagan), 835&lt;br /&gt;3. MIL (Weeks/Hart), 834&lt;br /&gt;Leadoff average, 733&lt;br /&gt;ML average, 723&lt;br /&gt;28. CHA (Pierre), 669&lt;br /&gt;29. SF (Torres/Rowand), 645&lt;br /&gt;30. WAS (Bernadina/Desmond/Espinosa), 630&lt;br /&gt;&lt;br /&gt;Along the same lines, one can also evaluate leadoff hitters in the same way I'd go about evaluating any hitter, and just use Runs Created per Game with standard weights (this will include SB and CS, which are ignored by 2OPS):&lt;br /&gt;&lt;br /&gt;1. BOS (Ellsbury), 6.7&lt;br /&gt;2. NYN (Reyes/Pagan), 6.2&lt;br /&gt;3. TEX (Kinsler), 6.1&lt;br /&gt;Leadoff average, 4.6&lt;br /&gt;ML average, 4.4&lt;br /&gt;28. CHA (Pierre), 3.4&lt;br /&gt;29. SF (Torres/Rowand), 3.4&lt;br /&gt;30. WAS (Bernadina/Desmond/Espinosa), 3.4&lt;br /&gt;&lt;br /&gt;Finally, allow me to close with a crude theoretical measure of linear weights supposing that the player always led off an inning (that is, batted in the bases empty, no outs state).  There are weights out there (see &lt;U&gt;The Book&lt;/u&gt;) for the leadoff slot in its average situation, but this variation is much easier to calculate (although also based on a silly and impossible premise).  &lt;br /&gt;&lt;br /&gt;The weights I used were based on the 2010 run expectancy table from &lt;U&gt;Baseball Prospectus&lt;/u&gt;.  Ideally I would have used multiple seasons but this is a seat-of-the-pants metric.  Last year’s post went into the detail of how I figured it; this year, I’ll just tell you that the out coefficient was -.22, the CS coefficient was -.587, and for other details refer you to that post.  I then restate it per the number of PA for an average leadoff spot (741 in 2011):&lt;br /&gt;&lt;br /&gt;1. BOS (Ellsbury), 29&lt;br /&gt;2. TEX (Kinsler), 26&lt;br /&gt;3. NYN (Reyes/Pagan), 25&lt;br /&gt;Leadoff average, 0&lt;br /&gt;ML average, -3&lt;br /&gt;28. CHA (Pierre), -20&lt;br /&gt;29. WAS (Bernadina/Desmond/Espinosa), -20&lt;br /&gt;30. SF (Torres/Rowand), -21&lt;br /&gt;&lt;br /&gt;From an overview of all of these metrics, I think it’s safe to say that Red Sox and Mets leadoff hitters were pretty effective while White Sox, Nationals and Giants were not.  I was a little disappointed that the Braves and Astros didn’t make any lists together here as each team used both Michael Bourn and Jordan Schafer in twenty or more games out of the #1 spot.  Obviously that’s a possibility when players are traded for each other, but it would have been particularly amusing had one team been on the leader list and the other on the trailer list.&lt;br /&gt;&lt;br /&gt;A spreadsheet with all of the data and the full lists is &lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;hl=en_US&amp;key=0AnPJbQnlHhRHdEpaM19yd25fa3RpSW5wTHA1RTVhVWc&amp;output=html"&gt;available&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-8157668036312076447?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/8157668036312076447/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/12/2011-leadoff-hitters.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/8157668036312076447'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/8157668036312076447'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/12/2011-leadoff-hitters.html' title='2011 Leadoff Hitters'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-5150667484628897402</id><published>2011-12-01T03:14:00.020-05:00</published><updated>2011-12-01T03:14:00.044-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Meanderings'/><title type='text'>Statistical Meanderings 2011</title><content type='html'>I have to apologize in advance for this--it sort of resembles a bad Jayson Stark piece with better metrics but less interesting tidbits.  &lt;br /&gt;&lt;br /&gt;* The discrepancy in R/G between the AL and NL (for the offenses) expanded to .33 (4.46 to 4.13) after a one-year blip that saw the two circuits only .12 runs apart.  The leagues were equal in walk rate (.090 and .091 per at bat), but the AL hit for a higher BA (.258 to .253) and with more power (.150 to .139 ISO).&lt;br /&gt;&lt;br /&gt;* I certainly do not intend to dispute the notion that Houston was the worst team in baseball, but Minnesota actually had a lower EW% and PW%.  Based on runs and runs allowed, Houston “should have” won 61.8 games to Minnesota’s 61.5, and runs created expected a wider gap, 63.3 to 59.8.  Obviously this does not consider strength of schedule, but it does put into perspective just how disastrous the Twins’ season was.  &lt;br /&gt;&lt;br /&gt;* Tampa Bay led the majors in converting balls in plays into outs by a wide margin; their DER of .712 was as far ahead of second place LAA as the Angeles were ahead of twentieth place STL.  The Rays also led the majors in modified fielding average, albeit not by a runaway margin.&lt;br /&gt;&lt;br /&gt;As a brief aside, “modified” fielding average is no more complex or accurate than regular old fielding average, except I remove strikeouts and assists from the formula.  It would actually be easier to work with if I looked at the complement (errors/(putouts less strikeouts + errors)), but fielding average has been expressed that way for ever and it’s not a particularly telling metric in any event.  &lt;br /&gt;&lt;br /&gt;* In 2010, major league teams had an unusually high W% at home (.559) and 28 teams had a higher W% at home than on the road.  This led to some speculation about whether there was something afoot.  &lt;br /&gt;&lt;br /&gt;2011 did not provide any such conspiracy fodder.  Home teams had an abnormally low W% (.526), and only 23/30 teams (77%) won with a greater frequency at home.  It was the lowest HW% for MLB since 2001 (.524), and 2005 was the last time that only 23 teams were better at home (only 20 were in 2001).&lt;br /&gt;&lt;br /&gt;* The Giants scored 2.91 runs per game at home, the lowest output since 1972.  They had to do the near impossible to achieve this by scoring less than the legendary 2010 Mariners (2.95).  Offensive ineptitude combined with their good defense resulted in San Francisco playing in the lowest overall scoring context (7.09 RPG) in the majors since the 2003 Dodgers (6.98).&lt;br /&gt;&lt;br /&gt;* Don’t tell anyone, but the two teams that struck out the fewest times were the Rangers (930) and the Cardinals (978).  Both did unsurprisingly ground into a lot of double plays--Texas was sixth in MLB with 135 and St. Louis’ 169 was sixteen more than second place Baltimore.&lt;br /&gt;&lt;br /&gt;* I always like to run a chart showing each playoff team’s RAA broken down by offense and defense:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-M-gNMQh7Unw/Tta19_RaHpI/AAAAAAAAA5E/xUdf1rvhCV0/s1600/playoffraa.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="170" src="http://2.bp.blogspot.com/-M-gNMQh7Unw/Tta19_RaHpI/AAAAAAAAA5E/xUdf1rvhCV0/s400/playoffraa.jpg" width="193" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As you can see, the average playoff team was fairly balanced.  The only subpar unit in the group was the defense of the World Champion St. Louis Cardinals.&lt;br /&gt;&lt;br /&gt;* At first glance, there was nothing remarkable about the Kansas City bullpen:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-Yo414lP2Jt8/Tta2wAym7jI/AAAAAAAAA5Q/C5RF20E09_U/s1600/kcpen.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="135" width="368" src="http://3.bp.blogspot.com/-Yo414lP2Jt8/Tta2wAym7jI/AAAAAAAAA5Q/C5RF20E09_U/s400/kcpen.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Their 4.26 relief eRA was equal to the American League average.  But the interesting thing is that all of them were rookies except for Joakim Soria.  I’ve already said nice things about Greg Holland in my Rookie of the Year post, so I won’t repeat that here.&lt;br /&gt;&lt;br /&gt;* Someone &lt;a href="http://www.beyondtheboxscore.com/2011/10/16/2492501/most-specialized-relievers-by-outing-length-2011"&gt;beat me to it&lt;/a&gt;, but it is worth pointing out how low Trever Miller’s innings to appearance ratio was, particularly during his time in St. Louis.  Miller recorded 47 outs in 39 appearances (1.21 O/G) with the Cards.  I cannot state this absolutely, but I believe that is the lowest ratio in ML history for a pitcher with 20 or more appearances.  The previous low I can find is Randy Flores with the 2009 Rockies (36 outs/27 games, 1.33).  Miller’s complete season line was a yeoman 64 outs in 48 games, tying Flores’ record.  A fitting achievement for Tony LaRussa’s final season if I may say so myself.&lt;br /&gt;&lt;br /&gt;* One of the stats I track for relievers is inherited runners/game.  In an era where leverage index is readily available, it doesn’t yield much marginal value, but I always like looking at closer usage through IR/G.  Closers usually dominate the bottom of the IR/G list (I believe Mariano Rivera led full-time AL closers at .31, which was 71st out of 85 relievers), but it’s always fun to see which closers were never brought in with runners on base.  If a manager never calls on his closer with runners on, he’s either really locked into bullpen roles, or he really doesn’t trust him.  I’d assume the latter was the case with Kevin Gregg, who inherited zero runners in 2011.  The former was the case for John Axford (1 in 74 appearances).&lt;br /&gt;&lt;br /&gt;* Brian Wilson has taught us that a quirky personality, a ridiculous beard, and a World Series ring can get you a lot of commercials with 7 RAR.  Who was the last closer so marginal that got so much publicity?&lt;br /&gt;&lt;br /&gt;* Which Yankee reliever is which?&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-TVmDJ4KWtJI/Tta3facv3II/AAAAAAAAA5c/01HA-JUxUNs/s1600/drmr.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="51" width="229" src="http://1.bp.blogspot.com/-TVmDJ4KWtJI/Tta3facv3II/AAAAAAAAA5c/01HA-JUxUNs/s400/drmr.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The point here is not to compare the two, but to point out that David Robertson had a really great season.&lt;br /&gt;&lt;br /&gt;* You wouldn’t know it from watching the playoffs (and Ron Washington and the Rangers reluctance to use him that eventually turned into an outright dropping off of the roster), but Koji Uehara ranked fifth in RAR among AL relievers and was seventeenth last year.  Of course, if all you went by was Washington’s managing, you would be shocked to learn where Nick Punto tends to rank on RAR lists.&lt;br /&gt;&lt;br /&gt;* Five major league starters averaged 110 or more pitches per start this year, which has to be the most in some time.  I’m pretty sure that hasn’t happened since I’ve been including P/S in my year end stat reports, although I didn’t go back and check to make sure.  The five were:  Verlander (117), Weaver (113), Halladay (111), Shields (111) and Sabathia (110).&lt;br /&gt;&lt;br /&gt;* At the risk of cherry picking (as I’m sure I’m leaving out some pitchers that were talked about similarly but have had continued success, plus one season is obviously insufficient to draw conclusions in any event), I always find it a little satisfying when pitchers that were said to be DIPS beaters have either terrible or high BABIP seasons.  Trevor Cahill is in the latter category--he wasn’t horrible by any means, and a .306 BABIP is not that high, but it still is not the kind of season a good DIPS beater should have.  JA Happ, on the other hand, was atrocious and gave up an identical .306 BABIP.  Even Charlie Morton sort of fits--even looking at his entire season, he wound up at -3 RAA with a .323 BABIP.  Along those lines, what are the odds that Josh Tomlin is in the major leagues in five years?  They can’t be that good.&lt;br /&gt;&lt;br /&gt;* JoJo Reyes seemed to get a lot of attention for his lengthy (by time, especially) losing streak early in the year.  Or perhaps my impression of that is off, magnified by the fact that I watched him get his first win pitching against Cleveland.  In any event, Reyes may have had some bad luck along the way, but a lot of it evened out in 2011.  A pitcher with a 6.45 RRA, 6.24 eRA and 5.21 dRA should consider himself darn lucky to wind up 7-11.&lt;br /&gt;&lt;br /&gt;* PSA: David Freese is 28 and ranked 6th in RG among NL third basemen.  I overlooked it, but Chase Headley actually had a .393 OBA and created 6.1 runs per game, second to Pablo Sandoval among NL third baseman.  So postseaon hardware aside, Padres fans shouldn’t feel too terribly about which of their possible third basemen they actually have.&lt;br /&gt;&lt;br /&gt;* AL players with negative RAR who at one time were actually good included Vernon Wells, Magglio Ordonez, JD Drew, Justin Morenau, Alex Rios, Chone Figgins and Adam Dunn.  Morneau went from first among AL first baseman in RG in his concussion-shortened 2010 to last in 2011.&lt;br /&gt;&lt;br /&gt;* AL players who had an OBA greater than their SLG were: Ryan Sweeney, Chris Getz, JD Drew and Adam Dunn.  But for as bad as Dunn’s season was, Chone Figgins’ was actually worse on a rate basis.  Figgins only played in 81 games to Dunn’s 122, but still held just a -15 to -17 RAR lead.  Figgins created 1.75 runs per game, lowest among all major league players with 300 PA, lower even than Paul Janish (1.90).&lt;br /&gt;&lt;br /&gt;* Which of these teammates would you assume was more valuable, based on the statistics presented here?&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-SPU749u0Cmw/Tta-GiVjL7I/AAAAAAAAA5o/ait6T_llduE/s1600/nlteammates.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="51" width="268" src="http://2.bp.blogspot.com/-SPU749u0Cmw/Tta-GiVjL7I/AAAAAAAAA5o/ait6T_llduE/s400/nlteammates.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Of course, any opinion you’d form would be woefully incomplete, because I’ve only given you offensive statistics, without telling you anything about position or defense.  Offensively, though, they are nearly indistinguishable.  So what if I tell you that one of these players is a slow first baseman and the other one is a center fielder?  Surely, the center fielder must have been more valuable, right?&lt;br /&gt;&lt;br /&gt;How about these two teammates?&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-0wIWKSAERYA/Tta-twbxJeI/AAAAAAAAA6A/NN5fwO3DHo8/s1600/teammates2.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="54" width="266" src="http://1.bp.blogspot.com/-0wIWKSAERYA/Tta-twbxJeI/AAAAAAAAA6A/NN5fwO3DHo8/s400/teammates2.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;They both play the same position, but one of them was signed as a free agent and took the other’s spot at their common position (third base)--so the one who was pushed off played 105 games at 1B/DH and 55 games at the other infield positions.  The one who got the fielding job was more likely the more valuable player, right?&lt;br /&gt;&lt;br /&gt;One would think.  But the first baseman finished 10th in the MVP voting and the center fielder finished 13th.  The third baseman finished fifteenth while the 1B/DH finished 8th and got a first place vote.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-5150667484628897402?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/5150667484628897402/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/12/statistical-meanderings-2011.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5150667484628897402'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5150667484628897402'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/12/statistical-meanderings-2011.html' title='Statistical Meanderings 2011'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-M-gNMQh7Unw/Tta19_RaHpI/AAAAAAAAA5E/xUdf1rvhCV0/s72-c/playoffraa.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-1212532603095364354</id><published>2011-11-15T23:10:00.000-05:00</published><updated>2011-11-15T23:10:00.381-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Awards'/><title type='text'>IBA Ballot: MVP</title><content type='html'>My position on pitchers as MVP candidates is pretty simple: I think they absolutely should be considered.  However, that doesn’t mean it’s a common occurrence for me to conclude that a pitcher was the MVP of his league.  In general, I think that given modern workloads, it is much more likely for a batter to be the MVP than a pitcher.  Additionally, when I conclude that a pitcher and a position player are indistinguishable in terms of value, I will usually hedge my bets and go with the batter.  A corollary to this is that I’d like the pitcher’s peripheral statistics to indicate that he is equally or more valuable than his batting rivals, not just his actual runs allowed.  This is a higher hurdle to clear, since the best pitchers in terms of runs allowed are more likely than not to have outpitched their peripherals.&lt;br /&gt;&lt;br /&gt;The end result of this thinking is that somewhere around 2-4 pitchers are sprinkled through my MVP ballot, but rarely is one listed at #1.  I’ve been formally writing up my ballots for this blog since 2006, which gives me ten league-seasons with which to quantify my thought process:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-Qh7TmfPJZ84/TsM2JmxubtI/AAAAAAAAA44/YZwJCKVgtQ8/s1600/pitchermvp.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="206" width="258" src="http://2.bp.blogspot.com/-Qh7TmfPJZ84/TsM2JmxubtI/AAAAAAAAA44/YZwJCKVgtQ8/s400/pitchermvp.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As you can see, on average I list three pitchers on my ballot, with the leading pitcher placed fourth.  Obviously I’m biased, but I think this is a very fair treatment of pitchers.&lt;br /&gt;&lt;br /&gt;All of this bloviating and laughably in-depth analysis of my own previous ballots is necessary because, for the first time since I’ve been doing this, there is a popular movement to vote a starting pitcher as MVP.  I want to make it clear that, and I think I have, that if I don’t feel that Justin Verlander was the AL MVP, it’s not because of some bias against pitchers, but simply that I felt other player(s) were more valuable in 2011.&lt;br /&gt;&lt;br /&gt;Verlander has gained traction as a candidate for two reasons.  One, he pitched for a playoff team, and heavens knows that mainstream types will bend over backwards to try to give the MVP to a player whose contributions were “actually valuable”, or whatever argument they’d like to use to dismiss players whose teammates just weren’t that good.  It also helps that of the AL playoff teams, Detroit was something of a surprise (they were certainly the most surprising to me, although the voters would probably give that nod to Tampa Bay), and they made a strong surge in August and September to run away with their division.  That’s a good narrative.&lt;br /&gt;&lt;br /&gt;Secondly, Verlander’s W-L record is very impressive (24-5), and we all know that the mainstream still is easily distracted by a shiny W-L record.  And oh yeah, third, he pitched very well by any measure.&lt;br /&gt;&lt;br /&gt;That last point, though, is where I’m not as enthusiastic about Verlander.  The mainstream view is that Verlander was obviously the AL’s best pitcher in 2011--my view is that he was a solid #1, but Jered Weaver can’t just be laughed off.  Verlander’s season is not historic by any means when viewed through the lens of RAR--for last five seasons, the AL pitching RAR leaders totals have been:&lt;br /&gt;&lt;br /&gt;72, 84, 95, 76, 84&lt;br /&gt;&lt;br /&gt;Verlander’s 84 is very good, but the average of the previous four AL leaders was 82.  It’s a fairly typical league-leading type of performance, a very solid Cy Young-type season, but not one for the ages either.&lt;br /&gt;&lt;br /&gt;However, I have Jose Bautista at 82 RAR/63 RAA, I don’t see any compelling reason to penalize him for his defense or baserunning (UZR doesn’t think much of him, but Dewan’s DRS and Wyers’ FRAA don’t share that evaluation), and I don’t care that his team finished in fourth place.  Verlander does not look nearly as good when evaluated by dRA, and so when there’s reasonable doubt that the pitcher was more valuable than the position player, I side with the position player.&lt;br /&gt;&lt;br /&gt;I also have placed Verlander’s teammate Miguel Cabrera ahead of him, albeit with much less conviction.  Cabrera’s offensive value is essentially indistinguishable from Bautista’s--I estimate that Cabrera created 137 runs in 376 outs while Bautista created 134 runs in 363 outs (9.3 to 9.4 RG, 71 to 70 HRAA).  However, Cabrera played first base and there’s reason to believe he’s a below-average fielder, putting Bautista ahead.  Compared to Verlander, though, I think the case can be made that he was a little more valuable.&lt;br /&gt;&lt;br /&gt;Among the other position player candidates to fill out the ballots, Jacoby Ellsbury ranks first in RAR, plus fielding and baserunning would seem to work in his favor.  Adrian Gonzalez was right behind his teammate in RAR, and has a good fielding reputation and a decent showing in fielding metrics.  &lt;br /&gt;&lt;br /&gt;The other three spots all go to second baseman.  I suppose one can argue that the positional adjustments I use are too kind to second basemen, but I just happen to think there is a collection of very talented second basemen in the AL at this time.  Dustin Pedroia was just behind Ellsbury and Gonzalez in RAR.  Curtis Granderson (56 RAR) and Mike Napoli (56) rank ahead of the trio of Robinson Cano (53), Ben Zobrist (52), and Ian Kinsler (50), but Granderson’s fielding raises at least a little concern.  Napoli’s RAR gives him a full catcher position adjustment, but he actually played nearly as many games between first base and DH (53) as he did as a catcher (61).  While his 8.5 RG ranked third in the AL behind Bautista and Cabrera, he also logged just 427 PA.&lt;br /&gt;&lt;br /&gt;Among the three remaining second basemen, the offensive differences are small enough to throw a bone to Kinsler’s well-regarded fielding (at least by the various metrics)and baserunning, while keeping in mind that Zobrist like Napoli also played a fair amount at less demanding positions.  Evan Longoria will probably get a lot more love from others, but he ranks 16th on my RAR list and would require more fielding credit than I am comfortable with (or a repudiation of the position adjustment for 3B relative to 2B) to make the ballot:&lt;br /&gt;&lt;br /&gt;1) RF Jose Bautista, TOR&lt;br /&gt;2) 1B Miguel Cabrera, DET&lt;br /&gt;3) SP Justin Verlander, DET&lt;br /&gt;4) SP Jered Weaver, LAA&lt;br /&gt;5) CF Jacoby Ellsbury, BOS&lt;br /&gt;6) 1B Adrian Gonzalez, BOS&lt;br /&gt;7) 2B Dustin Pedroia, BOS&lt;br /&gt;8) SP James Shields, TB&lt;br /&gt;9) SP CC Sabathia, NYA&lt;br /&gt;10) 2B Ian Kinsler, TEX&lt;br /&gt;&lt;br /&gt;In the National League, there is no need for philosophical reflection about the value of a pitcher versus a position player, or any need for intricate comparisons of multiple players.  There is only one question that needs to be answered: Can you make a case against Matt Kemp?&lt;br /&gt;&lt;br /&gt;Kemp led NL hitters in RAR by 12, and was in a tied Ryan Bruan for the league lead with a 8.5 RG.  His fielding is probably not great, but since no one else was particular close in RAR, you’d have to think he was pretty bad and that Ryan Braun or Prince Fielder or Jose Reyes was really good in the field to close the gap.  I don’t see any reason to believe that, so Kemp is my runaway choice as NL MVP.&lt;br /&gt;&lt;br /&gt;Filling out the rest of the ballot, Ryan Braun is a very strong candidate for #2.  The three pitchers (Halladay, Kershaw, and Lee) that were very close for the Cy Young are all strong mid-ballot choices.  Prince Fielder was very good, but inferior to his teammate at the plate and he’s not a strong candidate for fielding and baserunning credit.  Jose Reyes and Joey Votto are also in the mix.  &lt;br /&gt;&lt;br /&gt;As you can see, I’m having trouble finding much to say about the NL ballot.  My RAR list actually makes it pretty straightforward; obviously small differences are not meaningful, but I don’t see a lot of compelling reasons to step in and make changes.  The only player who drops far below his RAR is Lance Berkman, who obviously is not much of a fielder at this point and who I would be loathe to argue was more valuable than teammate Pujols.  And that leaves him without a spot:&lt;br /&gt;&lt;br /&gt;1) CF Matt Kemp, LA&lt;br /&gt;2) LF Ryan Braun, MIL&lt;br /&gt;3) SP Roy Halladay, PHI&lt;br /&gt;4) 1B Prince Fielder, MIL&lt;br /&gt;5) SS Jose Reyes, NYN&lt;br /&gt;6) 1B Joey Votto, CIN&lt;br /&gt;7) SP Clayton Kershaw, LA&lt;br /&gt;8) SP Cliff Lee, PHI&lt;br /&gt;9) 1B Albert Pujols, STL&lt;br /&gt;10) SS Troy Tulowitzki, COL&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-1212532603095364354?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/1212532603095364354/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/11/iba-ballot-mvp.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/1212532603095364354'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/1212532603095364354'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/11/iba-ballot-mvp.html' title='IBA Ballot: MVP'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-Qh7TmfPJZ84/TsM2JmxubtI/AAAAAAAAA44/YZwJCKVgtQ8/s72-c/pitchermvp.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-3398639323353669059</id><published>2011-11-09T01:16:00.000-05:00</published><updated>2011-11-09T01:16:00.061-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Awards'/><title type='text'>IBA Ballot: Cy Young</title><content type='html'>In retroactively evaluating starting pitchers, I start with their actual runs allowed (crudely adjusted for bequeathed runners to produce what I call RRA).  I consider peripherals, primarily what I call eRA (basically a component RA) and dRA (a DIPS RA).  However, I do not start with either of those, and if there is a substantial difference in RRA, I usually don’t override it lightly.  I’m not sure that this stance makes much of a difference in this year’s Cy Young vote, at least at the top of the ballot--the top guys fare well however you slice it, but it does put me at odds with anyone following what could be called the Fangraphs school of pitcher evaluation.&lt;br /&gt;&lt;br /&gt;Everyone has handed the AL Cy Young to Justin Verlander, but consider this:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-9-0XCoDsRdY/TrmvkgVtJJI/AAAAAAAAA4g/nj2H6aVpoc8/s1600/weaververlander.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="52" width="321" src="http://3.bp.blogspot.com/-9-0XCoDsRdY/TrmvkgVtJJI/AAAAAAAAA4g/nj2H6aVpoc8/s400/weaververlander.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I don’t think that, looking at these categories, you can come to any sort of clear conclusion about who was the better pitcher.  The first guy pitched sixteen more innings, but he allowed .15 more runs per game, so when you compare them to a baseline, they are just about even.  The first pitcher had a better eRA, which is a positive, but the second pitcher didn’t grossly outpitch his peripherals.  All things outside of this chart being equal, I’d give the edge to the first pitcher, but I would hardly consider it a landslide.&lt;br /&gt;&lt;br /&gt;As you probably know, the first pitcher is Justin Verlander; the second pitcher is Jered Weaver.  Weaver also trailed Verlander in dRA (3.44 to 3.75), which I purposely omitted in order to make a point, and obviously Verlander has the win-loss angle going for him in the mainstream.  I have no qualms about putting Verlander first on my ballot, but Weaver ensured that he didn’t run away from the rest of the AL field.&lt;br /&gt;&lt;br /&gt;James Shields has a fairly large lead for the third spot on the RAR list at 77, with Sababthia fourth at 66 and three pitchers tightly clustered just below (Romero 62, Haren 62, Beckett 61).  Shields and Romero both benefitted from low BABIPs (.259 and .247).  Shields’ Rays teammates did lead the majors in DER by a wide margin; it wasn’t just him who was getting great defensive support.  Still, as discussed above, given Shields’ sizeable RAR lead over the others, I’m more comfortable giving him the nod.  It is enough to drop Romero out of the running for the last spot on the ballot, which comes down to Haren and Beckett.&lt;br /&gt;&lt;br /&gt;Haren worked 45 more innings, but Beckett’s RRA was .51 runs lower and his eRA was .42 runs lower.  However, Haren’s dRA was .24 runs lower, and since the peripherals are a split decision, I’m more comfortable going with the guy who worked a lot more.  Fried chicken was not a factor in this decision:&lt;br /&gt;&lt;br /&gt;1) Justin Verlander, DET&lt;br /&gt;2) Jered Weaver, LAA&lt;br /&gt;3) James Shields, TB&lt;br /&gt;4) CC Sabathia, NYA&lt;br /&gt;5) Dan Haren, LAA&lt;br /&gt;&lt;br /&gt;The NL Cy Young is very close.  Consider these two pitchers:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-54gk3LTW5t4/TrmvlCCQK6I/AAAAAAAAA4s/06RHOY1kgEs/s1600/kershawlee.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="53" width="385" src="http://4.bp.blogspot.com/-54gk3LTW5t4/TrmvlCCQK6I/AAAAAAAAA4s/06RHOY1kgEs/s400/kershawlee.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It would be tough to get much closer than that, wouldn’t it?  While it appears that Clayton Kershaw will win the award and that Roy Halladay is the consensus #2, the top line on that table is Kershaw and the second line is Cliff Lee.  Lee’s season is nearly indistinguishable from Kershaw’s in the categories that drive my decision.  Halladay’s same categories line: 234, 2.48, 2.70, 2.71, 43, 73.&lt;br /&gt;&lt;br /&gt;This race is close enough that I decided to take a look at each pitcher’s performance on a game-by-game basis, using the relatively crude gW% I discussed in &lt;a href="http://walksaber.blogspot.com/2011/08/completely-unnecessary-pitching-metric.html"&gt;this post&lt;/a&gt;.  However, looking at each game on its own does little more than verify that these pitchers were all very close: Lee leads the way at .685, but Halladay at .680 and Kershaw at .679 are right behind.&lt;br /&gt;&lt;br /&gt;We could consider strength of schedule.  On the team level, and considering just the opponent’s overall quality rather than isolating opposing offense as would be more useful for comparing pitchers, my crude team rankings indicate that PHI and LA played nearly the same caliber of opposition--PHI has a 95 SOS and LA a 94.  Baseball Prospectus’ data on quality of opposing hitter reveals that Halladay’s average opponent hit .260/.330/.413, Kershaw’s .263/.327/.416, and Lee’s .266/.332/.423.  Respectively, those lines translate to approximate runs/game of 4.69, 4.67, and 4.83.  But over 233 innings, even the difference between the high (Lee, 4.83) and the low (Kershaw, 4.67) is just 4 runs, and those figures probably shouldn’t be applied without any sort of regression.  &lt;br /&gt;&lt;br /&gt;From the game-by-game analysis, I can also compute the pitcher’s personal park factor weighted by innings pitched in each park rather than assuming that each pitcher logged a 50/50 home/road innings split.  My standard park factor for PHI is 101 versus 97 for LA.  Halladay and Lee’s personal park factors are both 101, while Kershaw’s is 96, making any sort of deviation from just using the team PFs an exercise in futility.&lt;br /&gt;&lt;br /&gt;I put next to zero stock in win-loss record, but Kershaw’s is 21-5 mark is obviously more impressive than Halladay’s 19-6 and Lee’s 17-8 when compared to their team’s winning percentage.  The pitcher’s run support (from ESPN.com) were 5.89, 5.52, and 4.95 respectively, which helps explain why Lee’s record lags behind the other two, but does next to nothing to help us sort out how effective they all were.&lt;br /&gt;&lt;br /&gt;In the end, I give the nod to Halladay--he led the three pitchers in all three run averages, and he does have a 5 RAR lead.  That doesn't prove he was better, but I have no reason to override it.  I think that a reasonable person could easily conclude that Kershaw or Lee deserved the award as well.  &lt;br /&gt;&lt;br /&gt;For the other two spots on the ballot, the RAR list highly recommends Ian Kennedy (61 RAR) and Cole Hamels (59), as Tim Lincecum is sixth on the list a full eight runs behind Hamels.  Hamels has slightly better peripherals than Kennedy, and while batted ball metrics are of questionable value, he does much better in those categories than Kennedy.  In this case it confirms my default position (Hamels &gt; Kennedy), and so I’ll give in and fill out my ballot as follows:&lt;br /&gt;&lt;br /&gt;1) Roy Halladay, PHI&lt;br /&gt;2) Clayton Kershaw, LA&lt;br /&gt;3) Cliff Lee, PHI&lt;br /&gt;4) Cole Hamels, PHI&lt;br /&gt;5) Ian Kennedy, ARI&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-3398639323353669059?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/3398639323353669059/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/11/iba-ballot-cy-young.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3398639323353669059'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3398639323353669059'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/11/iba-ballot-cy-young.html' title='IBA Ballot: Cy Young'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-9-0XCoDsRdY/TrmvkgVtJJI/AAAAAAAAA4g/nj2H6aVpoc8/s72-c/weaververlander.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-284044475919693415</id><published>2011-10-31T19:08:00.000-04:00</published><updated>2011-10-31T19:08:52.225-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Awards'/><title type='text'>IBA Ballot: Rookie of the Year</title><content type='html'>I will admit up front that I have not paid much attention this year to the award debates, either in the mainstream or the sabersphere.  This is good in the sense that I am coming into this cold, without having read many other perspectives that might bias me one way or another.  It’s also bad for the same reason--while I don’t think I’ve ever found mainstream commentary on player value particularly useful, there are a lot of others out there worth reading.&lt;br /&gt;&lt;br /&gt;I simply decided this year that I wasn’t going to waste any time thinking about awards until the season was over.  Not that I ever obsessed over them previously, but I pretty much completely shut them out of my mind this year.  So much so that when an acquaintance who knows I’m a baseball nut asked me who I thought should be the NL Cy Young winner a couple weeks ago, he was shocked when the best I could offer was “uh, probably either Halladay or Kershaw”.&lt;br /&gt;&lt;br /&gt;In any event, let me start in the AL.  This rookie crop belongs to the pitchers.  My top three candidates are all starting pitchers.  Michael Pineda got out the best start, Ivan Nova had the flashiest win-loss record, but Jeremy Hellickson was the AL’s most valuable rookie pitcher.  Hellickson led the trio in innings (189 to Pineda’s 171 and Nova’s 165) and RRA (3.24 to 3.81 and 4.10).  Combining the two, I have Hellickson at 52 RAR, Pineda 36, and Nova 30.&lt;br /&gt;&lt;br /&gt;Hellickson’s BABIP was just .229, so from a strict DIPS perspective one could make a case for Pineda (or even Nova) ahead of Hellickson.  But for a retrospective award, I stick to actual runs allowed and first-order component RA for the most part.  If Pineda and Hellickson were close, I would consider moving the former ahead, but the gap is too big in this case.&lt;br /&gt;&lt;br /&gt;For the remaining two spots on the ballot, the top position players are Dustin Ackley, Eric Hosmer, and Jemile Weeks.  Ackley was the most productive hitter of the three, while Hosmer had 130 more PA than either of them.  I have Ackley and Weeks both at 23 RAR with Hosmer at 21.  Fielding and baserunning would seem to favor Weeks.&lt;br /&gt;&lt;br /&gt;Greg Holland deserves a mention at least a mention as a reliever.  Holland stranded 31 of 33 baserunners, the second-best performance of any AL reliever, and his peripherals were terrific as well.  However, his 26 RAR is thanks in large part to the inherited runner performance, and thanks to Hosmer I wouldn’t be comfortable naming him the most valuable rookie on his own team.  So I see it as:&lt;br /&gt;&lt;br /&gt;1) SP Jeremy Hellickson, TB&lt;br /&gt;2) SP Michael Pineda, SEA&lt;br /&gt;3) SP Ivan Nova, NYA&lt;br /&gt;4) 2B Jemile Weeks, OAK&lt;br /&gt;5) 2B Dustin Ackley, SEA&lt;br /&gt;&lt;br /&gt;You’ll note that I consider Mark Trumbo an afterthought.  Yes, he hit 29 homers, but he also drew just 25 walks.  His .290 OBA was second-lowest among AL first baseman with 300 PA, so despite the power, he ranks in the middle of the pack offensively at his position.  He wouldn’t crack my top ten.&lt;br /&gt;&lt;br /&gt;If Trumbo is the biggest source of divergence from my take on the award and the mainstream, his NL counterpart will certainly be Craig Kimbrel.  Kimbrel was terrific by any measure, but in the end you have 77 innings pitched.  I don’t believe in extreme leverage bonuses--or much of a leverage bonus at all.  I’ll give him an arbitrary 25% boost to get to 25 RAR, but no more.&lt;br /&gt;&lt;br /&gt;Among position players, the three standouts are Kimbrel’s teammate Freddie Freeman and Washington teammates Wilson Ramos and Danny Espinosa.  I have them all essentially even in terms of RAR at 27.  BP’s FRAA likes Espinosa’s fielding and baserunning, and that’s enough to put him in the lead.  I suspect Freeman will get more support than Ramos, but the two aren’t that far apart as hitters, with Freeman creating 5.3 runs per game and Ramos 5.0.  Freeman had nearly 200 more PA, but Ramos is a catcher.  Freeman’s fielding reputation is good, but his FRAA was -5.  It can go either way, but I prefer Ramos.&lt;br /&gt;&lt;br /&gt;Josh Collmenter and Vance Worley were the top starters, with apologies to Cory Luebke, who I could certainly make a ballot case for, but will refrain lest I be accused of favoritism.  Collmenter worked 23 more innings than Worley, which puts him 5 RAR ahead (36 to 31).  Collmenter did have a BABIP of just .263 to Worley’s .293, but the dRA difference is not large enough (4.06 to 3.72) to convince me to put Worley ahead.&lt;br /&gt;&lt;br /&gt;Depending on how you value Espinosa’s fielding, you certainly could conclude that he was more valuable than Collmenter--conservatively, I’ll stick with the later, and so my ballot is:&lt;br /&gt;1) SP Josh Collmenter, ARI&lt;br /&gt;2) 2B Danny Espinosa, WAS&lt;br /&gt;3) SP Vance Worley, PHI&lt;br /&gt;4) C Wilson Ramos, WAS&lt;br /&gt;5) RP Craig Kimbrel, ATL&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-284044475919693415?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/284044475919693415/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/10/iba-ballot-rookie-of-year.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/284044475919693415'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/284044475919693415'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/10/iba-ballot-rookie-of-year.html' title='IBA Ballot: Rookie of the Year'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-7007062362735416736</id><published>2011-10-28T17:17:00.003-04:00</published><updated>2011-10-30T01:43:37.841-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Meanderings'/><category scheme='http://www.blogger.com/atom/ns#' term='Whimsy'/><title type='text'>Baseball</title><content type='html'>It has been a part of my life for almost as long as I can remember and it will remain so for as long as I live.  For seven months of the year, it is as familiar a part of my life as brushing my teeth or eating dinner, and so it is easy to take for granted.  But then one day I wake up and suddenly it is gone, and in the void there is malaise.  When the weather is nice, it is played; when it is dark and cold, it moves towards the tropics and away from focus.  While it can be used to tell seasons, it scoffs at time while it is played.  The competitors dictate the endpoint through their play.&lt;br /&gt;&lt;br /&gt;It is a team game, but in many ways it allows the individual to stand and be judged on his own merits.  It is a game that, through its variants and offshoots, is quite playable by a large number of people.  It is the great American pastime, but it is also the great Cuban passion, the great Dominican pastime, perhaps the most popular import Japan has ever known.  We call it baseball, but it is equally beisbol, yakyu, honkbal, pelota.  &lt;br /&gt;&lt;br /&gt;It is a game simple enough that it can be described (and recorded, on nothing more complex than a piece of paper) discretely--by inning, by score, by out, by baserunner, by count--yet complex enough that there are hundreds and hundreds of people like me who are fascinated by it and spend much of our free time thinking about it, yet we still discover new things about it. &lt;br /&gt;&lt;br /&gt;And if you are wired to view the world in a certain way, to try to find and verify patterns, to quantify when possible, and sometimes to find meaning and order through randomness and chance--then sabermetrics is a vessel for enjoying it, understanding it, and celebrating it.  To know that what we have seen over the last month is not just unlikely--but rather to have a systematic way of thinking that allows us to estimate just how unlikely--does not detract from it.  &lt;br /&gt;&lt;br /&gt;Once in a while we are presented with just one more game--one game that is, without question, the end.  It almost goes against the spirit of the game to be pettily constrained by a set limit of games that cannot be cheated, unlike the nine innings that often become ten, and sometimes become twelve, and on glorious occasions become twenty, and in theory can be infinite.  The potential is often greater than the payoff--but either way, the journey was incredible.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-7007062362735416736?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/7007062362735416736/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/10/baseball.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7007062362735416736'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7007062362735416736'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/10/baseball.html' title='Baseball'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-7234152592618296495</id><published>2011-10-09T23:54:00.000-04:00</published><updated>2011-10-09T23:54:44.224-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Playoffs'/><title type='text'>Brief Playoff Meanderings</title><content type='html'>* There have been eighteen postseasons in which the Division Series has been held (I’m counting the 1981 playoffs between the half-season winners as Division Series).  2011 set the new record for the most aggregate games played in the round, with nineteen.  The maximum is twenty, and had the Rays managed to take an additional game from the Rangers it would have been reached.  The previous high was eighteen, which occurred in 1981, 2001 and 2003. &lt;br /&gt;&lt;br /&gt;The record for most total games played in the postseason (since 1995; in this case I’m excluding 1981 because the LCS was only a five-game series at that point) is 38 in 2003--two LDS went four and the World Series went six, but all other series went the distance.  The ALCS and NLCS are both well-remembered (I can just say Grady Little or Aaron Boone and Steve Bartman and you’ll remember the circumstances).&lt;br /&gt;&lt;br /&gt;No other postseason has come particularly close; the runner-up is 2001, which saw 35 total games played despite each LCS only lasting five games.  The fewest games played in a post-season is 28 in 2007--every series was a sweep except for the two involving Cleveland, who beat New York in four in the ALDS then lost to Boston in seven in the ALCS.  To put 2007 in perspective, every series from here on out in 2011 could be a sweep, and the total games played would be 31.&lt;br /&gt;&lt;br /&gt;A natural follow-up question is “What is the expected number of postseason games?”  If you assume that each game is a 50/50 proposition (equally matched teams, no home field advantage, no variation in team quality from day-to-day, etc.), then it’s very straightforward to estimate series length with the geometric distribution.&lt;br /&gt;&lt;br /&gt;For a five-game series under those assumptions, there is a 25% chance for a sweep and a 37.5% chance for a four or five game series.  For a seven-game series, there is a 12.5% chance for four games, 25% for five games, and 31.25% for six or seven games.  Thus, the expected length of a five-game series is 4.125 games, the expected length of a seven-game series is 5.8125 games, and the expected number of games in the postseason is 33.9375.  1997, 2002 and 2004 all met expectations with 34 games.&lt;br /&gt;&lt;br /&gt;However, if one compares the expected series lengths to the observed series length in the divisional era (1969 and foreward), he will find that five-game series do not conform to expectations:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-p4b00Lyl7_Q/TpJsD-qQkCI/AAAAAAAAA4Q/WgRpaG00wFY/s1600/serieslength5.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="86" width="193" src="http://4.bp.blogspot.com/-p4b00Lyl7_Q/TpJsD-qQkCI/AAAAAAAAA4Q/WgRpaG00wFY/s400/serieslength5.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Five-game series tend to be resolved in fewer games than one would expect assuming an equal probability of each outcome.  The difference is statistically significant by reasonable standards.  The average is just 3.86 games.  Assuming that one of the teams has a .716 expected winning percentage comes close to minimizing the error assuming the geometric distribution framework:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-WK3NnMrRSLY/TpJsDu2qyII/AAAAAAAAA4I/9DhsZNw0ewM/s1600/serieslength5theo.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="69" width="194" src="http://1.bp.blogspot.com/-WK3NnMrRSLY/TpJsDu2qyII/AAAAAAAAA4I/9DhsZNw0ewM/s400/serieslength5theo.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I’m presenting this as a curiosity, and I’m certainly not suggesting that we should assume that the assumptions I described are useless when thinking about the Division Series.  And on the other hand, seven-game series since 1969 conform almost as well as one could hope for:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-eDY0Kz0Oelk/TpJsEBqtTwI/AAAAAAAAA4Y/eBEtAoS0jvo/s1600/serieslength7.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="103" width="194" src="http://2.bp.blogspot.com/-eDY0Kz0Oelk/TpJsEBqtTwI/AAAAAAAAA4Y/eBEtAoS0jvo/s400/serieslength7.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There is a slight tendency for series to be resolved more quickly than one would expect, but it isn’t particularly significant, and the average of 5.75 is not far off the expected 5.81.&lt;br /&gt;&lt;br /&gt;*What I’m going to say here is not in any way novel; many fans, both sabermetrically-inclined and not have expressed the same opinion over the years.  But there were two instances that I considered so egregious in the Arizona/Milwaukee game give that I can’t help but comment on it here. &lt;br /&gt;&lt;br /&gt;I have always thought that many managers are way too eager to make substitutions that sacrifice offense for baserunning or defense or the pitcher’s slot in the lineup, but I’m not sure I’ve ever seen a better display of it than in the aforementioned Game Five.  In the eighth inning, Arizona trailed 2-1 with runners at first and third and two out.  Chris Young drew a walk to load the bases and advance Miguel Montero from first to second, bringing Ryan Roberts up with the bases loaded.&lt;br /&gt;&lt;br /&gt;At this point, Kirk Gibson decided to pinch-run for Montero, sending Collin Cowgill in.  Montero occupied the #4 spot in the order, while Roberts was #7.  Thus, it doesn’t take a rocket scientist to realize that, with an additional inning to go, there was a pretty good chance that Montero’s vacated spot would come up to bat again, and barring Arizona scoring at least two runs and holding Milwaukee in the bottom of the eighth, it would come with the Diamondbacks still needing a run (when I say needing  a run, I mean it in the sense that Gibson apparently considered, since I would never say you don’t “need” more runs at any point in the game).&lt;br /&gt;&lt;br /&gt;One would have to evaluate the marginal value of Cowgill’s baserunning very highly to see that as a winning move, especially considering that Montero would be off with contact given that their were two outs.  Of course, as it played out, Roberts grounded into a fielder’s choice, and Montero’s spot did come up in the ninth, with the game now tied but runners at the corners and two outs.  Henry Blanco hit into a fielder’s choice, and Arizona did not mount a threat in the tenth before allowing the game-winning run in the bottom of the frame.&lt;br /&gt;&lt;br /&gt;The second move was not nearly as egregious, but it was still quite puzzling to me.  With a 2-1 lead in the top of the ninth, Ron Roenicke summoned his closer, John Axford.  The pitcher’s spot was due up fourth in the bottom of the ninth, so he double-switched Axford into Rickie Weeks’ #5 spot since he’d made the last out of the eighth.&lt;br /&gt;&lt;br /&gt;Given that Roenicke wanted to make a double switch, Weeks was the only obvious candidate to be replaced--removing Braun or Fielder would be worse, especially since they were closer to coming to the plate, and Nyjer Morgan’s second spot was due up sixth in the bottom of the ninth.  (One could make a case that Morgan would be the best candidate, but given that he got the walkoff hit in the tenth it wouldn’t be an argument that would fly over well with the “results not process” crowd).&lt;br /&gt;&lt;br /&gt;What I find interesting about the double-switch for the home team taking the lead into the top of the ninth is that the only way the batting order matters at all is if Axford surrenders the lead.  Thus, while you preserve Axford’s ability to pitch the tenth without sabotaging your offense in the ninth, you also know that if he does so, it will be only after he yielded a run in the ninth.  You know that you will “need” runs if the #5 spot ever comes to the plate again.&lt;br /&gt;&lt;br /&gt;Of course, this all worked out for Roenicke, since Axford pitched a 1-2-3 tenth, Morgan got the game-winning hit, and the #5 spot never batted again.  And Roenicke does apparently like to bring Counsell in as a defensive replacement for Weeks, so if Weeks is going to come out of the game anyway, the double switch is the way to do it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-7234152592618296495?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/7234152592618296495/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/10/brief-playoff-meanderings.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7234152592618296495'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7234152592618296495'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/10/brief-playoff-meanderings.html' title='Brief Playoff Meanderings'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-p4b00Lyl7_Q/TpJsD-qQkCI/AAAAAAAAA4Q/WgRpaG00wFY/s72-c/serieslength5.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-4731549834699660468</id><published>2011-10-02T12:30:00.003-04:00</published><updated>2011-10-08T17:27:41.045-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Park Factors'/><category scheme='http://www.blogger.com/atom/ns#' term='Offense'/><category scheme='http://www.blogger.com/atom/ns#' term='Run Estimators'/><category scheme='http://www.blogger.com/atom/ns#' term='Baselines'/><category scheme='http://www.blogger.com/atom/ns#' term='Statistical Reports'/><category scheme='http://www.blogger.com/atom/ns#' term='Positional Adjustments'/><category scheme='http://www.blogger.com/atom/ns#' term='Pitching'/><title type='text'>End of Season Statistics 2011</title><content type='html'>The spreadsheets are published as Google Spreadsheets, which you can download in Excel format by changing the extension in the address from "=html" to "=xls". That way you can download them and manipulate things however you see fit.&lt;br /&gt;&lt;br /&gt;The data comes from a number of different sources. Most of the basic data comes from Doug's Stats, which is a very handy site. KJOK's park database provided some of the data used in the park factors, but for recent seasons park data comes from anywhere that has it--Doug's Stats, or Baseball-Reference, or ESPN.com, or MLB.com. Data on pitcher's batted ball types allowed, doubles/triples allowed, and inherited/bequeathed runners comes from Baseball Prospectus. &lt;br /&gt;&lt;br /&gt;The basic philosophy behind these stats is to use the simplest methods that have acceptable accuracy. Of course, "acceptable" is in the eye of the beholder, namely me. I use Pythagenpat not because other run/win converters, like a constant RPW or a fixed exponent are not accurate enough for this purpose, but because it's mine and it would be kind of odd if I didn't use it. &lt;br /&gt;&lt;br /&gt;If I seem to be a stickler for purity in my critiques of others' methods, I'd contend it is usually in a theoretical sense, not an input sense. So when I exclude hit batters, I'm not saying that hit batters are worthless or that they *should* be ignored; it's just easier not to mess with them and not that much less accurate. &lt;br /&gt;&lt;br /&gt;I also don't really have a problem with people using sub-standard methods (say, Basic RC) as long as they acknowledge that they are sub-standard. If someone pretends that Basic RC doesn't undervalue walks or cause problems when applied to extreme individuals, I'll call them on it; if they explain its shortcomings but use it regardless, I accept that. Take these last three paragraphs as my acknowledgment that some of the statistics displayed here have shortcomings as well.&lt;br /&gt;&lt;br /&gt;The League spreadsheet is pretty straightforward--it includes league totals and averages for a number of categories, most or all of which are explained at appropriate junctures throughout this piece. The advent of interleague play has created two different sets of league totals--one for the offense of league teams and one for the defense of league teams. Before interleague play, these two were identical. I do not present both sets of totals (you can figure the defensive ones yourself from the team spreadsheet, if you desire), just those for the offenses. The exception is for the defense-specific statistics, like innings pitched and quality starts. The figures for those categories in the league report are for the defenses of the league's teams. However, I do include each league's breakdown of basic pitching stats between starters and relievers (denoted by "s" or "r" prefixes), and so summing those will yield the totals from the pitching side. The one abbreviation you might not recognize is "N"--this is the league average of runs/game for one team, and it will pop up again.&lt;br /&gt;&lt;br /&gt;The Team spreadsheet focuses on overall team performance--wins, losses, runs scored, runs allowed. The columns included are: Park Factor (PF), Home Run Park Factor (PFhr), Winning Percentage (W%), Expected W% (EW%), Predicted W% (PW%), wins, losses, runs, runs allowed, Runs Created (RC), Runs Created Allowed (RCA), Home Winning Percentage (HW%), Road Winning Percentage (RW%) [exactly what they sound like--W% at home and on the road], Runs/Game (R/G), Runs Allowed/Game (RA/G), Runs Created/Game (RCG), Runs Created Allowed/Game (RCAG), and Runs Per Game (the average number of runs scored an allowed per game). Ideally, I would use outs as the denominator, but for teams, outs and games are so closely related that I don’t think it’s worth the extra effort.&lt;br /&gt;&lt;br /&gt;The runs and Runs Created figures are unadjusted, but the per-game averages are park-adjusted, except for RPG which is also raw. Runs Created and Runs Created Allowed are both based on a simple Base Runs formula. The formula is:&lt;br /&gt;&lt;br /&gt;A = H + W - HR - CS&lt;br /&gt;B = (2TB - H - 4HR + .05W + 1.5SB)*.76&lt;br /&gt;C = AB - H&lt;br /&gt;D = HR&lt;br /&gt;Naturally, A*B/(B + C) + D.&lt;br /&gt;&lt;br /&gt;I have explained the methodology used to figure the PFs before, but the cliff’s notes version is that they are based on five years of data when applicable, include both runs scored and allowed, and they are regressed towards average (PF = 1), with the amount of regression varying based on the number of years of data used. There are factors for both runs and home runs. The initial PF (not shown) is:&lt;br /&gt;&lt;br /&gt;iPF = (H*T/(R*(T - 1) + H) + 1)/2&lt;br /&gt;where H = RPG in home games, R = RPG in road games, T = # teams in league (14 for AL and 16 for NL). Then the iPF is converted to the PF by taking x*iPF + (1-x), where x = .6 if one year of data is used, .7 for 2, .8 for 3, and .9 for 4+. &lt;br /&gt;&lt;br /&gt;It is important to note, since there always seems to be confusion about this, that these park factors already incorporate the fact that the average player plays 50% on the road and 50% at home. That is what the adding one and dividing by 2 in the iPF is all about. So if I list Fenway Park with a 1.02 PF, that means that it actually increases RPG by 4%. &lt;br /&gt;&lt;br /&gt;In the calculation of the PFs, I did not get picky and take out “home” games that were actually at neutral sites, like the Astros/Cubs series that was moved to Milwaukee in 2008. &lt;br /&gt;&lt;br /&gt;There are also Team Offense and Defense spreadsheets. These include the following categories:&lt;br /&gt;&lt;br /&gt;Team offense: Plate Appearances, Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Walks per At Bat (WAB), Isolated Power (SLG - BA), R/G at home (hR/G), and R/G on the road (rR/G) BA, OBA, SLG, WAB, and ISO are park-adjusted by dividing by the square root of park factor (or the equivalent; WAB = (OBA - BA)/(1 - OBA) and ISO = SLG - BA).&lt;br /&gt;&lt;br /&gt;Team defense: Innings Pitched, BA, OBA, SLG, Innings per Start (IP/S), Starter's eRA (seRA), Reliever's eRA (reRA), RA/G at home (hRA/G), RA/G on the road (rRA/G), Battery Mishap Rate (BMR), Modified Fielding Average (mFA), and Defensive Efficiency Record (DER). BA, OBA, and SLG are park-adjusted by dividing by the square root of PF; seRA and reRA are divided by PF.&lt;br /&gt;&lt;br /&gt;The three fielding metrics I've included are limited it only to metrics that a) I can calculate myself and b) are based on the basic available data, not specialized PBP data. The three metrics are explained in &lt;a href="http://walksaber.blogspot.com/2010/08/rudimentary-team-fielding-metrics.html"&gt;this post&lt;/a&gt;, but here are quick descriptions of each:&lt;br /&gt;&lt;br /&gt;1) BMR--wild pitches and passed balls per 100 baserunners = (WP + PB)/(H + W - HR)*100&lt;br /&gt;&lt;br /&gt;2) mFA--fielding average removing strikeouts and assists = (PO - K)/(PO - K + E)&lt;br /&gt;&lt;br /&gt;3) DER--the Bill James classic, using only the PA-based estimate of plays made. Based on a suggestion by Terpsfan101, I've tweaked the error coefficient. Plays Made = PA - K - H - W - HR - HB - .64E and DER = PM/(PM + H - HR + .64E)&lt;br /&gt;&lt;br /&gt;Next are the individual player reports. I defined a starting pitcher as one with 15 or more starts. All other pitchers are eligible to be included as a reliever. If a pitcher has 40 appearances, then they are included. Additionally, if a pitcher has 50 innings and less than 50% of his appearances are starts, he is also included as a reliever (this allows some swingmen type pitchers who wouldn’t meet either the minimum start or appearance standards to get in).&lt;br /&gt;&lt;br /&gt;For all of the player reports, ages are based on simply subtracting their year of birth from 2011. I realize that this is not compatible with how ages are usually listed and so “Age 27” doesn’t necessarily correspond to age 27 as I list it, but it makes everything a heckuva lot easier, and I am more interested in comparing the ages of the players to their contemporaries, for which case it makes very little difference. The "R" category records rookie status with a "R" for rookies and a blank for everyone else; I've trusted Baseball Prospectus on this. Also, all players are counted as being on the team with whom they played/pitched (IP or PA as appropriate) the most. &lt;br /&gt;&lt;br /&gt;For relievers, the categories listed are: Games, Innings Pitched, Run Average (RA), Relief Run Average (RRA), Earned Run Average (ERA), Estimated Run Average (eRA), DIPS Run Average (dRA), Batted Ball Run Average (cRA), SIERA-style Run Average (sRA), Guess-Future (G-F), Inherited Runners per Game (IR/G), Batting Average on Balls in Play (%H), Runs Above Average (RAA), and Runs Above Replacement (RAR).&lt;br /&gt;&lt;br /&gt;IR/G is per relief appearance (G - GS); it is an interesting thing to look at, I think, in lieu of actual leverage data. You can see which closers come in with runners on base, and which are used nearly exclusively to start innings. Of course, you can’t infer too much; there are bad relievers who come in with a lot of people on base, not because they are being used in high leverage situations, but because they are long men being used in low-leverage situations already out of hand.&lt;br /&gt;&lt;br /&gt;For starting pitchers, the columns are: Wins, Losses, Innings Pitched, RA, RRA, ERA, eRA, dRA, cRA, sRA, G-F, %H, Pitches/Start (P/S), Quality Start Percentage (QS%), RAA, and RAR. RA and ERA you know--R*9/IP or ER*9/IP, park-adjusted by dividing by PF. The formulas for eRA, dRA, cRA, and sRA are in &lt;a href="http://walksaber.blogspot.com/2010/09/simple-bsr-component-ras.html"&gt;this article&lt;/a&gt;; I'm not going to copy them here, but all of them are based on the same Base Runs equation and they all estimate RA, not ERA:&lt;br /&gt;&lt;br /&gt;* eRA is based on the actual results allowed by the pitcher (hits, doubles, home runs, walks, strikeouts, etc.). It is park-adjusted by dividing by PF.&lt;br /&gt;&lt;br /&gt;* dRA is the classic DIPS-style RA, assuming that the pitcher allows a league average %H, and that his hits in play have a league-average S/D/T split. It is park-adjusted by dividing by PF.&lt;br /&gt;&lt;br /&gt;* cRA is based on batted ball type (FB, GB, POP, LD) allowed, using the actual estimated linear weight value for each batted ball type. It is not park-adjusted.&lt;br /&gt;&lt;br /&gt;* sRA is a SIERA-style RA, based on batted balls but broken down into just groundballs and non-groundballs. It is not park-adjusted either.&lt;br /&gt;&lt;br /&gt;Both cRA and sRA are running a little high when compared to actual RA for 2010. Both measures are very sensitive and need to be recalibrated in order to overcome batted ball-type definition differences, frequencies of hit types on each kind of batted ball, and other factors, so keep in mind that they may not perfectly track RA without those adjustments (which I have not made in this case).  I’ll let you make your own determination as to whether you find this data useful at all.  Personally, I prefer to look at RRA, eRA, and dRA.&lt;br /&gt;&lt;br /&gt;G-F is a junk stat, included here out of habit because I've been including it for years. It was intended to give a quick read of a pitcher's expected performance in the next season, based on eRA and strikeout rate. Although the numbers vaguely resemble RAs, it's actually unitless. As a rule of thumb, anything under four is pretty good for a starter. G-F = 4.46 + .095(eRA) - .113(K*9/IP). It is a junk stat. JUNK STAT JUNK STAT JUNK STAT. Got it?&lt;br /&gt;&lt;br /&gt;%H is BABIP, more or less; I use an estimate of PA (IP*x + H + W, where x is the league average of (AB - H)/IP). %H = (H - HR)/(IP*x + H - HR - K). Pitches/Start includes all appearances, so I've counted relief appearances as one-half of a start (P/S = Pitches/(.5*G + .5*GS). QS% is just QS/(G - GS); I don't think it's particularly useful, but Doug's Stats include QS so I include it.&lt;br /&gt;&lt;br /&gt;I've used a stat called Relief Run Average (RRA) in the past, based on Sky Andrecheck's article in the August 1999 By the Numbers; that one only used inherited runners, but I've revised it to include bequeathed runners as well, making it equally applicable to starters and relievers. I am using RRA as the building block for baselined value estimates for all pitchers this year. I explained RRA in &lt;a href="http://walksaber.blogspot.com/2010/09/relief-run-average.html"&gt;this article&lt;/a&gt;, but the bottom line formulas are:&lt;br /&gt;&lt;br /&gt;BRSV = BRS - BR*i*sqrt(PF)&lt;br /&gt;IRSV = IR*i*sqrt(PF) - IRS&lt;br /&gt;RRA = ((R - (BRSV + IRSV))*9/IP)/PF&lt;br /&gt;&lt;br /&gt;The two baselined stats are Runs Above Average (RAA) and Runs Above Replacement (RAR). RAA uses the league average runs/game (N) for both starters and relievers, while RAR uses separate replacement levels for starters and relievers. Thus, RAA and RAR will be pretty close for relievers:&lt;br /&gt;&lt;br /&gt;RAA = (N - RRA)*IP/9&lt;br /&gt;RAR (relievers) = (1.11*N - RRA)*IP/9&lt;br /&gt;RAR (starters) = (1.28*N - RRA)*IP/9&lt;br /&gt;&lt;br /&gt;All players with 285 or more plate appearances are included in the Hitters spreadsheets. (I usually use 300 as a cutoff, but this year when I had the list sorted there were a number of players just below 300 that I was interested in, so I chose an arbitrarily lower threshold). Each is assigned one position, the one at which they appeared in the most games. The statistics presented are: Games played (G), Plate Appearances (PA), Outs (O), Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Runs Created (RC), Runs Created per Game (RG), Speed Score (SS), Hitting Runs Above Average (HRAA), Runs Above Average (RAA), Hitting Runs Above Replacement (HRAR), and Runs Above Replacement (RAR).&lt;br /&gt;&lt;br /&gt;I do not bother to include hit batters, so take note of that for players who do get plunked a lot. Therefore, PA are simply AB + W. Outs are AB - H + CS. BA and SLG you know, but remember that without HB and SF, OBA is just (H + W)/(AB + W). Secondary Average = (TB - H + W)/AB = SLG - BA + (OBA - BA)/(1 - OBA). I have not included net steals as many people (and Bill James himself) do--it is solely hitting events.&lt;br /&gt;&lt;br /&gt;BA, OBA, and SLG are park-adjusted by dividing by the square root of PF. This is an approximation, of course, but I'm satisfied that it works well. The goal here is to adjust for the win value of offensive events, not to quantify the exact park effect on the given rate. I use the BA/OBA/SLG-based formula to figure SEC, so it is park-adjusted as well.&lt;br /&gt;&lt;br /&gt;Runs Created is actually Paul Johnson's ERP, more or less. Ideally, I would use a custom linear weights formula for the given league, but ERP is just so darn simple and close to the mark that it’s hard to pass up. I still use the term “RC” partially as a homage to Bill James (seriously, I really like and respect him even if I’ve said negative things about RC and Win Shares), and also because it is just a good term. I like the thought put in your head when you hear “creating” a run better than “producing”, “manufacturing”, “generating”, etc. to say nothing of names like “equivalent” or “extrapolated” runs. None of that is said to put down the creators of those methods--there just aren’t a lot of good, unique names available. Anyway, RC = (TB + .8H + W + .7SB - CS - .3AB)*.322.&lt;br /&gt;&lt;br /&gt;RC is park adjusted by dividing by PF, making all of the value stats that follow park adjusted as well. RG, the rate, is RC/O*25.5. I do not believe that outs are the proper denominator for an individual rate stat, but I also do not believe that the distortions caused are that bad. (I still intend to finish my rate stat series and discuss all of the options in excruciating detail, but alas you’ll have to take my word for it now).&lt;br /&gt;&lt;br /&gt;I have decided to switch to a watered-down version of Bill James' Speed Score this year; I only use four of his categories. Previously I used my own knockoff version called Speed Unit, but trying to keep it from breaking down every few years was a wasted effort. &lt;br /&gt;&lt;br /&gt;Speed Score is the average of four components, which I'll call a, b, c, and d:&lt;br /&gt;&lt;br /&gt;a = ((SB + 3)/(SB + CS + 7) - .4)*20&lt;br /&gt;b = sqrt((SB + CS)/(S + W))*14.3&lt;br /&gt;c = ((R - HR)/(H + W - HR) - .1)*25&lt;br /&gt;d = T/(AB - HR - K)*450&lt;br /&gt;&lt;br /&gt;James actually uses a sliding scale for the triples component, but it strikes me as needlessly complex and so I've streamlined it. I also changed some of his division to mathematically equivalent multiplications.&lt;br /&gt;&lt;br /&gt;There are a whopping four categories that compare to a baseline; two for average, two for replacement. Hitting RAA compares to a league average hitter; it is in the vein of Pete Palmer’s Batting Runs. RAA compares to an average hitter at the player’s primary position. Hitting RAR compares to a “replacement level” hitter; RAR compares to a replacement level hitter at the player’s primary position. The formulas are:&lt;br /&gt;&lt;br /&gt;HRAA = (RG - N)*O/25.5&lt;br /&gt;RAA = (RG - N*PADJ)*O/25.5&lt;br /&gt;HRAR = (RG - .73*N)*O/25.5&lt;br /&gt;RAR = (RG - .73*N*PADJ)*O/25.5&lt;br /&gt;&lt;br /&gt;PADJ is the position adjustment, and it is based on 1992-2001 offensive data. For catchers it is .89; for 1B/DH, 1.19; for 2B, .93; for 3B, 1.01; for SS, .86; for LF/RF, 1.12; and for CF, 1.02.  It dawned on me when re-reading this before posting that the timeframe means that I’ve been using the same PADJ for ten years--which means two things:&lt;br /&gt;&lt;br /&gt;1) I’m getting old&lt;br /&gt;2) It’s probably time for an update.  I’ll look at 2002-2011 in my forthcoming annual “Offense by Postion” post&lt;br /&gt;&lt;br /&gt;That was the mechanics of the calculations; now I'll twist myself into knots trying to justify them. If you only care about the how and not the why, stop reading now. &lt;br /&gt;&lt;br /&gt;The first thing that should be covered is the philosophical position behind the statistics posted here. They fall on the continuum of ability and value in what I have called "performance". Performance is a technical-sounding way of saying "Whatever arbitrary combination of ability and value I prefer".&lt;br /&gt;&lt;br /&gt;With respect to park adjustments, I am not interested in how any particular player is affected, so there is no separate adjustment for lefties and righties for instance. The park factor is an attempt to determine how the park affects run scoring rates, and thus the win value of runs.&lt;br /&gt;&lt;br /&gt;I apply the park factor directly to the player's statistics, but it could also be applied to the league context. The advantage to doing it my way is that it allows you to compare the component statistics (like Runs Created or OBA) on a park-adjusted basis. The drawback is that it creates a new theoretical universe, one in which all parks are equal, rather than leaving the player grounded in the actual context in which he played and evaluating how that context (and not the player's statistics) was altered by the park.&lt;br /&gt;&lt;br /&gt;The good news is that the two approaches are essentially equivalent; in fact, they are equivalent if you assume that the Runs Per Win factor is equal to the RPG. Suppose that we have a player in an extreme park (PF = 1.15, approximately like Coors Field pre-humidor) who has an 8 RG before adjusting for park, while making 350 outs in a 4.5 N league. The first method of park adjustment, the one I use, converts his value into a neutral park, so his RG is now 8/1.15 = 6.957. We can now compare him directly to the league average:&lt;br /&gt;&lt;br /&gt;RAA = (6.957 - 4.5)*350/25.5 = +33.72&lt;br /&gt;&lt;br /&gt;The second method would be to adjust the league context. If N = 4.5, then the average player in this park will create 4.5*1.15 = 5.175 runs. Now, to figure RAA, we can use the unadjusted RG of 8:&lt;br /&gt;&lt;br /&gt;RAA = (8 - 5.175)*350/25.5 = +38.77&lt;br /&gt;&lt;br /&gt;These are not the same, as you can obviously see. The reason for this is that they take place in two different contexts. The first figure is in a 9 RPG (2*4.5) context; the second figure is in a 10.35 RPG (2*4.5*1.15) context. Runs have different values in different contexts; that is why we have RPW converters in the first place. If we convert to WAA (using RPW = RPG), then we have:&lt;br /&gt;&lt;br /&gt;WAA = 33.72/9 = +3.75&lt;br /&gt;WAA = 38.77/10.35 = +3.75&lt;br /&gt;&lt;br /&gt;Once you convert to wins, the two approaches are equivalent. The other nice thing about the first approach is that once you park-adjust, everyone in the league is in the same context, and you can dispense with the need for converting to wins at all. You still might want to convert to wins, and you'll need to do so if you are comparing the 2010 players to players from other league-seasons (including between the AL and NL in the same year), but if you are only looking to compare Jose Bautista to Miguel Cabrera, it's not necessary. WAR is somewhat ubiquitous now, but personally I prefer runs when possible--why mess with decimal points if you don't have to? &lt;br /&gt;&lt;br /&gt;The park factors used to adjust player stats here are run-based. Thus, they make no effort to project what a player "would have done" in a neutral park, or account for the difference effects parks have on specific events (walks, home runs, BA) or types of players. They simply account for the difference in run environment that is caused by the park (as best I can measure it). As such, they don't evaluate a player within the actual run context of his team's games; they attempt to restate the player's performance as an equivalent performance in a neutral park.&lt;br /&gt;&lt;br /&gt;I suppose I should also justify the use of sqrt(PF) for adjusting component statistics. The classic defense given for this approach relies on basic Runs Created--runs are proportional to OBA*SLG, and OBA*SLG/PF = OBA/sqrt(PF)*SLG/sqrt(PF). While RC may be an antiquated tool, you will find that the square root adjustment is fairly compatible with linear weights or Base Runs as well. I am not going to take the space to demonstrate this claim here, but I will some time in the future. &lt;br /&gt;&lt;br /&gt;Many value figures published around the sabersphere adjust for the difference in quality level between the AL and NL. I don't, but this is a thorny area where there is no right or wrong answer as far as I'm concerned. I also do not make an adjustment in the league averages for the fact that the overall NL averages include pitcher batting and the AL does not (not quite true in the era of interleague play, but you get my drift). &lt;br /&gt;&lt;br /&gt;The difference between the leagues may not be precisely calculable, and it certainly is not constant, but it is real. If the average player in the AL is better than the average player in the NL, it is perfectly reasonable to expect the average AL player to have more RAR than the average NL player, and that will not happen without some type of adjustment. On the other hand, if you are only interested in evaluating a player relative to his own league, such an adjustment is not necessarily welcome.&lt;br /&gt;&lt;br /&gt;The league argument only applies cleanly to metrics baselined to average. Since replacement level compares the given player to a theoretical player that can be acquired on the cheap, the same pool of potential replacement players should by definition be available to the teams of each league. One could argue that if the two leagues don't have equal talent at the major league level, they might not have equal access to replacement level talent--except such an argument is at odds with the notion that replacement level represents talent that is truly "freely available".&lt;br /&gt;&lt;br /&gt;So it's hard to justify the approach I take, which is to set replacement level relative to the average runs scored in each league, with no adjustment for the difference in the leagues. The best justification is that it's simple and it treats each league as its own universe, even if in reality they are connected.&lt;br /&gt;&lt;br /&gt;The replacement levels I have used here are very much in line with the values used by other sabermetricians. This is based both on my own "research", my interpretation of other's people research, and a desire to not stray from consensus and make the values unhelpful to the majority of people who may encounter them.&lt;br /&gt;&lt;br /&gt;Replacement level is certainly not settled science. There is always going to be room to disagree on what the baseline should be. Even if you agree it should be "replacement level", any estimate of where it should be set is just that--an estimate. Average is clean and fairly straightforward, even if its utility is questionable; replacement level is inherently messy. So I offer the average baseline as well.&lt;br /&gt;&lt;br /&gt;For position players, replacement level is set at 73% of the positional average RG (since there's a history of discussing replacement level in terms of winning percentages, this is roughly equivalent to .350). For starting pitchers, it is set at 128% of the league average RA (.380), and for relievers it is set at 111% (.450). &lt;br /&gt;&lt;br /&gt;I am still using an analytical structure that makes the comparison to replacement level for a position player by applying it to his hitting statistics. This is the approach taken by Keith Woolner in VORP (and some other earlier replacement level implementations), but the newer metrics (among them Rally and Fangraphs' WAR) handle replacement level by subtracting a set number of runs from the player's total runs above average in a number of different areas (batting, fielding, baserunning, positional value, etc.), which for lack of a better term I will call the subtraction approach.&lt;br /&gt;&lt;br /&gt;The offensive positional adjustment makes the inherent assumption that the average player at each position is equally valuable. I think that this is close to being true, but it is not quite true. The ideal approach would be to use a defensive positional adjustment, since the real difference between a first baseman and a shortstop is their defensive value. When you bat, all runs count the same, whether you create them as a first baseman or as a shortstop. &lt;br /&gt;&lt;br /&gt;That being said, using “replacement hitter at position” does not cause too many distortions. It is not theoretically correct, but it is practically powerful. For one thing, most players, even those at key defensive positions, are chosen first and foremost for their offense. Empirical work by Keith Woolner has shown that the replacement level hitting performance is about the same for every position, relative to the positional average.&lt;br /&gt;&lt;br /&gt;Figuring what the defensive positional adjustment should be, though, is easier said than done. Therefore, I use the offensive positional adjustment. So if you want to criticize that choice, or criticize the numbers that result, be my guest. But do not claim that I am holding this up as the correct analytical structure. I am holding it up as the most simple and straightforward structure that conforms to reality reasonably well, and because while the numbers may be flawed, they are at least based on an objective formula that I can figure myself. If you feel comfortable with some other assumptions, please feel free to ignore mine.&lt;br /&gt;&lt;br /&gt;That still does not justify the use of HRAR--hitting runs above replacement--which compares each hitter, regardless of position, to 73% of the league average. Basically, this is just a way to give an overall measure of offensive production without regard for position with a low baseline. It doesn't have any real baseball meaning. &lt;br /&gt;&lt;br /&gt;A player who creates runs at 90% of the league average could be above-average (if he's a shortstop or catcher, or a great fielder at a less important fielding position), or sub-replacement level (DHs that create 4 runs a game are not valuable properties). Every player is chosen because his total value, both hitting and fielding, is sufficient to justify his inclusion on the team. HRAR fails even if you try to justify it with a thought experiment about a world in which defense doesn't matter, because in that case the absolute replacement level (in terms of RG, without accounting for the league average) would be much higher than it is currently. &lt;br /&gt;&lt;br /&gt;The specific positional adjustments I use are based on 1992-2001 data. There's no particular reason for not updating them; at the time I started using them, they represented the ten most recent years. I have stuck with them because I have not seen compelling evidence of a change in the degree of difficulty or scarcity between the positions between now and then, and because I think they are fairly reasonable. The positions for which they diverge the most from the defensive position adjustments in common use are 2B, 3B, and CF. Second base is considered a premium position by the offensive PADJ (.94), while third base and center field are both neutral (1.01 and 1.02).&lt;br /&gt;&lt;br /&gt;Another flaw is that the PADJ is applied to the overall league average RG, which is artificially low for the NL because of pitcher's batting. When using the actual league average runs/game, it's tough to just remove pitchers--any adjustment would be an estimate. If you use the league total of runs created instead, it is a much easier fix.&lt;br /&gt;&lt;br /&gt;One other note on this topic is that since the offensive PADJ is a proxy for average defensive value by position, ideally it would be applied by tying it to defensive playing time. I have done it by outs, though.&lt;br /&gt;&lt;br /&gt;The reason I have taken this flawed path is because 1) it ties the position adjustment directly into the RAR formula rather then leaving it as something to subtract on the outside and more importantly 2) there’s no straightforward way to do it. The best would be to use defensive innings--set the full-time player to X defensive innings, figure how Derek Jeter’s innings compared to X, and adjust his PADJ accordingly. Games in the field or games played are dicey because they can cause distortion for defensive replacements. Plate Appearances avoid the problem that outs have of being highly related to player quality, but they still carry the illogic of basing it on offensive playing time. And of course the differences here are going to be fairly small (a few runs). That is not to say that this way is preferable, but it’s not horrible either, at least as far as I can tell.&lt;br /&gt;&lt;br /&gt;To compare this approach to the subtraction approach, start by assuming that a replacement level shortstop would create .86*.73*4.5 = 2.825 RG (or would perform at an overall level of equivalent value to being an average fielder at shortstop while creating 2.825 runs per game). Suppose that we are comparing two shortstops, each of whom compiled 600 PA and played an equal number of defensive games and innings (and thus would have the same positional adjustment using the subtraction approach). Alpha made 380 outs and Bravo made 410 outs, and each ranked as dead-on average in the field.&lt;br /&gt;&lt;br /&gt;The difference in overall RAR between the two using the subtraction approach would be equal to the difference between their offensive RAA compared to the league average. Assuming the league average is 4.5 runs, and that both Alpha and Bravo created 75 runs, their offensive RAAs are:&lt;br /&gt;&lt;br /&gt;Alpha = (75*25.5/380 - 4.5)*380/25.5 = +7.94&lt;br /&gt;&lt;br /&gt;Similarly, Bravo is at +2.65, and so the difference between them will be 5.29 RAR.&lt;br /&gt;&lt;br /&gt;Using the flawed approach, Alpha's RAR will be:&lt;br /&gt;&lt;br /&gt;(75*25.5/380 - 4.5*.73*.86)*380/25.5 = +32.90&lt;br /&gt;&lt;br /&gt;Bravo's RAR will be +29.58, a difference of 3.32 RAR, which is two runs off of the difference using the subtraction approach.&lt;br /&gt;&lt;br /&gt;The downside to using PA is that you really need to consider park effects if you, whereas outs allow you to sidestep park effects. Outs are constant; plate appearances are linked to OBA. Thus, they not only depend on the offensive context (including park factor), but also on the quality of one's team. Of course, attempting to adjust for team PA differences opens a huge can of worms which is not really relevant; for now, the point is that using outs for individual players causes distortions, sometimes trivial and sometimes bothersome, but almost always makes one's life easier.&lt;br /&gt;&lt;br /&gt;I do not include fielding (or baserunning outside of steals, although that is a trivial consideration in comparison) in the RAR figures--they cover offense and positional value only). This in no way means that I do not believe that fielding is an important consideration in player valuation. However, two of the key principles of these stat reports are 1) not incorporating any data that is not readily available and 2) not simply including other people's results (of course I borrow heavily from other people's methods, but only adapting methodology that I can apply myself).&lt;br /&gt;&lt;br /&gt;Any fielding metric worth its salt will fail to meet either criterion--they use zone data or play-by-play data which I do not have easy access to. I do not have a fielding metric that I have stapled together myself, and so I would have to simply lift other analysts' figures. &lt;br /&gt;&lt;br /&gt;Setting the practical reason for not including fielding aside, I do have some reservations about lumping fielding and hitting value together in one number because of the obvious differences in reliability between offensive and fielding metrics. In theory, they absolutely should be put together. But in practice, I believe it would be better to regress the fielding metric to a point at which it would be roughly equivalent in reliability to the offensive metric.&lt;br /&gt;&lt;br /&gt;Offensive metrics have error bars associated with them, too, of course, and in evaluating a single season's value, I don't care about the vagaries that we often lump together as "luck". Still, there are errors in our assessment of linear weight values and players that collect an unusual proportion of infield hits or hits to the left side, errors in estimation of park factor, and any number of other factors that make their events more or less valuable than an average event of that type. &lt;br /&gt;&lt;br /&gt;Fielding metrics offer up all of that and more, as we cannot be nearly as certain of true successes and failures as we are when analyzing offense. Recent investigations, particularly by Colin Wyers, have raised even more questions about the level of uncertainty. So, even if I was including a fielding value, my approach would be to assume that the offensive value was 100% reliable (which it isn't), and regress the fielding metric relative to that (so if the offensive metric was actually 70% reliable, and the fielding metric 40% reliable, I'd treat the fielding metric as .4/.7 = 57% reliable when tacking it on, to illustrate with a simplified and completely made up example presuming that one could have a precise estimate of nebulous "reliability").&lt;br /&gt;&lt;br /&gt;Given the inherent assumption of the offensive PADJ that all positions are equally valuable, once RAR has been figured for a player, fielding value can be accounted for by adding on his runs above average relative to a player at his own position. If there is a shortstop that is -2 runs defensively versus an average shortstop, he is without a doubt a plus defensive player, and a more valuable defensive player than a first baseman who was +1 run better than an average first baseman. Regardless, since it was implicitly assumed that they are both average defensively for their position when RAR was calculated, the shortstop will see his value docked two runs. This DOES NOT MEAN that the shortstop has been penalized for his defense. The whole process of accounting for positional differences, going from hitting RAR to positional RAR, has benefited him.&lt;br /&gt;&lt;br /&gt;I've found that there is often confusion about the treatment of first baseman and designated hitters in my PADJ methodology, since I consider DHs as in the same pool as first baseman. The fact of the matter is that first baseman outhit DH. There is any number of potential explanations for this; DHs are often old or injured, players hit worse when DHing than they do when playing the field, etc. This actually helps first baseman, since the DHs drag the average production of the pool down, thus resulting in a lower replacement level than I would get if I considered first baseman alone.&lt;br /&gt;&lt;br /&gt;However, this method does assume that a 1B and a DH have equal defensive value. Obviously, a DH has no defensive value. What I advocate to correct this is to treat a DH as a bad defensive first baseman, and thus knock another five or ten runs off of his RAR for a full-time player. I do not incorporate this into the published numbers, but you should keep it in mind. However, there is no need to adjust the figures for first baseman upwards --the only necessary adjustment is to take the DHs down a notch. &lt;br /&gt;&lt;br /&gt;Finally, I consider each player at his primary defensive position (defined as where he appears in the most games), and do not weight the PADJ by playing time. This does shortchange a player like Ben Zobrist (who saw significant time at a tougher position than his primary position), and unduly boost a player like Buster Posey (who logged a lot of games at a much easier position than his primary position). For most players, though, it doesn't matter much. I find it preferable to make manual adjustments for the unusual cases rather than add another layer of complexity to the whole endeavor.&lt;br /&gt;&lt;br /&gt;Player spreadsheets should be coming by the middle of the week.&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0AnPJbQnlHhRHdFBqVWpmWm1TaWlyRGlydlVKZlYxZUE&amp;amp;output=html"&gt;2011 Park Factors&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0AnPJbQnlHhRHdGFWTjZBQ2JFNFlHYUVOWkRWQ3h0RFE&amp;amp;output=html"&gt;2011 Leagues&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0AnPJbQnlHhRHdHhZSEhGVjM5NlJKeE85YVVtR0ZDaHc&amp;amp;output=html"&gt;2011 Teams&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0AnPJbQnlHhRHdHVJUHhUVnUxM3BQNkJlSVoxTVFvNHc&amp;amp;output=html"&gt;2011 Team Offense&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0AnPJbQnlHhRHdEo5X3RJM0FwcWhQa0RLZkVac292ZUE&amp;amp;output=html"&gt;2011 Team Defense&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0AnPJbQnlHhRHdE9DUDFHUkZPS1VFMXpMNER1NzZhT0E&amp;amp;output=html"&gt;2011 AL Relievers&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0AnPJbQnlHhRHdGVqUl80VnBLOVZMYVgyUXNyQUVqYkE&amp;amp;output=html"&gt;2011 NL Relievers&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?key=0AnPJbQnlHhRHdGl4Q29mZVk5V3RBbTM5YkhoWFFUcnc&amp;amp;output=html"&gt;2011 AL Starters&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?key=0AnPJbQnlHhRHdC0zZXFQaXJwY1ZXMWtMRS1SODY1Y0E&amp;amp;output=html"&gt;2011 NL Starters&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?key=0AnPJbQnlHhRHdDZDLU9HTkYxeTlwQllOVUV1aXdKRnc&amp;amp;output=html"&gt;2011 AL Hitters&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="https://docs.google.com/spreadsheet/pub?key=0AnPJbQnlHhRHdGRZZ242UXk2bGlWX1J6VERidmUwWEE&amp;amp;output=html"&gt;2011 NL Hitters&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-4731549834699660468?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/4731549834699660468/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/10/end-of-season-statistics-2011.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/4731549834699660468'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/4731549834699660468'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/10/end-of-season-statistics-2011.html' title='End of Season Statistics 2011'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-6631122983223371621</id><published>2011-09-29T22:53:00.000-04:00</published><updated>2011-09-29T22:53:20.152-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Current Events'/><category scheme='http://www.blogger.com/atom/ns#' term='Playoffs'/><category scheme='http://www.blogger.com/atom/ns#' term='Predictions'/><title type='text'>Playoff Meanderings</title><content type='html'>I always like to put down some of my thoughts about the playoffs each year, but it’s a challenge to say anything even remotely close to being meaningful.  Predicting the outcome of short series is folly (although I’ll engage in a little of this folly later), and you can read that anywhere.  So I always try to come up with a different angle to illustrate why the playoffs are subject to such uncertainty.&lt;br /&gt;&lt;br /&gt;I’ve certainly had some more interesting illustrations in the past; this one is pretty lame, but for some reason when it crossed my mind in August I thought it was a lot more interesting than I do now.  What is the value of each playoff game or series in terms of a regular season game?  In asking this, I’m not talking about the weight that should be applied to playoff performance for evaluating individual value, or any such thing...I’m just asking what the implied value is, given the assumption that the regular season standings carry over to the playoffs.&lt;br /&gt;&lt;br /&gt;Of course, that’s not how it works--it's a cliché, but every team starts every series out at 0-0.  However, I’ll assume that regular season standings carry over (resetting with the start of each additional round to keep things manageable) and the playoff games are weighted in a manner such that at the end of the series, the team that wins the series has a better overall record than its opponent.&lt;br /&gt;&lt;br /&gt;There are at least two different ways to approach this increasingly silly scenario, which will be best illustrated by example--treating the series outcome as a binary, or considering the games individually. Suppose that the Alphas enter a five-game division series with a record of 92-70 while their opponents the Betas are 90-72.  &lt;br /&gt;&lt;br /&gt;First, from the series outcome perspective, if the Alphas win, the series was unnecessary since the Alphas already led in the standings.  If the Betas win, however, the series must be given a weight of a number of games such that adding that many wins to the Betas and losses to the Alphas give the Betas a better record.  Leaving things in terms of whole games, the answer in this case is three.  Giving the Betas three additional wins leaves them at 93-72; three additional losses for the Alphas would make them 92-73.  The series could have gone three, four, or five games, making the effective value of those games equal to either  1, .75, or .6 regular season games.&lt;br /&gt;&lt;br /&gt;You can also consider this from the game perspective, that is actually looking at the outcome of each game in the series rather than treating the series as a binary win or loss.  If the Betas win the above series 3-0, this is pretty straightforward given the two game margin--treating playoff games as equivalent to regular season games leaves the Alphas 92-73 and the Betas 93-72.  Suppose the Betas had been 88-74 instead of 90-72, though.  In order to bring the Betas ahead of the Alphas (on a whole wins basis), they need five, so each win (and thus each game has to be worth) 5/3 = 1.67 times a regular season game.  Now the Alphas have 3*1.67 + 70 = 75 losses and the Betas have 3*1.67 + 88 = 93 wins, so that the Alphas record is 92-75 and the Betas 93-74.&lt;br /&gt;&lt;br /&gt;You can see that if the final margin of the series is 3-2 in favor of the Betas, the weight on each playoff game would have to be roughly four times that of a regular season game since the Betas only pick up one win when the playoff series is considered.  A 4x weight brings the Alphas and Betas together at 100-82.&lt;br /&gt;&lt;br /&gt;This is all just a silly digression, but given the assumptions it is a simple way to think about how the implied value of a playoff game compares to that of a regular season game. &lt;br /&gt;Getting to the 2011 playoffs, let me offer some quick thoughts.  I’ll leave the detailed handicapping to those who are better suited for it and also like quixotic quests.  The marginal value of more in-depth analysis is limited, but if that’s what you seek, you won’t find it here.  &lt;br /&gt;The probabilities that follow assume nothing about home field advantage or pitching matchups, or even true talent for that matter.  They are simply based on my crude team rankings, fueled by 25% actual W%, 25% expected W% (from R/RA), 25% predicted W% (from RC/RCA), and 25% from .500.  &lt;br /&gt;&lt;br /&gt;That formula is also arbitrary.  The results should be fairly reasonable, but I’m also eager to disown at the same time, as something of a commentary on the futileness of the exercise...and most especially the bloviating that is done without any logic at all.  I’m sure that there are many scribes across the country furiously writing about how certain teams have no chance, never learning the lesson that the differences between major league teams simply aren’t that great, especially after eight of the best have been selected from a 162 game sample.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-Bjg9PTAIqTA/ToUu0N0SJsI/AAAAAAAAA4A/VKsa2jPjNw0/s1600/11playoffs.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="154" width="307" src="http://2.bp.blogspot.com/-Bjg9PTAIqTA/ToUu0N0SJsI/AAAAAAAAA4A/VKsa2jPjNw0/s400/11playoffs.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This method considers all of the playoff teams to be in the top ten in MLB; only Boston (#3) and the Angeles (#8) are on the outside looking in.  The Yankees, Phillies, and Rangers are near co-favorites to win it all; NYA and TEX are ranked about evenly, while PHI benefits from the weaker NL field and has the highest odds of winning a first round series and the pennant.  Overall, the AL has an estimated 57% chance of winning the World Series.  The most likely matchup in the Series is NYA/PHI (11%); the least likely is DET/STL (4%).  The rankings imply that the worst playoff team (ARI) would beat the best playoff team (NYA) 43% of the time, which over 162 games is seventy wins.  Strictly equating true probability to actual 2011 record, consider the odds that the Padres could win a seven game series against the Indians, and there is roughly the same likelihood of the Diamondbacks winning a seven game series against the Yankees.&lt;br /&gt;&lt;br /&gt;As far as my personal rooting interests go, New York and Tampa Bay are my top two choices, followed by Milwaukee and St. Louis.  I would be happy to see any of those teams win, have no particularly strong feelings about Arizona or Detroit, and be mildly disappointed if it’s Philadelphia or Texas.  But there are no White Sox in this group.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-6631122983223371621?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/6631122983223371621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/09/playoff-meanderings.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6631122983223371621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6631122983223371621'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/09/playoff-meanderings.html' title='Playoff Meanderings'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-Bjg9PTAIqTA/ToUu0N0SJsI/AAAAAAAAA4A/VKsa2jPjNw0/s72-c/11playoffs.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-3004567196388684666</id><published>2011-09-20T01:36:00.003-04:00</published><updated>2011-09-20T01:36:00.116-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Pitching'/><category scheme='http://www.blogger.com/atom/ns#' term='History--Other'/><title type='text'>A Quick Look at Negro League W-L Records</title><content type='html'>&lt;i&gt;I wrote this about a year ago and wasn’t sure if I’d ever post it.  With the recent publication of some Negro League data at &lt;a href="http://www.seamheads.com/NegroLgs/index.php"&gt;Seamheads&lt;/a&gt;, I figured I’d better post it now before it became completely dated.  The data I used was compiled by Chris Cobb and posted on the &lt;a href="http://www.baseballthinkfactory.org/files/hall_of_merit/"&gt;Hall of Merit site&lt;/a&gt;, with John Holway's research as his source data.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I need to admit upfront that I know very little about the Negro Leagues.  My knowledge level of the Negro Leagues peaked at about age eleven when I read &lt;u&gt;Only the Ball Was White&lt;/u&gt;, and has only gone downhill since then.  That is one of the reasons for this post--as a (very limited) education for me on the great pitchers of the Negro Leagues.&lt;br /&gt;&lt;br /&gt;I am going to be applying the Netural Win-Loss record approach introduced by Rob Wood, which I have written about several times.  It is a way to contextualize a pitcher's W-L record using only the win-loss record of the pitcher's team.  This post applies it to several Negro League pitchers.&lt;br /&gt;&lt;br /&gt;The basic idea behind Wood's approach is that an average team's deviation from .500 is due in equal parts to their offense and defense.  The portion of a team's deviation from .500 that arises from the defense (with the exception of relievers in the pitcher's game and fielders) doesn't do anything to increase a pitcher's expected W% in reality, but if you compare his W% directly to that of his teammates', he will suffer for it.&lt;br /&gt;&lt;br /&gt;The formula is simple and linear; instead of comparing a pitcher to his team's W% when he does not get a decision (Mate), the comparison is to the average of Mate and .500.  The neutral W% is easy to figure:&lt;br /&gt;&lt;br /&gt;NW% = W% - Mate/2 + .25&lt;br /&gt;&lt;br /&gt;From NW%, one can figure Neutral Wins and Losses:&lt;br /&gt;&lt;br /&gt;NW = NW%*(W + L), NL = W + L - NW&lt;br /&gt;&lt;br /&gt;It is also very easy to combine NW% and the number of decisions into wins above some baseline.  Wins Above Team is traditionally defined as wins above .500:&lt;br /&gt;&lt;br /&gt;WAT = (NW% -  .5)*(W + L)&lt;br /&gt;&lt;br /&gt;I also use Wins Compared to Replacement, with the assumption that a replacement level starter will have a .380 W%:&lt;br /&gt;&lt;br /&gt;WCR = (NW% - .38)*(W + L)&lt;br /&gt;&lt;br /&gt;There are a number of weaknesses to the Neutral W-L approach, and there are a number of additional complications that arise when applying it to the Negro League data.  This is an incomplete list of the methodological issues that are present even when looking at major league data:&lt;br /&gt;&lt;br /&gt;* It does not isolate performance when the pitcher actually pitches; some will receive lousy run support despite pitching for good offensive teams.&lt;br /&gt;&lt;br /&gt;* While the approach assumes that the team is balanced between offense and defense, this is not always the case.  It is a decent assumption for a pitcher's entire career, but there are still going to be cases in which a pitcher is predominantly on teams skewed one way or the other.  Those on offensive teams will benefit unfairly in the metric, while those who are on teams with otherwise strong starting pitching staffs will be hurt.&lt;br /&gt;&lt;br /&gt;* All of the problems with the definition and concept behind pitcher wins and losses themselves are still present&lt;br /&gt;&lt;br /&gt;With respect to the Negro League results included in this post, the data I have used was compiled by Chris Cobb and posted on the Hall of Merit site, with John Holway's research as his source data.  Among the problems that arise from the data:&lt;br /&gt;&lt;br /&gt;* The records themselves are incomplete (missing seasons, team records only published for half seasons, etc.) and sometimes contradictory (individual totals that don't add up to the team total, etc.)  These kind of errors exist even in major league data from the period, so it's no surprise that they are present in the more chaotic, less-organized Negro League data.  &lt;br /&gt;&lt;br /&gt;To deal with the gaps in the specific data I used, if I couldn't find the team's record, I assumed that they were .500 when the pitcher in question's decisions were removed.  If a pitcher split time between teams and there was no breakdown of his W-L record with the two teams provided, I used the average of the two team's record.  For seasons in which Cobb did not include the team's record and I had to look it up from another source, I used the &lt;u&gt;ESPN Baseball Encyclopedia&lt;/u&gt;.  In that case, if the team's record was only available for a half-season, I assumed that the full season record was double the half-season record.&lt;br /&gt;&lt;br /&gt;* I only used the results from domestic Negro League games.  The world of the Negro Leagues encompassed a lot more than that; players went to the Caribbean to play, teams barnstormed extensively, played games against major league opponents, etc.  Limiting the analysis to league games makes it workable, but it does omit a lot of relevant performances.&lt;br /&gt;&lt;br /&gt;In this regard the Negro Leagues were similar to the early NA/NL days, in which the league schedule constituted only a small fraction of total games played, and independent teams often compared favorably to league opponents.&lt;br /&gt;&lt;br /&gt;* I am way out of my area of knowledge, but even I feel comfortable asserting that the NeL pitching rotations looked more like the early majors then the contemporary majors.  Pitchers got a higher percentage of their team's decisions, reducing the sample size from which Mate is drawn and weakening the assumption that the other pitchers are average.  I have also read that teams would purposefully match their aces against one another to create gate attractions, whereas our normal assumption is that teams will try to match their pitchers up in whatever manner creates the highest number of expected wins.&lt;br /&gt;&lt;br /&gt;* The league structure was less stable from year-to-year, which makes it harder to compare NeL pitchers from one time period to the other.  For twentieth-century major league pitchers, we can be confident that, regardless of when they pitched, that they were facing the highest level of competition available (with the obvious exception of the players locked out of the majors due to their skin color).  We also know that they pitched in seasons of roughly equal length, and so their career records represent a fair sample of their performance at different ages.&lt;br /&gt;&lt;br /&gt;We don't have that confidence when dealing with the NeL data.  For example, Satchel Paige gets no credit for 1935 here, but the adjacent seasons of 1934 and 1936 appear to be among his best.  Then he gets no credit for 1937-39, as he was not pitching in official league games.  You will see that Paige doesn't come out as impressively as might be expected in the career totals, but the gaps in league play might well be the major cause.&lt;br /&gt;&lt;br /&gt;* I have listed WCR figures using a .380 replacement level, but in actuality I have no idea where the NeL replacement level should be set.  &lt;br /&gt;&lt;br /&gt;From all of the caveats, it may seem as if I am declaring the NW-L statistics to be useless.  That is not my intention; I simply don't want to oversell them or fail to acknowledge their biases.  Many of the issues with the NW-L records are issues that would arise with any statistical analysis of Negro League pitchers.  Consider what a logistical nightmare it would be to try to look at runs allowed, needing innings, and league averages, and park factors.&lt;br /&gt;&lt;br /&gt;As sabermetricians we all know the flaws of pitcher W-L records, but there are a few benefits.  Among them is the ease in determining them, at least if complete games are the norm.  All you need to know is who the starting pitcher was and which team won the game, and you've got it.  No need for box scores or play-by-play.  No need for park factors or league averages--the average in every league and every park for all of time is .500.&lt;br /&gt;&lt;br /&gt;These useful properties are most useful when dealing with incomplete data, and we can refine them further by incorporating team record and producing NW-L.  Are the results perfect?  Absolutely not.  Are they likely to give us a better indication of the quality of these pitchers than raw W-L record or uncontextualized ERAs?  I say yes.&lt;br /&gt;&lt;br /&gt;The pitchers for whom data was available were: Chet Brewer, Dave Brown, Ray Brown, Bill Byrd, Andy Cooper, Leon Day, Willie Foster, Leroy Matlock, Satchel Paige, Dick Redding, Bullet Joe Rogan, Hilton Smith, Smokey Joe Williams, and Nip Winters.&lt;br /&gt;&lt;br /&gt;Since I am out of my area of knowledge when discussing the Negro League stars, I'm not going to make a lot of comments--I'll leave interpretation up to the reader.  Here are the actual career W-L records for the pitchers, along with Mate.  The list is sorted by career wins above .380:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-nFIjTgfIZBQ/TnYg7EEEp9I/AAAAAAAAA30/APLNaF_NTvQ/s1600/newl1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-nFIjTgfIZBQ/TnYg7EEEp9I/AAAAAAAAA30/APLNaF_NTvQ/s1600/newl1.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Only one of the pitchers had a worse record than that of his teams (Chet Brewer).  If one figures Wins Above Team by the traditional method, Brewer would rate as a below-average pitcher.  It's far more likely, though, that a pitcher with a .591 W% regarded as an excellent pitcher was in fact an excellent pitcher.  The fact that his teams played .624 baseball without him indicates that they probably had above-average pitching, which while good for the team did absolutely nothing to increase Brewer's expected W%.  Brewer still takes a hit, of course, when neutralizing his record by the Wood approach, but is assumed to be an above-average performer.&lt;br /&gt;&lt;br /&gt;Here are the career neutral W-L records for the pitchers, sorted by WCR:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-WkOduaiYC9A/TnYg85Iu4cI/AAAAAAAAA34/Kh7E7ZgZPUE/s1600/newl2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-WkOduaiYC9A/TnYg85Iu4cI/AAAAAAAAA34/Kh7E7ZgZPUE/s1600/newl2.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here is a link to the spreadsheet containing the complete yearly breakdowns for each pitcher.  You can see exactly what I inputted and which seasons I didn't have team records for (you'll see blanks in the TW and TL columns):&lt;br /&gt;&lt;br /&gt;https://docs.google.com/spreadsheet/pub?hl=en_US&amp;hl=en_US&amp;key=0AnPJbQnlHhRHdEl5TjRzUEVscjNQVy1naDY0ODVtZlE&amp;output=html&lt;br /&gt;&lt;br /&gt;Again, this is obviously a very incomplete examination of the careers of a limited number of Negro League stars, and I certainly would not advocate placing too much stock in the results.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-3004567196388684666?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/3004567196388684666/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/09/quick-look-at-negro-league-w-l-records.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3004567196388684666'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3004567196388684666'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/09/quick-look-at-negro-league-w-l-records.html' title='A Quick Look at Negro League W-L Records'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-nFIjTgfIZBQ/TnYg7EEEp9I/AAAAAAAAA30/APLNaF_NTvQ/s72-c/newl1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-2330552485899059511</id><published>2011-09-12T21:21:00.000-04:00</published><updated>2011-09-12T21:21:36.574-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scorekeeping'/><title type='text'>Scoring Self-Indulgence, pt. 5: Baserunner Advances</title><content type='html'>Last time I covered my scoring codes to recognize a batter reaching base; this time I’ll discuss what I record once he gets there.  I’ll start with advances made by the runner independent of the actions of a subsequent batter on his team--things like stolen bases and advancing on wild pitches.  Most of the codes that follow are pretty straightforward.   In each case, I’ll show the advance as a runner going from first to second, but the same concepts apply to advancing to third and scoring.  In each case, I’ll assume that the batter reached first by being hit by a pitch.&lt;br /&gt;&lt;br /&gt;For every advancement that occurs during the course of a plate appearance, I record both the lineup slot of the batter at the plate and the pitch on which the event occurs (or which pitches it is between if applicable).  The pitch is indicated by the same letter used in the batter’s box, except in lower case--the first pitch of a PA is “a”, the second pitch is “b”, etc.  &lt;br /&gt;&lt;br /&gt;There are several exceptions.  If the last pitch (which is never given a letter in the batter’s scorebox) is labeled “lp”.  If an event happens before the first pitch of a plate appearance, I use “bfp”.  Finally, if an event occurs between pitches, it is labeled “a!”, where ! is replaced by the pitch letter for the last pitch before the event.  Suppose the event occurs between pitches two and three of the plate appearance; in this case, the pitch code for the event is “ab”, because the second pitch (b) was the last one thrown.  “ab” can be read as “after b”.&lt;br /&gt;&lt;br /&gt;The code for a stolen base is the obvious “SB”.  If it occurred on the third pitch of a plate appearance taken by the #6 batter in the lineup, the scoring would be:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-TMGuloTzGPg/Tm6iU590VCI/AAAAAAAAA2Q/S0a181T5a54/s1600/scoreFifty.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="217" src="http://4.bp.blogspot.com/-TMGuloTzGPg/Tm6iU590VCI/AAAAAAAAA2Q/S0a181T5a54/s400/scoreFifty.jpg" width="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As you can see from the example, the pitch information is written above the advancement symbol, in smaller type.&lt;br /&gt;&lt;br /&gt;Wild pitches and passed balls are separated by a distinction I’d wipe out of the rule book if given the chance, but I do record them differently: “WP” and “PB” are the obvious codes.  In the examples, the wild pitch occurs on the last pitch to the #6 batter, and the passed ball occurs on the first pitch to the #2 hitter:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-F5Ya_A1C00g/Tm6iVKP7dvI/AAAAAAAAA2Y/nrbzgxOC1hc/s1600/scoreFiftyOne.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="217" src="http://4.bp.blogspot.com/-F5Ya_A1C00g/Tm6iVKP7dvI/AAAAAAAAA2Y/nrbzgxOC1hc/s400/scoreFiftyOne.jpg" width="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-Mjd4fKmYczA/Tm6iVefEp8I/AAAAAAAAA2g/umQ6ew_SO9g/s1600/scoreFiftyTwo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="217" src="http://2.bp.blogspot.com/-Mjd4fKmYczA/Tm6iVefEp8I/AAAAAAAAA2g/umQ6ew_SO9g/s400/scoreFiftyTwo.jpg" width="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The code for a balk is “BK”; this one comes between the third and fourth pitches to the cleanup hitter:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-fdmWNxyqLTw/Tm6iVu4sXKI/AAAAAAAAA2o/AZh6fgzHN_k/s1600/scoreFiftyThree.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="217" src="http://3.bp.blogspot.com/-fdmWNxyqLTw/Tm6iVu4sXKI/AAAAAAAAA2o/AZh6fgzHN_k/s400/scoreFiftyThree.jpg" width="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I don’t like the scoring distinction between a stolen base and defensive indifference, but I do make note of it on my scoresheets because SB is such a common category, and it’s easier to keep track of the ones that are really scored as steals and add in fielder’s indifference if one chooses than it is to try to divine after the fact what the scoring was.  I refer to it as Fielder’s Indifference (FI), because it is a subset of fielder’s choice by definition, and the symbol seems more consistent.  This one occurs on the sixth pitch to the #5 hitter:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-n_pexLDWKvg/Tm6iV6Vg0yI/AAAAAAAAA2w/H4x66lAuCEo/s1600/scoreFiftyThreeX.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-n_pexLDWKvg/Tm6iV6Vg0yI/AAAAAAAAA2w/H4x66lAuCEo/s320/scoreFiftyThreeX.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A runner could advance on an error between pitches, which almost always would be a throwing error.  If the pitcher throws the ball away on a pickoff attempt before the first pitch to the #2 hitter, the scoring looks like this:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-elc1XWYnVRQ/Tm6p1eC6eTI/AAAAAAAAA24/KydJYymT9M8/s1600/scoreFiftyFour.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-elc1XWYnVRQ/Tm6p1eC6eTI/AAAAAAAAA24/KydJYymT9M8/s320/scoreFiftyFour.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Sometimes, the extra bases are gained before the batter-runner becomes a runner; that is, on the same play on which he reaches base.  Suppose a batter dribbles a hit to the pitcher, but in his haste to make the play, the pitcher hurls the ball down the right field line, allowing the batter to move up to second.  The scoring looks like this:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-DdzwPSAOGag/Tm6p1lCwMUI/AAAAAAAAA3A/TBOy1Sv5YvE/s1600/scoreFiftyFive.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-DdzwPSAOGag/Tm6p1lCwMUI/AAAAAAAAA3A/TBOy1Sv5YvE/s320/scoreFiftyFive.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If there is no additional information included with the notation, it is assumed that the advance occurred on the same play as the on-base event.  The other common way a batter-runner moves up is when he is able to advance on a throw to another base made in an attempt to retire another runner.  In this example, the batter singles to right, then advances to second on a throw home.  The code “ATx” means advanced on throw, with x standing in for the base to which the throw is made (2 for second, 3 for third, and H for home):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-pulA-888twU/Tm6tZ_ZVFMI/AAAAAAAAA3g/eo68oIAJwxw/s1600/scoreFiftySix.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="217" width="400" src="http://1.bp.blogspot.com/-pulA-888twU/Tm6tZ_ZVFMI/AAAAAAAAA3g/eo68oIAJwxw/s400/scoreFiftySix.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I have yet to touch on the means by which most bases are gained: advances on plays initiated by subsequent batters.  I mark these by writing and circling the batting order position of the batter responsible in the quadrant of the runner’s scorebox corresponding to the base he wound up at.  Suppose the runner from first advances to third on a play initiated by the #7 hitter.  I would score it:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-3IltEMj72vQ/Tm6p2eoaa5I/AAAAAAAAA3Q/jT0qhCidkoQ/s1600/scoreFiftySeven.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://3.bp.blogspot.com/-3IltEMj72vQ/Tm6p2eoaa5I/AAAAAAAAA3Q/jT0qhCidkoQ/s320/scoreFiftySeven.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If the runner scores, then I use a box instead of a circle, so that it’s easy to distinguish how many runs a team has scored.  In this case, the runner who moved to third a player initiated by the #7 hitter ends up scoring on a play initiated by the #9 hitter:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-okdD_vucEVQ/Tm6p2uZ2y1I/AAAAAAAAA3Y/CEG4QhdlGzw/s1600/scoreFiftyEight.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-okdD_vucEVQ/Tm6p2uZ2y1I/AAAAAAAAA3Y/CEG4QhdlGzw/s320/scoreFiftyEight.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A runner can also score due to an event not initiated by another batter.  The most common is scoring on a wild pitch.  In this example, the runner from third scores on a wild pitch, with the wild pitch coming on the second pitch to the #9 hitter:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-GQDyPERH6YM/Tm6taDSiQ0I/AAAAAAAAA3o/sSPu6HelRbw/s1600/scoreFiftyNine.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="217" width="400" src="http://3.bp.blogspot.com/-GQDyPERH6YM/Tm6taDSiQ0I/AAAAAAAAA3o/sSPu6HelRbw/s400/scoreFiftyNine.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As you can see, I allow the box that indicates a run scored to vary in shape and size as appropriate to allow the necessary space for recording the event.&lt;br /&gt;&lt;br /&gt;Sometimes, the event that advances the baserunner occurs while the ball is in play, but referring to the relevant batter’s scorebox will not note how the advancement occurred.  Suppose that there is a runner on first, and the batter singles to right, advancing the runner to second.  Then the right fielder boots the ball, allowing both to move up one base (the batter-runner to second and the runner to third).  In this case, I would simply record the appropriate batter number circled in the runner’s third base quadrant.  The batter’s scorebox will include the error by the right fielder, imply that it occurred during his plate appearances, and thus imply that the single plus the error enabled the runner to advance from first to third.&lt;br /&gt;&lt;br /&gt;However, there are also cases in which the runner advances but the batter stays put.  Suppose that the same play occurs as described above, except the batter-runner (who happens to be the #4 hitter) stops at first base.  Now I would score the runner’s advancement as:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-YMA_G5zuziY/Tm6talxLbbI/AAAAAAAAA3w/2sRD0aHLgR8/s1600/scoreSixty.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="217" width="400" src="http://2.bp.blogspot.com/-YMA_G5zuziY/Tm6talxLbbI/AAAAAAAAA3w/2sRD0aHLgR8/s400/scoreSixty.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In this case, the use of a small circled 4 above the error indicates that the error occurred during the PA of the cleanup hitter.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-2330552485899059511?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/2330552485899059511/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/09/scoring-self-indulgence-pt-5-baserunner.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2330552485899059511'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2330552485899059511'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/09/scoring-self-indulgence-pt-5-baserunner.html' title='Scoring Self-Indulgence, pt. 5: Baserunner Advances'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-TMGuloTzGPg/Tm6iU590VCI/AAAAAAAAA2Q/S0a181T5a54/s72-c/scoreFifty.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-5328581172995774572</id><published>2011-08-30T00:04:00.000-04:00</published><updated>2011-08-30T00:04:00.163-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Pitching'/><title type='text'>A Completely Unnecessary Pitching Metric</title><content type='html'>There are a number of methods available to evaluate pitcher’s starts on a game-by-game basis rather than the more traditional full season method.  There are Game Scores, Support Neutral records, Win Values, and a number of other approaches.  There really is no need to add another approach to the mix, and I’m not really going to here--I'm simply going to take a conventional approach for evaluating a full season pitching line and apply it to individual starts.&lt;br /&gt;&lt;br /&gt;Of course, I’m not going to claim that this approach is better than the others, because it’s not.  It is relatively easy for me to implement, though, and I thought it would be nice to be able to offer a category on my year end stat report for starting pitchers that would consider distribution of performance rather than just aggregate performance as is the case for the rest of the metrics.&lt;br /&gt;&lt;br /&gt;The idea is basically to estimate the winning percentage that a team should have over the long haul given the runs allowed and innings pitched of the starting pitcher.  I am not using any sort of component RA estimate, and will not bother to explain the implications of this, which I’ll assume you’re well aware of (and thus can also decide for yourself whether you still have any interest in the results).  To do this, I use Pythagenpat and assume that the performance of everyone other than the starting pitcher is league average.  That is, the offense scores an average number of runs, and the bullpen allows an average number of runs.  For the latter, I’m not going to account for the difference between the RA allowed by relievers and the overall league average.&lt;br /&gt;&lt;br /&gt;That is not at all an inevitable choice, and if I was trying to construct a perfect metric I wouldn’t do it.  But this is so obviously not a perfect metric that the extra effort would be of questionable value.  Making that adjustment would also highlight the fact that this approach does not attempt to account for the effect of the number of innings the starter is able to log on the subsequent performance of the bullpen, despite &lt;a href="http://www.baseballprospectus.com/article.php?articleid=11839"&gt;research&lt;/a&gt; that suggests there is such an effect.  Of course, using the league average doesn’t do anything to address that issue, but it also keeps things simple.&lt;br /&gt;&lt;br /&gt;An additional benefit of not making any adjustment for the lower RA of the bullpen is that it allows this metric to be more easily comparable to other metrics that compare the performance of starting pitchers directly to the overall league average--which is a sizeable number.  The overall expected winning percentage for the team of a league average starting pitcher at the end of this road will be sub-.500, which while obviously false does in fact match the results of many full season-type metrics.&lt;br /&gt;&lt;br /&gt;One thing that cannot be ignored is park effects; the question is how to apply them.  One option is to only apply them to the elements of the team other than the pitcher--the bullpen and the offense.  I’ll call that option A; option B is to apply the park adjustment only to the pitcher himself.&lt;br /&gt;&lt;br /&gt;Option A is a little harder to implement, since there are two adjustments that need to be made.  On the other hand, it has some appeal because it allows us to keep the actual run environment of the game rather than recasting it in an imaginary neutral park.  I’ve decided to go with Option B because simplicity is a guiding principle here, and because it is more consistent with the way I apply park adjustments to full season metrics.  Again, it’s far from an inevitable choice.  I’m also assuming that all games are nine innings.&lt;br /&gt;&lt;br /&gt;With the thought process out of the way, this isn’t a particularly hard metric to demonstrate.  I’ll start simple, with a pitcher throwing a complete game in a neutral park in which he allows zero runs.  His expected winning percentage for the game is 1.000.&lt;br /&gt;&lt;br /&gt;Seriously, let’s consider a pitcher in a neutral park working seven innings and allowing two runs.  I’ll assume it’s an AL pitcher, so we need to know that the 2010 AL average R/G was 4.45 (this is the constant N later).  The pitcher’s team can thus be expected to allow 2 + (9 - 7)*4.45/9 = 2.99 runs and score 4.45 runs.  This is a 2.99 + 4.45 = 7.44 RPG environment, which has a Pythagenpat exponent of 7.44^.29 = 1.79, and thus the pitcher’s team has an expected W% of 4.45^1.79/(4.45^1.79 + 2.99^1.79) = .671.&lt;br /&gt;&lt;br /&gt;We could go through and count up the wins (.671) and the losses (1 - .671 = .329), but I’d rather keep it in rate terms, so the final result will just be the average expected winning percentage across a pitcher’s starts.&lt;br /&gt;&lt;br /&gt;To generalize the formula, let N be the league average R/G with R and IP as the runs allowed and innings pitched for the starting pitcher in a particular.  Let dPF be the Park Factor without any adjustment so that it can be applied to full season statistics combined for home and road games.  For example, the park factors I publish (which are the ones I’ll use here naturally) adjust for this.  A 1.03 PF does not mean that the park inflates scoring by 3%--it means that the park inflates scoring by 6%, and is diluted by averaging with 1.00 (neutral park) so that it can be applied to full seasons statistics which are, at least in theory, comprised of one-half home games and one-half road games.&lt;br /&gt;&lt;br /&gt;Then:&lt;br /&gt;A (team RA for game) = R/dPF + (9 - IP)*N/9&lt;br /&gt;X (Pythagenpat exponent) = (A + N)^.29&lt;br /&gt;gW% = N^X/(N^X + A^X)&lt;br /&gt;&lt;br /&gt;There’s really not much to it when you write it in math rather than English.&lt;br /&gt;&lt;br /&gt;I’m very tempted to cap it off by unscrambling it from a W% back into an estimated run average, but I’d rather not deal with the implications of aggregating multiple Pythagorean exponents.  One of the advantages of a game-by-game approach is that you’re able to better match performance with the run environment in which it actually occurred, and thus avoid some of the distortions that are inevitable when performance is &lt;a href="http://www.3-dbaseball.net/2010/04/more-esoteric-ramblings-about-era.html"&gt;aggregated&lt;/a&gt; across different run environments.&lt;br /&gt;&lt;br /&gt;I intend to implement this fully for 2011 starting pitchers (although I might change my mind on that depending on how I feel about the effort/usefulness tradeoff in October), but for now I ran the top five AL starting pitchers from 2010 (IMHO) through the process.  For comparison, I’ve included a column called sW% which is based on a traditional use of a pitcher’s full season line to estimate the theoretical W% of his team (albeit without making any adjustment for innings/start):&lt;br /&gt;&lt;br /&gt;X = (RA/PF + N)^.29&lt;br /&gt;sW% = N^X/(N^X  + (RA/PF)^X)&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-oGsJ8Nkmvdo/TlwtxBiDtpI/AAAAAAAAA2I/E25opAGjZz0/s1600/gwpitchart.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="103" width="198" src="http://2.bp.blogspot.com/-oGsJ8Nkmvdo/TlwtxBiDtpI/AAAAAAAAA2I/E25opAGjZz0/s400/gwpitchart.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If you use R and IP as the criteria, Felix Hernandez turned in the best performance of any AL starter, whether you aggregate or consider each game separately.  Sabathia and Weaver come out about the same either way, but Lee and Price move in opposite directions when you look at the game level.   This implies that Lee’s distribution of runs allowed and innings was such that it would figure to produce more wins than the averages would suggest, with Price the opposite.  &lt;br /&gt;&lt;br /&gt;This is more of a freak show stat than anything else, but it does provide a relatively simple way to compare starters at the game level on their bottom line results, and if a Cy Young race is particularly close, you may want to consider it.  Or you may not; I don’t have a lot of conviction about this, and there are more rigorous approaches available, but there you go.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-5328581172995774572?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/5328581172995774572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/08/completely-unnecessary-pitching-metric.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5328581172995774572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5328581172995774572'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/08/completely-unnecessary-pitching-metric.html' title='A Completely Unnecessary Pitching Metric'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-oGsJ8Nkmvdo/TlwtxBiDtpI/AAAAAAAAA2I/E25opAGjZz0/s72-c/gwpitchart.jpg' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-93053972688664209</id><published>2011-08-16T23:49:00.001-04:00</published><updated>2011-08-17T00:10:43.258-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Offense'/><category scheme='http://www.blogger.com/atom/ns#' term='Run Estimators'/><title type='text'>Ramblings on the Percentage of Runs Scored via Home Run</title><content type='html'>During the early part of the season, the perceived high dependence of the Yankees’ offense on home runs was often fodder for discussion in the mainstream baseball media.  The implication was that while the Yankees were scoring a lot of runs, the fact that many of the runs were being scored on home runs was a sign that the performance was either unsustainable or would not be duplicable against quality pitching.  The statistic most commonly cited in these discussions was the percentage of runs that scored on home runs.  This post is not intended to comment on the sustainability or quality of pitching issues, but rather to offer a quick critique of the “percentage of runs scored on home runs” figure.  &lt;br /&gt;&lt;br /&gt;If one broadly divides approaches to credit runs to various individuals or events into two categories, there are those that only consider the final outcome (whether the run scored or not, and who did the scoring or the driving in) and those that attempt to assign credit at all steps in the process, even incremental steps that don’t directly push a run across the plate.  The former class includes runs scored and RBI, of course, while the second class includes methods like linear weights and Base Runs.&lt;br /&gt;&lt;br /&gt;The percentage of runs a team scores on homers obviously falls into the first class, and in fact when you think about it you will realize that the statistic is founded on a RBI perspective.  The way a run gets tossed into the “resulting from a home run” bucket is to score on a home run--that is, to be driven in by a home run.  Of course, one could also look at the question from the runs scored perspective--what percentage of the runners that score reached base on a home run?  This is also very easy to compute, as it is simply the ratio between home runs and runs scored, data that is readily available for any team (whereas the number of runs actually scored on homers is harder to come by). &lt;br /&gt;&lt;br /&gt;The RBI-based approach is subject to possible distortions in a manner somewhat similar to the issues with earned runs.   If a home run is involved at all, the entire run is chalked up to the homer.  Often, the home run is the key event enabling a run, but outside of the batter that actually hits the home run, it is never the only contributing event.  In some cases, it is even relatively insignificant.  If a home run scores a runner from third with no one out, it really didn’t have a large marginal value with respect to scoring the runner from third--the probability of that runner was scoring was already very high, and any number of other events would have allowed him to score.  On the flip side, when a runner on first base with two outs is driven home by a home run, the home run is much more vital to scoring the runner.&lt;br /&gt;&lt;br /&gt;The discussion in the last paragraph is making the case from transitioning from the outcome perspective to the run expectancy perspective of linear weights.  Using play-by-play data, one can calculate the actual linear weight value of the home runs hit by a team.  Such an approach will still be subject to sequencing fluctuations and arguably may not be as predictive as a more context-neutral approach.&lt;br /&gt;&lt;br /&gt;One obvious context-neutral approach is to use standard linear weight values applied uniformly to all events to estimate the number of runs contributed by home runs.  This figure can then be compared to the total number of runs scored.  Using fixed linear weight values, though, this approach ends up boiling down to the ratio of home runs to runs scored, times a constant.  For example, if the linear weight value of a home run is 1.4 runs, the result of that calculation will just be 1.4 times the simple home run to run ratio.  &lt;br /&gt;&lt;br /&gt;The next refinement is to not use actual runs scored at all; this post is going to be way too dry as is, so I won’t even bother trying to explain why mixing actual runs scored with estimated run contributions is a bad idea--it should be relatively obvious.  Instead, you can compare the estimated run value of the team’s home runs to the run value of all of its offensive events.&lt;br /&gt;&lt;br /&gt;There is a complicating factor in using linear weights (or intrinsic weights derived from a dynamic run estimator as I will in a moment) in this manner--the negative run value of the out.  Simply taking Number of Event * Coefficient of Event for every event and dividing by the estimate of runs scored will result in percentages that sum to more than 100%, until outs are subtracted (and outs will have a negative percentage).  This means that you can’t use the value literally--if the ratio of 1.4*HR/estimated runs scored is 25%, it doesn’t mean that 25% of the runs were scored because of home runs.   Alternatively, one could look only at positive events, but then the denominator is no longer runs at all.  As long as the number is viewed as a ratio and not a true percentage contribution, the result can still be useful in measuring the contribution of the home run to the offense.&lt;br /&gt;&lt;br /&gt;Using a dynamic run estimator like Base Runs has the advantage of attempting to take into account the interaction between the offensive events rather than just assuming a fixed value.  However, in the case of the home run, the additional value of considering dynamism is less than it might be for some other events because the value of a home run stays relatively fixed.  The intrinsic value of a home run in BsR is: &lt;br /&gt;&lt;br /&gt;((B + C)*A*b - A*B*b)/(B + C)^2 + 1&lt;br /&gt;&lt;br /&gt;Where A, B, and C are the total A, B, and C factors for the team, and b and c are the respective B and C coefficients for the home run.&lt;br /&gt;&lt;br /&gt;Take this BsR equation:&lt;br /&gt;&lt;br /&gt;A  = H + W - HR&lt;br /&gt;B = .82S + 2.24D + 3.67T + 2.04HR + .1W&lt;br /&gt;C = AB - H&lt;br /&gt;D = HR&lt;br /&gt;&lt;br /&gt;The formula for the intrinsic weight of the HR is:&lt;br /&gt;&lt;br /&gt;((B + C)*A*2.04 - A*B*2.04)/(B + C)^2 + 1&lt;br /&gt;&lt;br /&gt;I’ve also figured the intrinsic weights for the other events so that I can also show you the percentage of the positive intrinsic linear weight total contributed by home runs (“POS” in the chart below).&lt;br /&gt;&lt;br /&gt;With this, we can look at the four different approaches I’ve discussed for 2010.  In the chart below, “hr” is the intrinsic LW of the home run, “RonHR” is the number of runs that actually scored on home runs, “%onHR” is the RBI-perspective figure that gets a lot of media play (RonHR/R), HR/R is the run-scored perspective figure (HR/R), BsR% is (hr*HR/BsR), and Pos% is hr*HR divided by the sum of the other products of positive event counts and their respective intrinsic weights.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-caaXkwPLw44/Tkrle3fQeDI/AAAAAAAAA2A/bjGx2XmNdHQ/s1600/runsonHR.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="347" src="http://4.bp.blogspot.com/-caaXkwPLw44/Tkrle3fQeDI/AAAAAAAAA2A/bjGx2XmNdHQ/s400/runsonHR.jpg" width="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I’m not going to add much comment on these figures.  This list is sorted by BsR%, which I think is the best measure of how large of a share of the offense the home run represented.  Toronto was in its own world, of course, with respect to home runs hit and the share of offense contributed by the homer no matter how one estimates it.  Also note the fact that the estimated linear weight value of every major league team falls in the [1.401, 1.444] range except for the Jays, 3.7 standard deviations below the mean at 1.355.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-93053972688664209?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/93053972688664209/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/08/ramblings-on-percentage-of-runs-scored.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/93053972688664209'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/93053972688664209'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/08/ramblings-on-percentage-of-runs-scored.html' title='Ramblings on the Percentage of Runs Scored via Home Run'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-caaXkwPLw44/Tkrle3fQeDI/AAAAAAAAA2A/bjGx2XmNdHQ/s72-c/runsonHR.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-1746572644766865788</id><published>2011-08-10T22:09:00.000-04:00</published><updated>2011-08-10T22:09:40.262-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Run Estimators'/><category scheme='http://www.blogger.com/atom/ns#' term='Pitching'/><title type='text'>Sample Simple Limited Input BsR ERA Estimator</title><content type='html'>In my last post on ERA estimators, I described how my philosophy towards constructing those metrics is predicated on starting with a solid model to estimate runs.  By using a solid foundation, you can be confident that, at the very least, your metric will adhere to the fundamental constraints of the run scoring process.   The designer retains freedom to experiment and estimate when it comes to selecting the inputs into the model (i.e., what gets filled in for hits, walks, home runs, etc.)&lt;br /&gt;&lt;br /&gt;In the article I sort of asserted that this could be done, and while I granted that it might be a more difficult process, I didn’t demonstrate how it could be done.  This post will offer an (admittedly simple) estimator using BsR with limited inputs and a lot of estimation.  The point is not to develop a metric that anyone will actually use.&lt;br /&gt;&lt;br /&gt;I’m going to define plate appearances as AB + W, which can be approximated by IP*2.84 + H + W (it can also of course be calculated from the horribly named BFP column), but I’ll just refer to it as PA in the equations.  The BsR equation I’ll be using as a basis is:&lt;br /&gt;&lt;br /&gt;A = H + W - HR&lt;br /&gt;B = (2TB - H - 4HR + .05W)*.78 = 1.56TB - .78H - 3.12HR + .039W &lt;br /&gt;C = AB - H&lt;br /&gt;D = HR&lt;br /&gt;&lt;br /&gt;We only have direct knowledge of walks.  Everything else will have to be filled in using estimation, for which I’ll use the 2010 major league totals.  I’m not going to attempt to state any interrelationships between strikeouts, walks, and the events to be estimated--everything will simply be based on a scalar times (PA - W - K), a quantity which I’ll call N (the estimate of N based on IP is IP*2.84 + H - K).&lt;br /&gt;&lt;br /&gt;In 2010, the ratio of hits to N was .369; the ratio of homers to N was .04; the ratio of total bases to N was .578; and the ratio of (AB - H - K) to N was .768.  Thus:&lt;br /&gt;&lt;br /&gt;A =.369N + W - .04N = W + .329N&lt;br /&gt;B = 1.56(.578N) - .78(.369N) - 3.12(.04N) + .039W = .039W + .489N&lt;br /&gt;C = K + .768N&lt;br /&gt;D = .04N&lt;br /&gt;BsR = (W + .329N)(.039W + .489N)/(.039W + .489N + K + .768N) + .04N&lt;br /&gt;= (W + .329N)(.039W + .489N)/(.039W + K + 1.257N) + .04N&lt;br /&gt;&lt;br /&gt;To convert to RA, multiply by 9 and divide by (C/2.84), which is a rough estimate of total outs (ideally, you would separate strikeouts from outs in play for this estimate).  This is equivalent to multiplying by 25.56 and dividing by C:&lt;br /&gt;&lt;br /&gt;Estimated RA = ((W + .329N)(.039W + .489N)/(.039N +K + 1.257N) + .04N)*25.56/(K + .768N)&lt;br /&gt;&lt;br /&gt;The range for the estimated RA when applied is not as wide as the range for actual RA, which shouldn’t be a surprise since I intentionally took everything except strikeouts and walks out of the equation and didn’t do anything to amplify their value.  For example, the top five starters in the AL in 2010 according to this formula were:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-qoSIartxLtw/TkM5zTPa5_I/AAAAAAAAA14/u5VATsZJKL4/s1600/estraleaders.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="103" width="129" src="http://1.bp.blogspot.com/-qoSIartxLtw/TkM5zTPa5_I/AAAAAAAAA14/u5VATsZJKL4/s400/estraleaders.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Again, the point is not to offer this as an equation that should be used.  It’s simply an illustration of constructing a Base Runs equation while restricting the list of available inputs, yet still estimating each component separately.  This same idea can be expanded upon (adding home runs to walk and strikeouts, for instance, would result in a standard DIPS-style estimator, and there are many other possible combinations of inputs), though, to produce a metric that is grounded in the foundation of the Base Runs model.  As I mentioned in the previous post, one need not tie themselves to the “dumb” kind of estimation on display here (i.e. assuming that the allowed variables have no ability to improve the prediction of the missing variables).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-1746572644766865788?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/1746572644766865788/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/08/sample-simple-limited-input-bsr-era.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/1746572644766865788'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/1746572644766865788'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/08/sample-simple-limited-input-bsr-era.html' title='Sample Simple Limited Input BsR ERA Estimator'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-qoSIartxLtw/TkM5zTPa5_I/AAAAAAAAA14/u5VATsZJKL4/s72-c/estraleaders.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-7804681248773687677</id><published>2011-08-07T00:58:00.002-04:00</published><updated>2011-08-07T00:58:53.503-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OSU'/><title type='text'>JB Shuck, #52</title><content type='html'>On Friday night, JB Shuck made his major league debut with the Astros.  He entered the game as part of a double switch in the top of the fifth, singling in the bottom of the inning and later grounding out to second.  On Saturday night, he again entered in a double switch, drawing a walk in the seventh and singling off (literally) John Axford in the ninth.  Unfortunately, he also made the last out at third base, trying to advance after Axford’s throw to first ended up in right field.&lt;br /&gt;&lt;br /&gt;Shuck is not a top prospect by any respect; he’s a 24 year-old left-handed hitting outfielder who would be stretched in center and doesn’t have any power (career .085 minor league ISO).  Was he not in an organization with little talent to begin with that just traded 2/3 of their outfield, he wouldn’t be getting a chance at the majors, and he probably won’t have much of a career.  Of course, I would love to be wrong about all that.&lt;br /&gt;&lt;br /&gt;Shuck was notable during his OSU career (2006-08) as a two-way player, a type that was always quite rare on Bob Todd coached teams.  Shuck was a very good left-handed pitcher for OSU (as you can probably guess, he didn’t have great stuff, but for a Big Ten left-hander he had plenty) and played left or center as well, often batting third.&lt;br /&gt;&lt;br /&gt;However Shuck’s career ends up, he has helped to make this a banner year for Buckeyes in the major leagues.  Two of his former teammates, Eric Fryer and Matt Angle, also broke in this year, making it the first season with three OSU debuts since 1969 (Steve Arlin, Chuck Brinkman and Fred Scherman).  Three Bucks also debuted in 1961 (Galen Cisco, Johnny Edwards and Ron Nischwitz) and 1927 (Arlie Tarbert, Marty Karow and Russ Miller).  To put the size of this crop into perspective, during 2000-2009 only three Buckeyes made the majors in total (Nick Swisher, Josh Newman and Scott Lewis).&lt;br /&gt;&lt;br /&gt;Along with Nick Swisher (2004 debut) and Cory Luebke (2010), five OSU products have appeared in the majors in 2011.  The last time that many Buckeyes played in the majors was 1974; we have a lot of work left to do to reach the highwater mark of nine in 1969.  While the three newbies are not really prospects (Fryer probably has the best prospectus on the basis of being a catcher), Swisher is an established quality contributor and Luebke is on his way to establishing himself as such, and the streak of at least one major league Buckeye (which dates to 1990, re-established after a three year drought from 1987-89) appears to be safe for some time to come.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-7804681248773687677?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/7804681248773687677/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/08/jb-shuck-52.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7804681248773687677'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7804681248773687677'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/08/jb-shuck-52.html' title='JB Shuck, #52'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-6931989486805974307</id><published>2011-07-23T20:04:00.000-04:00</published><updated>2011-07-23T20:04:19.705-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Run Estimators'/><category scheme='http://www.blogger.com/atom/ns#' term='Pitching'/><title type='text'>Saying Nothing About ERA Estimators</title><content type='html'>If you follow the sabermetric blog/Twittersphere at all (and if you don’t, why on earth are you wasting time here?), I’m sure you can figure out what prompted this post.  However, I’m not going to name the metric that has generated discussion about this general topic because this post is not meant to be targeted at anyone, or to be a debunking of a particular metric, or anything other than me expressing my opinion about the construction of ERA estimators.  Others have different philosophies and they are welcome to them.  This is mine:&lt;br /&gt;&lt;br /&gt;First, I find it helpful to &lt;a href="http://walksaber.blogspot.com/2010/07/flavors-of-component-era.html"&gt;classify&lt;/a&gt; the inputs and construction of each metric.  This is not necessary, but the reason I find it helpful is that the ERA estimators out there are relatively diverse.  Compared to the sabermetric metrics that exist for evaluating offense, they are extremely diverse.  Almost all offensive rates are built around an estimate of runs created and divided by either outs or plate appearances.  Almost all of them start with the traditional results-based batting line.&lt;br /&gt;&lt;br /&gt;ERA estimators, on the other hand, are all over the place.  Some follow the lead of their batting cousins and use a run estimator as their base, but some are regression-based.  Some use actual results, while some use batted ball data.  Some use batted ball data but decide to combine the four standard categories (flyballs, line drives, groundballs, and popups) in some manner.  Some assume that the pitcher has no control over what happens once the ball is put into play.  Some have implicit or explicit regression built-in with regard to balls in play.  Some limit themselves only to what happens when the ball is not put into play.  Some estimate ERA, and some estimate total runs allowed.  &lt;br /&gt;&lt;br /&gt;You probably don’t need personally need more than one overall batting metric.  That doesn’t mean there shouldn’t be diversity across the sabermetric community--there's nothing wrong with having a number of intelligently designed choices, but as an individual you don’t need both wOBA and True Average--one will suffice.  That is not necessarily the case with ERA estimators--sometimes you might be interested in one that is results-based, sometimes you might be interested in DIPS, sometimes you might want to venture out into the uncertain world of batted ball metrics…even when using a common construction (BsR or LW for example), there is arguably a place for two or three or more different variations based on the inputs.&lt;br /&gt;&lt;br /&gt;I believe that the most logical place to start with an ERA estimator is estimating runs.  That is intentionally written to sound a little silly but it is not a philosophy shared by all developers of these metrics.  Some put formulas down on the page that they would never consider using to try to estimate how many runs a team would score.  I say that the place to start is with a logical run estimator.  Given the team-level nature of the task, that suggests to me the use of Base Runs or another dynamic estimator, but I’m not going to argue too strenuously if you start with linear weights.&lt;br /&gt;&lt;br /&gt;This is a path which is not necessarily going to minimize your RMSE, or give the best correlation with future ERA.  With respect to the latter, if your goal is to provide the best possible estimate of future ERA, your metric is not attempting to measure how well the pitcher actually performed, it’s trying to forecast how well he will perform in the future.  Certain constructions will by their nature be less accurate at estimating ERA in the same period.  Every step you take down the path from outcome inputs (hits, walks, home runs, etc.) to component-based inputs (ignoring the actual outcomes of balls in play, or looking at batted ball types, etc.) will cost you accuracy when the standard is same period ERA.  However, one can still use accuracy at predicting same period ERA for methods of similar classes.&lt;br /&gt;&lt;br /&gt;Beginning the construction of the metric with a model of run scoring avoids some of the problems inherent in using actual pitcher runs allowed.  I’m going to gloss over the fact that the number of runs a pitcher allows, regardless of whether it’s from a base period or a future period, is always dependent upon his defense and other factors outside of his control.  There are still other concerns that do not apply when looking at true team-level data.  The way runs are charged to individual pitchers is biased towards pitchers who inherit baserunners at the expense of those who bequeath baserunners.  In practice, that means favoring relievers at the expense of starters, although depending on the performance of the relievers who inherit baserunners, individual bequeathers might actually benefit.&lt;br /&gt;&lt;br /&gt;Thus, whenever an approach detects a reliever ERA advantage is detected, some of it is attributable to the way runs are assigned and not to the actual effectiveness of the pitcher.  It might even be possible to increase the accuracy of a metric by giving a bonus to relievers.  It is entirely unclear to me what benefit this provides other than lowering RMSE.  It doesn’t tell you anything about how well the pitchers performed, and it certainly doesn’t help you measure “true talent” any better--if that is the objective, an adjustment in the opposite direction could be warranted.&lt;br /&gt;&lt;br /&gt;Another advantage of modeling runs is that you can easily move between RA and ERA.  Most sabermetricians prefer RA because of the biases present in ERA and the distortions created by reconstructing imaginary innings sans errors.  It’s easy to rescale from RA to ERA by multiplying by a constant like .91.  While it’s also easy to divide by .91 to go the other way, if the metric has been tailored to match ERA, you’ve baked the biases of ERA into your metric.  This could potentially be most problematic for a regression-based estimator that uses batted ball data.  Even if this bias is small, it’s still completely unnecessary.&lt;br /&gt;&lt;br /&gt;Finally, the issue of dynamism is one that is often misunderstood with respect to ERA estimators.  SIERA trumpets its “interactive” nature in its name (which does distinguish it from FIP and other linear methods) but any metric based on the foundation of a dynamic run estimator is by nature interactive.  Instead of the interactivity being limited to target categories, though, every event interacts with every other event.  Singles interact with triples, walks interact with home runs, doubles interact with triples, home runs interact with outs, outs interact with themselves...you get the idea (and I think that’s enough talk of events interacting with themselves).&lt;br /&gt;&lt;br /&gt;Building your metric around a run estimator does not necessarily restrict you to simply plugging in the numbers in the appropriate place.  Suppose you wanted to construct a metric based on batted ball types, strikeouts, and walks.  One way to go about it would be to simply go through and estimate singles, doubles, triples, homers, and outs in play based on the percentage of each batted ball type that wind up as each.  So, you would end up with equations that might look something like &lt;a href="http://www.insidethebook.com/ee/index.php/site/comments/the_best_of_this_week_at_bpro/#28"&gt;this&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;Singles = .057FB + .217GB + .516LD + .017PU&lt;br /&gt;&lt;br /&gt;However, if you believe that you have gleaned some other insights into the relationship between events that could improve your metric (such as strikeout pitchers having &lt;a href="http://www.fangraphs.com/blogs/index.php/new-siera-part-two-of-five-unlocking-underrated-pitching-skills/"&gt;lower HR/FB rates&lt;/a&gt;) , you could still build that in to your formula for estimated home runs, and plug those into the run estimator.  It’s more difficult than running a regression, and a more delicate balancing act (at least in terms of developing the formula), but it allows you to stay grounded in a model that estimates runs by taking a first step of, well, estimating runs.&lt;br /&gt;&lt;br /&gt;Again, I want to make it clear that I was attempting to explain where I’m coming from when I examine metrics of this type.  There is room for legitimate philosophical differences and I’m not trying to state that sabermetricians who deviate from the way I’d do it are engaging in poor practice.  It would certainly be possible to develop a lousy metric based on a run estimator and following some of the other suggestions.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-6931989486805974307?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/6931989486805974307/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/07/saying-nothing-about-era-estimators.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6931989486805974307'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6931989486805974307'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/07/saying-nothing-about-era-estimators.html' title='Saying Nothing About ERA Estimators'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-2069990521328854742</id><published>2011-07-19T00:39:00.000-04:00</published><updated>2011-07-19T00:39:00.187-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scorekeeping'/><title type='text'>Scoring Self-Indulgence, pt. 4: Reaching Base</title><content type='html'>Before I begin demonstrating how plays that result in a batter reaching base are scored, I need to make clear exactly how I divide the scorebox.  It’s nothing special; the box is divided into quadrants, one for each base, with first base beginning in the lower right, and the trip around the bases is recorded counter-clockwise from that point (just as in the game itself).  The areas for balls and strikes (as well as two-strike fouls, which are not included in the diagram) are ignored if not needed, as are the base quadrants--there are no actual lines in the scorebox, it’s just a way to organize the way the boxes are filled in:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-ouQNjzdq7T4/TiSo1wi5RXI/AAAAAAAAA0M/UUJJj7i7Pz0/s1600/scoreTwentySix.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-ouQNjzdq7T4/TiSo1wi5RXI/AAAAAAAAA0M/UUJJj7i7Pz0/s320/scoreTwentySix.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I’ll begin with the on base events that are very simple to score--those where the ball is not put into play.  My favorite baseball event, and the reason for the silly name of this blog:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-hocSz469P4w/TiSo7Qgjy0I/AAAAAAAAA0Q/2-cnBvWTIKQ/s1600/scoreTwentySeven.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-hocSz469P4w/TiSo7Qgjy0I/AAAAAAAAA0Q/2-cnBvWTIKQ/s320/scoreTwentySeven.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If it happens to be an intentional walk, then I circle the walk symbol (this matches how the ball symbol is circled for an intentional ball):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-ck1Sm1P84fg/TiSo8cvWoRI/AAAAAAAAA0U/h6gel9oftjc/s1600/scoreTwentyEight.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-ck1Sm1P84fg/TiSo8cvWoRI/AAAAAAAAA0U/h6gel9oftjc/s320/scoreTwentyEight.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The close cousin of the walk is the hit batter, which I record so:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-1EQPOK7x9Nc/TiSo91ewhYI/AAAAAAAAA0Y/yxv4arAaSEI/s1600/scoreTwentyNine.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://3.bp.blogspot.com/-1EQPOK7x9Nc/TiSo91ewhYI/AAAAAAAAA0Y/yxv4arAaSEI/s320/scoreTwentyNine.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Their distant and often overlooked cousin is catcher’s interference.  I simply score that play as “INT”:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-qWcKdUXSci4/TiSpASXNGOI/AAAAAAAAA0c/5kqz05ynpQc/s1600/scoreThirty.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-qWcKdUXSci4/TiSpASXNGOI/AAAAAAAAA0c/5kqz05ynpQc/s320/scoreThirty.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This is as good of a time as any to discuss on of the minor tenets of my scoring philosophy.  As you know, catcher’s interference is also scored as an error on the catcher, which I do not make note of on my sheet.  I do not generally see the need to take up space repeating information that can be inferred from other markings.  Interference is always an E2, and so interference suffices for me.  &lt;br /&gt;&lt;br /&gt;Now that stance can definitely be a pain in the butt if you want to go back through the scoresheet and count errors, and in doing so you might overlook the “INT”.  But my concern is not in data compilation after the game.  If it was, I would use a completely different system of scoring than this one.&lt;br /&gt;&lt;br /&gt;I use one of the most common symbolic means of recording hits--the use of dashes in proportion to the number of bases the hit is worth.  Thus, the base symbol for a single is a simple dash.  I complicate matters a bit by including a hit location code and a symbol for trajectory after the hit; I won’t discuss those here just yet.  Suffice it to say that the following is a flyball single to right field:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-RkVxpGP_mf8/TiSpBSrRWsI/AAAAAAAAA0g/XuSFLysMzHU/s1600/scoreThirtyOne.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-RkVxpGP_mf8/TiSpBSrRWsI/AAAAAAAAA0g/XuSFLysMzHU/s320/scoreThirtyOne.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I use a slightly different symbol for an infield hit; it looks like a plus sign, but really it’s supposed to be the standard horizontal dash for a single with a vertical line of equal length running through it.  I use this vertical line to modify the other hit symbols for special cases, as you’ll see below.  This is an infield hit on a groundball in the vicinity of the second baseman:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-v1XM_dfb3M0/TiSqCac45eI/AAAAAAAAA0k/_V94us7350o/s1600/scoreThirtyTwo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-v1XM_dfb3M0/TiSqCac45eI/AAAAAAAAA0k/_V94us7350o/s320/scoreThirtyTwo.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Standard doubles feature two horizontal dashes.  This one happens to be a flyball to right-center field:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-cq1JfdSCeoI/TiSqDdfjs1I/AAAAAAAAA0o/LHETO8kqy5Y/s1600/scoreThirtyThree.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-cq1JfdSCeoI/TiSqDdfjs1I/AAAAAAAAA0o/LHETO8kqy5Y/s320/scoreThirtyThree.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A vertical dash through the standard double symbol indicates that it is a ground-rule double.  This one came on a flyball to left field:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-cMy6bpIqS7s/TiSqE9Qk7XI/AAAAAAAAA0s/i07HWvov3SA/s1600/scoreThirtyFour.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-cMy6bpIqS7s/TiSqE9Qk7XI/AAAAAAAAA0s/i07HWvov3SA/s320/scoreThirtyFour.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If the reason for the batter being awarded second base was fan interference, I draw a little flag at the top of the ground-rule double symbol, creating a “F” for fan interference; this example is on a fly ball down the left field line:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-rOTsMRbCMuM/TiSqFkrzsVI/AAAAAAAAA0w/kvrUclDZVXs/s1600/scoreThirtyFive.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-rOTsMRbCMuM/TiSqFkrzsVI/AAAAAAAAA0w/kvrUclDZVXs/s320/scoreThirtyFive.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;You can figure out the symbol for triple; this one is on a flyball to center-left:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-8g4nlZjvFE4/TiSqHB1jHWI/AAAAAAAAA00/QpBISPZ9Fn8/s1600/scoreThirtySix.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-8g4nlZjvFE4/TiSqHB1jHWI/AAAAAAAAA00/QpBISPZ9Fn8/s320/scoreThirtySix.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;For the very rare occasion on which an automatic triple occurs, I’d draw a vertical line through the three horizontal lines, but that’s not even worth rendering.  Moving on to home runs, they feature four horizontal lines.  Since a home run means there won’t be any stops at the bases, the symbol is written large enough to take up the whole box.  I denote a run scored by boxing the event that allows the runner to score in his box, so the home run is boxed.  I also denote a RBI with an empty circle, so we’ll assume this is a two-run homer on a flyball to right:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-_TyUwRsfXBM/TiSqIeutBPI/AAAAAAAAA04/ESSF8kKeubY/s1600/scoreThirtySeven.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-_TyUwRsfXBM/TiSqIeutBPI/AAAAAAAAA04/ESSF8kKeubY/s320/scoreThirtySeven.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A vertical line added to the home run symbol indicates an inside-the-park home run.  This example is a solo inside-the-parker on a flyball to right field:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-_ycFo0ZaZIk/TiSqJGHKNyI/AAAAAAAAA08/AuaPbYMPZ_s/s1600/scoreThirtyEight.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-_ycFo0ZaZIk/TiSqJGHKNyI/AAAAAAAAA08/AuaPbYMPZ_s/s320/scoreThirtyEight.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The other main way to reach base is on errors.  My basic symbol for an error is the letter “E”, preceded by one of four letter codes: “F” for fielding, “T” for throwing, “C” for catching and “R” for receiving.  The quadrant in which the error is recorded is the one for the base on which the batter-runner ends up.  A fielding error by the first baseman that results in the batter-runner stopping at first is marked as:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-PqJrVC_teP0/TiS3CpO-nqI/AAAAAAAAA1A/XhPNX3BXS-4/s1600/scoreThirtyNine.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-PqJrVC_teP0/TiS3CpO-nqI/AAAAAAAAA1A/XhPNX3BXS-4/s320/scoreThirtyNine.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A throwing error by the third baseman which allows the batter to reach second base:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-F7i682LSlqU/TiS3IX436sI/AAAAAAAAA1E/7m4ofVxpPS0/s1600/scoreForty.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-F7i682LSlqU/TiS3IX436sI/AAAAAAAAA1E/7m4ofVxpPS0/s320/scoreForty.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If there is no indication to the contrary, you can assume that the throw was intended for first base.  However, sometimes, a batter will reach on a throwing error when the intent of the fielder was to make a play on some other runner.  In such a case, I use an arrow and the number of the base (2, 3, or H for home) that the fielder was trying to throw to.  In this case a third baseman tried for a force at second, but threw the ball away instead.  The batter may have reached anyway, and technically he is considered to have reached on a fielder’s choice, but this is a perfect example of the scoring legalese that I endeavor to avoid:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-d-yjqMhZFqM/TiS3LVy4VaI/AAAAAAAAA1M/VUSGRAiM6WU/s1600/scoreFortyX.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://3.bp.blogspot.com/-d-yjqMhZFqM/TiS3LVy4VaI/AAAAAAAAA1M/VUSGRAiM6WU/s320/scoreFortyX.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A catching error by the center field which allows the batter-runner to advance to third base:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-ttD3bkm5F1A/TiS3J6ghDMI/AAAAAAAAA1I/yatfoUd5RwU/s1600/scoreFortyOne.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-ttD3bkm5F1A/TiS3J6ghDMI/AAAAAAAAA1I/yatfoUd5RwU/s320/scoreFortyOne.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A receiving error occurs when a fielder mishandles a throw from another.  When allowing a batter to reach base, this almost always means that the fielder who made the throw is given credit for an assist.  I note this by recording his position number first, then the position number of the player who (literally) dropped the ball.  In this case, the shortstop gets credit for an assist and the first baseman is charged with an error:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-PtQD5RTXrNc/TiS3MJsnF5I/AAAAAAAAA1Q/9uC-ui_5LDs/s1600/scoreFortyTwo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-PtQD5RTXrNc/TiS3MJsnF5I/AAAAAAAAA1Q/9uC-ui_5LDs/s320/scoreFortyTwo.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In the rare event of a four-base error, it would be written across the scorebox and boxed in a similar fashion as the home run.&lt;br /&gt;&lt;br /&gt;I will now look at the miscellaneous means of reaching base.  One is a strikeout plus a wild pitch or passed ball.  While such strikeouts are almost always swinging, it is possible to have a passed ball on a called strike three, in which case the K is backwards:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-vr06QnawV2Q/TiS3NJk1IZI/AAAAAAAAA1U/fPSiZvGRh2I/s1600/scoreFortyThree.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-vr06QnawV2Q/TiS3NJk1IZI/AAAAAAAAA1U/fPSiZvGRh2I/s320/scoreFortyThree.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-PU-E-M0gGxo/TiS3O-RxcEI/AAAAAAAAA1Y/IxD6ewxaPhc/s1600/scoreFortyFour.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://3.bp.blogspot.com/-PU-E-M0gGxo/TiS3O-RxcEI/AAAAAAAAA1Y/IxD6ewxaPhc/s320/scoreFortyFour.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If a batter reaches on a fielder’s choice, I use the obvious code “FC”.  Some people make scoring legalese distinctions between fielder’s choices and forceouts, but as you can guess by now, I don’t consider that necessary.  It’s helpful to record the initial fielder, since that indicates where the ball was hit.  This example is a fielder’s choice initiated by the shortstop:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-yA0pBoab1zg/TiS3QU5WAgI/AAAAAAAAA1c/J6HvEHgMT_A/s1600/scoreFortyFive.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-yA0pBoab1zg/TiS3QU5WAgI/AAAAAAAAA1c/J6HvEHgMT_A/s320/scoreFortyFive.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If appropriate, a hit trajectory modifier (like bunt or chop) can be added to the fielder’s choice code above the fielder’s number.&lt;br /&gt;&lt;br /&gt;A similar case is the rare double play that allows a batter to reach base.  The most common type of this play is an failed attempt to turn a triple play on a groundball to third.  If the third baseman tags the bag, throws to second for the force, and the batter still reaches at first, you could have:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-Yzq334puapc/TiS3RgBn-tI/AAAAAAAAA1g/e2FPQWTUxms/s1600/scoreFortySix.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-Yzq334puapc/TiS3RgBn-tI/AAAAAAAAA1g/e2FPQWTUxms/s320/scoreFortySix.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Sacrifice hits and sacrifice flies can also occur in tandem with a runner reaching base, sometimes without an error in the case of a sacrifice hit.  Suppose the pitcher makes an unsuccessful attempt at retiring the runner at second on a bunt attempt, but no error is charged and the scorer credits a sacrifice:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-2fwkvEcYh9g/TiS3SYiX23I/AAAAAAAAA1k/6UjWUn92nq8/s1600/scoreFortySeven.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-2fwkvEcYh9g/TiS3SYiX23I/AAAAAAAAA1k/6UjWUn92nq8/s320/scoreFortySeven.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There could be an error as well; suppose that the catcher attempts to make a play on the lead runner and throws the ball into center field, with an error charged for allowing the batter to reach second and the runner to reach third, but the SH credited as well:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-f047jtx60Bs/TiS3UNZbH-I/AAAAAAAAA1o/Lps6PTUoRZk/s1600/scoreFortyEight.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-f047jtx60Bs/TiS3UNZbH-I/AAAAAAAAA1o/Lps6PTUoRZk/s320/scoreFortyEight.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Finally, a batter might get credit for a sacrifice fly and reach safely when an outfielder fails to make a catch.  Suppose that happened with the right fielder:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-J5de8d9ehOU/TiTFaSdSPaI/AAAAAAAAA1w/AMMSGpGDaEo/s1600/scoreFortyNine.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="217" width="400" src="http://2.bp.blogspot.com/-J5de8d9ehOU/TiTFaSdSPaI/AAAAAAAAA1w/AMMSGpGDaEo/s400/scoreFortyNine.jpg" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-2069990521328854742?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/2069990521328854742/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/07/scoring-self-indulgence-pt-4-reaching.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2069990521328854742'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2069990521328854742'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/07/scoring-self-indulgence-pt-4-reaching.html' title='Scoring Self-Indulgence, pt. 4: Reaching Base'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-ouQNjzdq7T4/TiSo1wi5RXI/AAAAAAAAA0M/UUJJj7i7Pz0/s72-c/scoreTwentySix.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-7958568600834248621</id><published>2011-07-17T19:09:00.001-04:00</published><updated>2011-08-06T15:59:23.500-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OSU'/><title type='text'>Matt Angle, #51</title><content type='html'>The Orioles beat the Indians 8-3 today, but overall the game was great for me as Buckeye product Matt Angle made his major league debut, leading off and playing left field.  Angle grounded out three times against Jeanmar Gomez, then drew a walk against Joe Smith.  Angle is now the fifty-first OSU product to play in the majors, although the unconfirmed list I maintain is now up to sixty.  Angle played at OSU from 2005-2007 and was a seventh round pick in '07 by the Orioles.&lt;br /&gt;&lt;br /&gt;Unfortunately, Angle's upside is probably fifth outfielder.  At OSU, he was a good center fielder and leadoff hitter, getting on base a ton but not hitting for much power.  In the minors, he has been a similar type of player with a career .285/.372/.350 line.  Angle can certainly help a team as a pinch-runner/defensive replacement, but he's yet to display a consistent ability to get on base at the highest levels of the minors (his AAA OBA is .336 in 765 PA).&lt;br /&gt;&lt;br /&gt;Whether he has much of a career or not, he's in the encyclopedia forever now, and that's an awesome thing.  Jack Shuck might be the next best hope for a Buckeye in the majors, but his AAA line is similar to Angle's without the speed (.267/.375/.321).  Of course, in the Astros system...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-7958568600834248621?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/7958568600834248621/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/07/matt-angle-51.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7958568600834248621'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7958568600834248621'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/07/matt-angle-51.html' title='Matt Angle, #51'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-5246951439536008115</id><published>2011-07-12T19:01:00.000-04:00</published><updated>2011-07-12T19:01:12.977-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Statistical Reports'/><title type='text'>Crude Team Ratings at the All-Star Break</title><content type='html'>Crude Team Ratings are a system I put together &lt;a href="http://walksaber.blogspot.com/2011/01/crude-team-ratings.html"&gt;last year&lt;/a&gt; to adjust team records for strength of schedule.  The resulting value is expressed on a scale where an average team gets 100, and the numbers themselves can be plugged directly into an odds ratio calculation.  If a team with a rating of 120 plays a team with a rating of 90, they should win about 120/(120 + 90) = 57% of the time.  Because of the way the ratings are calculated (explained in the linked article), a rating of 100 does not mean a .500 team--a .500 team will actually be a shade below 100 in a normal league.&lt;br /&gt;&lt;br /&gt;The ratings are similar in theory to those published elsewhere, and so there’s nothing particularly unique or interesting about them.  But I felt that the All-Star break was a logical point to stop and take a look at the ratings as they stand, both because interleague play is now complete and we can get a better read on the difference between the leagues, and because having some idea about strength of schedule to date (and in the future, although I haven’t figured that here) is helpful when handicapping the pennant races.&lt;br /&gt;&lt;br /&gt;I will run through three sets of CTRs--one based on win/loss record (CTR), one based on R/RA (eCTR), and one based on RC/RC Allowed (pCTR).  I prefer the latter two, especially at this point in the season, but you could also do a combination or factor in projections.  I slapped the “crude” label on them in the name for a reason.&lt;br /&gt;&lt;br /&gt;First, here are the CTRs based on actual win/loss record:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-9cqGLofkYEA/ThzRDQA_wUI/AAAAAAAAAzg/VATTsVE8hgQ/s1600/asb1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://3.bp.blogspot.com/-9cqGLofkYEA/ThzRDQA_wUI/AAAAAAAAAzg/VATTsVE8hgQ/s400/asb1.jpg" width="221" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-Nqk2t2_kZjI/ThzRDhDFhJI/AAAAAAAAAzo/bCaDhx0MQCg/s1600/asb2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="126" src="http://2.bp.blogspot.com/-Nqk2t2_kZjI/ThzRDhDFhJI/AAAAAAAAAzo/bCaDhx0MQCg/s400/asb2.jpg" width="81" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The league ratings (which are simply the average rating of the division or league’s members) show the AL with a much smaller advantage over the NL than in recent years.  However, as you’ll see, the AL advantage grows as we move further away from actual record towards component record.&lt;br /&gt;CTRs based on expected record (runs scored and allowed):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-3Ib1sk8YMDM/ThzREKLWg2I/AAAAAAAAAzw/AU6bTbnNF0A/s1600/asb3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://1.bp.blogspot.com/-3Ib1sk8YMDM/ThzREKLWg2I/AAAAAAAAAzw/AU6bTbnNF0A/s400/asb3.jpg" width="222" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The top three teams remain the same for the three approaches, but each time a different club is ranked #1.  Houston actually starts to look a little better as you go and only shares last place on the predicted list:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-SBagztBf4qE/ThzRE9QmmVI/AAAAAAAAAz4/wVjr_6ZgEuM/s1600/asb4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://4.bp.blogspot.com/-SBagztBf4qE/ThzRE9QmmVI/AAAAAAAAAz4/wVjr_6ZgEuM/s400/asb4.jpg" width="223" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-QnsItkPEG3o/ThzRFSjI8XI/AAAAAAAAA0A/rxvaRwrGul4/s1600/asb5.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="127" src="http://4.bp.blogspot.com/-QnsItkPEG3o/ThzRFSjI8XI/AAAAAAAAA0A/rxvaRwrGul4/s400/asb5.jpg" width="81" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Boston’s component record has easily been the most impressive in MLB to date when adjusted for schedule.  The Giants have played the weakest schedule by any measure and here only appear to be an average club.  Pittsburgh has both exceeded its component record and benefitted from a weak schedule.  Cleveland comes out better, as essentially an average team, and I was surprised to see that the Tribe has actually played a tough schedule.  &lt;br /&gt;&lt;br /&gt;Actual record gave the NL East the distinction of best division, but here the AL East returns to its customary position.  The Centrals and the NL West are the weak divisions, the most notable element of which is that the NL Central is not alone at the bottom of the barrel as they were in 2010.&lt;br /&gt;&lt;br /&gt;Finally, here is a freak show way of comparing the performances of teams so far in 2011 to what they did in 2010.  The first column shows each team’s 2011 pCTR to date; the second column is their 2010 pCTR; and the third column is the implied winning percentage of the 2011 team against their 2010 predecessors.   Of course, I’m comparing full seasons to (slightly more than) half seasons, not applying any regression, and presenting it as a W% is just a cute trick device.   (Doing so if one takes the result seriously also implies that an average team is equally good in 2010 and 2011):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-fLA3F2QvmYE/ThzRn44cuoI/AAAAAAAAA0I/mDtEdJyxntw/s1600/asb6.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="188" src="http://2.bp.blogspot.com/-fLA3F2QvmYE/ThzRn44cuoI/AAAAAAAAA0I/mDtEdJyxntw/s400/asb6.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The Pirates and Indians being near the top of the list won’t surprise anyone, but the Red Sox while really good last year have been great so far.  The range (if not the standard deviation; I’m not going to bother) of theoretical this year v. last year W% is pretty close to what you’d expect for a range of team W% in the current season.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-5246951439536008115?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/5246951439536008115/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/07/crude-team-ratings-at-all-star-break.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5246951439536008115'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5246951439536008115'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/07/crude-team-ratings-at-all-star-break.html' title='Crude Team Ratings at the All-Star Break'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-9cqGLofkYEA/ThzRDQA_wUI/AAAAAAAAAzg/VATTsVE8hgQ/s72-c/asb1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-8920240073082771128</id><published>2011-06-28T23:46:00.000-04:00</published><updated>2011-06-28T23:46:00.420-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><title type='text'>Once in a Lifetime</title><content type='html'>What is the probability that your favorite team will win the World Series during your remaining lifetime?  Obviously, the best estimate of this probability depends on the expected quality of your team each year, as well as a specific estimate of your mortality, which is dependent upon any number of factors.  This post generalizes this question to estimate the probability of an average team winning the Series during an average person's lifetime.&lt;br /&gt;&lt;br /&gt;Before delving into that, I must note that the subject matter is simultaneously frivolous (from a rigid sabermetric standpoint) and a tad morbid.  After all, no one really wants to think about their own mortality, and thinking about the chances that your team will break your heart until your heart breaks isn't exactly fun.  &lt;br /&gt;&lt;br /&gt;I will assume here that the probability of winning the World Series each year takes one of three values, which are independent from year-to-year: 1/30 (all teams have an equal chance), 1/60 (50% as likely to win as average), or 1/15 (twice as likely to win).  The last value is also essentially equal to the average chance of winning a pennant (with 14 and 16 team leagues it splits the difference).  &lt;br /&gt;&lt;br /&gt;What's trickier is figuring the probability of survival at each age.  I have used a &lt;a href="http://www.ssa.gov/OACT/NOTES/as120/LifeTables_Tbl_7_1980.html"&gt;life table&lt;/a&gt; from the Social Security Administration as the basis for these estimates.  The table is a projected life table for Americans born in 1980.  As such, it doesn't directly apply to people in other age groups, but for the sake of simplicity I have assumed that it does. The effect of this will be to overstate the life expectancy and championship probabilities for people born prior to 1980, as the SSA models assume that the force of mortality will decrease. &lt;br /&gt;&lt;br /&gt;Even if the assumptions about mortality prove to be accurate, the probability of your team winning will probably be lower than assumed here thanks to expansion, which may not be imminent but will almost certainly occur at some point.&lt;br /&gt;&lt;br /&gt;You can skip ahead a couple of paragraphs if that explanation of the life functions satisfies your curiosity, because this has absolutely nothing to do with baseball.  The charts show data for age at five year intervals, starting at age five and running through age 100.  They are based only on the male life expectancy chart; females have higher life expectancy, but most fans are male and presenting two separate charts or combining the two would add a lot of trouble to a silly exercise.&lt;br /&gt;&lt;br /&gt;There are three pieces of data presented in the chart for each age:&lt;br /&gt;&lt;br /&gt;1. e(x)--curtate life expectancy&lt;br /&gt;&lt;br /&gt;The curtate life expectancy only considers full-year survival.  For example, the life table tells us that at age five there are 98,357 survivors.  At age six, there are 98,324.  The 33 deaths between age five and six do not contribute to the curtate life expectancy for age five.  The complete life expectancy does include partial-years and is included in the life table, but requires the use of assumptions about mortality at fractional ages to estimate.  The curtate expectancy is easily calculated from the life table:&lt;br /&gt;&lt;br /&gt;e(5) = l(6)/l(5) + l(7)/l(5) + l(8)/l(5) + ... + l(117)/l(5)&lt;br /&gt;&lt;br /&gt;= 98324/98357 +  98294/98357 + 98264/98357 + ... + 1/98357 = 73.38&lt;br /&gt;&lt;br /&gt;2. Exp Wins--expected championships won by the team during one's lifetime&lt;br /&gt;&lt;br /&gt;The probability of a team winning in any given year is assumed to be one of the constants explained above (1/15, 1/30, 1/60).  The probability of a person age x surviving to age x + 1 is l(x + 1)/l(x).  The product of these two is the probability that a person age x survives to see their team win a title in the year of age x + 1.&lt;br /&gt;&lt;br /&gt;So the expected wins for a life age 5 is figured thusly (assuming a 1/30 probability of team winning):&lt;br /&gt;&lt;br /&gt;1/30*l(6)/l(5) + 1/30*l(7)/l(5) + 1/30*l(8)/l(5) + ... + 1/30*l(117)/l(5)&lt;br /&gt;&lt;br /&gt;= 1/30*e(5)&lt;br /&gt;&lt;br /&gt;The assumption here is that one needs to survive the full year in order to see one's team win during that year.  To put it in baseball terms, we could say that these calculations assume that each person is born on October X and the World Series always concludes on October X. &amp;nbsp;I'm also assuming that survival means one is able to enjoy a victory by their team, but sadly this is not always the case.&lt;br /&gt;&lt;br /&gt;The use of curtate functions makes computations easier but it also undersells the life expectancy and the championship expectations and probabilities a little bit.&lt;br /&gt;&lt;br /&gt;3. &amp;gt;=1 win--the probability that the team will win at least one championship during one's life&lt;br /&gt;&lt;br /&gt;For any given age x, the probability that one's team wins and that they survive to see it is 1/30*l(x + 1)/l(x).  The complement of this result is the probability that the team does not win the title in this year plus the probability that the team does win the title but life (x) does not survive to see it. &lt;br /&gt;&lt;br /&gt;Multiplying all of the complements for a lifetime results in the probability that the team never wins while the subject survives.  The complement of this is therefore the probability that the team does win during the lifetime of (x).&lt;br /&gt;&lt;br /&gt;Here is the chart based on an average (1/30) chance of winning the title each year:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-0uEJVvdB7vw/Tgdqcu0t0YI/AAAAAAAAAzQ/SuFu13yJcIc/s1600/lifex1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-0uEJVvdB7vw/Tgdqcu0t0YI/AAAAAAAAAzQ/SuFu13yJcIc/s1600/lifex1.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This paints a pretty rosy picture for most people.  Keeping in mind that the life functions are based on expected mortality for those born in 1980, from fifty and younger the expected number of titles is still one or greater, and from 60 and younger the probability of seeing a title is still greater than 50%.  If you are indoctrinating your ten-year old son into fandom of your favorite team, you have about a 90% chance of not making his fan life unrewarding, so you can feel good about that.&lt;br /&gt;&lt;br /&gt;Of course, the odds aren't as favorable if your team is only half as likely to win as an average franchise:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-swDesx_u8pk/TgdqeBjYGFI/AAAAAAAAAzU/LxrKDWoHdQE/s1600/lifex2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-swDesx_u8pk/TgdqeBjYGFI/AAAAAAAAAzU/LxrKDWoHdQE/s1600/lifex2.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Still, even if your team can only expect to win once every sixty years, there's no reason to despair.  If you are only concerned about winning the pennant, and your team can manage an average probability of doing so, you can feel pretty good regardless of your age:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-45OnyAt8Zj0/TgdqfvuORdI/AAAAAAAAAzY/gtnQEY6uEjU/s1600/lifex3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-45OnyAt8Zj0/TgdqfvuORdI/AAAAAAAAAzY/gtnQEY6uEjU/s1600/lifex3.jpg" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-8920240073082771128?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/8920240073082771128/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/06/once-in-lifetime.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/8920240073082771128'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/8920240073082771128'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/06/once-in-lifetime.html' title='Once in a Lifetime'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-0uEJVvdB7vw/Tgdqcu0t0YI/AAAAAAAAAzQ/SuFu13yJcIc/s72-c/lifex1.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-3049363447479291251</id><published>2011-06-26T19:47:00.002-04:00</published><updated>2011-06-26T19:47:18.438-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OSU'/><title type='text'>Eric Fryer, #50</title><content type='html'>Today (Sunday June 26, 2011) may have been the best day for OSU major leaguers in years.  Nick Swisher hit a home run in the Yankees’ win over the Rockies, while Cory Luebke got his first start of the year for San Diego (on the roster all season, he’s made 29 relief appearances) and pitched five shutout innings, allowing one hit and two walks while fanning six.  The third Buckeye in the majors had the least impressive game (0-3 with a walk, plus getting flattened by David Ortiz on a play at the plate), but his is the most significant performance of the day since it was his major league debut.  &lt;br /&gt;&lt;br /&gt;Eric Fryer became the fiftieth Buckeye to play in the majors (although the unconfirmed list I maintain now numbers 59) when he started for Pittsburgh at catcher.  Fryer was the starting catcher for OSU from the day he set foot on campus, playing from 2005-07 until he was a 10th round pick by Milwaukee.  The Brewers dealt him to the Yankees for Chase Wright in the 2009 pre-season, and they passed him along to the Pirates that summer for Eric Hinske.  Both Milwaukee and New York tried him in the outfield as a minor league, so his long-term prognosis as a catcher is unclear, and to have a real major league career he’ll almost certainly have to stick behind the plate.  But Pittsburgh has shown a commitment to using him as a catcher, and a strong minor league showing in 2011 (.320/.408/.520 in 201 PA between AA and AAA) opened the door for a major league roster spot.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-3049363447479291251?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/3049363447479291251/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/06/eric-fryer-50.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3049363447479291251'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3049363447479291251'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/06/eric-fryer-50.html' title='Eric Fryer, #50'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-2427159377437072327</id><published>2011-06-20T22:15:00.000-04:00</published><updated>2011-06-20T22:15:10.181-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OSU'/><title type='text'>Freshman Blues</title><content type='html'>I really am struggling with how to write this post, because it will probably come off as fairly critical, and that is far from my intent.  I support the OSU baseball program fully, and any time I make a prediction or say anything less than fully complementary of a player, I would like nothing more than to be proven wrong.&lt;br /&gt;&lt;br /&gt;On top of that, it’s also necessary to note that the length of the college season is such that statistics are compiled over much smaller samples and with much less predictive value than what I am accustomed to when looking at the major leagues.  Of course, caution must be taken with regard to sample sizes and predictability when dealing with statistics of any kind, but it’s more important than usual here.  More weight and deference must be given to observation and to the coaches who see the players in practice every day.&lt;br /&gt;&lt;br /&gt;A final disclaimer that I need to throw in is that I didn’t really expect the 2011 Bucks to be very good.  I thought qualifying for the Big Ten Tournament would be a challenge, and Ohio wound up as the #4 seed, so they actually exceeded my expectations, at least in conference play.  It’s difficult to put too much blame on Coach Greg Beals, as he was in his first year and had nothing to do with recruiting the players he had to work with.&lt;br /&gt;&lt;br /&gt;OSU lost its first two games of the season, then trailed St. John’s in the third game 7-0 before rallying for an 8-7 win.  The team was 6-5 against a not particularly tough schedule before embarking on a California trip that saw them be hammered to the tune of 1-5.  From that point on, the team had to work hard to stay near .500.&lt;br /&gt;&lt;br /&gt;Mid-week games against non-conference opponents saw OSU go 5-4, which was actually an improvement over 2010 (and included a win over Oklahoma State in a rare matchup with a national program in a game of that type).  In Big Ten play, the Buckeyes went 13-11 to earn the #4 seed in the conference tournament.  They defeated #5 Minnesota, lost to #1 Illinois, and lost a rematch with Minnesota to bow out with a final season record of 26-27.  It was the first sub-.500 season for Ohio State since 1987--the season before Bob Todd arrived in Columbus.  It wasn’t just unadjusted record that suggested it was a very poor team by OSU standards--since Boyd Nation’s ISR ratings started in 1997, the lowest national rank for the program had been #130 in 1998, but in 2011 OSU could manage no better than #160.&lt;br /&gt;&lt;br /&gt;OSU’s .491 W% ranked seventh in the Big Ten (Purdue led at .649); their .464 EW% ranked seventh (Purdue led at .668); and their .477 PW% ranked sixth (MSU led at .643).  The Big Ten averaged 5.38 runs scored and 5.30 runs allowed per game; OSU ranked fifth with 5.6 runs scored and ninth with 6.0 runs allowed.  Ohio had a .930 modified Fielding Average, which ranked eighth (the conference average was .939).  OSU was also eighth in DER, converting approximately 66% of balls in play into outs compared to an average of 67%.&lt;br /&gt;&lt;br /&gt;Offensively, the Buckeyes were almost perfectly average in the major components of offense: average (.280 compared to a .279 B10 average); walks (.100 per at bat versus .l02); and power (.104 isolated power matched the mean).  Looking at the individual players, catching was the biggest weakness.  Greg Solomon got the vast majority of the playing time, but was ice cold from the mid-point of the B10 schedule on, winding up at -5 RAA for the season. Solomon’s strikeout to walk ratio was a dreadful 42/4.  First baseman Josh Dezse copped B10 Freshman of the Year honors thanks to his team-leading +15 RAA, fueled by a team-leading .332 BA but a decent .280 SEC as well.  Second baseman Ryan Cypret hit .323, which despite a .208 SEC was enough to finish second on the team with +9 RAA.  Third baseman Matt Streng drew just five walks, leaving him with a putrid .138 SEC despite a solid .115 ISO, and was two runs below average as a result.  Tyler Engle bounced back to something resembling his sophomore form, drawing enough walks to make himself an average overall offensive player. Steng and Engle were both seniors.&lt;br /&gt;&lt;br /&gt;Left field began as a platoon between David Corna and Joe Ciamocco, but Corna’s doubles power (he led the team with 16) coupled with Ciamocco’s failure to hit at all quickly left the former getting all the playing time.  Unfortunately, Corna didn’t hit for a high enough average of draw enough walks to rank any better than average (-1 RAA).  Freshman center fielder Tim Wetzel showed decent promise, but his lack of power (just three doubles and two triples in 176 at bats) left him at -4 RAA.  Senior right fielder Brian DeLucia didn’t hit for power as he had in the past, and thus ended up at just +2 RAA with a .276/.359/.381 line.  DH Brad Hallberg led the team with 28 walks, which was enough to make him an average contributor despite hitting .254 with a ISO of just .042.  The Bucks did not have much depth, and there were no particularly notable performances by non-regulars; they combined for a .202/.248/.287 line in 137 plate appearances.&lt;br /&gt;&lt;br /&gt;On the pitching side, senior Drew Rucinski was the clear ace of the staff, leading the team in innings (82), Run Average (4.29), and RAA (+9), with an even better 3.86 eRA.  Sophomore Brett McKinney settled into the Saturday starter role, and turned in average results (5.18 RA) with a promising 49/20 K/W ratio.  Freshman Greg Greve got better as the Big Ten season went on, but his overall performance (6.72 RA, -11 RAA) left much to be desired for a third starter.  Fellow freshman John Kuchno got a number of mid-week starting assignments (10 appearances, 7 starts)&lt;br /&gt;McKinney and Greve were bumped up in the pecking order because of a complete collapse in performance from senior Dean Wolosiansky, penciled in as the #2 but quickly shuffled off to low-leverage work with a 8.46 RA in 55 innings for -19 RAA.&lt;br /&gt;&lt;br /&gt;Beals brought a major change in philosophy to the bullpen.  Todd’s teams tended to have a closer, a middle reliever that he trusted, and a bunch of other guys.  Beals managed more as a professional manager, with a closer and a number of pitchers who he mixed and matched to get the platoon advantage.  Left-hander Andrew Armstrong was used in a LOOGY-esque capacity (29 innings in 33 appearances, with an essentially average 4.91 RA despite 39/16 K/W), a new experience for OSU baseball fans.  Fellow southpaw Theron Minimum was not used in as much of a matchup role (he started once and pitched 29 innings in 21 appearances), and was used in lower leverage situations than Armstrong.  He recorded a 6.28 RA for -3 RAA.&lt;br /&gt;&lt;br /&gt;The situational usage of the lefties was no doubt enhanced by the fact that the Bucks’ top two right-handed setup men were somewhat underhanded.  Senior Jared Strayer raised his arm slot to 3/4 this year, while junior former catcher David Fathalikhani delighted this fan with his more pure sidearm approach.  Both turned in similar performances, as Strayer worked 29 innings in 27 outings with a 4.66 RA (+2 RAA); Fathalikhani matched that RAA by working 26 innings in 26 appearances with a 4.56 RA.&lt;br /&gt;&lt;br /&gt;The other reliever of note (Brian Bobinski and Paul Geuy worked 21 innings between them) was Dezse, the freshman first baseman who doubled as a closer.  Dezse throws hard, in the mid-to-high nineties, but he left a lot to be desired in the polish department with 32/22 K/W in 28 IP (plus two hit batters and eight wild pitches) with a 7.48 RA.  I’ll have more to say about Dezse in a moment, which will be a little critical, but I intend that to be aimed at the way he was utilized, not at the pitcher himself.&lt;br /&gt;&lt;br /&gt;Given that OSU did not return much talent and that could not be pinned on Beals, the most interesting aspect to evaluation of the new coach was his strategy.  What I saw I did not like.  OSU *seemed* to make more baserunning mistakes than usual, including a couple of horrible little league style delayed double steals of home that were sniffed out with ease by the opposition.  OSU’s SB% was 63%, an improvement from last year’s 56%, but well below 2009 (82%) and 2008 (74%).  More disturbing was that OSU’s stolen base attempt frequency (measured here as (SB + CS)/(S + W)) increased to 9.9% (2008-10: 11.9%, 8.9%, 7.4%).&lt;br /&gt;&lt;br /&gt;Beals also called for many more sacrifices than Todd had over the last three seasons.  While he had a less potent offense to work with, the difference was stark enough to suggest that perhaps Beals is a bigger believer in the bunt than Todd.  OSU’s SH/(S + W) was .073, while Todd’s final three teams had ratios of .028, .047, and .033.&lt;br /&gt;&lt;br /&gt;What really befuddled this observer, though, was Beals’ handling of Dezse.  For one, Beals never gave him a chance to start.  One of the most bizarre discussions by a TV announcer I can recall occurred during the Buckeyes’ Big Ten Tournament game against Illinois.  The announcer lamented the difficult job college coaches have in balancing player development with winning games, and his example was that Dezse could develop more starting but was more valuable to his team as a reliever.&lt;br /&gt;&lt;br /&gt;In fairness, it might have been for the best that Dezse was limited to short mound outings, because he showed little command and really was not effective.  There were a couple games in which Dezse came in and absolutely blew away the opposition, but he also had several games that can only be described as meltdowns.&lt;br /&gt;&lt;br /&gt;On April 3, OSU led Northwestern 14-10 entering the top of the ninth.  Dezse gave up four hits and a walk, was bailed out by an idiotic piece of Wildcat baserunning…and still yielded a game-tying three run blast.  The Bucks did pull it out with a tally in the bottom of the frame.  On May 14, OSU led Iowa 8-4 entering the bottom of the ninth.  Dezse gave up three hits, three walks, and the lead as the Hawkeyes wound up prevailing in ten.  &lt;br /&gt;&lt;br /&gt;But the costliest such appearance came in the second round of the Big Ten Tournament.  Fresh off a win over #5 Minnesota, #4 OSU had a chance to knock off #1 Illinois and stay in the winner’s bracket.  Dezse was summoned to start the eighth with OSU up 4-1.  He issued two walks and a wild pitch in the eighth, but kept the Illini off the board.  In the ninth he surrendered a leadoff double, uncorked a wild pitch, yielded a single, got a groundout, uncorked a second wild pitch, gave up another single, got the second out on a fly to left, then issued a walk and a single that tied the game.  Andrew Armstrong relieved him but his first pitch was hit back up the middle for a single that essentially ended Ohio’s season (the Bucks bowed out with a whimper against Minnesota the next day).&lt;br /&gt;&lt;br /&gt;Beals *seemed* to be locked into the mindset that he had to have a closer, and that it had to be his hardest throwing option, regardless of whether he was able to throw strikes or pitch efficiently.  Dezse may have outstanding potential, and I’ll grant that it’s possible that he truly was the best option--but as a fan, it was beyond frustrating to watch the same movie play out three times.&lt;br /&gt;&lt;br /&gt;I do not want to close this on a down note, so it’s important to point out that Beals has by all accounts done a terrific job in recruiting, illustrated by the fact that two OSU signees went in the first ten rounds of the June draft.  There’s a lot more to the job of being a major league manager than just strategy, and that applies even more in a collegiate setting in which procuring talent is also the coach’s responsibility.  The true test of Beals’ success will not be bunt frequencies, but wins and losses, and that test begins in 2012.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-2427159377437072327?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/2427159377437072327/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/06/freshman-blues.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2427159377437072327'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2427159377437072327'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/06/freshman-blues.html' title='Freshman Blues'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-2585937836820067889</id><published>2011-06-17T22:05:00.000-04:00</published><updated>2011-06-17T22:05:05.463-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Yahoo Box Scores'/><category scheme='http://www.blogger.com/atom/ns#' term='Whimsy'/><title type='text'>Great Moments in Yahoo! Box Scores</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-qtoKy1Oz46U/TfwHtzsM1PI/AAAAAAAAAzM/61wKJSNnmzg/s1600/reimold.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="80" width="400" src="http://2.bp.blogspot.com/-qtoKy1Oz46U/TfwHtzsM1PI/AAAAAAAAAzM/61wKJSNnmzg/s400/reimold.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Thanks to a reader for pointing this one out.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-2585937836820067889?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/2585937836820067889/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/06/great-moments-in-yahoo-box-scores.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2585937836820067889'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2585937836820067889'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/06/great-moments-in-yahoo-box-scores.html' title='Great Moments in Yahoo! Box Scores'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-qtoKy1Oz46U/TfwHtzsM1PI/AAAAAAAAAzM/61wKJSNnmzg/s72-c/reimold.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-4288728603886805406</id><published>2011-06-14T00:17:00.004-04:00</published><updated>2011-06-14T00:17:00.180-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Win Estimators'/><category scheme='http://www.blogger.com/atom/ns#' term='Book Reviews'/><category scheme='http://www.blogger.com/atom/ns#' term='Run Estimators'/><title type='text'>Comments on Bill James' “Solid Fool’s Gold”</title><content type='html'>For the past three seasons, Bill James had published an annual book called the &lt;u&gt;Bill James Gold Mine&lt;/u&gt;.  The book included a sampling of some of the material available to subscribers of his Bill James Online website, including some unique split data (unique in the sense that it’s not commonly in print on actual pieces of paper), like breakdowns of pitches thrown to left-handed and right-handed batters.  There were little boxes with “nuggets” (gold is a theme needless to say) pointing out various oddities.&lt;br /&gt;&lt;br /&gt;Those elements of the book really took up a lot of space, but were woefully incomplete (they clearly weren’t intended to be complete, but the point is that the book had no utility as a reference).  The most interesting aspect of the book was that it reprinted a number of full-length essays that James had written for his website over the course of the year.  I generally enjoyed those essays as you will see if you look back at my comments on the previous editions of the book.&lt;br /&gt;&lt;br /&gt;This year, there was no &lt;u&gt;Gold Mine&lt;/u&gt;.  Instead, ACTA and James released a slimmer, smaller volume entitled &lt;u&gt;Solid Fool’s Gold&lt;/u&gt;, which includes only essays.  I didn’t bother to count, but I’d guess that the new book has about as many essays as the &lt;u&gt;Gold Mine&lt;/u&gt;, it is sold for a lower price, and personally I won’t miss the hodgepodge of charts that much.&lt;br /&gt;&lt;br /&gt;If a fairly quick read of non-technical essays by Bill James is something you think you’d enjoy, you’ll probably like &lt;u&gt;Solid Fool’s Gold&lt;/u&gt;, unless you already subscribe to Bill James Online.  The new format is much better, as it includes a bunch of essays that you could pick up and read in five years rather than some of the more limited shelf-life aspects of the &lt;u&gt;Gold Mine&lt;/u&gt; and it does it more cheaply and compactly.  I certainly enjoyed it, which you should keep in mind as I now launch into a more critical review of some specific essays in the book.&lt;br /&gt;&lt;br /&gt;The essay that has probably gotten the most attention (outside of the much-panned essay on Shakespeare and Topeka which was published on Slate) is called “Minor League Pyramid”, and it includes James’ outline of a way to reform the minor league structure to make it resemble a pyramid rather than a tube (his description) as it does now.  I won’t repeat his argument here, but there are couple points that I feel strongly about:&lt;br /&gt;&lt;br /&gt;1. I agree with James that talent is choked out of the game by the limited number of openings at the entry level.  This is offset somewhat by the existence of college baseball as an alternative means to improve baseball, but scholarship restrictions make it a less reliable means of keeping quality talent engaged than college football or basketball.  The scholarship restrictions also leave college baseball as next to useless for attracting low-income players to the game.&lt;br /&gt;&lt;br /&gt;2. The proposal that James offers includes limits on the rate at which a prospect can be advanced through the minors, with the bottom line being that a player could not reach the majors until he’d played three seasons in the minors.  This is impossible to square with the existence of college baseball, and it also removes the illusion of a meritocracy which I think is very important, even if it is only an illusion.  We all know that teams play service time games, and that there are few twenty-year olds ready to be major league contributors anyway, but the potential for a wunderkind to reach the majors, even if there are only a handful each year, is something that should be preserved.&lt;br /&gt;&lt;br /&gt;3. James doesn’t seem to think his system would have much of an impact on the ability of baseball to attract talented athletes with other options.  In fact, he claims that a minor league pyramid would reduce the pressure on signing bonuses.  While I’m sure MLB CFOs would like that (as they would like the destruction of college as an alternative path), I can’t fathom how it wouldn’t make MLB a much less attractive option. &lt;br /&gt;&lt;br /&gt;He doubles down on this by saying that it would reduce “pressure” on teams to scout internationally to find talent, since they would have a larger supply of homegrown players.  I can’t for the life of me understand why that would be considered a positive.  The piece does make some good points, but that one is a real head-scratcher.&lt;br /&gt;&lt;br /&gt;Another essay in the book is called “Stink-O-Meter”; it discusses a fairly simple method of tracking the persistency of losing for a franchise.  The article is good in that it reinforces something that the average fan with no historical perspective constantly needs to be reminded--the state of even sorry franchises like the Pirates and Royals is nothing like that of the terrible franchises of the game’s past.&lt;br /&gt;&lt;br /&gt;The article is way too long, though--James feels compelled to run a chart every few paragraphs listing the top five or ten losing teams at a given moment in time.  This works better online than in print where it just wastes space, but it also helped me to see what I think is a pattern in James more recent work.  James has realized that it’s more difficult (not just for him, but for anyone) to produce cutting edge technical work.  Rather than introducing more rigorous means of analysis, James has decided to play show-and-tell; his more recent essays are filled with tables that in the past would have been left to the imagination or determination of the reader.  I could be off base, but it seems that increased comprehensiveness represents James’ attempt to keep up with the Joneses.&lt;br /&gt;&lt;br /&gt;Another essay is a reprinting of a speech James gave called “Battling Expertise with the Power of Ignorance”.  It’s a good read, but there was a portion that mentioned Pythagorean record and Runs Created that, shockingly, I can’t help but comment on.  &lt;br /&gt;&lt;br /&gt;James describes the two methods as the best-known of the “large number of heuristic rules” he developed during what could loosely be defined as his &lt;u&gt;Abstract&lt;/u&gt; years.  About the Pythagorean theorem, James said: “Later research has demonstrated that it works better still if you modify the exponent for the level of scoring”.&lt;br /&gt;&lt;br /&gt;The relationship between the slope of a run to win converter has been known for many years (dating back at least to Pete Palmer), but Clay Davenport’s Pythagenport was the first well-known modification to James’ Pythagorean theorem.  Later, this was refined further into Pythagenpat by recognizing the minimum theoretical exponent was one.&lt;br /&gt;&lt;br /&gt;James freely acknowledges the refinements to Pythagorean and in fact has used Pythagenpat in at least one of his own studies.  That only makes it all the more strange that he continues to cling to Runs Created, even when RC has been demonstrated to be a less accurate tool than Pythagorean with a fixed exponent.  RC is subject to complete meltdown under theoretical extreme conditions; while Pythagorean incorrectly handles the known point of 1 RPG, it correctly imposes a range of [0, 1] on all of its estimates.&lt;br /&gt;&lt;br /&gt;Discussing RC, James made no mention of subsequent work on other run estimators.  Understand that I am not trying to claim that he had any obligation to do so in the context of this speech--only that his continuing clinging to RC while recognizing other refinements to his original tools grows more bizarre as time passes.  I suppose one could argue that at least the Pythagorean refinements maintain the original R^x/(R^x + RA^x) model, and James always pointed out that an exponent other than two could result in more accurate estimates.  In any event, it’s extremely hard for me not to comment on a run estimator when the opportunity arises.&lt;br /&gt;&lt;br /&gt;While James recognized the existence of variable exponent refinements to Pythagorean record, he unfortunately missed a golden opportunity to utilize them in another essay in the book.  There is an article in which James examines the performance of starting pitchers when supported by X runs--basically, an attempt to examine the mystical phenomenon of “pitching to the score”.  I won’t steal his thunder by discussing his conclusions, but I will point out a methodological shortcoming in his approach.  &lt;br /&gt;&lt;br /&gt;When Whitey Ford’s teams scored one run with him pitching, their record (not Ford’s record) was 10-28.  James converts this to an effective rate of runs allowed using Pythagorean math.  In this case we know that Ford’s teams scored 38 runs (one for each game), so the equivalent number of runs Ford allowed to produce a Pythagorean record of 10-28 is x in the following equation:&lt;br /&gt;&lt;br /&gt;10/(10 + 28) = 38^2/(38^2 + x^2)&lt;br /&gt;&lt;br /&gt;This eventually simplifies to sqrt(L/W)*R, or sqrt(28/10)*38 = 63.6.  With two runs, Ford’s team were 19-22, which is equivalent to sqrt(22/19)*82 = 88.2 runs.  Adding these up and dividing by the total number of games produces an “Effective Runs Allowed Rate” for Ford in games in which his team scored one or two runs: (62.6 + 88.2)/(38 + 41) = 1.91.  Continuing in this manner for scoring three, four, five, … runs (while ignoring shutouts which are always losses), James has a measure of pitching effectiveness given the level of offensive support on a discrete game-by-game basis.&lt;br /&gt;&lt;br /&gt;However, the use of a fixed exponent severely distorts things by essentially assuming an average run scoring environment (an exponent of 2 corresponds to a RPG of around 10.9 using Pythagenpat), when we know the scoring output of one of the teams involved.  If one team scored only one run, the expected RPG is going to be lower than average.&lt;br /&gt;We could assume that an average number of runs would be scored by the other team involved in the game, and instead say that the RPG is 5.5, and the Pythagorean exponent should be around 1.64.  In that case, the equivalent runs allowed would be (28/10)^(1/1.64)*38 =  71.2 runs, a 12% difference from James’ estimate.&lt;br /&gt;&lt;br /&gt;That approach assumes that we know nothing about the “other” team’s run scoring rate--but of course, we know a great deal about it, because we know the identity of the starting pitcher: Whitey Ford.  For his career, Whitey Ford had a Run Average of 3.14 and averaged 6.94 innings/start in a league that averaged about 4.31 runs/game, so we could estimate that his team’s RA is (6.94*3.14 + (9 - 6.94)*4.31)/9 = 3.41, and that the expected RPG for a game in which his offense scores one run is 4.41, producing a Pythagorean exponent of 1.54 and (28/10)^(1/1.54)*38 = 74.2 run equivalent.  This new estimate is approximately 17% higher than James’ original estimate.&lt;br /&gt;&lt;br /&gt;The good news is that when you extend things across the entire spectrum of the run distribution, much of the distortion is canceled out.  James presents complete breakouts for several pitchers, but the two of historical interest are Ford and Tom Seaver.  Setting aside the adjustments he introduces to smooth the data and restate the effective runs allowed rate on the actual RA scale, let me just run the crude totals for those two under three assumptions: the James approach of a Pythagorean exponent of 2, an assumption that the RPG at each scoring level is 4.5 plus the number of runs the pitcher’s team scored, and the customized type assumption I described for Ford above (Seaver had a career RA of 3.15 and averaged 7.38 innings/start in a league that averaged approximately 4.11 runs per game, resulting in a customized team RA of 3.32 runs/game):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-PELg8tWKGrI/TfU6483RTAI/AAAAAAAAAzE/Iizd0FN1jGM/s1600/fordseaver.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-PELg8tWKGrI/TfU6483RTAI/AAAAAAAAAzE/Iizd0FN1jGM/s1600/fordseaver.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;These differences are small, in the neighborhood of 1%, and thus not worth getting too worked up about.  However, it’s important to keep in mind that fixed Pythagorean will not work particularly well at the extremes, and it would be a mistake to put a lot of confidence in the isolated application of the fixed exponent Pythagorean estimate to an extreme RPG.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-4288728603886805406?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/4288728603886805406/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/06/comments-on-bill-james-solid-fools-gold.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/4288728603886805406'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/4288728603886805406'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/06/comments-on-bill-james-solid-fools-gold.html' title='Comments on Bill James&apos; “Solid Fool’s Gold”'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-PELg8tWKGrI/TfU6483RTAI/AAAAAAAAAzE/Iizd0FN1jGM/s72-c/fordseaver.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-3367598134906631274</id><published>2011-06-08T17:24:00.001-04:00</published><updated>2011-06-08T17:24:53.023-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Yahoo Box Scores'/><category scheme='http://www.blogger.com/atom/ns#' term='Whimsy'/><title type='text'>Great Moments in Yahoo! Box Scores</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-Os_mkrLrXKU/Te_oilyM-3I/AAAAAAAAAzA/V59UdW3mg5Y/s1600/missingone.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="253" width="400" src="http://4.bp.blogspot.com/-Os_mkrLrXKU/Te_oilyM-3I/AAAAAAAAAzA/V59UdW3mg5Y/s400/missingone.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Screencap grabbed at 5:22 for a game that ended around 3:00.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-3367598134906631274?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/3367598134906631274/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/06/screencap-grabbed-at-522-for-game-that.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3367598134906631274'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3367598134906631274'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/06/screencap-grabbed-at-522-for-game-that.html' title='Great Moments in Yahoo! Box Scores'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-Os_mkrLrXKU/Te_oilyM-3I/AAAAAAAAAzA/V59UdW3mg5Y/s72-c/missingone.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-4088646115311689497</id><published>2011-06-07T00:03:00.051-04:00</published><updated>2011-06-07T00:03:00.356-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scorekeeping'/><title type='text'>Scoring Self-Indulgence, pt. 3: Outs in Play</title><content type='html'>The scoring of outs in play is of course heavily dependent on the traditional numbering system for the fielders.  Since the use of those familiar position numbers is a part of just about every scoring system I’ve seen (&lt;a href="http://walksaber.blogspot.com/2010/06/keeping-score-with-ll-bean.html"&gt;LL Bean’s pictorial system&lt;/a&gt; as the exception), the way I score outs in play is very similar to the way everyone else scores outs in play.  The only real difference is the particular field location modifiers I use, but those are easier to discuss in their own separate post near the end of this series.&lt;br /&gt;&lt;br /&gt;The two principles that I attempt to adhere to when scoring an out in play (these are batters retired without reaching safely, or hitting into fielder’s choices that force their teammates out, not baserunning outs) that are a little different than the systems that some people use are:&lt;br /&gt;&lt;br /&gt;1. No dashes to indicate throws.  All of the position codes are one number (a convenient result of having only nine fielders).  If two of them end up next to each other, I can deduce that there was a throw 99% of the time.&lt;br /&gt;&lt;br /&gt;Instead, I use a dash on those occasions in which the ball goes between two fielders involved in recording an out for their team without a throw.  The most common is a deflection by the pitcher, but you can have crazy scenarios where balls are deflected between outfielders and caught and the like.&lt;br /&gt;&lt;br /&gt;2. If there’s a throw involved, it’s an infield groundout, unless otherwise noted.  If it’s an unassisted out, it’s in the air, unless otherwise noted (I make an exception for first baseman.  Three with no elaboration is always a groundout on my scoresheet, which admittedly violates this rule).  If it’s in the air, it’s a flyball/popup rather than a line drive unless otherwise noted.&lt;br /&gt;&lt;br /&gt;Based on these rules, here are some sample scoreboxes for certain outs.  For all of the examples, I’ve excluded any pitch scoring, because it would just distract from the main focus of the post.  Just pretend all these outs are first pitch swinging:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-WZrh1VXhKYg/TeqGtNA06MI/AAAAAAAAAxs/nNVZ-cAKca8/s1600/scoreNine.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-WZrh1VXhKYg/TeqGtNA06MI/AAAAAAAAAxs/nNVZ-cAKca8/s320/scoreNine.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here is a simple groundout to short, scored the way everyone does except for those who insist on using dashes.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-DWkHhvFKdg0/TeqGunUwZYI/AAAAAAAAAxw/tEzzZTvssGE/s1600/scoreTen.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-DWkHhvFKdg0/TeqGunUwZYI/AAAAAAAAAxw/tEzzZTvssGE/s320/scoreTen.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here’s where the dash comes in--a deflected ball.  The dash comes after the player doing the deflecting, in this case the pitcher.  The second baseman recovers it and throws to first for the out.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-KBUxZzKjUvU/TeqGwwagI3I/AAAAAAAAAx0/WQWPIHpxdnE/s1600/scoreEleven.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-KBUxZzKjUvU/TeqGwwagI3I/AAAAAAAAAx0/WQWPIHpxdnE/s320/scoreEleven.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In this case, the T indicates tag; the first baseman recorded the out not by stepping on the base but by physically tagging the batter-runner.  On certain baserunning plays, the tag is implied (caught stealings are an obvious example), but I’ll deal with that in the appropriate section.  You could also have “T3”, which would be a groundout fielded by the first baseman, who tags the runner himself.  Sometimes you’ll see “T1” as well.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-WobhSHCHcQ0/TeqGzYunl-I/AAAAAAAAAx4/qPMsHe2zwFg/s1600/scoreTwelve.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-WobhSHCHcQ0/TeqGzYunl-I/AAAAAAAAAx4/qPMsHe2zwFg/s320/scoreTwelve.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The “SH” indicates that this will be scored as a sacrifice hit, catcher to first base.  I do not include the bunt symbol because the SH designation communicates that.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-j3B9reo1ntM/TeqG1GGLASI/AAAAAAAAAx8/c8mikn5sWGA/s1600/scoreThirteen.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-j3B9reo1ntM/TeqG1GGLASI/AAAAAAAAAx8/c8mikn5sWGA/s320/scoreThirteen.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If it’s a bunt, but not a sacrifice hit, then I use the squiggly line modifier that indicates bunt.  In this case, the third baseman threw out the batter at first on the bunt attempt.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-sDq-epBXs5M/TeqIJKyrHeI/AAAAAAAAAyM/CKyS8Cuspzo/s1600/scoreFourteen.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-sDq-epBXs5M/TeqIJKyrHeI/AAAAAAAAAyM/CKyS8Cuspzo/s320/scoreFourteen.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I use a “DP” modifier to indicate double plays either on groundballs, line drives, or flyballs (we’ll see those later)--basically, double plays in which there was a force out (I realize my usage of that term is not necessarily in full compliance with the rule book definition).  I do not note a double play on a strikeout/caught stealing or on a runner thrown out attempting to tag, even if these technically are double plays as well.  The scoring is done in such a manner that someone reading the sheet can ascertain the double play, but there is no “DP” code employed.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-rQPGkEErwQc/TeqG5H4GCPI/AAAAAAAAAyA/D8mu2dB72U4/s1600/scoreFourteenX.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-rQPGkEErwQc/TeqG5H4GCPI/AAAAAAAAAyA/D8mu2dB72U4/s320/scoreFourteenX.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;My symbol for a groundout is a straight line underneath the fielder’s number, but the only situation in which I actually end up using this is an unassisted play in which the pitcher tags first base for the out.  Other groundballs are implied by the use of throws to record the outs; even an unusual event in which an outfielder fields a groundball and throws the batter-runner out at first or records a fielder’s choice is clearly suggested to be a groundball by the indicated throw.  If for some reason a second baseman or someone else ended up running to first and tagging the base unassisted, they’d get the underline modifier as well.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-N5tNQ2s_wdw/TeqG6tq_K8I/AAAAAAAAAyE/xer5C4nzGWE/s1600/scoreFourteenY.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-N5tNQ2s_wdw/TeqG6tq_K8I/AAAAAAAAAyE/xer5C4nzGWE/s320/scoreFourteenY.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Fielder’s choices involve the batter reaching base safely, so I’ll cover them in the section on scoring plays where the batter reaches.  However, a special case is the two out fielder’s choice.  While the batter is not himself retired, he also never actually takes his place as a runner, and so I don’t make it look as if he does in the scoresheet.  In that case, I write it the scoring of the play large across the box as I would for any other out.  This one was third to second, obviously.  You’ll not that the dot for the out is omitted, because the batter was not retired.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-84-38dkAQug/TeqG8Bs0sBI/AAAAAAAAAyI/_Zd5qSpDZv8/s1600/scoreFourteenZ.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-84-38dkAQug/TeqG8Bs0sBI/AAAAAAAAAyI/_Zd5qSpDZv8/s320/scoreFourteenZ.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I don’t use this one much, but the carrot below the groundball indicates a chopper.  I only use it for serious ones that John McGraw would approve of.  Here, the third baseman was able to make the play anyway.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-UmswFC5Godo/TeqIM7ToSOI/AAAAAAAAAyQ/WTxY7Y3yvHc/s1600/scoreFifteen.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://3.bp.blogspot.com/-UmswFC5Godo/TeqIM7ToSOI/AAAAAAAAAyQ/WTxY7Y3yvHc/s320/scoreFifteen.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;On to flyouts.  This is your garden variety fly ball to center field.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-EYhQGF7lLrE/TeqIODVJ9pI/AAAAAAAAAyU/CSA6rvtyEWU/s1600/scoreSixteen.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-EYhQGF7lLrE/TeqIODVJ9pI/AAAAAAAAAyU/CSA6rvtyEWU/s320/scoreSixteen.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If a fly is caught in foul territory, “`” is the symbol I use to indicate it.  This is a foul to left field.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-Hjxvy7HVe5o/TeqIQXgV-hI/AAAAAAAAAyY/hWAaHtrE-II/s1600/scoreSeventeen.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-Hjxvy7HVe5o/TeqIQXgV-hI/AAAAAAAAAyY/hWAaHtrE-II/s320/scoreSeventeen.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I use a curved line segment over the position number to indicate a fly ball. I waive this for all positions except first base, where “3” alone indicate a groundout.  For any other position, an unassisted putout is always assumed to be a popup for an infielder and a flyball for an outfielder (in any event, I don’t distinguish between pops and flies).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-4QCuqs4pLhg/TeqIT78Ff1I/AAAAAAAAAyc/xZMRANQc6HU/s1600/scoreEighteen.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-4QCuqs4pLhg/TeqIT78Ff1I/AAAAAAAAAyc/xZMRANQc6HU/s320/scoreEighteen.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Since flies get a curved line above the fielder’s number, a line drive gets a straight line in the same location.  This one was caught by the shortstop.  &lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-k1Pqf0WvxQM/TeqIXnHbzgI/AAAAAAAAAyg/U8ONZn6ogm0/s1600/scoreNineteen.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://3.bp.blogspot.com/-k1Pqf0WvxQM/TeqIXnHbzgI/AAAAAAAAAyg/U8ONZn6ogm0/s320/scoreNineteen.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here is a regular flyout to right that is scored a sacrifice fly, indicated by a “SF” prefix.  Any of the out modifiers could be combined when sensible--one could have a line drive sacrifice fly in foul territory, I suppose.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-eURsMWXa6wA/TeqJHoOFE0I/AAAAAAAAAyk/Z-gDtb6zAPE/s1600/scoreTwenty.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-eURsMWXa6wA/TeqJHoOFE0I/AAAAAAAAAyk/Z-gDtb6zAPE/s320/scoreTwenty.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Some balls are somewhere in between popups and line drives.  When I used letters rather than symbols to indicate ball trajectory, I called these “loopers (LP)”.  Now I use the flyball curve plus the line drive line to indicate them.  I never score outfield loopers, and I never score base hits as loopers.  Only infield outs; obviously this one was snagged by the second baseman.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-LD_YsBxM6Ag/TeqJI_QA5MI/AAAAAAAAAyo/IoR9ppjkCCg/s1600/scoreTwentyOne.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-LD_YsBxM6Ag/TeqJI_QA5MI/AAAAAAAAAyo/IoR9ppjkCCg/s320/scoreTwentyOne.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The “IF” prefix here indicates “infield fly”; that is the infield fly rule has been invoked &lt;b&gt;and&lt;/b&gt; the catch was not completely cleanly.  This makes it a pretty rare code, but I have had to use it a couple of times, and you’d encounter it a lot more at lower levels of the game.  Obviously the first baseman was credited with the putout on this one.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-yT2WUwYskt8/TeqJKCBsTEI/AAAAAAAAAys/KQeZcnj0LSw/s1600/scoreTwentyTwo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-yT2WUwYskt8/TeqJKCBsTEI/AAAAAAAAAys/KQeZcnj0LSw/s320/scoreTwentyTwo.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here is a popped up bunt snagged by the catcher in foul territory (none of the examples here are sacrifices or else the bunt would be indicated by “SH” and not by the bunt symbol).  You can tell it’s not a grounder because that would involve a throw or a tag (and in this case because it’s marked as a foul).  The exception is a first baseman. If I record this:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-yC7gXVEWfOY/TeqJMhrWCCI/AAAAAAAAAyw/lwFqTTMlxn8/s1600/scoreTwentyThree.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-yC7gXVEWfOY/TeqJMhrWCCI/AAAAAAAAAyw/lwFqTTMlxn8/s320/scoreTwentyThree.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It indicates a groundball bunt with the play made unassisted at the bag by the first baseman.  If it was a popup to the first baseman, the fly arch would be above the number.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/--57l7fzbGfw/TeqJOdp1YQI/AAAAAAAAAy0/Zz4_PccVeEA/s1600/scoreTwentyFour.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/--57l7fzbGfw/TeqJOdp1YQI/AAAAAAAAAy0/Zz4_PccVeEA/s320/scoreTwentyFour.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This is an example of a line drive double play where the shortstop catches a liner and flips to second for the out, ending the inning.  There is no solid out dot because the batter was not retired.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-oEt50rAx_0Q/TeqJPfIFjYI/AAAAAAAAAy4/xNqouA3CCaM/s1600/scoreTwentyFive.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-oEt50rAx_0Q/TeqJPfIFjYI/AAAAAAAAAy4/xNqouA3CCaM/s320/scoreTwentyFive.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In this case, the batter flies to right, and subsequently a runner is doubled off his base.  That portion of the play is recorded in the runner’s box; the DP here just lets us know that there was a double play somewhere.  Again, I do not record a DP symbol when the out is recorded by a runner attempting to advance on a non-force play.  If a runner is thrown out after tagging at third and attempting to score, there is no record of this made in the batter’s scorebox--you'll just see a regular flyout symbol.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-4088646115311689497?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/4088646115311689497/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/06/scoring-self-indulgence-pt-3-outs-in.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/4088646115311689497'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/4088646115311689497'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/06/scoring-self-indulgence-pt-3-outs-in.html' title='Scoring Self-Indulgence, pt. 3: Outs in Play'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-WZrh1VXhKYg/TeqGtNA06MI/AAAAAAAAAxs/nNVZ-cAKca8/s72-c/scoreNine.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-2839037654905756438</id><published>2011-05-26T18:20:00.002-04:00</published><updated>2011-05-26T18:20:44.631-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Yahoo Box Scores'/><category scheme='http://www.blogger.com/atom/ns#' term='Whimsy'/><title type='text'>Great Moments in Yahoo! Box Scores</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-0EfB0hZ_tOM/Td7SLeW9cAI/AAAAAAAAAxk/KVpavO5oizA/s1600/oaklaa.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="233" width="400" src="http://3.bp.blogspot.com/-0EfB0hZ_tOM/Td7SLeW9cAI/AAAAAAAAAxk/KVpavO5oizA/s400/oaklaa.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-2839037654905756438?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/2839037654905756438/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/05/great-moments-in-yahoo-box-scores_26.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2839037654905756438'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2839037654905756438'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/05/great-moments-in-yahoo-box-scores_26.html' title='Great Moments in Yahoo! Box Scores'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-0EfB0hZ_tOM/Td7SLeW9cAI/AAAAAAAAAxk/KVpavO5oizA/s72-c/oaklaa.jpg' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-5199910779293157351</id><published>2011-05-19T03:18:00.000-04:00</published><updated>2011-05-19T03:18:00.406-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Meanderings'/><category scheme='http://www.blogger.com/atom/ns#' term='Book Reviews'/><title type='text'>Meanderings</title><content type='html'>&lt;i&gt;Print Baseball Encyclopedias&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;As I grow older, I try to stay alert to warning signs of old-fogeyism.  One or two such signs are not particularly concerning--they can just be written off as personal quirks/eccentricities, which we all possess to one degree or another.  A prime example for me is cell phones.  I hate the things, and I always have.  I finally got one, only because it was cheaper than paying for a landline, and if there's one thing I hate more than cell phones, it's spending money on any type of phone.&lt;br /&gt;&lt;br /&gt;When it comes to baseball, one of the possible signs I've noticed is my continuing love for print encyclopedias.  I think it's great that we have Baseball-Reference, Retrosheet, the National Pastime Almanac, the Baseball-Databank, and the like, and obviously there are countless advantages to computerized data that you and I take advantage of every day.  Still, I have yet to warm up to the idea of going to Baseball-Reference, clicking on a page, following a link somewhere else, and wasting an hour or two just wandering in the statistical record of the game.  I still do this all the time with print encyclopedias.  This post is a tribute/review of them.&lt;br /&gt;&lt;br /&gt;Of course, the print encyclopedia is a dinosaur.  It always was a bit of a wonder that one could publish a multi-thousand page book, carrying a hefty hardcover price, and sell enough of them to make it a worthwhile business endeavor, especially with annual or semi-annual editions.  Perhaps they never really earned their keep anyway, but they should have.&lt;br /&gt;&lt;br /&gt;The advent of computerized equivalents has driven the print encyclopedia out of existence (although apparently the erstwhile &lt;u&gt;ESPN Baseball Encyclopedia&lt;/u&gt; is still being shopped to publishers).  If that is the inevitable cost of progress, then so be it--I wouldn't give up my Lahman database to get a new edition of &lt;u&gt;Total Baseball&lt;/u&gt; if that was what it would take.  Still, I miss the print encyclopedias--and it seems as if other people do to.&lt;br /&gt;&lt;br /&gt;As I write this (New Year's Eve), the current cheapest prices listed on Amazon.com for a copy of the final edition of each of the printed encyclopedias (new or used) are:&lt;br /&gt;&lt;br /&gt;* &lt;u&gt;Macmillan&lt;/u&gt; (10th edition, 1996): $44.99&lt;br /&gt;&lt;br /&gt;The 9th edition is available for as little as $25.&lt;br /&gt;&lt;br /&gt;*&lt;u&gt;Sports Encyclopedia: Baseball&lt;/u&gt; (2007 edition): $123.08&lt;br /&gt;&lt;br /&gt;The 2006 edition is available for as little as $3.31.&lt;br /&gt;&lt;br /&gt;*&lt;u&gt;Total Baseball&lt;/u&gt; (8th edition, 2004): $99.65&lt;br /&gt;&lt;br /&gt;The 7th edition is available for as little as $3.58.&lt;br /&gt;&lt;br /&gt;*&lt;u&gt;ESPN Baseball Encyclopedia&lt;/u&gt; (5th edition, 2008): $95.80&lt;br /&gt;&lt;br /&gt;The 4th edition is available for as little as $1.73.&lt;br /&gt;&lt;br /&gt;*&lt;u&gt;STATS All-Time Baseball Handbook&lt;/u&gt; (2nd edition, 2000): $3.99&lt;br /&gt;&lt;br /&gt;The exception, and not really an iconic book as it only went through two editions and presumably had the most limited printing run of any of the five.&lt;br /&gt;&lt;br /&gt;I'm not sure if these prices reflect actual demand for the books in question, or whether sellers think they have something valuable and are setting the price above the intersection of the demand and supply curves.  Assuming that it is a real phenomenon, it suggests that there are a fair number of people who miss the print encyclopedias so much that they are willing to pay a high price just to have the final update.&lt;br /&gt;&lt;br /&gt;I have at least one copy of each of the big four (excluding the STATS book from that designation) on my bookshelf at all times.  Of the four, the two that I use most are &lt;u&gt;ESPN&lt;/u&gt; and &lt;u&gt;Sports Encyclopedia: Baseball&lt;/u&gt;.  Of all of the encyclopedias, I have to count SE:BB as my favorite.  It's certainly not the most statistically complete or the best-edited, but it's the only one of the four that breaks from the career register format and instead presents a season rosters format. &lt;br /&gt;&lt;br /&gt;I've always felt that the season rosters lend themselves better to browsing than the career registers.  (This is the part where the readers scream, "With a computer you can have both!")  Not only does it allow one to look at team composition and track changes from year-to-year, it allows one to view an entire league-season on 2-4 pages, making it much easier to get the big picture for a season.&lt;br /&gt;&lt;br /&gt;The SE:BB is not without flaws, of course.  The book is filled with typos, many of which were presumably there from the first edition to the last.  Two quick examples, both from the 1994 edition (although I'd be very surprised if they were corrected in later updates):&lt;br /&gt;&lt;br /&gt;* Johnny Kling is listed as "Johnny King" with the roster for the 1901 Cubs, and in the 1901-19 Batter Register (later Cub seasons correctly list him as "Kling").&lt;br /&gt;&lt;br /&gt;* The header for the 1972 NLCS says "Cincinnati (west) 3 Pittsburg (East) 2".  Perhaps if this was a listing for 1882, it could be considered authentic to the times.&lt;br /&gt;&lt;br /&gt;There have to be dozens of similar errors throughout the book, none of which are damning to its utility as a baseball reference but all of which do build up to an uneasy feeling of neglect.  Still, the charms of the book overcome that for my money.&lt;br /&gt;&lt;br /&gt;Like its cousin, the &lt;u&gt;Macmillan&lt;/u&gt;, the statistical selection in SE:BB was formed at its first publication (1969 for Big Mac, 1974 for SE:BB).  OBA is nowhere to be found, nor is CS or pitcher home runs allowed.  Fractional innings pitched are rounded, an the typesetting varies throughout the book, making some sections more difficult to read.  Sometimes space requires severe truncating of batting lines--Dick McAuliffe went 7-27 as a 20 year old left-handed hitter for the 1960 Tigers, but that's all you can find out.&lt;br /&gt;&lt;br /&gt;The &lt;u&gt;ESPN&lt;/u&gt; encyclopedia, edited by Gary Gillette and Pete Palmer, is my favorite of the three career register works.  Mostly this is because it is the most recent, superceding &lt;u&gt;Total Baseball&lt;/u&gt;.  For the most part, the statistical selection is the same as TB.  In both cases, I'd love to have a better offensive rate than OPS+, and I think they tried to hard with respect to fielding categories, but both give the basic categories necessary to build standard statistics.  &lt;br /&gt;&lt;br /&gt;&lt;u&gt;Total Baseball&lt;/u&gt; is unique because of the volume of the text that accompanies the statistics--short biographies of notable players, team histories, a history of sabermetrics, and a bunch of other articles that changed from edition-to-edition.  More than any of the other encyclopedias, the article turnover created a reason to buy each new edition (other than, of course, the updated statistics).  &lt;br /&gt;&lt;br /&gt;The &lt;u&gt;MacMillan&lt;/u&gt; must be given respect due to its status as the pioneer; the research that went into producing it has been incorporated by every serious baseball historical work of any stripe since that time.  As an encyclopedia, though, it's heyday was the first edition.  It soon had SE:BB as a competitor, and with the two including essentially the same basic data, the (IMO) superior format of SE:BB made it an unfair fight.  MacMillan also played fast and loose with changing statistics for silly ends.  Later editions cut this out, and added some interesting data like team home/road splits and sketchy Negro League records, but by that time &lt;u&gt;Total Baseball&lt;/u&gt; was on the scene.&lt;br /&gt;&lt;br /&gt;The STATS &lt;u&gt;All-Time Major League Handbook&lt;/u&gt; was the most thorough encyclopedia for individual statistics, but as such it is the one that has taken the biggest hit from the existence of Baseball-Reference.  No other encyclopedia offered complete batting, pitching, and fielding data (including all of the minor categories like GDP and sacrifice hits allowed), but the sheer volume of data sapped the book of any character it might have otherwise had.  While Big Mac has standings and playoff records and the like, and &lt;u&gt;Total Baseball&lt;/u&gt; had all of that and the articles, there was no room in the &lt;u&gt;Handbook&lt;/u&gt; for anything other than the player career register.  The ancillary material was shuffled off into an equally large &lt;u&gt;All-Time Sourcebook&lt;/u&gt;.&lt;br /&gt;&lt;br /&gt;While the massive print encyclopedia may be something of a relic, I do think it would be wonderful if it could live on.  Obviously I know nothing about the real-world feasibility of what I am about to spout, but it would be great to see an organization like SABR step up to the plate and subsidize an updated print encyclopedia (even if it had to be in PDF format, as SABR has done with the &lt;u&gt;Emerald Guide&lt;/u&gt;) every half-decade or so.  Eventually the desire for such a tome might be foreign to even the crustiest old baseball historians, but I think it's safe to say that day is still several decades off into the future.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Standard Deviation of Franchise W%&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Speaking of electronic encyclopedias, this is the type of exercise that they make a breeze, which previously would have been an arduous chore.  I figured these a while ago with the intent of using them in some other discussion, but that never materialized so I'll dump them here.&lt;br /&gt;&lt;br /&gt;These charts simply show the standard deviation of full-decade W% for each major league franchise.  I have criticized the use of decades as a line of demarcation for baseball statistics in the past, but this is not a through analytical endeavor and they do provide an easy, straightforward manner of categorization.  I have defined the decade here as 1901-1910, 2001-2010, etc, not because I have any particularly strong feelings on the matter of decade division but because it works better since 1) it includes 2010 and 2) the first decade thus defined corresponds with the American League's 1901 ascension to major league status.  &lt;br /&gt;&lt;br /&gt;There are four different standard deviations shown for each decade--"whole" is the StD for teams that completed the entire decade.  This is fairly arbitrary, as it allows the 1961 AL expansion teams but excludes the 1962 NL expansion teams (the four 1969 expansion teams are obviously excluded as well).  "All" is the StD for all franchises that played in the decade, even if it was for as little as one season (actually, the shortest in-decade tenure is two years for the 1969 expansion teams).  "1901s" is the StD for the sixteen franchises that have played continuously since 1901.  While they now make up just over half of MLB, they at least provide a constant frame of reference throughout the century.  "Expan", as you might figure, is the StD for whichever of the fourteen expansion franchises competed in a given decade.  &lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-p4H1AMyFNEg/TdQ8WjXj0cI/AAAAAAAAAxQ/PyohLphDjFY/s1600/ency1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-p4H1AMyFNEg/TdQ8WjXj0cI/AAAAAAAAAxQ/PyohLphDjFY/s1600/ency1.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;By this measure, the 1980s and 90s stand out as very competitive periods in the game, and the 2000s were a step back from that.  However, the standard deviation of franchise W% in the last decade were essentially the same as the 1950s and 60s, and still well under the norm for most of history.&lt;br /&gt;&lt;br /&gt;The next chart gives the average W% for teams by decade broken down into 1901s and expansion teams.  It also lists the best and worst franchise W%s for the decade, but those lists include only the teams that played ten seasons in each decade:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-pLCrdY3QXwY/TdQ8YusjIII/AAAAAAAAAxU/UPHttK5hGWw/s1600/ency2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-pLCrdY3QXwY/TdQ8YusjIII/AAAAAAAAAxU/UPHttK5hGWw/s1600/ency2.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In the 1980s, expansion teams actually had a slightly better record than the 1901s, but they have lost ground in the last twenty years.  Of course, most of the big city teams are 1901s, with the major exception being the Angels.  The spread between the best team W% and worst was higher in the 2000s than it had been since the 1960s, but I wouldn't attempt to make anything out of it.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Two Team Cities&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;During a bout of encyclopedia browsing, I noticed that the two Boston teams both had dreadful 1906 seasons.  The Braves were 49-102, but the now-Red Sox were even worse, losing three more games (49-105).  I made the mistake of pointing this out on Twitter and saying that it "had to be the worst" such record.&lt;br /&gt;&lt;br /&gt;Of course, it didn't have to be anything, and it isn't.  It is only the third-worst combined record by teams in the same city since 1901.  While I'm sure someone has done this before, a quick search turned up nothing.  I considered Brooklyn to be New York (meaning that from 1903-1957 New York had three teams), and I considered the Angels/Dodgers and Giants/A's as sharing a city (when applicable).  The ten worst single season records for the two or three teams combined:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-5rGBm5cfdp4/TdQ8ZoTWuzI/AAAAAAAAAxY/eKm3dWUpMDc/s1600/ency3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-5rGBm5cfdp4/TdQ8ZoTWuzI/AAAAAAAAAxY/eKm3dWUpMDc/s1600/ency3.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;At least Boston 1906 was the worst in something, as it was the worst non-Philadelphia combined record.  Philly has seen some bad records over the years, but none worse than 1919 when the Phillies were 47-90 and the A's were 36-104.  The worst years for each of the two-team cities other than Boston and Philadelphia were St. Louis 1913 (108-195, .356), Chicago 1948 (115-191, .376), Bay Area 1979 (12-199, .386), New York 1965 (127-197, .392), and Los Angeles 1992 (135-189, .417).&lt;br /&gt;&lt;br /&gt;The best records are:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-NHyZPyq6QR0/TdQ8a7qPlVI/AAAAAAAAAxc/odIJ7nyMNms/s1600/ency4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-NHyZPyq6QR0/TdQ8a7qPlVI/AAAAAAAAAxc/odIJ7nyMNms/s1600/ency4.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Four of these top ten featured a crosstown World Series, led by the 1906 victory by the White Sox over the Cubs; the others are St. Louis 1944, New York 1951 (Giants/Yankees as the Dodgers dropped the three-game NL playoff), and New York 1952 (this time Dodgers/Yankees).  The banner years for the other cities were Boston 1915 (184-119, .607), Philadelphia 1913 (184-120, .605) and Los Angeles 2009 (192-132, .593).&lt;br /&gt;&lt;br /&gt;The overall records for each city (for years in which they had multiple teams) are:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-1OFJpL2EXzg/TdQ8bjT1dyI/AAAAAAAAAxg/TUKbN-Jy_SI/s1600/ency5.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-1OFJpL2EXzg/TdQ8bjT1dyI/AAAAAAAAAxg/TUKbN-Jy_SI/s1600/ency5.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The cities in which the combined record has been good still have two teams; the ones in which they were poor do not.  Shocking but true.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-5199910779293157351?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/5199910779293157351/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/05/meanderings.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5199910779293157351'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5199910779293157351'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/05/meanderings.html' title='Meanderings'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-p4H1AMyFNEg/TdQ8WjXj0cI/AAAAAAAAAxQ/PyohLphDjFY/s72-c/ency1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-6950367148082923129</id><published>2011-05-07T13:49:00.000-04:00</published><updated>2011-05-07T13:49:21.565-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Horse Racing'/><category scheme='http://www.blogger.com/atom/ns#' term='Whimsy'/><title type='text'>Great Moments in New York Post Horse Racing Info</title><content type='html'>&lt;a href="http://1.bp.blogspot.com/-ZO2Hd8wTh5g/TcWGDGINVWI/AAAAAAAAAxA/l6KWm_sRQDs/s1600/notlal.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="369" width="272" src="http://1.bp.blogspot.com/-ZO2Hd8wTh5g/TcWGDGINVWI/AAAAAAAAAxA/l6KWm_sRQDs/s400/notlal.jpg" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-6950367148082923129?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/6950367148082923129/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/05/great-moments-in-new-york-post-horse.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6950367148082923129'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6950367148082923129'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/05/great-moments-in-new-york-post-horse.html' title='Great Moments in New York Post Horse Racing Info'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-ZO2Hd8wTh5g/TcWGDGINVWI/AAAAAAAAAxA/l6KWm_sRQDs/s72-c/notlal.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-4860213615479739194</id><published>2011-05-07T01:13:00.001-04:00</published><updated>2011-05-07T01:13:33.299-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Yahoo Box Scores'/><category scheme='http://www.blogger.com/atom/ns#' term='Whimsy'/><title type='text'>Great Moments in Yahoo! Box Scores</title><content type='html'>&lt;a href="http://1.bp.blogspot.com/-Z4JEmoFQ9mA/TcTUfzrfrqI/AAAAAAAAAw4/9hMh6W3kE5A/s1600/stanton.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="200" width="400" src="http://1.bp.blogspot.com/-Z4JEmoFQ9mA/TcTUfzrfrqI/AAAAAAAAAw4/9hMh6W3kE5A/s400/stanton.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;They'd better make sure this issue doesn't carry over to my fantasy team.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-4860213615479739194?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/4860213615479739194/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/05/great-moments-in-yahoo-box-scores_07.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/4860213615479739194'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/4860213615479739194'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/05/great-moments-in-yahoo-box-scores_07.html' title='Great Moments in Yahoo! Box Scores'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-Z4JEmoFQ9mA/TcTUfzrfrqI/AAAAAAAAAw4/9hMh6W3kE5A/s72-c/stanton.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-1459687217831697466</id><published>2011-05-04T22:31:00.000-04:00</published><updated>2011-05-04T22:31:50.644-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Yahoo Box Scores'/><category scheme='http://www.blogger.com/atom/ns#' term='Whimsy'/><title type='text'>Great Moments in Yahoo! Box Scores</title><content type='html'>&lt;a href="http://1.bp.blogspot.com/-MPrYQacU0UU/TcILyWTLu-I/AAAAAAAAAww/tMoWaXvXBVs/s1600/nododgers.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="150" width="400" src="http://1.bp.blogspot.com/-MPrYQacU0UU/TcILyWTLu-I/AAAAAAAAAww/tMoWaXvXBVs/s400/nododgers.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I guess the Dodgers trouble meeting payroll came to fruition sooner than anticipated.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-1459687217831697466?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/1459687217831697466/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/05/great-moments-in-yahoo-box-scores.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/1459687217831697466'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/1459687217831697466'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/05/great-moments-in-yahoo-box-scores.html' title='Great Moments in Yahoo! Box Scores'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-MPrYQacU0UU/TcILyWTLu-I/AAAAAAAAAww/tMoWaXvXBVs/s72-c/nododgers.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-403679746219483824</id><published>2011-05-03T00:20:00.001-04:00</published><updated>2011-06-04T15:19:08.637-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scorekeeping'/><title type='text'>Scoring Self-Indulgence, pt. 2: Scoring Pitches and Strikeouts</title><content type='html'>When I score a game, I almost always keep a pitch-by-pitch record of the game, unless for some reason I have to juggle watching the game with some other task, and will not have the ability to accurately record each and every pitch.  Even when I set out with this as my intention, I often find myself unconsciously scoring the pitches anyway.&lt;br /&gt;&lt;br /&gt;My system for tracking pitches only records the basics--whether it is a ball or a strike, and any of the basic subgroups contained within those two categories (intentional balls/pitchouts, swinging strikes, called strikes, fouls).  Some people attempt to keep track of pitch locations or pitch types; of course, Pitchf/x has rendered this even more of a chore than it was previously, and some people (hello, nice to meet you) just aren’t good enough at observing pitch locations and distinguishing pitch types to even attempt to put in this level of effort.&lt;br /&gt;&lt;br /&gt;The final pitch of a plate appearance is not recorded separately--it is implied by whatever event follows.  For example, if a batter draws a walk, I don’t record the fourth ball independently of noting the walk.  If a pitch is hit into play, you’ll see a symbol for a base hit, or a groundout, or whatever the case may be.  I don’t see any reason to waste another pencil stroke on spelling this out.&lt;br /&gt;&lt;br /&gt;The left side of the empty scorebox is used to record balls; the right side is reserved for strikes, and the very top (and on the very rare occasions when necessary, the very bottom), with much smaller letters, is where two-strike fouls are recorded.  The order of pitches in indicated by letters of the alphabet--the first pitch is “A”, the second pitch “B”, and so forth.&lt;br /&gt;&lt;br /&gt;Balls usually don’t usually need any elaboration--intentional balls/pitchouts are the only common subcategory.  I do not distinguish between the two; it is usually pretty obvious which is being employed if you consider the context of the plate appearance and the pitch sequence.  An intentional ball of any stripe is simply circled.&lt;br /&gt;&lt;br /&gt;The other, much less common alteration needed to balls is the automatic ball, on the rare occasion that the umpire makes that call.  The symbol for this is simply a lower case “a” in front of the usual symbol.  For example “aD” would indicate an automatic ball called on what would have been the fourth pitch.&lt;br /&gt;&lt;br /&gt;There are more modifiers needed for strikes.  Called strikes receive no alteration, while a left bracket “[“ is put around the outside of a foul and a left brace “{“ is put around the outside of a swinging strike.  The foul symbol is not used with two-strike fouls, since by definition they could be nothing else.  Another modifier I use which can be applied to strikes of all kinds (except two-strike fouls) is circling the letter, which is used in case of a bunt attempt.  &lt;br /&gt;A couple of examples will hopefully make this pretty clear:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-q-zIR_9V1Eg/Tb2I3_Ogx3I/AAAAAAAAAwQ/GMbQP3Yn-LU/s1600/scoreOne.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-q-zIR_9V1Eg/Tb2I3_Ogx3I/AAAAAAAAAwQ/GMbQP3Yn-LU/s320/scoreOne.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The first pitch (A) is a garden variety ball.  The second pitch (B) is a called strike.  The pitcher is called for a rare automatic ball (aC) before what would have been the third pitch.  The fourth pitch (albeit the third actually delivered) is a foul.  The fifth pitch (E) is another foul.  The sixth pitch (F) is a ball, and the seventh pitch (G) another foul.  Finally, the batter flies to right on the eighth pitch, for which the pitch is not explicitly noted--the occurrence of a flyout is sufficient to demonstrate its existence.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-wvv60jkLvzk/Tb2I7b41FnI/AAAAAAAAAwU/HaF1Ow_O0Xg/s1600/scoreTwo.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://3.bp.blogspot.com/-wvv60jkLvzk/Tb2I7b41FnI/AAAAAAAAAwU/HaF1Ow_O0Xg/s320/scoreTwo.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In this plate appearance, the batter shows bunt on the first pitch (A) but takes a strike.  The second pitch is a standard ball (B), but the third pitch is a pitchout (C).  The batter swings and misses on the fourth pitch (D), and does it again on the fifth pitch for a strikeout.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-klqyBs3kY8E/Tb2I_V_j6VI/AAAAAAAAAwY/mMoJ1KEn1Qk/s1600/scoreThree.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-klqyBs3kY8E/Tb2I_V_j6VI/AAAAAAAAAwY/mMoJ1KEn1Qk/s320/scoreThree.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The batter attempts to bunt the first pitch, but he bunts through it for a swinging strike (A).  He attempts to bunt again on the second pitch, but this time he fouls it off (B).  He then takes a ball (C), fouls off a pitch (D), and eventually grounds back to the box.&lt;br /&gt;&lt;br /&gt;There are several different possible symbols for a strikeout that actually becomes an out in my system--a swinging strikeout, a called strikeout, a strikeout where the putout is something other than catcher unassisted, a strikeout on a missed bunt (that is a swinging strikeout on a bunt), and a strikeout on a two-strike foul bunt.  In these examples, I will not include the pitch sequence, since that has already been explained and it would just clutter the scorebox and distract from the out itself.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-95QOHN-c1PM/Tb2JCpZQWAI/AAAAAAAAAwc/SMu_qdNmYig/s1600/scoreFour.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://3.bp.blogspot.com/-95QOHN-c1PM/Tb2JCpZQWAI/AAAAAAAAAwc/SMu_qdNmYig/s320/scoreFour.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This is a standard swinging strikeout.  The solid dot is my universal symbol for an out; any time the batter-runner is retired at any point, the dot will appear somewhere within his scorebox.  This makes it much easier to quickly see how many outs there are in the inning, and also eliminates some potential confusion in cases in which a certain code could indicate an out or could indicate something else.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-0aJ2VSpbYfI/Tb2JEpPJjpI/AAAAAAAAAwg/7fDEUtKncRw/s1600/scoreFive.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-0aJ2VSpbYfI/Tb2JEpPJjpI/AAAAAAAAAwg/7fDEUtKncRw/s320/scoreFive.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And the inscrutable backwards K for a called third strike.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/-GYgBIbHmVuY/Tb2JG-N4W5I/AAAAAAAAAwk/M7RsS4_x1ls/s1600/scoreSix.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://1.bp.blogspot.com/-GYgBIbHmVuY/Tb2JG-N4W5I/AAAAAAAAAwk/M7RsS4_x1ls/s320/scoreSix.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Sometimes the scoring on a strikeout is something other than the standard catcher unassisted.  By far the most common is the catcher throwing to first for the out (23), although there are other possible and weird ways for this to occur.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-HsB7gtGXAr0/Tb2JJPCNCUI/AAAAAAAAAwo/iN2NAagNRuc/s1600/scoreSeven.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://4.bp.blogspot.com/-HsB7gtGXAr0/Tb2JJPCNCUI/AAAAAAAAAwo/iN2NAagNRuc/s320/scoreSeven.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This is the symbol I use for a foul bunt with a two-strike count, resulting in a strikeout.  As I’ll show later, the squiggly line is my symbol for a bunt on a ball in play, so the symbol applied to that of the strikeout has a clear meaning.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-PmwQqptSOkc/Tb2JKsAX-UI/AAAAAAAAAws/aEQ1YvpbhAo/s1600/scoreEight.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="173" src="http://2.bp.blogspot.com/-PmwQqptSOkc/Tb2JKsAX-UI/AAAAAAAAAws/aEQ1YvpbhAo/s320/scoreEight.jpg" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If a bunt is attempted but missed for a third strike (that is, the batter offered but did not make any contact), then the brace that indicates a swing for a non-third strike is included in the symbol above to distinguish it from  the more common third strike bunted foul.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-403679746219483824?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/403679746219483824/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/05/scoring-self-indulgence-pt-2-scoring.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/403679746219483824'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/403679746219483824'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/05/scoring-self-indulgence-pt-2-scoring.html' title='Scoring Self-Indulgence, pt. 2: Scoring Pitches and Strikeouts'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-q-zIR_9V1Eg/Tb2I3_Ogx3I/AAAAAAAAAwQ/GMbQP3Yn-LU/s72-c/scoreOne.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-5972062327195457228</id><published>2011-04-21T07:14:00.001-04:00</published><updated>2011-04-21T18:02:07.137-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='NFL'/><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Book Reviews'/><title type='text'>Wayne Winston's Mathletics</title><content type='html'>The "book reviews" on this blog are almost always a day late and a dollar short.  They are written and published long after the book, and my comments about them usually don't amount to a review but rather as a springboard from which to discuss other topics.  This one is no different.&lt;br /&gt;&lt;br /&gt;Wayne Winston is a professor of Decision Sciences at Indiana University's business school and a former consultant to the NBA's Dallas Mavericks.  He published &lt;u&gt;Mathletics&lt;/u&gt; in 2009 with the tagline "How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football."&lt;br /&gt;&lt;br /&gt;If you are a regular reader of this blog or similar material, do not buy this book expecting to learn a lot of new things about sabermetrics.  The sabermetric material is fairly standard, rudimentary type material--introductory-level discussion of run estimators, park factors, replacement level, the base/out table, win expectancy, and the like.  I would also not recommend it to a novice, not because it is poor (there are elements I like and dislike, as I'll discuss below), but because there are better resources out there--internet primers, Bennett and Fluck's &lt;u&gt;Curve Ball&lt;/u&gt;, and Lee Panas' &lt;u&gt;Beyond Batting Average&lt;/u&gt; among others.&lt;br /&gt;&lt;br /&gt;I am not particularly well-read on either football or basketball quantitative analysis, so I cannot definitively state the level of Winston's discussion on those topics.  My guess is that the football discussion is fairly basic (with the caveat that football analysis as a field lags behind apbrmetrics), but that the basketball material is much stronger.  It is certainly obvious from the writing that basketball is Winston's passion, and that the adjusted plus/minus ratings are a particular favorite.&lt;br /&gt;&lt;br /&gt;Winston's writing is not particularly strong--he writes like someone whose favorite class was math (as do I).  There are some minor slip-ups in the baseball discussion; these won't mislead the reader, but they also reflect the pedestrian nature of the material:&lt;br /&gt;&lt;br /&gt;* Winston includes a formula for estimating batting outs that accounts for ROE by putting a multiplier on at bats.  But this applies the adjustment to all at bats, including those in which we know a batter did not reach on an error (hits) and those in which the likelihood was very small (strikeouts).&lt;br /&gt;&lt;br /&gt;* He refers to Keith Woolner's statistic as VORPP--Value Over Replacement Player Points.  This makes sense in that he applies the replacement level concept to WPA points, but he also refers to Woolner's run based version as VORPP.  Additionally, he credits the concept of replacement level to Woolner.  In reality, Woolner did much to popularize replacement level, but the concept did not originate with him.&lt;br /&gt;&lt;br /&gt;* Similarly, he credits the concept of park factors to Bill James.  James had much to do with popularizing the notion that statistics could be corrected for park effect, but if any single person is to be credited with the concept, Pete Palmer would be an easy choice.&lt;br /&gt;&lt;br /&gt;* There is a chapter that discusses player improvement over time by comparing annual performance, but it does so without even really addressing aging and survivor bias.&lt;br /&gt;&lt;br /&gt;* The discussion of strategy is fairly bare-bones and deals only with basic estimates based on a standard run expectancy table.&lt;br /&gt;&lt;br /&gt;There are positive things of similar magnitude to the list of negatives--for example, while he uses Runs Created, he explains that a theoretical team construct is necessary to make accurate player comparisons.  As a whole, the baseball portion of the book is adequate without being excellent for a novice and a yawn for those well-versed in sabermetrics.&lt;br /&gt;&lt;br /&gt;Being a novice myself when it comes to football and basketball analysis, I found the discussion in those chapters much more interesting.  Focusing on a couple interesting football tidbits, Winston offers a version of the famed two-point conversion chart that incorporates the expected number of possessions remaining in the game.  There is also a formula for the probability of a successful field goal in the NFL based on distance that I found interesting, although the model produces results that are clearly too high for very long kicks.&lt;br /&gt;&lt;br /&gt;There is also a discussion of quarterback ratings, which have always interested me.  Like every other sane person, Winston has little use for the NFL system, focusing his discussion on Berri's rating from &lt;u&gt;Wages of Wins&lt;/u&gt; and his own adaptation of Brian Burke's regression of team categories against team wins.  Isolating the categories from Burke's equation that can be related directly to individual quarterbacks, Winston offers the following as a quarterback rating:&lt;br /&gt;&lt;br /&gt;1.543*(Yards - Sack Yards)/(Attempts + Sacks) - 50.0957*(Interceptions/Attempts)&lt;br /&gt;&lt;br /&gt;If you factor out and ignore the 1.543 coefficient, and change the second quantity's denominator to (Attempts + Sacks), this can be rewritten as:&lt;br /&gt;&lt;br /&gt;(Yards - Sack Yards - 32.47*Interceptions)/(Attempts + Sacks)&lt;br /&gt;&lt;br /&gt;In this form, Winston's rating is very similar to a number of rating formulas, including the NEWS rating published by Bob Carroll, John Thorn, and Pete Palmer in &lt;u&gt;The Hidden Game of Football&lt;/u&gt;:&lt;br /&gt;&lt;br /&gt;NEWS = (Yards - Sack Yards - 45*Interceptions + 10*Touchdowns)/(Attempts + Sacks)&lt;br /&gt;&lt;br /&gt;Breaking into editorial mode and stepping away from &lt;u&gt;Mathletics&lt;/u&gt; for a moment, the treatment of a touchdown pass can be thought of as somewhat analogous to the sacrifice fly in baseball.  The comparison is strained as touchdown pass is always a positive play from any perspective, while a sacrifice fly might actually reduce run expectancy. &lt;br /&gt;&lt;br /&gt;A fairly large number of touchdown passes occur on short passes.  Suppose a quarterback completes a three-yard touchdown pass.  This will actually reduce his rating in Winston's ranking, as the quarterback's rating prior to the touchdown will be higher than three.  By giving a positive weight to all passing touchdowns, one could ensure that a touchdown pass always increases ranking.&lt;br /&gt;&lt;br /&gt;However, in doing so, one gives special treatment to the touchdown because it is a tracked category (like sacrifice flies).  However, one could also track "sacrifice grounders" or "first down completions".  These theoretical categories would also be cases in which a positive or somewhat positive outcome was achieved, but the statistics treat it as a negative (a batting out or a reduction of the passer's rating, assuming the completion was short).  Giving special treatment to the recorded categories can thus be seen as unhelpful and biased by particular types of players that might be predisposed to one or the other.&lt;br /&gt;&lt;br /&gt;Moving back to the book, most of my comments to this point have focused on the negatives.  However, there are three things that Winston does really well:&lt;br /&gt;&lt;br /&gt;1. Winston provides downloadable spreadsheets for many of the examples.  This allows the reader to follow along with the work and to learn how to carry it out in Excel.  Many of the Excel steps are explained in the text as well.&lt;br /&gt;&lt;br /&gt;The drawback to this is that some of the why behind the math is glossed over in favor of a quick Excel solution.  Winston's rating system for NBA and NFL teams basically boil down to finding the best-fitting solution for a system of linear equations to predict the point margin in each game.  Winston doesn't explain the math in that manner, though, instead just explaining that the Excel solver is used to minimize error.  While this gives the reader enough detail to produce their own ratings, and no one is actually going to solve hundreds of equations, I personally prefer a stronger emphasis on the underlying math.&lt;br /&gt;&lt;br /&gt;2. The bibliography is excellent, as it includes not just a list of sources but descriptions of what they offer.  For example, this is the description of Phil Birnbaum's &lt;a href="http://sabermetricresearch.blogspot.com/"&gt;Sabermetric Research&lt;/a&gt; blog:&lt;br /&gt;&lt;br /&gt;&lt;i&gt; This is perhaps the best mathletics blog on the Internet.  Sabermetrician Phil Birnbaum gives his cogent review and analysis of the latest mathletics research in hockey, baseball, football, and basketball.  This is a must-read that often gives you clear and accurate summaries of complex and long research papers.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;3. Winston's description of Birnbaum's blog provides a nice transition into discussing the best thing about his approach.  While Winston has excellent academic credentials (he is a professor of Decisions Sciences at Indiana and earned a PhD at Yale in Operations Research), but he does not beat you over the head with it.  In fact, I don't think that his doctorate is ever explicitly referenced.&lt;br /&gt;&lt;br /&gt;In any event, Winston mixes the research of other academics into his text, but he gives plenty of space to amateurs as well.  Some academics that enter the sports arena seem to thumb their nose down at anyone who doesn't hold an advanced degree or a teaching position.  Winston is not one of them.  He even used one of Birnbaum's posts to offer a counterpoint to an academic paper on the NFL draft.&lt;br /&gt;&lt;br /&gt;Winston's book provides a great example of how sabermetric knowledge generated by academics, amateurs, and everyone in between can be integrated, and how all parties can respect and learn from each other.  It also gives analysts specializing in each sport a window into the work being done on other sports.  Thanks to those attributes, &lt;u&gt;Mathletics&lt;/u&gt; is a worthwhile read.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-5972062327195457228?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/5972062327195457228/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/04/wayne-winstons-mathletics.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5972062327195457228'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5972062327195457228'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/04/wayne-winstons-mathletics.html' title='Wayne Winston&apos;s Mathletics'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-3627697620487751467</id><published>2011-04-13T00:25:00.005-04:00</published><updated>2011-04-14T18:30:35.272-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Offense'/><category scheme='http://www.blogger.com/atom/ns#' term='Book Reviews'/><title type='text'>Comments on Baseball Prospectus 2011</title><content type='html'>At some point it becomes bad sport to write the same thing about an annual book--if there’s a certain characteristic of the book that you find yourself dissatisfied with several years running, it might be a you problem.  It’s one thing to decide that a certain book is not for you; it’s another to continue to believe that it will when it’s obvious that the writers have something else in mind.&lt;br /&gt;&lt;br /&gt;Much of what I could say about the &lt;u&gt;Baseball Prospectus&lt;/u&gt; annual for 2011 is the same as I said about in 2010, and 2009…and so I’ll try to avoid saying it again.  By now, it’s clear that BP is what it is, and that can either be a great thing or a bad thing or a mostly good thing, depending on your perspective.  My perspective is that it’s mostly a good thing--the redeeming qualities of the book outweigh its flaws fairly easily from my perspective.&lt;br /&gt;&lt;br /&gt;I still felt compelled to jot down a few comments on the book this year because I might have been a little unfair in nitpicking a few things in the past.  Now that there is a lot of new blood on board, it’s more apparent that some of the issues (like stats not matching up between the comments and the data directly above) are systematic, and probably endemic to producing a book of this kind.  To put together a tome of that size in a few months is a massive undertaking, and there are thousands of moving parts, so expecting them all to be dialed in to the same setting is unrealistic.&lt;br /&gt;&lt;br /&gt;The cover still has the infamous phrase that I will not repeat about PECOTA; this is obviously out of the hands of the writers.  They do redeem the cover with a great caption under the little photo of Albert Pujols.&lt;br /&gt;&lt;br /&gt;That being said, I do have one major bone to pick with the new, slimmed down statistical offerings.  It’s great that they stopped doubling up on metrics that measure the same thing (in the past, there have been simultaneous displays of VORP and WARP, or EqA and MLVr), and with one glaring exception the new stat lines still manage to give you most of the key metrics.  That glaring exception is the lack of any kind of component ERA (or RA, which I’d prefer anyway) figure for pitchers.  &lt;br /&gt;&lt;br /&gt;It’s not simply a matter of limiting your choice to vanilla, while having to leave chocolate, strawberry, and cookies and cream aside (after all, there are a lot of &lt;a href="http://walksaber.blogspot.com/2010/07/flavors-of-component-era.html"&gt;flavors&lt;/a&gt; of component ERA).  There is none whatsoever.  Instead, BP has listed Fair RA, which is a fine metric constructed by Colin Wyers and the primary input for pitcher WARP.  But if the choice is between having Fair RA and a component ERA in a book that is largely aimed toward predicting performance in 2011, it’s not a choice at all.  Sticking with metrics under the BP umbrella, peripheral ERA and SIERA would fit the bill.&lt;br /&gt;&lt;br /&gt;Of course, if I could strike any category from the pitcher stat line to clear space, it wouldn’t be Fair RA--it would be W-L or saves or WHIP.  But since a big target audience for the book is fantasy players, that is not an option.  However, it leaves everyone (including fantasy players) without a backwards looking metric that gives us the best estimation of how the pitcher’s overall effectiveness in the past.  I certainly hope that they will figure out a way to include Peripheral ERA or SIERA or something similar in the 2012 edition.&lt;br /&gt;&lt;br /&gt;PECOTA is in good hands with Colin Wyers, and I’m sure there are still some bugs to be worked out, so please take this comment as more amusement than criticism: some of the PECOTA comps seem way off.  I’m sure this happened in the past, and I didn’t bother to make note of it, but two players that really stood out to me were Gregor Blanco and Nick Franklin.  Blanco’s top comps are Richie Ashburn, Kenny Lofton and Freddy Guzman.  One of these things is not like the other, and two of them are nothing like Gregor Blanco (Lofton was still in the process of breaking out, but had already established himself as clearly better).  The Franklin comps are more understandable since he’s a younger player with less of a track record, but it’s still an odd juxtaposition to see a player ranked as the #44 prospect in MLB while his top comps are identified as Adrian Beltre (ok), Hank Aaron and Willie Mays.&lt;br /&gt;&lt;br /&gt;There are only a few team entries that have extensive sabermetric (as opposed to applied sabermetric) content.  One of these is the Arizona entry, and sadly I have a bone to pick with it.  The author accepts the mainstream view that Arizona’s copious strikeout totals in recent campaigns had doomed their offense.   He (or she; I still maintain it would be more interesting to know which author is responsible for the team entry) asserts that “when the majority of the lineup falls prey to empty at-bats of this sort, highly volatile run-scoring can result.”&lt;br /&gt;&lt;br /&gt;While there have been some studies done on the relationship between shape of offense and scoring distribution, I am personally unaware of any comprehensive or well-established enough to make a statement like that without the need for supporting evidence.  The only statistic brought in to support that position is that Arizona scored three or more runs per inning as much as the NL average, but scored two or less more often.&lt;br /&gt;&lt;br /&gt;That is a very odd and not particularly helpful way to break down innings, because it lumps scoreless innings in with one and two run innings.  To be absurd for a moment, if an offense never scored three or more runs an inning, and scored 0-2 in 100% of their innings, but 40% of those were one run and 10% were two runs, they would average a healthy 5.4 runs per game.  It is true that Arizona scored in a smaller proportion of their innings than did the average NL offense--25.9% of Arizona innings resulted in a run scored compared to 26.5% for the league as a whole.  But Arizona was more likely to have a multi-run inning (12.4%) than the average NL team (12.2%).  &lt;br /&gt;&lt;br /&gt;Another odd thing about this perspective is that it makes the inning the unit by which scoring volatility is measured.  It’s true that the best perspective from which to understand how runs are scored is the inning level, since the events that transpire in each inning is independent of those that occurred in previous innings in terms of scoring in runs (I hope it’s clear that I’m talking about baserunners and outs from one inning affecting each other, not lineups turning over and pitchers being removed and the like, but you never can tell) but from a win/loss perspective, it is the run distribution per game that is crucial.  Admittedly, the two are very closely related, but any time you extend the time period over which such volatility is projected, its impact is reduced. &lt;br /&gt;&lt;br /&gt;One crude but simple and reasonably sensible way to consider the win value of a team’s per game scoring distribution is a method that I call Game Offensive Winning Percentage (gOW%) and have published here for the last three years.  It is based on a Bill James idea; instead of estimating an OW% from average runs scored per game, use the team’s actual distribution of runs scored.  If in a given season teams that score one run win 11.8% of the time (as they did in 2010), then credit the offense with .118 wins for each game in which they score exactly one run.  Repeat for all scoring levels and average and you have an alternative OW%.&lt;br /&gt;&lt;br /&gt;There are of course flaws with this method--the unit of games doesn’t always represent the same things (i.e. there are not always 27 outs per game), the use of the actual W% by runs scored in any given season is subject to sample size fluctuations, there is no adjustment for park, etc.--yet it’s still reasonable to think that if a team’s run distribution was particularly unusual, it would manifest itself in a comparison of gOW% to standard OW% based on average runs per game (in this case, without a park adjustment so as to better match gOW%).&lt;br /&gt;&lt;br /&gt;The Diamondbacks led the NL in strikeouts in 2009 and 2010 and were second in 2008.  In 2007, they ranked eleventh (and made the playoffs, see!), so those three seasons are the relevant high strikeout seasons for the team.  In 2008, Arizona’s gOW% was .485 while their OW% was .479--considering their run distribution rather than just their average suggests an additional win.  In 2009, it was .484/.483--no difference.  In 2010, the split was .492/.502, which is -1.6 wins.  So for the three years considered together, the net total is -.5 wins.&lt;br /&gt;&lt;br /&gt;Of course, this does not conclusively demonstrate that Arizona’s offense was as efficient as a typical offense with their scoring average, and it certainly doesn’t allow us to make any statements about the effect of high strikeout offenses generally.  However, neither does anything offered or referenced in the BP essay, yet the author chose to make much stronger assertions than I would dare to here.&lt;br /&gt;&lt;br /&gt;My comments on strikeouts should not be taken as a negative judgment of the book as a whole--my book “reviews”, such as they are, generally serve as an opportunity to discuss issues raised by the author rather than to offer a summary judgment on the book itself.  By now, you already know whether BP is a book for you or not.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-3627697620487751467?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/3627697620487751467/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/04/comments-on-baseball-prospectus-2011.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3627697620487751467'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3627697620487751467'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/04/comments-on-baseball-prospectus-2011.html' title='Comments on Baseball Prospectus 2011'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-5714378935954774480</id><published>2011-04-05T01:06:00.000-04:00</published><updated>2011-04-05T01:06:00.537-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Scorekeeping'/><title type='text'>Scoring Self-Indulgence, pt. 1</title><content type='html'>When I have occasion to write something on paper, I usually use a pen.  It’s easier that way--ball-point pens are ubiquitous and cheap; you can sign things with them; and now that the hideous scourge of blue ink has faded a bit, they no longer result in an assault on one’s sensibilities every time they are used (okay, that last one should say “my sensibilities”).  In truth, I like pencil better, specifically a mechanical pencil with .5 lead.  I use the real cheap Bic ones exclusively, and have for years--you know, the ones that are supposed to be disposable, but you can hold the clicker down and push the replacement lead in through the top.  You can get ten of them at Wal-Mart for $2.&lt;br /&gt;&lt;br /&gt;The pluses of the ball-point pen allow me to save that favorite writing utensil for only the most important tasks, ones that just can’t be entrusted to the terrifying permanence of ink.  For most of the winter, it sits undisturbed on my bookshelf or in a pencil holder or wherever--but sometime in March, I have occasion to take it out and put it to use, and I don’t stop until mid-autumn.  &lt;br /&gt;&lt;br /&gt;You have probably surmised by now that the important task to which I refer is scorekeeping.  Yes, the existence of internet gametrackers have made the collection of data for one’s own perusal something less than a necessity if one would like access to real-time information on a game, and to the extent that people do want to keep their own score, electronic applications are pushing pencil and paper aside.  And admittedly, those of us who keep score not just at the ballpark but in the privacy of our own homes have always been a rare breed and prime targets for the nerd label.  &lt;br /&gt;&lt;br /&gt;Still, I have no intention of giving up scorekeeping in the foreseeable future.  It is still true that if you want something done right, you have to do it yourself.  GameDay may have all of the information I need, but it (cannot yet at least) be customized to display it in the exact manner I have become accustomed to.  If you want to save it for posterity, a GameDay printout lacks any sort of sentimentality whatsoever.  And I might be part of a dying breed, but if I want to give my full undivided attention to the ballgame, the last thing I need to be doing is puttering around on the computer between pitches.&lt;br /&gt;&lt;br /&gt;If this reads as a half-hearted defense of scorekeeping, I have accomplished what I set out to do with this post.  For one thing, I don’t really need to justify my hobby to you; I just feel compelled to put in a good word for the practice every once in a while.  I’ve never understood why announcers sometimes feel compelled to give you basic information about the sequence of plays in a game--information that they are tasked with providing--by prefacing it with “If you are keeping score at home…”  Of course, this is a hanging curveball set up for the announcing partner, who gets to jump in and make a snide comment about what kind of deviants would be doing that.  Considering that those of us that keep score are the least likely subset of fans to turn the game off when it’s 14-2 in the bottom the eighth...&lt;br /&gt;&lt;br /&gt;But the other reason that it can be difficult to espouse the virtues of scorekeeping is that scorekeeping is a very personal pursuit.  Everyone has their own technique, their own special symbols built around the familiar position numbers that have united the vast majority of scorecards from the 1890s or so on.  (Except for the early twentieth century occasional flip-flopping of 5 and 6 for third base and shortstop).  This makes it difficult to generalize--I might say that I love keeping score because I could quickly get a precise count of how many balls the hapless Ranger pitches had thrown while Neftali Feliz waited in the bullpen…but your scoresheet might not tell you that.  Instead, it might tell you who won the sausage race.&lt;br /&gt;&lt;br /&gt;There are displays of the variety and innovation in individual scorekeeping out there online, but not to an extent that I consider sufficient, so last year I asked people to send me their scoresheets for posting on my scorekeeping blog, &lt;a href="http://weeklyscoresheet.blogspot.com/"&gt;Weekly Scoresheet&lt;/a&gt;.  Several people graciously accepted my invitation, but I was foolish enough to make the initial request during the offseason, when even compulsive scorekeepers weren’t particularly likely to have an example sitting around. (*)  So if you’re interested in sharing now, please send me an email.&lt;br /&gt;&lt;br /&gt;Weekly Scoresheet has a whopping total of six subscribers on Google Reader, which is completely understandable--a personal scorekeeping blog is a vanity blog, plain and simple.  Unfortunately, I haven’t been updating it recently because I no longer have a scanner at home, and it just isn’t a big enough priority for me to buy one even though this is 2011 and they are cheap.  Eventually, Weekly Scoresheet will be back in full swing to bore the five people who read it with my own chicken-scratched records of ballgames.&lt;br /&gt;&lt;br /&gt;In the meantime, though, I’ll be using this space to occasionally run a tutorial on my scoring system and walking through a sample game from the 2010 season.  Calling it “my scoring system” is a misnomer--there's certainly nothing groundbreaking about it and most of the symbols are drawn from other people’s systems--but the great thing about scorekeeping is that the precise combination of data you record, codes you use, and the like is fairly unique.  I would not encourage anyone to learn to score a game in the way I do, not because I don’t think it’s a decent system but because I would encourage you to organically develop your own that fits your needs and interests as a baseball fan.  As such, an explanation and tutorial is ultimately just a way to fill up space and pad the post count.  I’ll enjoy writing it; if you enjoy reading it, then much the better.&lt;br /&gt;&lt;br /&gt;(*) Sadly, I have to admit that I spend a not insignificant free time in February doodling scoring of imaginary games, using Excel to design some new scoresheets that I’ll never use because I continue to use the same basic sheet (and I do mean basic) that I have for over a decade, wondering why the calendar can’t turn to March so that spring training games can be scored, and other such pursuits.&lt;br /&gt;&lt;br /&gt;I will begin simple, with a look at one of my blank scoresheets--it's just a bunch of empty blocks in a 9x9 grid.  I originally made this in the DOS Brief text editor prior to the 1998 season, using a lot of “_” and “|” symbols.  The lines weren’t solid, so I eventually traced them down for 1999 or so.  Later I would scan it as a PDF and touch it up a little bit in Photoshop, but it still has non-perfect lines which to me grants it a little character that you don’t get from using a computer to draw the lines precisely.  I have a couple facsimile versions created in Excel (which I use now for any new scoresheets I make--it might not be as capable graphically as some other programs but making grids is something that spreadsheet software does very well), but there’s nothing like the original article for me.  &lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-YD97O-HUdLw/TZiZfo8lWQI/AAAAAAAAAwI/5_FkW7g2-Gk/s1600/blanksheet.jpg" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="400" width="303" src="http://2.bp.blogspot.com/-YD97O-HUdLw/TZiZfo8lWQI/AAAAAAAAAwI/5_FkW7g2-Gk/s400/blanksheet.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;* Why a 9x9 grid?  You do realize that the average team doesn’t even use half of those scoreboxes in a game, right?&lt;br /&gt;&lt;br /&gt;One of the great things about the Project Scoresheet is that it pioneered the use of numbered boxes rather than a box for every batter to hit in every inning.  This was a great way to conserve space, but it also makes it harder to view the inning as a standalone unit.  I prefer to see each inning on its own.  Yes, a team batting around is a mild inconvenience, and splits up an inning into two columns, but I’ve never understood why some people freak out about and start crossing off the inning headings and pushing each inning down a column.&lt;br /&gt;&lt;br /&gt;* No room for statistical lines (AB-R-H-BI and the like)?&lt;br /&gt;&lt;br /&gt;Nope. I don’t think that standard compiled statistics for a single game tell you much of anything, and the sort of boilerplate box score is something that is very easy to obtain online (although some of them &lt;a href="http://walksaber.blogspot.com/search/label/Yahoo%20Box%20Scores"&gt;aren’t so accurate&lt;/a&gt;).  I’m already sacrificing space by using the 9x9 format; I don’t want to waste any more on stat lines.  Plus, if you fill them in as the game goes on, it’s another distraction and you have a bunch of ugly tallies on your sheet.  If you wait until the game is over to finish your scoresheet, then it’s just work.&lt;br /&gt;&lt;br /&gt;* No diamonds?&lt;br /&gt;&lt;br /&gt;Nope, I’ve never liked them.  They’re great if you just want a quick snapshot of where the runners are, but if you’re trying to record a lot of detail on how runners advanced, they get in the way.  I split the box up into four corners for each of the bases, but I don’t think a visual aid like a diamond is necessary to accomplish this.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-5714378935954774480?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/5714378935954774480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/04/scoring-self-indulgence-pt-1.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5714378935954774480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/5714378935954774480'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/04/scoring-self-indulgence-pt-1.html' title='Scoring Self-Indulgence, pt. 1'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-YD97O-HUdLw/TZiZfo8lWQI/AAAAAAAAAwI/5_FkW7g2-Gk/s72-c/blanksheet.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-6868558622508619671</id><published>2011-03-30T00:02:00.004-04:00</published><updated>2011-03-30T00:02:00.786-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Predictions'/><title type='text'>2011 Predictions</title><content type='html'>There are many great things to be said about internet publication.&amp;nbsp; It’s free, it’s instantaneous and any boring person like myself can get his thoughts out there and have them found by interested parties.&amp;nbsp; The unfortunate thing about it is that any idiot can find it too.&amp;nbsp; If you write in print or for a price, idiots still read your work (or worse yet, get secondhand accounts that substitute for reading), but there is a higher barrier to entry than a lucky Google search.&lt;br /&gt;&lt;br /&gt;No matter how many disclaimers you put on a post, the reader can always ignore them.&amp;nbsp; The reader can quote you out of context by means of a simple copy and paste that any semi-coordinated five-year old could manage.&amp;nbsp; No matter how careful you are, there is a decent chance that your disclaimers will be ignored and your words will be turned on their head.&lt;br /&gt;&lt;br /&gt;Making the disclaimer more strenuous will do no good, naturally, but it’s the only option available to the blogger.&amp;nbsp; If you find this site via a Google search and fail to read and understand the next few sentences, that's on you, not me.&lt;br /&gt;&lt;br /&gt;These predictions are offered in the spirit of fun, not science or anything resembling science.&amp;nbsp; They are one man’s opinion, one man’s wild guess at an unknowable future.&amp;nbsp; If you do not believe that future is unknowable, if you think that you are blessed with some special insight that allows you to predict the outcome of pennant races, you are likely incapable of understanding this point, but nevertheless: even if you could predict exactly how many runs a team would score and allow over the course of the upcoming season, you still could do no better than predict their win total to within a standard error of four games.&lt;br /&gt;&lt;br /&gt;Of course, some folks predictions will be more accurate for a single season.&amp;nbsp; Some will be better over the long haul, too, although expecting a persistent and consistent performance is folly.&amp;nbsp; It is quite possible that I am poorer than most at this exercise.&lt;br /&gt;&lt;br /&gt;Understand, though, that the picks that follow are the product of one man’s feeble mind.&amp;nbsp; They do not in any way, shape or form reflect the predictions of “sabermetrics”.&amp;nbsp; They are not based on any sabermetrically rigorous procedure; I do some very crude estimates of team runs scored and allowed based on freely available projections, but then I substitute my own opinion wherever I feel.&amp;nbsp; If you want predictions based on a more rigorous sabermetric procedure, you need to visit &lt;a href="http://www.baseballprospectus.com/fantasy/dc/"&gt;Baseball Prospectus&lt;/a&gt;&amp;nbsp;or the &lt;a href="http://www.rlyw.net/index.php/RLYW/direct/the_2011_diamond_mind_projection_blowout_-_american_league_edition"&gt;Replacement Level Yankee Weblog&lt;/a&gt;&amp;nbsp;or somewhere, anywhere else--not this blog.&lt;br /&gt;&lt;br /&gt;Another point I make every year but which is hopelessly lost on the mental midgets who invariably stumble upon this page is that the format distorts my true feelings.&amp;nbsp; I have limited myself to the format of ordinal standings rank not because it is a better format, but because to do what is more telling, offering expected wins and probabilities, requires the use of a rigorous procedure to be remotely credible, and then it becomes harder to shrug it off as a fun, seat-of-the pants exercise.&lt;br /&gt;&lt;br /&gt;Given the constraints of the ordinal prediction format, one is forced to predict which team will win the NL Central, and which will finish second, and all down the line.&amp;nbsp; In reality, I have no idea who will win the NL Central.&amp;nbsp; I believe that three clubs--Cincinnati, Milwaukee, and St. Louis--are essentially equal in their chances.&amp;nbsp; A fourth, Chicago, could certainly win if things broke their ways.&amp;nbsp; And while it may be unlikely, one cannot give true zero odds to the possibility of a miracle in Houston and Pittsburgh.&amp;nbsp; Add it all up, and I wouldn’t be willing to give you better than a 35% chance that any single team wins.&amp;nbsp; But making an ordinal prediction gives the impression that the predictor believes that the team listed first has an expected outcome of first place. &amp;nbsp;The intelligent reader and prognosticator understand the difference and don’t need to have it spelled out.&lt;br /&gt;&lt;br /&gt;Anyway, I’ve done all I could, spilling 650 words on why what follows should not be taken seriously.&amp;nbsp; Someone with Google will invariably stumble upon this and be unwilling or unable to understand anyway.&amp;nbsp; Oh well:&lt;br /&gt;&lt;br /&gt;AL EAST&lt;br /&gt;&lt;br /&gt;1. Boston&lt;br /&gt;2. New York (wildcard)&lt;br /&gt;3. Tampa Bay&lt;br /&gt;4. Baltimore&lt;br /&gt;5. Toronto&lt;br /&gt;&lt;br /&gt;I agree with the consensus view--the Red Sox look to be the best team in baseball.  Their offense should be great, their starting pitching might have only one truly safe bet in Jon Lester but also features four guys with the potential to pitch really well, and their bullpen has three big arms (albeit perhaps with two empty heads) at the back.  Too much is being made of the Yankees' pitching concerns, I believe.  There are a lot of teams expected to contend with question marks at the back of their rotation, but since we're conditioned to see New York run out four guys with track records, it looks worse than it is.  I was very tempted to pick the Rays second, but it would be a dishonest pick for shock value.  This should still be a very competitive ballclub, and there a couple divisions I probably would pick them to win.  The mainstream is spending too much time fretting about losing Pena, Garza and Bartlett and not enough recognizing that their performances in 2010 were far from irreplaceable.  Crawford is obviously a much bigger loss, as is their bullpen's splendid 2010 performance which couldn't be replicated even if they were all still in St. Pete.  The Orioles are not nearly as close as they seem to think they are, but I thought they'd be better last year and meeting those expectations could be enough for fourth place.  I like Alex Anthopoulos' moves as much as the next guy, but I liked much of what Jack Zduriencik did last winter too.  I don't think the Blue Jays project to be a better team than they were last year, and their homer-happy offense is almost surely unsustainable.  If you had to fall into the silly cliché trap and  pick “this year's Mariners”, you could do worse than Toronto.&lt;br /&gt;&lt;br /&gt;AL CENTRAL&lt;br /&gt;&lt;br /&gt;1. Chicago&lt;br /&gt;2. Minnesota&lt;br /&gt;3. Detroit&lt;br /&gt;4. Cleveland&lt;br /&gt;5. Kansas City&lt;br /&gt;&lt;br /&gt;I've had it out for the White Sox organization for a long time, mostly focused on their manager.  It's more a personality thing than a baseball thing, but I'd be lying if I said that I've always been able to successfully separate the two.&amp;nbsp; So it is not with any sort of relish that I predict they'll win the AL Central.  But their starting pitching is excellent, and while I wouldn't want their 2013 offense, their 2011 offense should be fine.  The Twins did nothing, which is usually not a great sign.  If Justin Morneau is healthy, I'm not sure he's a great bet to provide much more value than he did in 2010, when he was arguably the best hitter in the league for half a season.  My crude spreadsheet actually gives Minnesota an insignificant edge over Chicago, so I see these two clubs as quite close.  The Tigers are spending money without any particular target in mind, or at least that is the impression they give off.  They are a contender in this division, but they represent the second tier.  The second tier is small because it does not include the Indians or the Royals.  Neither pose much of a threat, and I won't rehash my thoughts on the Tribe here.  Dayton Moore's farm system is a universally-acknowledged jewel, but the man has yet to provide much evidence that he is capable of constructing a winning major league team, although this wasn't really the year to try,  One can easily imagine five-star prospects surrounded with the Jeff Francoeurs and Melky Cabreras and Pedro Felizs of 2014 (that would be a fun exercise--who is the next Jeff Francoeur?  Who was the last Jeff Francoeur?)&lt;br /&gt;&lt;br /&gt;AL WEST&lt;br /&gt;&lt;br /&gt;1. Texas&lt;br /&gt;2. Oakland&lt;br /&gt;3. Los Angeles&lt;br /&gt;4. Seattle&lt;br /&gt;&lt;br /&gt;The Rangers are my reluctant pick--I love to pick against pennant winners without sterling regular season records, because the mainstream halo effect for playoff success is far too strong.  Like the talk of the Twins dangling Liriano, the decision to keep Feliz in the bullpen gives off a sense of complacency and overconfidence.  However, Texas does look like the strongest club on paper. Of course, games aren't played on paper, and so if the great clubhouse influence of the sainted team-player Michael Young is dispatched, they will plummet to 100 losses.  The A's made an effort, but their offense still makes them hard to pick.  They also don't have a strong Buster Posey candidate to emulate the 2010 trick of their baymates.  The big splash Tony Reagins promised for the Angels was fulfilled if you think about “big splash” in terms of Olympic diving.  The Pythagorean fairy better bring some extra pixie dust.  The Mariners have to score more runs, don't they?  Don't they?&lt;br /&gt;&lt;br /&gt;NL EAST&lt;br /&gt;&lt;br /&gt;1. Atlanta&lt;br /&gt;2. Philadelphia (wildcard)&lt;br /&gt;3. New York&lt;br /&gt;4. Florida&lt;br /&gt;5. Washington&lt;br /&gt;&lt;br /&gt;Four aces = unbeatable!  The pitching fetish that still looms large in the conventional narrative demands that tribute be paid to the Phillies, but would it be sacrilege to point out that pitchers often get hurt, and an offense built on a bunch of players on the wrong side of thirty whose best player is a major injury question might be a little vulnerable?  Just checking.  The Braves have been my irrational NL pick for some time now; since they finally rewarded my faith with a playoff berth, there's no reason to stop now.  The bullpen is due for some regression, but the starting pitching looks fine and they should score some runs.  The Mets look like a .500 team, which means the ratio of lamenting how bad they are to reality will be way out of whack.  The Marlins seem stuck in neutral, even if they hadn't traded Uggla; even if you believe in success cycle theory, it's no one's birthright to win the World Series on a six-year cycle.  This might be the MLB division with the  most delusional owners; the Mets apparently put all their eggs in one basket, Loria thinks his world title is two years overdue, and the Nationals think that Jayson Werth + Stephen Strasburgh + Bryce Harper = 2012 contention without a lot of downside risk.  Good luck to all.&lt;br /&gt;&lt;br /&gt;NL CENTRAL&lt;br /&gt;&lt;br /&gt;1. St. Louis&lt;br /&gt;2. Milwaukee&lt;br /&gt;3. Chicago&lt;br /&gt;4. Cincinnati&lt;br /&gt;5. Pittsburgh&lt;br /&gt;6. Houston&lt;br /&gt;&lt;br /&gt;My crude projections have some kind of borderline irrational love for the Cardinals.  Perhaps it's the fact that they ignore fielding; perhaps there's some truth there.  Picking them to win after the Wainwright injury in an already close division may seem foolish, but what the heck?  At least I can point to the fact that it's not alone—PECOTA also puts St. Louis on top, albeit by an insignificant margin.  The Brewers have shown a tendency to go for it with gusto by trading for starting pitchers, and while I wouldn't want my team doing the same, it could very well pay off in this division; and for the moment, fate has smiled upon them.  The Cubs probably aren't as close to winning this division as the Garza trade suggests they think, but a lot went wrong for them last year (some of it of their own volition, mind you), obscuring the fact that in 2009 they were kind of in the mix.  My own intuition tells me that the picks for this division are scrambled, but the spreadsheet really doesn't like the Reds that much, and I can see why.  Their offense still has issues at short and in left, Scott Rolen was a big contributor in 2010 but at this stage isn't the most reliable guy around, Joey Votto isn't really Albert Pujols Jr., and I'm not really a Drew Stubbs believer.  They don't really have a leadoff hitter (which I point out not because the leadoff role is particularly important but because they either don't have or don't trust high OBA guys without power), and you still have to wonder about Dusty's ability to push the right the buttons if things don't go according to plan.  The starting pitching has solid depth, but they also lack front line starters barring a Cueto or Volquez breakout.  They are certainly the team I've picked fourth in a division that I think has the best chance to win it--again, that's the inherent peril in using an ordinal standing prediction approach.  It should go without saying that they're much closer to  St. Louis on paper than they are to Pittsburgh.  The Pirates should not lose 100 games again, and the Astros might be as good of a bet as any team in the game to do so.&lt;br /&gt;&lt;br /&gt;NL WEST&lt;br /&gt;&lt;br /&gt;1. San Francisco&lt;br /&gt;2. Los Angeles&lt;br /&gt;3. Colorado&lt;br /&gt;4. San Diego&lt;br /&gt;5. Arizona&lt;br /&gt;&lt;br /&gt;I really don't like picking the Giants.  Winning the World Series doesn't wipe the fact that their playoff berth was not secured until the final Sunday of the season out of existence, and their offense is still quite suspect.  However, the Rockies didn't do much to improve; I wouldn't be surprised to see serious regression from Carlos Gonzalez, and my crude projection attempt sees them as a .500 team.  The Dodgers could surprise some people since expectations are not particularly high.  Their pitching staff should be one of the best of the NL (at least before park adjustments), but there are a number of weak/questionable spots on offense.  The Padres obviously traded away their best player without much coming back to help them in 2011; still, they should be respectable.  The Diamondbacks may think that shedding strikeouts is the magic elixir to increased offense, but you still should expect to score more runs with Mark Reynolds as your third baseman than Melvin Mora.&lt;br /&gt;&lt;br /&gt;WORLD SERIES&lt;br /&gt;&lt;br /&gt;Boston over Atlanta&lt;br /&gt;&lt;br /&gt;While I think they are far from a juggernaut (just as I thought during last year’s postseason), I would have to admit that the Phillies are the team most likely to win the NL pennant and the NL East.&amp;nbsp; However, my estimation of their chance to do so is so much lower than that of the mainstream that to pick them to win and thus replicate everybody's prediction would be boring.  I almost feel the same way about Boston--the scattered talk about winning 110 games is irrational.  However, I think Boston stands out from the pack more than Philly, and picking against your head twice is just a waste of everyone's time.  My crude standings projections have the Giants second in the NL behind the Phillies, but picking them would be even worse in terms of overvaluing the previous postseason and assuming repeats, so the mantle falls to the Braves.&lt;br /&gt;&lt;br /&gt;AL Rookie of the Year: SP Jeremy Hellickson, TB&lt;br /&gt;It doesn't feel like he should be eligible, but he is.&lt;br /&gt;AL Cy Young: Jon Lester, BOS&lt;br /&gt;Let's try this again.  I picked him last year and he had a fine season, but not a spectacular one.  &lt;br /&gt;AL MVP: 1B Adrian Gonzalez, BOS&lt;br /&gt;I always like to pick a newcomer for MVP if possible (voters love shiny things), and Fenway should help the mainstream perception of his performance if nothing else.&lt;br /&gt;NL Rookie of the Year: 1B Freddie Freeman, ATL&lt;br /&gt;Having a job is half the battle.&lt;br /&gt;NL Cy Young: Clayton Kershaw, LA&lt;br /&gt;I was going to pick Zack Greinke, but missing a few starts is a hurdle to overcome.&lt;br /&gt;NL MVP: SS Troy Tulowitzki, COL&lt;br /&gt;You know that the writers are just dying for an excuse to give this to Ryan Howard.&lt;br /&gt;&lt;br /&gt;First manager fired: Jim Riggleman, WAS&lt;br /&gt;Spending all that money on Jayson Werth leads me to conclude that the powers that be in Washington think they're a stronger club than they are.&lt;br /&gt;Best pennant race: NL Central&lt;br /&gt;Worst pennant race: AL West&lt;br /&gt;Worst team in each league: KC, HOU&lt;br /&gt;Most likely to go .500 in each league: OAK, CHN&lt;br /&gt;Team in each league most likely to disappoint mainstream consensus: DET, CIN&lt;br /&gt;Team in each league most likely to surprise mainstream consensus: TB, LA&lt;br /&gt;Most obnoxious stories of the year: Michael Young, Pujols' contract status, Nolan Ryan taught the Rangers pitchers to do X, Josh Hamilton, whether various starters turned relievers should be used in a sane manner (Chapman, Joba, Feliz)...basically the Texas Rangers, win or lose.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-6868558622508619671?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/6868558622508619671/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/03/2011-predictions.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6868558622508619671'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6868558622508619671'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/03/2011-predictions.html' title='2011 Predictions'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-378387840095977254</id><published>2011-03-22T01:21:00.008-04:00</published><updated>2011-03-22T01:21:00.095-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Park Factors'/><title type='text'>Historical Park Factors, 1901-2008</title><content type='html'>I have posted an updated spreadsheet with park factors for all teams, 1901-2008, as a &lt;a href="https://spreadsheets.google.com/pub?hl=en&amp;hl=en&amp;key=0AnPJbQnlHhRHdFRJRHFTTzM5X1BHajQyRkE3d29XZGc&amp;output=html"&gt;Google Spreadsheet&lt;/a&gt;. These are five-year park factors, calculated in the same manner I describe on &lt;a href="http://gosu02.tripod.com/id103.html"&gt;this page&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The guiding philosophy was to try to include as much data as possible. If there are five possible years of data to be used for a park, they will all be used, even if four of the seasons were in the past or in the future. The source of the raw data was KJOK’s excellent park database for past seasons and various sources (most notably Baseball-Reference) for recent seasons.&lt;br /&gt;&lt;br /&gt;I treat a park as new if there are major changes to the dimensions, but I did not by any means do a complete historical survey to find out when those changes have taken place, so some that probably should have been treated differently are not. If you have specific data on when a change should have (or shouldn’t have) been made, feel free to leave a comment and I will try to incorporate these changes when I update the chart some time in the future.&lt;br /&gt;&lt;br /&gt;Additionally, when a team moves, and a new team immediately moves in (for example, the Senators of ’60 and ’61), this is treated as a new team. Also, in cases in which teams have played a significant (which I defined as around ten or more) number of games in a different stadium in the same year, those years are treated as being a new park (an example is the Dodgers playing games in New Jersey the two years before they moved from Brooklyn). Whenever a “new park” of this sort is established, when the old order is restarted it is treated as another new park.&lt;br /&gt;&lt;br /&gt;The reason the park factors are only shown through 2008 is that my ideal data set is two previous years, the year in question, and two future years. For most of the parks active in 2009, we will after 2011 be able to fill this dataset, and so I don’t want to publish a park factor now and change it later. However, there are parks where the 2008 or 2007 factors are not yet settled because they are new and there are not yet five years of data available. In these cases, I have listed a PF but marked it as one that will change in the future (this is indicated with an orange shading; park factors for the first year after a switch are in pink text).&lt;br /&gt;&lt;br /&gt;Now I will give an example of how I chose the years to be considered in figuring the PF. Suppose we look at the Diamondbacks, who have played in Bank One Ballpark since 1998. In 1998, we have no previous data, but there is four future years of data, so the sample is 1998-2002. For 1999, there is one previous year, so we also look at three future years, and get 1998-2002. For 2000, there are two previous years, so we use two future years, and have a sample of 1998-2002. This is now in the ideal format--the year in question, plus the two immediately prior and future years. Of course, in 2001, we use the two previous years (1999 and 2000), and two future years (2002 and 2003), making the total sample 1999-2003, and it will continue in that manner until something changes.&lt;br /&gt;&lt;br /&gt;Let’s also consider the end of the Braves’ tenure in Fulton-County Stadium. The last season there was 1996. For 1994, we have two previous years (‘92 and ‘93) as well as two future years (‘95 and ‘96), so we use 1992-1996. For 1995, we have just one future year, so we use three previous years, and also use 1992-1996, and the same for 1996.&lt;br /&gt;&lt;br /&gt;In the previous iteration of these park factors, there were three recent parks for which I needlessly inserted a changepoint and thus changed the factors for the surrounding seasons.  These teams were pointed out by Terpsfan101, and I have corrected their PFs in this edition.  They are Detroit, 1994-1999; Minnesota, 1989-1995; and Seattle, 1993-1996.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-378387840095977254?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/378387840095977254/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/03/historical-park-factors-1901-2008.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/378387840095977254'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/378387840095977254'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/03/historical-park-factors-1901-2008.html' title='Historical Park Factors, 1901-2008'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-361446397488136474</id><published>2011-03-08T01:02:00.002-05:00</published><updated>2011-03-27T14:15:46.186-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Indians'/><category scheme='http://www.blogger.com/atom/ns#' term='Predictions'/><title type='text'>2011 Indians “Preview”</title><content type='html'>If you close your eyes and dream just a little bit, you can see the Indians contending in a division without a clear favorite.  A healthy Grady Sizemore and a healthy Carlos Santana teaming up with the overlooked Shin-Soo Choo to form a secondary average all-star wrecking crew at the heart of the offense.  Maybe Matt LaPorta lives up to his potential, just a little, and provides average first base output.  Michael Brantley is able to get on base, and Asdrubal Cabrera fields well, and second and third base are not complete black holes.  Meanwhile, the bullpen is anchored by Chris Perez, with Rafael Perez and Tony Sipp combining to give the Tribe two solid left-handed relievers.  A couple of other somewhat promising righties take bullpen spots and run with them.  That team is no threat to win 100 games, but it could win 87 games.&lt;br /&gt;&lt;br /&gt;“But wait”, you say.  “Your rosy scenario said nothing about the starting pitching.”  Oops.  And that is the problem with the 2011 Indians in a nutshell; they might not wind up having a good offense or a good bullpen, but it's not that hard to envision how it might happen.  On the other hand, it's very hard to see the starting pitching performing in a manner that could make Cleveland a legitimate threat in the AL Central.&lt;br /&gt;&lt;br /&gt;Carlos Santana will be the catcher, and is said to be back to normal after recovering from the knee injury that prematurely ended his rookie season.  Even assuming that to be reality, it's easy to see Santana being something of a disappointment to fans in 2011 after hitting .261/.408/.469 in 187 major league PA.  Whether those expectations are met or not, Santana should be on of the Tribe's best offensive players and have the bat to carry first base, where he is slated to see some time.&lt;br /&gt;&lt;br /&gt;The competition to be his backup will continue throughout the spring.  Lou Marson started last season as the Indians catcher, but my money would be on him beginning the season in Columbus.  He's still young enough that the organization would presumably like to get him some regular playing time rather than the rare chances that figure to come behind Santana.  That leaves an assortment of uninspiring choices (veterans Paul Phillips and Luke Carlin and youngster Juan Apodaca).  I would be very surprised if it was Apodaca, and the fact that Marson is already on the 40-man roster may wind up winning him the position after all.&lt;br /&gt;&lt;br /&gt;At first base, the hope that Matt LaPorta could be the impact hitter that would really make the CC Sabathia trade payoff have essentially been extinguished--it's now just a question of whether he can be an adequate major league hitter at the position.  The jury is still out on that, but his former prospect status still has at least another year or so with which to tease observers.  LaPorta has seen time in left field in the past, but appears anchored to first base now.&lt;br /&gt;&lt;br /&gt;Second base figures to be a black hole for the team as it has been since Asdrubal Cabrera moved across the bag for good.  Luis Valbuena played himself out of Cleveland's plans last season, and while he is in camp and ostensibly competing for the job, the odds against him are staggering.  If the team wouldn't promote Santana at the outset of the 2010 season, you can forget about seeing prospect Jason Kipnis (or Lonnie Chisenhall at the hot corner, although he is considered to be farther away anyway).  That leaves Jason Donald, who'll likely instead be slotted at third base, and non-roster invitee Adam Everett as options (and let's be honest--if you're not going to use Everett's shortstop glove, you're not going to put him in your everyday lineup). &lt;br /&gt;&lt;br /&gt;Thanks to this grim outlook, the Indians decided to bring in a veteran to at least give a steady presence at the position--Orlando Cabrera.  Cabrera really doesn't offer much more than his name at this point, but given the choice of watching Luis Valbuena or just about any human being other than Muammar Gaddafi, things could be worse.&lt;br /&gt;&lt;br /&gt;Third base is the other position where the team has thrown their hands up.  Jayson Nix was thought to be the favorite, and will almost certainly be on the team one way or the other, but his late season trial at the hot corner was scary from a fielding standpoint, and the man has a career 77 OPS+ in over 700 PA, so it's not as if there's a tradeoff being made between fielding and offense.  Donald now appears to have the inside track on the position, but he too leaves much to be desired offensively.  Jared Goedert, who hit 27 homers between Akron and Columbus last year, will also get a look, but at 26 and without much a previous track record, he's more suspect than prospect.  Jack Hannahan is also in camp as an option of last resort.&lt;br /&gt;&lt;br /&gt;Cord Phelps, like Chisenhall and Kipnis, could be a midsummer or later addition to the team.  Phelps is trained to play some third as Kipnis has surpassed his as the second baseman in waiting, but third base puts him in competition with Chisenhall, so he may really be preparing to take a utility job down the line.  The Indians will have decent flexibility, as Cabrera and Donald can both back up short and second; there will be no need for an Everett type to be tacked on the roster solely for Asdrubal Cabrera's days off.&lt;br /&gt;&lt;br /&gt;The outfield picture is much clearer than the infield, particularly if Sizemore is able to perform.  Right field belongs to Shin-Soo Choo, who is not really appreciated by the home fans anymore than the general consensus.  If you tried to tell a typical Tribe fan that Choo was roughly comparable in value to Carl Crawford in 2010, they'd laugh you off as most people from Chicago or San Francisco would as well.  Sizemore will likely not be ready for Opening Day; apparently two weeks into the season is the target, but personally I don't expect to see him before May.  The brass seems to be committed to playing him in center when he returns, but left is a possibility with Michael Brantley sliding over to center as he will in Sizemore's absence.  I remain a Brantley skeptic, as he has yet to display a hint of power; he will need a .350ish OBA to be an asset.  His .270 BABIP suggests some bad fortune, but given that his major league OBA is .313 in over 400 PA, it's going to take more than a few hits dropping in to make him a legitimate major league outfielder.&lt;br /&gt;&lt;br /&gt;I was not particularly pleased that Austin Kearns was brought back; his hot start obscured the fact that he was a below average hitter in a corner position.  He does provide a right-handed bat to platoon with Brantley, and he's borderline playable in center (as is Choo), so he brings a bit more to the table than Shelley Duncan.  Duncan is left-handed, more of a liability in the field, and has performed no better at the plate.  He'll probably make the team anyway and also see time at DH, particularly on days Hafner is scheduled to sit and a righty is pitching.  Travis Buck, signed to a minor league deal, is a more intriguing option, similar but more useful in the field than Jordan Brown (a LF/1B).  Both are left-handed, which is problematic given Brantley's presence.  &lt;br /&gt;&lt;br /&gt;Trevor Crowe would be another candidate for the bench, but he is held back by an injury and may not be ready for some time.  Ezequiel Carrera, Chad Huffman, and Nick Weglarz are also in camp but don't figure to win a spot on the team.&lt;br /&gt;&lt;br /&gt;Travis Hafner is the DH, and it's worth noting that he still contributes to the offense.  5.7 RG is still an acceptable and even slightly above-average performance from a DH.  His production was similar in 2009--5.9  RG, which was actually a little worse relative to the league.  The problem with Hafner as a player is that he requires frequent days off due to nagging elbow problems (only 118 games in 2010 and 94 in 2009).  The real problem with Hafner as an asset is his massive contract, which still has two years and $26 million to go (one could argue that his platoon splits make him a liability against southpaws, too, I suppose).  However, the typical Tribe fan carries on about Hafner the hitter as if he's completely useless, unable to separate judgment of the contract and the memory of the hitter he was from a fair assessment of the hitter he is.&lt;br /&gt;&lt;br /&gt;Nick Johnson was a late addition to the first base/DH mix on a minor league deal.  Usually I'd be very excited about Johnson joining my team, but with the Tribe not going anywhere there's really no reason to throw obstacles in the path of letting LaPorta take 600 PA.  Johnson can also fill in at DH, and gives the Indians the potential to have a number of Secondary Average beasts (joining Sizemore, Choo, and Santana); of course, concerns about his effect on opportunities for other players have a high probability of being rendered moot given Nick the Stick's injury history.&lt;br /&gt;&lt;br /&gt;The Indians' starting rotation is pretty well set, barring injury; this can be a good thing if you're the Phillies or the Giants, and a bad thing if you have Mitch Talbot and Carlos Carrasco locked into starting jobs.  Fausto Carmona will take the ball on opening day; there has been so much written about him that  I feel completely incapable of providing any analytical insight.  Carmona stands out from other pitchers that have had flukish Cy Young type seasons (Joe Mays and Esteban Loaiza come to mind in the last decade) because he does have the stuff that makes you want to overlook the low strikeout rates and believe that he could be an exceptional pitcher with an unusual profile.  But believing doesn't make it so, and Carmona is a groundball pitcher typecast as an ace by a team desperate for an ace that doesn't have very good infield defense.&lt;br /&gt;&lt;br /&gt;Left-handed batters have remained the scourge of Justin Masterson, and as such he might well be best suited for a relief role.  However, of all teams Cleveland needs to try to resist that temptation as long as possible.  While as a rule I'd usually support not giving up on a pitcher's rotation potential until absolutely necessary, it is even more imperative in the case of Masterson and the Indians.  The team is thin in starters, and three of the team's top pitching prospects (Alex White, Jason Knapp, and Nick Hagadone) are considered to be potential future relievers.&lt;br /&gt;&lt;br /&gt;Mitch Talbot is a perfectly acceptable back of the rotation filler type who'll likely slot third in this club's rotation because his 159 innings and 14 RAR in 2010 constitute a track record relatively.  Carlos Carrasco's stuff has always outranked his results, but he pitched well in September and will thus be given a rotation spot with little resistance this spring.&lt;br /&gt;&lt;br /&gt;Fifth starter options are numerous, but most lack potential to ever be much more.  Josh Tomlin and Jeanmar Gomez are forgettable righties, David Huff is the last man standing of the Indians once prodigious collection of left-handed soft-tossers with the rade of Aaron Laffey..  Hector Rondon was one of the team's top prospects but has been stopped cold by injuries, while deadline trade swag Zach McAllister and Corey Kluber figure to be in the mix to fill in during the summer, but not break camp with the club.  The aforementioned White has more upside and will also be a possibility late in the campaign.&lt;br /&gt;&lt;br /&gt;The wildcard in the fifth starter competition is Anthony Reyes, who will be given every opportunity to win the job.  Reyes was acquired in 2008 and started 2009 in the rotation before requiring Tommy John.  The Indians would love to see him physically able to pitch thanks to his experience if nothing else, but it is pretty clear that he will not be ready for Opening Day.  The team was linked to veteran free agents Kevin Millwood, Jeremy Bonderman, and Bartolo Colon; the latter signed with the Yankees, sadly denying Tribe fans of seeing a former contributor return home and perhaps fall flat on his face with a flair not seen in these parts since Juan Gonzalez' one-game appearance in 2005.  The former two remain on the market but each day diminishes the likelihood that they will turn up in Cleveland.  &lt;br /&gt;&lt;br /&gt;Recent developments have made the bullpen picture clearer than one might expect under the circumstances.  Chris Perez will be the closer, and is a good bet to disappoint; his .234 BABIP should go up, and even without using DIPS-inspired metrics, his peripheral RA was a good three-quarters of a run above the actual figure.  None of this is to say that Perez will be bad, only that one should not expect a repeat performance of the second half, in which he was a lockdown closer.&lt;br /&gt;&lt;br /&gt;Rafael Perez and Tony Sipp give the bullpen a pair of left-handers who tease their potential to be more than specialists while often proving combustible on the mound.  Chad Durbin, briefly an Indian in 2003-04 (~ 60 innings), has been signed to a major league contract and thus will be the team's primary middle innings reliever.  The Durbin signing may seem unnecessary given the club's outlook, but it appears as if giving Manny Acta some hope of a stable, experienced right-handed middle reliever outweighed other considerations.&lt;br /&gt;&lt;br /&gt;Those four are locks; two others, Jensen Lewis and Joe Smith, are strong possibilities.  Both were signed to contracts rather than being non-tendered, which leads one to believe that they will be pitching in Jacobs Field this year.  Smith is a pleasure to watch if nothing else, and Lewis' childhood Indians fandom allows the fans a degree of warm fuzzies.  &lt;br /&gt;&lt;br /&gt;The final spot appears to be reserved for a long reliever; if that is the case, two pitchers jump to the front: Justin Germano and Joe Martinez.  Both have started and relieved in their careers; Germano has the upper hand as his BABIP gave the impression of effectiveness in 35 innings with the Tribe in 2010.  &lt;br /&gt;&lt;br /&gt;If the team is willing to eschew a long man requirement, Frank Herrmann and Vinnie Pestano will be very much in the mix.  Herrmann is a 27 year old righty without good stuff; his performance over 45 big league innings was average but he lived on a low walk rate (1.8 per nine) and a low strikeout rate (4.8).  The fall off the high wire could be ugly.  Pestano throws hard and impressed in his September callup; his AA and AAA performance was impressive as well (1.81 ERA, 77/16 K/W in 59 innings).  It would be nice to see the team go with Pestano, as there really aren't any right-handers with power arms to set up Perez (I don't want to give the impression I have a fetish for power arms--”effective” arms would have worked too).&lt;br /&gt;&lt;br /&gt;Other relievers who could appear as the season progresses include Josh Judy, Jess Todd (definitely stuck in neutral at this point), Nick Hagadone (if moved to the pen), Bryce Stowell and Zach Putnam (who'll immediately become my least favorite Indian of all-time).  There are a number of arms with potential in that group, and so it's possible that the Tribe bullpen could be a lot more exciting by September and perhaps positioned to be more effective in 2012.&lt;br /&gt;&lt;br /&gt;Last year I predicted a 74-88 season, which was five wins too many; I've learned my lesson and will go with 72-90 this year.  It's dangerous to attempt to project further down the road, but I could see 2012 as a similar season to 2011 with some more young players breaking in, with 2013 as a season in which to make a push.  But by that point the contract status of players like Sizemore and Choo will be an issue, and so it's best to keep an even keel and say that it's not at all clear when Cleveland's next contender will take the field. &lt;br /&gt;&lt;br /&gt;A best case scenario (really more like 90th percentile)?  Let's say 81-81.  The starting pitching makes it tough to go much higher even if one assumes favorable health for the offense and an effective bullpen.&lt;br /&gt;&lt;br /&gt;Worst case?  Any time you expect to win 72 games, you could lose 100 if things don't go your way.  I don't think this is the worst team in the majors, but only a homer could argue that median expectation wouldn't put them in the bottom third or quarter of the league.&lt;br /&gt;&lt;br /&gt;Sadly, this is an organization that needs a break from a public relations standpoint.  The performance of the organization has been a disappointment over the last three seasons without question, but the fickleness of the fanbase has been a disproportionate response.  Indians fans have every right to demand better from the organization, but when just four years ago your team was one win away from the World Series, it is unbecoming to act as fans of a club that has spent a generation planted in the second division.  Perhaps more than any other fanbase, Indians rooters have swallowed the vision of baseball as hopeless for small markets hook, line, and sinker--even when they could simply look to the city's other major league franchises and see that salary caps can neither compel “homegrown” players to stay home (even when the team wins a lot of games and the player is actually from the area) or ensure that an organization will put a decent team on the field even once in a decade.  The pathetic, feeble whining about a ballclub by denizens of a declining, corrupt city is a bit much for me.&lt;br /&gt;&lt;br /&gt;C: Carlos Santana, Luke Carlin&lt;br /&gt;IF: Matt LaPorta, Orlando Cabrera, Jayson Nix, Adsrubal Cabrera, Jason Donald (Shelley Duncan in lieu of Johnson)&lt;br /&gt;OF: Michael Brantley, Grady Sizemore, Shin Soo-Choo, Austin Kearns, Shelley Duncan (Travis Buck in lieu of Sizemore) &lt;br /&gt;DH: Travis Hafner&lt;br /&gt;SP: Fausto Carmona, Justin Masterson, Mitch Talbot, Carlos Carrasco, Josh Tomlin&lt;br /&gt;RP: Chris Perez, Rafael Perez, Tony Sipp, Chad Durbin, Jensen Lewis, Joe Smith, Vinnie Pestano&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-361446397488136474?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/361446397488136474/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/03/2011-indians-preview.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/361446397488136474'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/361446397488136474'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/03/2011-indians-preview.html' title='2011 Indians “Preview”'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-7331003910423998587</id><published>2011-03-01T00:08:00.002-05:00</published><updated>2011-03-01T17:39:31.769-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Fielding'/><category scheme='http://www.blogger.com/atom/ns#' term='Win Shares'/><category scheme='http://www.blogger.com/atom/ns#' term='Book Reviews'/><category scheme='http://www.blogger.com/atom/ns#' term='Positional Adjustments'/><category scheme='http://www.blogger.com/atom/ns#' term='Pitching'/><title type='text'>Comments on Bill James Gold Mine 2010, pt. 2</title><content type='html'>2. Defensive Win Shares and Loss Shares&lt;br /&gt;&lt;br /&gt;James has revamped Win Shares over the last couple of years to include Loss Shares.  I think this is a very good thing, although I look forward to when (if?) the entire methodology is published.  Without the full explanation, it's dangerous to comment about isolated details, but James' essay on "Explaining Defensive Win Shares to a Dead Sportswriter" is tough to ignore.  My Twitter-friendly take on it: He's going to have trouble explaining it to a lot of people, not just dead sportswriters.&lt;br /&gt;&lt;br /&gt;Again, it's impossible to evaluate the method while knowing so little about it, but James makes this extraordinary statement:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Making outs increases the team's responsibility to play defense.  When you make more outs, that increases the team's responsibility to play defense.  Therefore, if two players are the same in the field but of them makes more outs, the one who makes fewer outs has to come out ahead when you compare the player's defense contribution to his defensive responsibility.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Lest you think that was just a slip, he doubles down:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;While we are in the habit of thinking of offense and defense in baseball as un-connected, they are in fact not un-connected.  There is a very important connect between them, which is the rule that for every out you make on offense, you must record an out on defense.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Bill James is obviously a very intelligent man, and you a very intelligent reader, so I am hesitant to respond to this--the response should write itself.  Limiting myself to a paragraph or less, I suppose it is technically true that each out on offense is matched by a defensive out, barring walkoffs and rainouts and the like.  But there is no causation between the two.  The rules of the game require three outs per inning and nine innings per team.  Each team makes 27 outs regardless of the rate at which they use them (think OBA) or any other factor.&lt;br /&gt;&lt;br /&gt;An individual who makes outs at a higher rate than some comparison player does not increase the number of outs that his defense must record.  The defense must record 27 outs regardless of what an individual does at bat.  What does happen is that by consuming excess outs, the individual batter leaves less outs to be consumed by the other eight members of his lineup, and fails to generate additional plate appearances for them.  &lt;br /&gt;&lt;br /&gt;James later seems to suggest that the revamped DWS-LS system assigns the same responsibility to field to each position, regardless of where it stands on the defensive spectrum.  He then states his objection to offensive-based positional adjustments, and so it seems as if the stuff about making outs might be a backdoor way of applying positional adjustments.  It's unclear, though, and still doesn't follow logically.&lt;br /&gt;&lt;br /&gt;James’ discussion of positional adjustments also seems to gloss over the use of defense-based positional adjustments or the fact that most of us who still use offensive positional adjustments do so because we believe they provide a ballpark estimate of the defensive differences between the positions.  When I use an offensive positional adjustment, I'm not saying that I think a shortstop with a 5 RG is a better hitter than a first baseman with a 5 RG.  What I am saying is that the difference between aggregate offensive performance between shortstops and first baseman (when considered carefully and over a long period of time) approximates the inherent difference in defensive value.&lt;br /&gt;&lt;br /&gt;You are certainly free to reject that argument (and many sabermetricians that I respect very much do just that), but please recognize that the sabermetrician using an OPADJ is likely not making the claim that a player's offensive contribution is altered by his fielding position.&lt;br /&gt;&lt;br /&gt;More important than my own positional adjustment folly is an apparent failure by James to recognize that the positional adjustments that are now used most prominently in the community (generally Tango's, which have made their mark on the PADJs used in WAR figures from both Chone and Fangraphs) are based on estimates of the defensive difference between positions, sometimes informed by offensive averages.  Furthermore, the sources do not lump the positional adjustment into the offensive ranking--they break everything (offense, fielding, baserunning, position, etc.) into smaller components, which are then summed to produce RAA, WAR, or some other total value metric.  &lt;br /&gt;&lt;br /&gt;Again, it is possible that I have misunderstood James' point, or that he has done a poor job of expressing himself, and that DWS is completely logical.  However, I think it is going to take a much more thorough explanation of the system to give people that read the &lt;u&gt;Gold Mine&lt;/u&gt; piece a lot of confidence in his methodology.&lt;br /&gt;&lt;br /&gt;3. Strikeout rate&lt;br /&gt;&lt;br /&gt;One of the most thought-provoking essays is "Whiff 7", which discusses the phenomenon of strikeout rates continuing to reach all-time highs.  James argues that there is no end to this in sight under current conditions, as teams have an incentive to find power pitchers but no disincentive to find batters that avoid striking out.  James argues that the standard deviation of power (he doesn't use that terminology) has decreased over time, and so league homer rates have gone up while the top individual performers hit about as many homers as they did in previous eras.  &lt;br /&gt;&lt;br /&gt;James then offers some suggestions of rule changes that would slow or reverse the trend.  It's an interesting piece, and it didn't prod me to respond to it directly, but rather to make a tangential and mostly unrelated point about how we measure strikeout rates--a wholly unoriginal and stale one at that.&lt;br /&gt;&lt;br /&gt;I have for a long time advocated using K/PA rather than K/IP as the measure of pitcher strikeout proficiency (I’m not claiming this is unique, as others have carried that banner with  much more vigor and coherent arguments than I have offered).  Through no effort of mine, the use of K/PA has increased in the sabermetric community, with sites like The Hardball Times and Fangraphs prominently utilizing K/PA.  &lt;br /&gt;&lt;br /&gt;As an example of how the different denominators can change perception, consider the point that most long-term successful pitchers have at least average strikeout rates.  This is a point that the average fan still mystifyingly misses a great deal of the time.  Take Greg Maddux for example.  Maddux is apparently seen by some as a non-strikeout pitcher.  Here is a table with his K/9 versus the league average, with KAA being strikeouts above average per inning:&lt;br /&gt;&lt;br /&gt;&lt;a href="https://lh3.googleusercontent.com/-m57p14HNsRc/TWm_JX5wI3I/AAAAAAAAAvg/3pZZ-mDskyk/s1600/maddux1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="https://lh3.googleusercontent.com/-m57p14HNsRc/TWm_JX5wI3I/AAAAAAAAAvg/3pZZ-mDskyk/s320/maddux1.jpg" width="204" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;For his career, Maddux struck out 6.1 per nine, while the league average was 6.4.  He struck out 206 less batters than an average pitcher would have in the same number of innings.  Without seeing the same figure for a lot of pitchers, it's hard to contextualize that, admittedly.&lt;br /&gt;&lt;br /&gt;Suppose that instead you look at Maddux through K/PA:&lt;br /&gt;&lt;br /&gt;&lt;a href="https://lh4.googleusercontent.com/-Vt32j-wuFTU/TWm_LA9yD3I/AAAAAAAAAvk/wo8v9NG4WyU/s1600/maddux2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="https://lh4.googleusercontent.com/-Vt32j-wuFTU/TWm_LA9yD3I/AAAAAAAAAvk/wo8v9NG4WyU/s320/maddux2.jpg" width="205" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Now Maddux' strikeout rate is essentially average--he struck out 17% of opposing batters, the same as the league average.  Maddux' career rate is lower (it's actually 16.5% to 16.6%), but just barely so, and by this metric he only recorded 22 less strikeouts than average.&lt;br /&gt;&lt;br /&gt;In Maddux' peak years (I think 1992-98 stand out), he was above-average even by K/9--+90 KAA, while he was an even more robust +196 when K/PA is the standard.&lt;br /&gt;&lt;br /&gt;This is not intended to recast Maddux as a strikeout fiend--certainly he was not, even at his best.  Still, Maddux' strikeout rate is more impressive when viewed in light of the number of opposing batters he actually faced rather than in terms of innings pitched, which really is just a measure of the percentage of outs a pitcher gets via the K rather (this is obscured by displaying strikeouts per 9 innings rather than strikeouts per 27 outs).&lt;br /&gt;&lt;br /&gt;In addition to K/9, there are several other per-inning pitching ratios in common usage--H/9, W/9, HR/9, WHIP.  What all of those have in common is that they are ratios of bad things (offensive successes) to good things (outs recorded).  K/9 is a ratio of really good things (outs recorded by strikeout) to another set of plain old good things that includes the really good things (total outs recorded).  As such, it's best viewed as a measure of a pitcher's reliance on strikeouts.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-7331003910423998587?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/7331003910423998587/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/03/comments-on-bill-james-gold-mine-2010.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7331003910423998587'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7331003910423998587'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/03/comments-on-bill-james-gold-mine-2010.html' title='Comments on Bill James Gold Mine 2010, pt. 2'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh3.googleusercontent.com/-m57p14HNsRc/TWm_JX5wI3I/AAAAAAAAAvg/3pZZ-mDskyk/s72-c/maddux1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-7814481187859687362</id><published>2011-02-21T03:06:00.000-05:00</published><updated>2011-02-21T03:06:00.529-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Rankings'/><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Book Reviews'/><category scheme='http://www.blogger.com/atom/ns#' term='Pitching'/><title type='text'>Comments on Bill James Gold Mine 2010, pt. 1</title><content type='html'>I quite enjoyed the third edition of the Bill James &lt;U&gt;Gold Mine&lt;/u&gt;, even though I didn't get around to reading it until a few months after it was published.  It jogged some thoughts, which lead to this post, which is not fully based on James' essays but on the semi-related paths they sent my mind down.  To me, that is one of the tests of a really good sabermetric work--does it get you thinking, even if not about the exact topics covered?  James' book passed that test for me.&lt;br /&gt;&lt;br /&gt;However, I do think that the book would be stronger if it contained more of James' essays and less "statistical nuggets".  The nuggets were of less interest to me, and seemed to be present in lesser quantity than they were in the first two editions of the book.  The reverse was true for the essays, and those are what compel me to buy the book.  Not being a subscriber to Bill James Online, I'm not positive about this, but I believe that James writes a number of additional essays in each year that are not included in the book.  &lt;br /&gt;&lt;br /&gt;If that is indeed the case, I believe that they'd be much better off to collect all of Bill's essays in the &lt;u&gt;Gold Mine&lt;/u&gt;, and leave the nuggets for the individual to drudge up themselves online.  Not only does the website lend itself more to the statistics (the data there is much more extensive than what can be printed in the book even if the book were the size of one of the old &lt;U&gt;Great American Baseball Stat Books&lt;/U&gt;) and the essays to the printed page, but if there are any folks out there who still refuse to use the Internet and are interested in James, I'd think they'd be more enticed by the essays.  A book of just the essays, with some other filler of some sort, would have a character not unlike that of the 1990-1992 &lt;U&gt;Baseball Books&lt;/u&gt;, which I liked very much.&lt;br /&gt;&lt;br /&gt;Of course, since it appears that the book is not even being published in 2011, those suggestions are for naught.&lt;br /&gt;&lt;br /&gt;I have three subjects to touch on, two of which could be considered critiques and one of which is just a good old-fashioned tangent.  This post went a lot longer than I originally intended so it's been broken up into two portions:&lt;br /&gt;&lt;br /&gt;1. Starting pitcher rankings&lt;br /&gt;&lt;br /&gt;The longest essay in the book deals with a system to rate starting pitchers based on where they place among other starters in each of their league seasons.  James first ranks pitchers by Season Score (*), and then assigns points based on the pitcher's standing in the league.  Each league season has 5.5 points per team available.  In a fourteen team league, the top ranked pitcher gets 12 points, the #2 ranked pitcher gets 11 points, and so on down to the #11 pitcher who gets 2 points.  There are also three-point bonuses, up to nine points per season, available for truly historic seasons.  The resulting metric is called Strong Season points.  &lt;br /&gt;&lt;br /&gt;(*) James does not give the formula for Season Score in the article, but explains that is based on W, L, IP, ERA, K, W, and SV.  "The point of the system is to evaluate a pitcher's record without context"..."This was a way of trying to say 'How good are the numbers themselves?', rather than 'How good was the pitcher who compiled these numbers?'".&lt;br /&gt;&lt;br /&gt;Personally, I'm not sure that I have a whole lot of interest in rankings of pitchers based on a method that deliberately ignores context (and James certainly does not deny the importance of context).  Setting my objection aside, though, it seems to me as if the Season Score is yet another result of a process that James has repeated over the course of his career: the re-invention of Approximate Value.  Of all of his methods, my impression is that there is none that James personally likes more than AV.  Even Win Shares is in some respects a return to AV--while it attempts to adjust for everything, it still expresses the result in an integer.  The scale is higher than that of AV (a 20 AV would be an extraordinary season, while 20 WS is good but ordinary).&lt;br /&gt;&lt;br /&gt;And so after attempting to adjust for everything, it seems James still had a void in his own toolkit, and so he filled it with the Season Score.  &lt;br /&gt;&lt;br /&gt;Digression aside, James found that a career total of 43 strong season points marks a fairly clear line for the Hall of Fame in retrospect.  Only five pitchers retired for a significant length of time have more than 44 points and are not in the Hall--Vida Blue, Bert Blyleven, Ron Guidry, Carl Mays and Billy Pierce.  James says that Blyleven and Guidry (60 points) are the only two pitchers that were far above 43 yet are excluded from Cooperstown.  (Blyleven has been elected since James wrote the book and I wrote this post, obviously). &lt;br /&gt;&lt;br /&gt;Since Guidry's is the most surprising result of James' survey, I'll take a closer look at him.  I do not intend the discussion about Guidry to be a commentary on his Hall worthiness or even his value, but rather as a means of discussing the issue I have with the strong season method.  It is important to note that James does not in any claim that the strong season method must be used in ranking pitchers, that it is better than any methods X, Y, and Z, or any such thing.  James does not argue that Guidry should be in the Hall of Fame because of his showing in the system.&lt;br /&gt;&lt;br /&gt;Guidry earned points for six seasons in James' analysis--1977-79, 1982-83, and 1985.  Suppose we apply James' method, but use a different metric--a simple Runs Above Replacement, figured using total runs allowed and adjusted for park.  How many points would Guidry earn under such a system?&lt;br /&gt;&lt;br /&gt;* James ranked Guidry #6 in the AL in 1977, which is worth seven points.  I have him #7, worth six points.&lt;br /&gt;&lt;br /&gt;* James and I both have Guidry #1 in 1978 with an extraordinary season for 12 points (James awards the 9 point bonus, and I'll do so as well to keep things comparable).  Guidry turned in 101 RAR, seventeen more than the next closest pitcher and nine more than any other AL pitcher in any of these six years.&lt;br /&gt;&lt;br /&gt;* James had Guidry #3 in 1979 for ten points.  I have him second, for eleven points.  &lt;br /&gt;&lt;br /&gt;* James ranks Guidry #11 in 1982 for two points.  I have him all the way down at #26.  His RA was 4.22 in a league in which 4.5 runs were scored per game, and he pitched in a moderate pitchers' park (.97 PF).  At 34 RAR, he is eleven runs behind the eleventh-place pitcher (Geoff Zahn, 45).  Presumably Season Score gives Guidry a boost because of his 14-8 record, one of the most impressive in the league (seventh in the league in Win Points).&lt;br /&gt;&lt;br /&gt;* James ranks Guidry #4 in 1983 for nine points; I have him #6 for seven points.&lt;br /&gt;&lt;br /&gt;* James ranks Guidry #2 in 1985 for eleven points; I have him #11 for two points.  This is another season in which Guidry's W-L record seems to give him a huge season score boost (22-6).&lt;br /&gt;&lt;br /&gt;Add it all up, and I have Guidry at 47 points--suddenly not that far above the Hall of Fame line James observed.  I followed his scoring method exactly, but the results changed significantly simply by changing ranking methods.&lt;br /&gt;&lt;br /&gt;More interesting, IMO, is how the use of in-season rank elevates the importance of very small performance differences.  In 1979, Guidry ranked second in RAR at 71.  However, Tommy John (71) and Jerry Koosman (70) were right behind him.  Given that Guidry relied much less on his fielders, I strongly support the notion that he had a better season than the other lefties. Still, negligible differences in actual performance are given much greater impact when one uses a points system like James'.  &lt;br /&gt;&lt;br /&gt;Another example is 1985, in which Guidry ranks eleventh on my list at 61.  Jimmy Key ranks sixth at 62--there are six pitchers within two RAR of each other.  Guidry could very easily rank sixth in this season, which would be worth an additional five points.  That would vault him from 47 points to 52 points, and give him a great deal more clearance over the HOF line.&lt;br /&gt;&lt;br /&gt;This is not to say that James' ranking system is without its strong points with respect to its aims--it values peak performance and it sets an equal total value relative to the size of the league, which depending on one's perspective might be very good properties.  My contention is that such a system is very sensitive to small changes in statistics, ones that would have no impact on a career-based evaluation.  If Guidry had been evaluated at 62 RAR and thus sixth in 1985, the extra run saved would have zero impact on your evaluation of his career RAR total--and rightfully so.  Allowing one run to exert a significant difference in a player's rank on an all-time list strikes me as utterly illogical and unsatisfactory.&lt;br /&gt;&lt;br /&gt;You may object and say that I am using RAR rather than Season Score, and that Season Score is not subject to minute differences in performance having a large effect on rank order as is the case for RAR.  While it is true that RAR and Season Score are very different methods, and that their application to Guidry might be very different as well, any metric is going to be subject to the same concerns when making a rank order over one season.  There is always the potential that a very small margin could be the difference between a batting title and third place, between fifth in the league on a list and out of the top ten.  That is true for any metric you want to pick, from BA to home runs to ERA to Season Score to RAR.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-7814481187859687362?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/7814481187859687362/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/02/comments-on-bill-james-gold-mine-2010.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7814481187859687362'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7814481187859687362'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/02/comments-on-bill-james-gold-mine-2010.html' title='Comments on Bill James Gold Mine 2010, pt. 1'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-3682584795427207918</id><published>2011-02-14T02:14:00.002-05:00</published><updated>2011-02-14T02:14:00.302-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OSU'/><title type='text'>Into the Great Wide Open</title><content type='html'>For as long as I have been writing this blog, I have done an annual preview of the OSU baseball team.  I have never before been at such a loss for words or offered so little insight about the team that will appear on the field in the upcoming season.  While I have never possessed nor claimed any sort of insider status, any sort of knowledge about the thinking of the coaching staff and the player’s performance in off-season drills and the like, I’ve always felt that I had a pretty good handle on what to expect the lineup to look like.  The tiny scraps of information available through the media about the team, coupled with  my knowledge of the personnel from the previous season and years of observation of Bob Todd’s decision-making made it a relatively simple guessing game.&lt;br /&gt;&lt;br /&gt;The 2011 Ohio State team is anything but a simple guessing game.  There has been massive roster turnover, with the team’s two best players (ace Alex Wimmers and catcher Dan Burkhart) now playing professionally, and, worst of all from a prognostication standpoint, the fact that Coach Todd has retired, and Greg Beals is now in charge of the program.  Beals had moderate success at Ball State, but his Ohio roots and purported recruiting skill landed him what is likely the most attractive coaching position in Midwest baseball.  He has filled out his staff with Chris Holick, his pitching coach from Ball State (and former OSU pitcher) Mike Stafford, and volunteer coach Josh Newman (another former Buckeye hurler and briefly a major leaguer with the Rockies and Royals).&lt;br /&gt;&lt;br /&gt;Holick returns to coaching after spending two years in private industry that came on the heels of a seven-year assistant coaching career (Kent State, Arizona State, and Florida International).  Stafford served as pitching coach under Beals at Ball State since 2003; prior to that he had been a bullpen catcher for the Columbus Clippers and pitched four years in the minors after ending his OSU closing career in 1998.  Newman is entering his first season as a collegiate coach.&lt;br /&gt;&lt;br /&gt;The talent they have to work with does not inspire a great deal of confidence for 2011, as there is very little in the way of proven performers.  My fears about this were enhanced when Beals, speaking on the radio halftime show of an OSU football game this fall, answered a question about expectations for the season with vapid generalities about “playing hard”, “playing the game the right way”, “learning how to win”…the sort of talk you hear from a coach that knows he’s building for the future.  At a recent meet the team event, Beals said that due to new NCAA rules on bats “It's going to be a faster paced game, but a safer game with less runs scored," Beals said. "Our game is going to become a game of the finer skills - defense...throwing strikes...aggressive on the base paths - and we want to be ahead of this curve."  Uh-oh.&lt;br /&gt;&lt;br /&gt;Even if the entire roster from 2010 was back, it would be difficult to get a good read on the team’s prospects.  2009 was a great season for the Buckeyes, with a Big Ten regular season title and a second-place regional finish, but 2010 was a disaster despite high expectations.  OSU was in first place (although by a very slim margin in a very tight conference race) in late April when Wimmers went down with a hamstring injury.  The few starts he missed might have been enough to mark the difference between finishing near the top of the conference and failing to finish in the top six to qualify for the Big Ten Tournament--the first time that fate had befallen an OSU club since 1996.  Ohio’s 11-13 conference record was the first and only sub-.500 record in Big Ten play in Coach Todd’s twenty-two year career.&lt;br /&gt;&lt;br /&gt;Usually I write my preview and lineup expectation a month or so before the season starts; this time I’m doing it less than a week prior to Opening Day.  As such, the ensuing discussion is based largely on the &lt;a href="http://www.ohiostatebuckeyes.com/ViewArticle.dbml?SPSID=87801&amp;amp;SPID=10418&amp;amp;ATCLID=205093976&amp;amp;DB_OEM_ID=17300"&gt;preview posted at the official athletic website&lt;/a&gt;; I’ve had to rely on such accounts to get any sort of feeling for how the lineup would shake out&lt;br /&gt;&lt;br /&gt;Burkhart’s successor behind the plate will be Greg Solomon, a sophomore-eligible transfer from Paradise Valley Community College in Arizona.  Solomon missed most of last season with a knee injury and did not hit particularly well in his 2009 freshman season, so this position is a huge question mark (you’ll note questions as a recurring theme).  Beals’ comments on him praise his defensive skill with little to say about his bat.  Burkhart was such an ironman for the past three seasons that none of the other catchers on the roster have any experience.  Redshirt freshman Steel Russell, son of former Pittsburgh skipper John, would seem to be the logical choice as backup.  True freshman Josh Bokor, junior-eligible JUCO transfer Brad Hutton, and his brother Blake, a true freshman, could also be options.  The Hutton brothers are both listed as C/IF on the roster, so it is possible they could see time at the corners as well.&lt;br /&gt;&lt;br /&gt;For the past two seasons, first base belonged to senior Matt Streng, but he has flipped corners with sophomore Brad Hallberg.  I was quite surprised to learn of this move, as Streng has never struck me as particularly athletic, but he will man the hot corner in 2011 after a very disappointing 2010 campaign that saw him hit just one home run (coincidentally, it came in the only game I was able to attend) and turn in a -13 RAA performance.  Hallberg was -4 runs in 112 PA in his debut season.&lt;br /&gt;&lt;br /&gt;Hallberg will split first base and DH duties with true freshman Josh Dezse, the most impressive member of the OSU recruiting class and a 28th round pick of the Yankees.  Given the fact that Dezse is also expected to be a key part of OSU’s bullpen, it would probably make more sense to have him at first base more often than not, but we’ll have to wait and see.&lt;br /&gt;&lt;br /&gt;Second base is another position where the Bucks have to fill a big whole as one of my favorite players, Cory Kovanda, completed his eligibility.  Kovanda was a consistent on base threat, slapping infield hits, drawing walks, and getting plunked.  His replacement will be redshirt sophomore Ryan Cypret, whose father was a member of the previous coaching staff.  Cypret served something of a utility infield role last season, but had just one extra base hit in 51 PA and thus has much to prove with the stick.&lt;br /&gt;&lt;br /&gt;Shortstop is the field position with the most continuity, as Tyler Engle has spent most of his three years in Columbus playing good defense but producing little at the plate.  Last year he posted an ugly .224/.342/.376 line; his 28 walks were second on the team and the only thing that made him playable.  If Engle could regain something approaching his 2009 form (.285/.411/.423), it would be a huge help as the infield looks as if it will struggle to create runs.  Two true freshmen Indiana natives, Derek Hannahs and Jacob Hayes, and local product Phil Jaskot figure to provide depth, with Hayes also being trained for left field.&lt;br /&gt;&lt;br /&gt;The outfield features one returning starter, as senior Brian DeLucia will play right field.  DeLucia is easily OSU’s top returning hitter, with a .320/.395/.503 line in 2011, and he’s a very good outfielder as well.  The rest of the Buckeye outfield will have big shoes to fill, as left fielder Zach Hurley (.385/.438/.602) and center fielder Michael Stephens (.360/.395/.556) were the team’s two most potent hitters outside of Burkhart in 2010.  Also gone to graduation is DH Ryan Dew (.348/.420/.498), leaving OSU short six of its top seven offensive performers--this from a team that averaged 6.6 runs to the Big Ten’s 6.8.&lt;br /&gt;&lt;br /&gt;It was expected that sophomore Hunter Mayfield would get one of the open positions, but he transferred to Rollins College (a Division II school that managed to beat the Bucks last year, perhaps costing them more than just the game), so there is no returning experience of which to speak outside of DeLucia.  Left field will be a platoon between junior David Corna and sophomore Joe Ciamocco.  Despite having burned three years of eligibility combined, they have a total of one collegiate at bat between them, so it’s impossible to count on much offense out of them.  Center field will belong to true freshman Tim Wetzel, described by the web account as having--you guessed it--"speed and defensive prowess”.  &lt;br /&gt;&lt;br /&gt;One can only hope that the unproven players will be better batters than the advance billing would suggest, because otherwise it seems as if Ohio will have significant trouble putting runs on the scoreboard.  This is particularly unfortunate since the 2010 pitching staff was Wimmers and anyone who could stay healthy, and Wimmers, perhaps the best pitcher at OSU since Steve Arlin, is now a Twins farmhand.&lt;br /&gt;&lt;br /&gt;The Friday starter should be senior Drew Rucinski, who had been constantly shuttled back and forth between the bullpen and the rotation by the previous staff.  Rucinski was a good complement to Wimmers, but does not figure to match up well with the other top starters in the conference.  Another senior, Dean Wolosiansky, has been a rotation stalwart throughout his career and should get the Saturday starts.  Wolosiansky has always fit the profile of a league-average innings muncher, which is a good thing to have, but in a perfect world he would be the Sunday starter.  That role apparently will go to true freshman Greg Greve, a 45th round pick by San Francisco in last year’s draft.  &lt;br /&gt;&lt;br /&gt;Greve’s spot in the rotation may not hold up if Brad Goldberg is granted a waiver to pitch in 2011.  The Coastal Carolina transfer is a junior in eligibility, but may not be able to use it until 2012.  He pitched just five innings for Coastal last year after pitching fourteen effective innings out of the pen as a freshman.  If eligible, he figures to be a big boost to the Buckeye mound staff.&lt;br /&gt;&lt;br /&gt;Josh Dezse, slated to start at 1B/DH, is also the favorite for the closer role.  While two-way players are fairly common in the college game, especially as closer, the only prominent OSU two-way player of recent years was JB Shuck, now a solid prospect in the Astros organization.  However, Shuck was used as a starting pitcher, not a reliever, and rarely was in the lineup when he pitched.  It appears as if Beals is more comfortable with hybrid players than Todd was.&lt;br /&gt;&lt;br /&gt;Senior Jared Strayer’s role and effectiveness increased as the season went on last year, and he figures to be the key middle reliever in 2011.  Unfortunately, it seems as if the new staff has taught Strayer a more conventional delivery--last year he adopted a three-quarters/sidearm delivery that was a pleasure to watch and a rare sight on the Bill Davis Stadium mound.  &lt;br /&gt;&lt;br /&gt;Sophomore Brett McKinney showed some promise last year, but was way too wild and way too hittable (32 walks and 78 hits in 59 innings).  He figures to be the Wednesday starter/weekend long man at this point, but he has a chance to be a quality contributor at some point during his career.  Two left-handed pitchers--senior Theron Minium and junior Andrew Armstrong--figure to be the other candidates for Wednesday starts and weekend middle relief options.  Armstrong showed much promise as a freshman in 2008, but was injured during the ’09 season and missed all of last season.&lt;br /&gt;&lt;br /&gt;Junior Brian Bobinski and sophomore Cole Brown combined for eighteen ineffective innings last year and in a perfect world will be mopup pitchers this season, provided they have shown no improvement.  Three other pitchers on the roster do not figure to see significant action: true freshman lefty Ben Bokor (twin brother of backup catcher Josh), junior walkon Paul Guey, and a freshman walkon who was unsuccessful in attempting to make Wake Forest’s roster last year, John Kuchno.&lt;br /&gt;&lt;br /&gt;The OSU pre-conference schedule is significantly different than those of past years, showing a change in philosophy at the top.  The schedule opens the weekend of February 18 with the Big Ten/Big East challenge in Florida, where the opponents will be Cincinnati, Louisville, and St. John’s--the latter two have both been problems for OSU in recent years, with Louisville burying the Bucks in five games over the past three seasons by a combined 62-32 score, and St. John’s having knocked Ohio out of the NCAA Tournament in 2005.&lt;br /&gt;&lt;br /&gt;The weekend of February 25th sees the Bucks back in Florida four a four-game series with Western Michigan, a team with OSU has a rich tradition--the teams played ten times in the NCAA Tournament between 1951 and 1967.  The weekend of March 3 will be a trip to North Carolina to face Army, Western Carolina, and Akron, and the weekend of March 10 is another Florida trip to play Illinois State, Bradley, and Army.&lt;br /&gt;&lt;br /&gt;The annual spring break trip, which this year will run from March 18-26, is the one in which the philosophical differences emerge.  Coach Todd used the spring break trip as a week of tuneup for conference play, going to Florida to face mostly northern teams that OSU should expect to go 6-2 or so against.  He had not taken his team to the West Coast since a 2002 trip to Albuquerque that featured a crazy 38-15 game against Toledo, and the last trip to California came around twenty years ago.  In his first season, Beals will take his charges to play a three game series in Berkeley (in what sadly appears to be the final campaign for Golden Bear baseball), two games at Fresno State, and three at Cal St.-Bakersfield.&lt;br /&gt;&lt;br /&gt;The home opener will be Tuesday, March 29 against Xavier.  Other standard one game mid-week home opponents will be Miami, Akron, Bowling Green, and Toledo.  The Buckeyes will travel to play at Ohio University on one Wednesday, and have two out-of-region opponents coming in for two game series: North Florida and Oklahoma State.&lt;br /&gt;&lt;br /&gt;Big Ten play begins April 1 against Northwestern; in subsequent weeks, OSU goes to Indiana, hosts Michigan State, travels to Penn State, hosts the forces of evil, goes to Illinois, hosts Iowa, and goes to Minnesota.  If OSU is able to make the top six, they’ll be able to stay home and play downtown at Huntington Park; the Big Ten Tournament will be held there May 25 - 28.&lt;br /&gt;&lt;br /&gt;Sadly, if I had to hazard a guess, I would say that for the second straight season six other Big Ten teams will be playing in Columbus that weekend.  While I thought that Todd’s program was healthy enough, 2011 would have loomed as a possible rebuilding year in any case.  Coaching changes often excite fans at the outset, but first seasons tend to be rough.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-3682584795427207918?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/3682584795427207918/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/02/into-great-wide-open.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3682584795427207918'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/3682584795427207918'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/02/into-great-wide-open.html' title='Into the Great Wide Open'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-7068197084267035184</id><published>2011-02-09T00:14:00.000-05:00</published><updated>2011-02-09T00:14:00.644-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Win Estimators'/><category scheme='http://www.blogger.com/atom/ns#' term='Statistical Reports'/><title type='text'>Crude Team Ratings 2010</title><content type='html'>In a &lt;a href="http://walksaber.blogspot.com/2011/01/crude-team-ratings.html"&gt;previous post&lt;/a&gt; I explained the methodology behind these rankings, and acknowledged a fairly decent number of shortcomings they possess.  I will not harp on either of those topics again here, but that is simply to avoid being repetitive--these rankings are far from perfect and should be taken in that spirit.&lt;br /&gt;&lt;br /&gt;I have will present four different sets of rankings based on four different inputs.  The manner of calculating the rankings is identical for all four; the only difference is which initial win ratio is used.  I tend to believe the final set based on Runs Created/Runs Created Allowed is the most indicative of true talent, but any season aggregate metric is going to have obvious deficiencies when it comes to estimating true talent, and in reality some combination would probably be superior for that purpose.  &lt;br /&gt;&lt;br /&gt;The first ranking is what I call actual CTR, since it is based on the actual W/L ratio of each team.  This is an attempt to evaluate teams based on their actual game outcomes, just adjusted for strength of schedule.  Some might argue that even if actual record is used for the team, some component-estimated record should be used to gauge SOS, and they have a point, but this approach equates team strength to W/L ratio.&lt;br /&gt;&lt;br /&gt;In the chart below, aW% is adjusted W%, and "s rk" is the rank of the team's SOS (#1 = toughest schedule):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_moz5d-MCh8g/TVIPvgAJ7nI/AAAAAAAAAvI/69xPDW5nacE/s1600/ctr6.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://3.bp.blogspot.com/_moz5d-MCh8g/TVIPvgAJ7nI/AAAAAAAAAvI/69xPDW5nacE/s320/ctr6.jpg" width="147" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;You can see from the chart that a 100 CTR does not correspond to a .500 aW%.  This is because the rankings are designed to give the average team a 100 CTR; since the properties of W/L ratio ensure that the average W/L will be &amp;gt; 1, an average W/L ratio is not the same time as the W/L ratio of an average team (which is 1, of course).  If this scale distortion bothers you, use aW%--and stop using ERA+, because the distortion is similar.  I am unconcerned about the issue because I want the average rating (but not necessarily the median) to be 100.&lt;br /&gt;&lt;br /&gt;The three toughest schedules are the bottom three in the AL East (BAL, TOR, BOS, although it's hardly fair to refer the latter two teams as being at the bottom of anything). Of course, this is a consequence of playing in the stronger league and having to play a bunch of games against the two highest-rated teams.  What I don't like about this definition of SOS is that one can make the case that Tampa Bay's schedule was equally as tough on paper as Toronto's (assuming an equal distribution of games against non-AL East opponents)--but Tampa Bay's success in winning games makes Toronto's harder in practice.  It would be difficult to devise a SOS technique that took that point of view, however.&lt;br /&gt;&lt;br /&gt;The more notable weakness of the schedule-adjustment (and thus the ratings themselves) is they make no correction for the influence that a team has upon the win-loss record of its opponents, and thus might be acting in a distortive manner at the extremes.&lt;br /&gt;&lt;br /&gt;I also have some division/league ratings; these are simply the average CTR of all of the teams in the division:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_moz5d-MCh8g/TVIPw19vluI/AAAAAAAAAvM/mfrcI2j3P1M/s1600/ctr7.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_moz5d-MCh8g/TVIPw19vluI/AAAAAAAAAvM/mfrcI2j3P1M/s1600/ctr7.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Four of the six divisions were relatively equal, with an average CTR in the 96-102 range.  However, one extremely good division and one extremely bad division result in the AL having a 108 ranking to the NL's 93.  Those overall league rankings imply that the average AL team would be expected to have a .537 W% against the average NL team.  The 2010 interleague record for the AL was .532, which of course was not generated through balanced schedule, neutral-field meetings and covers a sample of 252 games.&lt;br /&gt;&lt;br /&gt;While I developed the rankings this year, I ran 2009 through the spreadsheet and the AL/NL disparity was estimated to be greater (113/89, .559).  The AL West was the top-rated division (126), and the NL Central fared a tick worse than they would in 2010 (80).&lt;br /&gt;&lt;br /&gt;The weak NL Central allowed the Reds to have the worst CTR of any playoff team (108) in 2010, although that figure was also better than the Cardinals' 2009 low of 103.  &lt;br /&gt;&lt;br /&gt;Switching gears, here are the gCTR figures.  These are based on gEW% (described in &lt;a href="http://walksaber.blogspot.com/2011/01/run-distribution-and-w-2010.html"&gt;this post&lt;/a&gt;), which takes into account the distribution of team runs scored and allowed per game (but does so independently of the other):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_moz5d-MCh8g/TVIPx-CQwpI/AAAAAAAAAvQ/-3tiQUifwc0/s1600/ctr8.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_moz5d-MCh8g/TVIPx-CQwpI/AAAAAAAAAvQ/-3tiQUifwc0/s320/ctr8.jpg" width="147" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I'm not going to have a lot to say about the charts for each input, since they track the differences between their inputs and actual W%, which I have already written about in one form or another.  Next is eCTR, which uses standard Pythagenpat W-L record as its starting point:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_moz5d-MCh8g/TVIPywrWDEI/AAAAAAAAAvU/uXlO9XyggHA/s1600/ctr9.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://2.bp.blogspot.com/_moz5d-MCh8g/TVIPywrWDEI/AAAAAAAAAvU/uXlO9XyggHA/s320/ctr9.jpg" width="147" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Finally, pCTR, based on Runs Created/Runs Created Allowed used to fuel Pythagenpat:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_moz5d-MCh8g/TVIPzgLtLqI/AAAAAAAAAvY/UVeRnD-O8_4/s1600/ctr10.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_moz5d-MCh8g/TVIPzgLtLqI/AAAAAAAAAvY/UVeRnD-O8_4/s320/ctr10.jpg" width="148" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Here the AL East really looks good, with the top four teams in baseball.  The potential for this type of clustering of strong teams (and, in the case of the NL Central, lousy teams) in one division is one of the reasons I oppose treating winners of small divisions playing with unbalanced schedules as sacrosanct.&lt;br /&gt;&lt;br /&gt;pCTR and the associated aW% are the closest in construction to other popular ratings that account for strength of schedule and use component rather than actual W-L inputs, namely the third-order records published by &lt;a href="http://www.baseballprospectus.com/statistics/standings.php"&gt;Baseball Prospectus&lt;/a&gt;&amp;nbsp;and the TPI rankings figured by Justin at &lt;a href="http://www.beyondtheboxscore.com/section/power-rankings-2"&gt;Beyond the Box Score&lt;/a&gt;.  Here are the most comparable winning percentages for each methodology--my aW% based on pCTR, the third-order winning percentage from BP, and the TPI from Justin:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_moz5d-MCh8g/TVIShysCVCI/AAAAAAAAAvc/WrZpbQZ46cw/s1600/ctr11.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://3.bp.blogspot.com/_moz5d-MCh8g/TVIShysCVCI/AAAAAAAAAvc/WrZpbQZ46cw/s320/ctr11.jpg" width="152" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;You can see that there is general agreement between all of the methods, which is a good sign.  The best correlation is between CTR and BP (+.983); the worst between CTR and Justin (+.942), with the BP/Justin correlation falling in the middle (+.965).  The methods are all in general agreement about the proper spread of the teams--the standard deviation of the BP and Justin figures is .059 compared to .060 for the CTR-based figures, .060 for my estimate of PW% without any schedule adjustments, and .068 for actual W%.&lt;br /&gt;&lt;br /&gt;The BP approach and my own to generating the underlying W% estimate are essentially the same, except for the use of different run estimators (BP uses EqR and I use Base Runs).  Justin’s approach is a little different, and I’m personally not wild about it--it breaks defense down into pitching and fielding, making use of FIP and defensive metrics like UZR and Dewan’s Runs Saved.  The approach used by BP and myself looks at the actual total component statistics surrendered by the defense.  This does not allow one to split defense into pitching and fielding, but it also makes use of the actual observed interaction between the two on the field rather than using estimates that might make sense in isolation but leave something missing when the two are considered as one unit.&lt;br /&gt;&lt;br /&gt;In any event, it is encouraging to see that CTR is able to produce similar results to systems developed by others that have been around a little or a lot longer as the case may be.  If CTR returned very different results, I would probably conclude that it had a serious methodological error rather than the minor though not insignificant flaws that I am already aware of.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-7068197084267035184?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/7068197084267035184/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/02/crude-team-ratings-2010.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7068197084267035184'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/7068197084267035184'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/02/crude-team-ratings-2010.html' title='Crude Team Ratings 2010'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_moz5d-MCh8g/TVIPvgAJ7nI/AAAAAAAAAvI/69xPDW5nacE/s72-c/ctr6.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-2276508931122597889</id><published>2011-02-06T23:17:00.000-05:00</published><updated>2011-02-06T23:17:29.600-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Whimsy'/><title type='text'>Great Moments in Hulu Sports Information</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_moz5d-MCh8g/TU9yU-hNyAI/AAAAAAAAAvE/QR602KxBajI/s1600/hulu.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="97" src="http://1.bp.blogspot.com/_moz5d-MCh8g/TU9yU-hNyAI/AAAAAAAAAvE/QR602KxBajI/s320/hulu.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-2276508931122597889?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/2276508931122597889/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/02/great-moments-in-hulu-sports.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2276508931122597889'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/2276508931122597889'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/02/great-moments-in-hulu-sports.html' title='Great Moments in Hulu Sports Information'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_moz5d-MCh8g/TU9yU-hNyAI/AAAAAAAAAvE/QR602KxBajI/s72-c/hulu.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-12133335.post-6085721443482584043</id><published>2011-01-31T01:15:00.000-05:00</published><updated>2011-01-31T01:15:00.215-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Sabermetrics'/><category scheme='http://www.blogger.com/atom/ns#' term='Win Estimators'/><title type='text'>Crude Team Ratings</title><content type='html'>This methodology is really unnecessary, and very possibly has a flaw somewhere inside that I was unable to spot.  However, one night in September the Rays/Yankees game I was watching went into a rain delay, and not wanting to do some real work I used this time to fiddle around with a rating system for teams that incorporated strength of schedule.  Never one to simply let an idle experiment rest in peace without milking a couple of blogposts out of it, I am compelled to describe the Crude Team Rating (CTR) here.&lt;br /&gt;&lt;br /&gt;There is nothing novel about the idea--similar ratings which are likely based on better theory are published by Baseball Prospectus, Beyond the Box Score, Andy Dolphin, Baseball-Reference, and others.  The concept is simple; the execution in this case is likely muddled.&lt;br /&gt;&lt;br /&gt;Let me offer an example of how this works with a hypothetical four-team league composed of the Alphas, Bravos, Charlies, and Ekos.  The 90 game schedule is not balanced; the Alphas/Bravos and Charlies/Ekos are in "divisions" and play each other 40 times, playing their cross-divisional foes 25 times each.  The Alphas go 25-15 against the Bravos and 17-8 against the Charlies and Ekos; the Bravos go 14-11 against the Charlies and Ekos; the Charlies go 14-6 against the Ekos.  We wind up with these standings:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_moz5d-MCh8g/TUXVI7A53tI/AAAAAAAAAus/rokwYUenhLs/s1600/ctr1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_moz5d-MCh8g/TUXVI7A53tI/AAAAAAAAAus/rokwYUenhLs/s1600/ctr1.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The Bravos and Charlies are equal in the standings, but we have every reason to believe that the Bravos are a better team--they won their season series against both teams from the other division, and had to play forty games against the Alphas, who are clearly the league's dominant team.  Obviously strength of schedule worked against the Bravos.&lt;br /&gt;&lt;br /&gt;Let's start by looking at each team's win ratio (W/L, which of course is also W%/(1 - W%)):&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_moz5d-MCh8g/TUXVJWyGYVI/AAAAAAAAAuw/zDlrs2eAYBU/s1600/ctr2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_moz5d-MCh8g/TUXVJWyGYVI/AAAAAAAAAuw/zDlrs2eAYBU/s1600/ctr2.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;These do not average to 1, because of the nature of working with ratios rather than percentages (the average W% is .500 in this league--don't worry, I made sure my standings added up).  The average win ratio for the league will always be greater than one, unless all of the teams are .500.  What we can do now is calculate the average win ratio for each team's opponents.  The Alphas' opponents had an average win ratio of (40*.915 + 25*.915 + 25*.636)/90 = .838.  This is the initial strength of schedule, S1.&lt;br /&gt;&lt;br /&gt;Next, we can adjust each team's win ratio for their strength of schedule.  First, though, I need to point out a flaw that is inherent in the CTR approach--it does not recognize the degree to which a team's SOS is affected by their quality.  The Alphas will take a hit on this count, because all of the teams they play against have poor records.  Of course, based on what we know, the Alphas actually play the second-toughest schedule in the league because of the forty games with the Bravos.  To the algorithm used here, they have a weak schedule because all the teams they play have losing records.  That is true, but the reason the Bravos have a losing record is &lt;i&gt;because they play the Alphas so often&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;I could attempt to adjust for this in some way, but I've chosen to let it slide--that's one of many reasons why these are self-proclaimed crude ratings.  Of course, the effect is much more pronounced in this hypothetical when teams are playing 44% of their games against the same opponent, and is much less pronounced in a 30 team league in which the most frequent opponents play only 12% of the time.&lt;br /&gt;&lt;br /&gt;Once we have the initial strength of schedule S1, we need to find avg(S1)--the average S1 for all teams in the league.  In this example it is 1.092.  To adjust each team's W0 for SOS, we multiply by the ratio of S1 to avg(S1).  This gives us the first-iteration adjusted win ration, which I'll call W1:&lt;br /&gt;&lt;br /&gt;W1 = W0*(S1/Avg(S1))&lt;br /&gt;&lt;br /&gt;For the Alphas: W1 = 1.903*.838/1.092 = 1.459&lt;br /&gt;&lt;br /&gt;Now that we have an adjusted win ratio, we can re-estimate SOS in the same manner as before, producing S2.  This is necessary because we now have a better estimate of the quality of each team, and that knowledge should be reflected in the SOS estimate.  S2 for the Alphas is .916.  &lt;br /&gt;&lt;br /&gt;In order to find W2, we need to compare S2 to S1.  We can't simply apply S2 to W1, because W1 already includes an adjustment for SOS.  S2 supercedes S1; it doesn't act upon it as a multiplier or addition.  We also can't apply S2 to W0, the initial win ratio, because the schedule adjustment of S2 is based on each team's W1.  At each step in the process, the previous iteration has to be seen as the new starting point, and the adjustment has to be a comparison between the new iteration and the one that directly preceded it.&lt;br /&gt;&lt;br /&gt;So W2 is figured in the same manner as W1, except the ratio of S2 to the average is compared to the same for S1:&lt;br /&gt;&lt;br /&gt;W2 = W1*(S2/Avg(S2))/(S1/Avg(S1))&lt;br /&gt;&lt;br /&gt;For the Alphas, W2 = 1.459*(.916/1.029)/(.838/1.092) = 1.692&lt;br /&gt;&lt;br /&gt;Then we use W2 to figure S3, and use S3 to figure W3, and the process continues through as many iterations as you feel like setting up in Excel (at least for me).  For the purpose of this example, I did nine; for the actual spreadsheet for ML teams, I went a little overboard and did thirty iterations.  That results in a "final" estimate of win ratio, W9.  It is not of course truly final as that would be W(infinity).  The "final" estimate of SOS is S10, so that the last estimate of SOS is based on the last estimate of team quality.&lt;br /&gt;&lt;br /&gt;One undesirable effect of the iterative process is that the average W9 is no longer equal to the actual average win ratio for the league, and the distribution is not the same.  Thus, when the adjusted win ratios are converted into winning percentages (W% = WR/(1 + WR)), they are not guaranteed to average to .500, which of course is a logical must.&lt;br /&gt;&lt;br /&gt;In order to convert W9 into an adjusted winning percentage, first figure an initial W% for each team:&lt;br /&gt;&lt;br /&gt;iW% = W9/(1 + W9)&lt;br /&gt;&lt;br /&gt;Dividing by the average of iW% will force the new adjusted W% to average to .500 for the league:&lt;br /&gt;&lt;br /&gt;aW% = iW%/Avg(iW%)*.5&lt;br /&gt;&lt;br /&gt;The results for the theoretical league are illustrative of the strength of schedule problem I touched on earlier--the Alphas' aW% is lower than their actual W%, as they did not have to play against themselves.  It really doesn't make sense to consider what a team's record would be if they played themselves, but thankfully this distortion becomes more trivial as the number of teams in a league increases.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_moz5d-MCh8g/TUXVJ2jB6FI/AAAAAAAAAu0/H_RYgd7mzEo/s1600/ctr3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_moz5d-MCh8g/TUXVJ2jB6FI/AAAAAAAAAu0/H_RYgd7mzEo/s1600/ctr3.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;aW% might seem like a logical measure to use as the final outcome of the system, but I actually prefer a scale which remains grounded in win ratio.  So I convert W9 into the final product, Crude Team Rating, by dividing W9 by the average W9.  This is different from aW%, which forces the average to equal the initial average win ratio for the league; this adjustment forces the average to 1 (or 100 when the decimal point is dropped), which is a nice property to have for the overall rating:&lt;br /&gt;&lt;br /&gt;CTR = W9/Avg(W9)&lt;br /&gt;&lt;br /&gt;In the same manner, a final strength of schedule metric that is centered at 1 is:&lt;br /&gt;&lt;br /&gt;SOS = S10/Avg(S10)&lt;br /&gt;&lt;br /&gt;The CTR and SOS for the four hypothetical teams are:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_moz5d-MCh8g/TUXVK4_gG_I/AAAAAAAAAu4/jxtjnBtEiAo/s1600/ctr4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_moz5d-MCh8g/TUXVK4_gG_I/AAAAAAAAAu4/jxtjnBtEiAo/s1600/ctr4.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This post is already a mess to read, so at this point I'm going to break it up into sections on particular aspects of the methodology that I wish to expound upon:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Why is win ratio used throughout the process rather than W%?&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I use win ratio because it is much easier to work with; for instance, the ratios between the various SOS estimates can be multiplied by the win ratio to find an adjusted win ratio.  The math would not be anywhere near as straightforward if W% were used.  Using win ratio also allows for a larger spread in the final CTRs; I could use CTR = aW%/Avg(aW%), but the range would be narrower.&lt;br /&gt;&lt;br /&gt;Most importantly, though, is that win ratios can be plugged directly into Odds Ratio (equivalent to Log5) to estimate W% for matchups between teams.  If a team with a win ratio of 1.2 plays a team with a win ratio of .9, they can be expected to win 57.1% of the time--1.2/(1.2 + .9).  There would be equivalent but messier math if working with W%.&lt;br /&gt;&lt;br /&gt;Due to this property, we can use CTR to estimate head-to-head winning percentages.  CTR is not a true win ratio, since it has been re-centered at 100, but the re-centering is done with a common scalar and so it has no effect on ratios of team CTRs.  So a team with a CTR of 130 can be expected to win 52% of the time against a team with a 120 CTR--130/(130 + 120).  &lt;br /&gt;&lt;br /&gt;I estimated W% for the Alphas against each of their opponents using CTR.  The fact that they are close to the actual results should not be taken as any type of indication that the head-to-head estimates are accurate, as I obviously came up with the numbers to follow logically.  They are offered just to show how the ratings can be used to estimate W% in a head-to-head matchup:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_moz5d-MCh8g/TUXVLWKWmeI/AAAAAAAAAu8/mf9eH9NYaBU/s1600/ctr5.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_moz5d-MCh8g/TUXVLWKWmeI/AAAAAAAAAu8/mf9eH9NYaBU/s1600/ctr5.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Your ranking system is novel and unique, right?&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Wrong.  There's nothing new about it.  It is really quite similar to the Simple Ranking System used by the Sports-Reference sites, except it operates on ratios rather than differentials.  As mentioned above, there are a number of similar and likely more refined approaches utilized by other analysts.&lt;br /&gt;&lt;br /&gt;It's really quite simple:&lt;br /&gt;&lt;br /&gt;1. Assign each team an initial ranking&lt;br /&gt;2. Use those initial rankings to estimate SOS for each team&lt;br /&gt;3. Compare SOS to the average schedule and adjust initial ranking accordingly&lt;br /&gt;4. Repeat until rankings stabilize&lt;br /&gt;&lt;br /&gt;&lt;i&gt;You haven't adjusted for home-field advantage, have you?&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;No, I haven't, either in the process of estimating team strength (I don't breakdown a team's schedule into home and road games) or by producing adjustments for CTR at home and on the road (i.e. having a formula that tells you that a 130 overall CTR team has an equivalent 110 CTR on the road and 150 at home).  The former falls outside the scope of an admittedly crude rating system; the latter is something that is easy enough to account for on the fly.&lt;br /&gt;&lt;br /&gt;While there is a lot that could be said about incorporating home field advantage (writing about some of it is on my to-do list), the simplest thing to do is to incorporate a home field edge into the odds ratio calculation.  The long-term major league average is for the home team to win 54% of the time, which is a win ratio of 1.174.  I call the square root of that win ratio "h" (1.083).  If a team is away, divide their CTR by h; if they are home, multiply it by h.&lt;br /&gt;&lt;br /&gt;Suppose that we have a 125 CTR team hosting a 110 CTR team.  In lieu of home field advantage, we'd expect the home team to win 125/(125 + 110) = 53.2% of the time.  But the 125 team is now an effective 135.4 team (125*1.083), and the 110 team is now an effective 101.6 team (110/1.083), and so the expected outcome is now a home win 57.1% of the time.&lt;br /&gt;&lt;br /&gt;Equivalently, one can figure the odds ratio probability by first dividing the two CTRs (125/110 = 1.136), and dividing by that ratio plus one (1.136/2.136 = 53.2%).  If you approach it in this manner, you need to multiply by the ratio by h^2 (1.174) rather than h (125/110*1.174 = 1.334 and 1.334/2.334 = 57.1%).  I prefer the former method, because it produces a distinct new rating for each team based on whether they are home or away rather than accounting for it all in one step, but it is a matter of preference and has no computational impact.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;CTR is based on actual win ratio.  Why don't you use expected win ratio from runs and runs allowed or runs created and runs created allowed?&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;One can very easily substitute expected win ratio for actual win ratio.  I have two variations, eCTR, gCTR, and pCTR.  eCTR uses win ratio estimated from actual runs scored and allowed, while gCTR uses "Game EW%" (based on runs scored and allowed distribution taken independently in &lt;a href="http://walksaber.blogspot.com/2011/01/run-distribution-and-w-2010.html"&gt;this post&lt;/a&gt;) and pCTR uses win ratio estimated from runs created and runs created allowed.&lt;br /&gt;&lt;br /&gt;The discussion of what inputs to use helps to illustrate another flaw in the methodology--there is no regression built in to the system.  For the purpose of ranking a team's actual W-L results, there is no need for regression, but if one is using the system to estimate a team's W% against an opponent, it is incorrect to assume that a team's sample W% is equal to the true probability of them winning a game.  Even if one did not want to regress the W/L ratio of the team being rated, it would make sense to regress the records of their opponents in figuring strength of schedule.  I've done neither.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Are there any other potential applications of CTR?&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;One can always think up different ways to use a method like this (and since methods similar in spirit but superior in execution to CTR already exist many of them have been implemented already); the question is whether the results are robust enough to provide value in a given function.  I'll offer one possible use here, which is using a team's strength of schedule adjustment to estimate an adjustment factor for their players' performance.&lt;br /&gt;&lt;br /&gt;There are some perfectly good reasons why one would not want to adjust individual performance for the strength of his opponents, but if that is something in which you are interested, CTR might be useful.  If a team's opponents have an expected win ratio of .9, then based on the Pythagorean formula with an exponent of two, their equivalent run ratio should be sqrt(.9) = .949.  Custom exponents could be used as well, of course, but two will suffice for this example.&lt;br /&gt;&lt;br /&gt;So a pitcher on a team with a .9 SOS could have his ERA adjusted by dividing by .949 to account for the weaker quality of opposition.  This approach assumes that the team's opponents are evenly balanced between offense and defense.  One could put together a CTR-like system that broke down runs scored and allowed separately, but that would require the use of park factors or home/road breakdowns, and would greatly complicate matters.&lt;br /&gt;&lt;br /&gt;Speaking of park factors, one could use an iterative approach to calculate park factors (Pete Palmer's PF method takes this path).  Instead of simply comparing a team's home and road RPG as I do, you could look at the team's road games in each park, calculate an initial adjustment, and iterate until the final park factors stabilized.  At some point I’ll cover this application in detail.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Wait, there's something messed up with the 100 scale.  A .500 team will not get a 100 CTR.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;You're right, they won't, but it's not a problem with the scale--it's a feature of it.  (You are free to prefer a different scale, of course).  This issue is hinted at in the section about why win ratio is used, but I didn't address it explicitly there.  The scale is designed so that a team with an average win ratio gets a 100 rating, not so that an average team (which would by definition play .500) gets a 100 rating.&lt;br /&gt;&lt;br /&gt;Others have pointed out the potential dangers in working with ratios rather than percentages--assuming that ratios work as percentages can result in mathematical blunders.  Suppose we have two football teams that comprise a league, one which goes 1-15 and the other that goes 15-1.  Obviously, the average record for a team in this league is 8-8 with a winning percentage of .500.  Such a team would have a win ratio of 1.&lt;br /&gt;&lt;br /&gt;But what is the average win ratio for a team in this league?  Not the win ratio for a hypothetical average team--the arithmetic mean of the win ratios of the teams in this league.  It is (15/1 + 1/15)/2 = 7.533.  It's not even close to 1.&lt;br /&gt;&lt;br /&gt;Obviously this is an extreme example, but the principle holds--teams that are an equal distance from .500 will not see their win ratios balance to 1.  The average win ratio for a real league will always be &amp;gt;= 1.  The effect is stronger in leagues in which win ratios deviate more from 1.  In the 2009 majors, for instance, the average win ratio was 1.039.  In the 2009 NFL, it was 1.386.&lt;br /&gt;&lt;br /&gt;I could have set up the ratings so that a team with a .500 record (i.e. 1 win ratio) was assured of receiving a ranking of 100 by simply not dividing what is called W(f) below by Avg[W(f)], but it also would have ensured that the average of all of the team ratings would not be 100.  In the first spreadsheet I put together, that's exactly what I did, but I decided that it was more annoying to have to remember what the league average of the ratings was (especially when looking at aggregate rankings for divisions and leagues) than it was to remember that a .500 team would have a ranking of 96 or so.  It's purely a matter of aesthetics and personal preference.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Generic Formulas&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;W0 = W/L for aCTR; (gEW%/(1 - gEW%) for gCTR; (R/RA)^x for eCTR; (RC/RCA)^y for pCTR&lt;br /&gt;where x = ((R + RA)/G)^z and y = ((RC + RCA)/G)^z, where z is the Pythagenpat exponent (I use .29 out of habit)&lt;br /&gt;&lt;br /&gt;for a league of t teams, where G is total games played for a team and g(i) is the number of games against a particular opponent:&lt;br /&gt;&lt;br /&gt;S1 = (1/G)*{SUM(i = 0 to t)[g(i)*W0(i)]}&lt;br /&gt;&lt;br /&gt;W1 = W0*(S1/Avg(S1))&lt;br /&gt;&lt;br /&gt;S(n) for n &amp;gt; 1 = (1/G)*{SUM(i = 0 to t)[g(i)*W(n-1)(i)]&lt;br /&gt;&lt;br /&gt;W(n) for n &amp;gt; 1 = W(n-1)*(S(n)/Avg(S(n))/(S(n-1)/Avg(S(n-1)))&lt;br /&gt;&lt;br /&gt;For final win iteration (f) in the specific implementation (f = 30 in my spreadsheet):&lt;br /&gt;&lt;br /&gt;S(f + 1) = (1/G)*{SUM(i = 0 to t)[g(i)*W(f)(i)]&lt;br /&gt;&lt;br /&gt;iW% = W(f)/[1 + W(f)]&lt;br /&gt;&lt;br /&gt;aW% = iW%/Avg(iW%)*.5&lt;br /&gt;&lt;br /&gt;CTR = W(f)/Avg(W(f))&lt;br /&gt;&lt;br /&gt;SOS = S(f + 1)/Avg(S(f + 1))&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/12133335-6085721443482584043?l=walksaber.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://walksaber.blogspot.com/feeds/6085721443482584043/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://walksaber.blogspot.com/2011/01/crude-team-ratings.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6085721443482584043'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/12133335/posts/default/6085721443482584043'/><link rel='alternate' type='text/html' href='http://walksaber.blogspot.com/2011/01/crude-team-ratings.html' title='Crude Team Ratings'/><author><name>p</name><uri>http://www.blogger.com/profile/18057215403741682609</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='htt
