Saturday, October 04, 2014

End of Season Statistics, 2014

These reports will be trickling out over the next month as I don’t have as much time as I’d like to devote to them right now. I wanted to get the park factors out as soon as possible and everything else will be added later.

The spreadsheets are published as Google Spreadsheets, which you can download in Excel format by changing the extension in the address from "=html" to "=xls". That way you can download them and manipulate things however you see fit.

The data comes from a number of different sources. Most of the basic data comes from Doug's Stats, which is a very handy site, or Baseball-Reference. KJOK's park database provided some of the data used in the park factors, but for recent seasons park data comes from B-R. Data on pitcher's batted ball types allowed, doubles/triples allowed, and inherited/bequeathed runners comes from Baseball Prospectus.

The basic philosophy behind these stats is to use the simplest methods that have acceptable accuracy. Of course, "acceptable" is in the eye of the beholder, namely me. I use Pythagenpat not because other run/win converters, like a constant RPW or a fixed exponent are not accurate enough for this purpose, but because it's mine and it would be kind of odd if I didn't use it.

If I seem to be a stickler for purity in my critiques of others' methods, I'd contend it is usually in a theoretical sense, not an input sense. So when I exclude hit batters, I'm not saying that hit batters are worthless or that they *should* be ignored; it's just easier not to mess with them and not that much less accurate.

I also don't really have a problem with people using sub-standard methods (say, Basic RC) as long as they acknowledge that they are sub-standard. If someone pretends that Basic RC doesn't undervalue walks or cause problems when applied to extreme individuals, I'll call them on it; if they explain its shortcomings but use it regardless, I accept that. Take these last three paragraphs as my acknowledgment that some of the statistics displayed here have shortcomings as well.

The League spreadsheet is pretty straightforward--it includes league totals and averages for a number of categories, most or all of which are explained at appropriate junctures throughout this piece. The advent of interleague play has created two different sets of league totals--one for the offense of league teams and one for the defense of league teams. Before interleague play, these two were identical. I do not present both sets of totals (you can figure the defensive ones yourself from the team spreadsheet, if you desire), just those for the offenses. The exception is for the defense-specific statistics, like innings pitched and quality starts. The figures for those categories in the league report are for the defenses of the league's teams. However, I do include each league's breakdown of basic pitching stats between starters and relievers (denoted by "s" or "r" prefixes), and so summing those will yield the totals from the pitching side. The one abbreviation you might not recognize is "N"--this is the league average of runs/game for one team, and it will pop up again.

The Team spreadsheet focuses on overall team performance--wins, losses, runs scored, runs allowed. The columns included are: Park Factor (PF), Home Run Park Factor (PFhr), Winning Percentage (W%), Expected W% (EW%), Predicted W% (PW%), wins, losses, runs, runs allowed, Runs Created (RC), Runs Created Allowed (RCA), Home Winning Percentage (HW%), Road Winning Percentage (RW%) [exactly what they sound like--W% at home and on the road], Runs/Game (R/G), Runs Allowed/Game (RA/G), Runs Created/Game (RCG), Runs Created Allowed/Game (RCAG), and Runs Per Game (the average number of runs scored an allowed per game). Ideally, I would use outs as the denominator, but for teams, outs and games are so closely related that I don’t think it’s worth the extra effort.

The runs and Runs Created figures are unadjusted, but the per-game averages are park-adjusted, except for RPG which is also raw. Runs Created and Runs Created Allowed are both based on a simple Base Runs formula. The formula is:

A = H + W - HR - CS
B = (2TB - H - 4HR + .05W + 1.5SB)*.76
C = AB - H
D = HR
Naturally, A*B/(B + C) + D.

I have explained the methodology used to figure the PFs before, but the cliff’s notes version is that they are based on five years of data when applicable, include both runs scored and allowed, and they are regressed towards average (PF = 1), with the amount of regression varying based on the number of years of data used. There are factors for both runs and home runs. The initial PF (not shown) is:

iPF = (H*T/(R*(T - 1) + H) + 1)/2
where H = RPG in home games, R = RPG in road games, T = # teams in league (14 for AL and 16 for NL). Then the iPF is converted to the PF by taking x*iPF + (1-x), where x = .6 if one year of data is used, .7 for 2, .8 for 3, and .9 for 4+.

It is important to note, since there always seems to be confusion about this, that these park factors already incorporate the fact that the average player plays 50% on the road and 50% at home. That is what the adding one and dividing by 2 in the iPF is all about. So if I list Fenway Park with a 1.02 PF, that means that it actually increases RPG by 4%.

In the calculation of the PFs, I did not get picky and take out “home” games that were actually at neutral sites.

There are also Team Offense and Defense spreadsheets. These include the following categories:

Team offense: Plate Appearances, Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Walks per At Bat (WAB), Isolated Power (SLG - BA), R/G at home (hR/G), and R/G on the road (rR/G) BA, OBA, SLG, WAB, and ISO are park-adjusted by dividing by the square root of park factor (or the equivalent; WAB = (OBA - BA)/(1 - OBA) and ISO = SLG - BA).

Team defense: Innings Pitched, BA, OBA, SLG, Innings per Start (IP/S), Starter's eRA (seRA), Reliever's eRA (reRA), Quality Start Percentage (QS%), RA/G at home (hRA/G), RA/G on the road (rRA/G), Battery Mishap Rate (BMR), Modified Fielding Average (mFA), and Defensive Efficiency Record (DER). BA, OBA, and SLG are park-adjusted by dividing by the square root of PF; seRA and reRA are divided by PF.

The three fielding metrics I've included are limited it only to metrics that a) I can calculate myself and b) are based on the basic available data, not specialized PBP data. The three metrics are explained in this post, but here are quick descriptions of each:

1) BMR--wild pitches and passed balls per 100 baserunners = (WP + PB)/(H + W - HR)*100

2) mFA--fielding average removing strikeouts and assists = (PO - K)/(PO - K + E)

3) DER--the Bill James classic, using only the PA-based estimate of plays made. Based on a suggestion by Terpsfan101, I've tweaked the error coefficient. Plays Made = PA - K - H - W - HR - HB - .64E and DER = PM/(PM + H - HR + .64E)

Next are the individual player reports. I defined a starting pitcher as one with 15 or more starts. All other pitchers are eligible to be included as a reliever. If a pitcher has 40 appearances, then they are included. Additionally, if a pitcher has 50 innings and less than 50% of his appearances are starts, he is also included as a reliever (this allows some swingmen type pitchers who wouldn’t meet either the minimum start or appearance standards to get in).

For all of the player reports, ages are based on simply subtracting their year of birth from 2013. I realize that this is not compatible with how ages are usually listed and so “Age 27” doesn’t necessarily correspond to age 27 as I list it, but it makes everything a heckuva lot easier, and I am more interested in comparing the ages of the players to their contemporaries, for which case it makes very little difference. The "R" category records rookie status with a "R" for rookies and a blank for everyone else; I've trusted Baseball Prospectus on this. Also, all players are counted as being on the team with whom they played/pitched (IP or PA as appropriate) the most.

For relievers, the categories listed are: Games, Innings Pitched, estimated Plate Appearances (PA), Run Average (RA), Relief Run Average (RRA), Earned Run Average (ERA), Estimated Run Average (eRA), DIPS Run Average (dRA), Strikeouts per Game (KG), Walks per Game (WG), Guess-Future (G-F), Inherited Runners per Game (IR/G), Batting Average on Balls in Play (%H), Runs Above Average (RAA), and Runs Above Replacement (RAR).

IR/G is per relief appearance (G - GS); it is an interesting thing to look at, I think, in lieu of actual leverage data. You can see which closers come in with runners on base, and which are used nearly exclusively to start innings. Of course, you can’t infer too much; there are bad relievers who come in with a lot of people on base, not because they are being used in high leverage situations, but because they are long men being used in low-leverage situations already out of hand.

For starting pitchers, the columns are: Wins, Losses, Innings Pitched, Estimated Plate Appearances (PA), RA, RRA, ERA, eRA, dRA, KG, WG, G-F, %H, Pitches/Start (P/S), Quality Start Percentage (QS%), RAA, and RAR. RA and ERA you know--R*9/IP or ER*9/IP, park-adjusted by dividing by PF. The formulas for eRA and dRA are based on the same Base Runs equation and they estimate RA, not ERA.

* eRA is based on the actual results allowed by the pitcher (hits, doubles, home runs, walks, strikeouts, etc.). It is park-adjusted by dividing by PF.

* dRA is the classic DIPS-style RA, assuming that the pitcher allows a league average %H, and that his hits in play have a league-average S/D/T split. It is park-adjusted by dividing by PF.

The formula for eRA is:

A = H + W - HR
B = (2*TB - H - 4*HR + .05*W)*.78
C = AB - H = K + (3*IP - K)*x (where x is figured as described below for PA estimation and is typically around .93) = PA (from below) - H - W
eRA = (A*B/(B + C) + HR)*9/IP

To figure dRA, you first need the estimate of PA described below. Then you calculate W, K, and HR per PA (call these %W, %K, and %HR). Percentage of balls in play (BIP%) = 1 - %W - %K - %HR. This is used to calculate the DIPS-friendly estimate of %H (H per PA) as e%H = Lg%H*BIP%.

Now everything has a common denominator of PA, so we can plug into Base Runs:

A = e%H + %W
B = (2*(z*e%H + 4*%HR) - e%H - 5*%HR + .05*%W)*.78
C = 1 - e%H - %W - %HR
cRA = (A*B/(B + C) + %HR)/C*a

z is the league average of total bases per non-HR hit (TB - 4*HR)/(H - HR), and a is the league average of (AB - H) per game.

In the past couple years I’ve presented a couple of batted ball RA estimates. I’ve removed these this year, not just because batted ball data exhibits questionable reliability but because these metrics were complicated to figure, required me to collate the batted ball data, and were not personally useful to me. I figure these stats for my own enjoyment and have in some form or another going back to 1997. I share them here only because I would do it anyway, so if I’m not interested in certain categories, there’s no reason to keep presenting them.

Instead, I’m showing strikeout and walk rate, both expressed as per game. By game I mean not 9 innings but rather the league average of PA/G. I have always been a proponent of using PA and not IP as the denominator for non-run pitching rates, and now the use of per PA rates is widespread. Usually these are expressed as K/PA and W/PA, or equivalently, percentage of PA with a strikeout or walk. I don’t believe that any site publishes these as K and W per equivalent game as I am here. This is not better than K%--it’s simply applying a scalar multiplier. I like it because it generally follows the same scale as the familiar K/9.

To facilitate this, I’ve finally corrected a flaw in the formula I use to estimate plate appearances for pitchers. Previously, I’ve done it the lazy way by not splitting strikeouts out from other outs. I am now using this formula to estimate PA (where PA = AB + W):

PA = K + (3*IP - K)*x + H + W
Where x = league average of (AB - H - K)/(3*IP - K)

Then KG = K*Lg(PA/G) and WG = W*Lg(PA/G).

G-F is a junk stat, included here out of habit because I've been including it for years. It was intended to give a quick read of a pitcher's expected performance in the next season, based on eRA and strikeout rate. Although the numbers vaguely resemble RAs, it's actually unitless. As a rule of thumb, anything under four is pretty good for a starter. G-F = 4.46 + .095(eRA) - .113(K*9/IP). It is a junk stat. JUNK STAT JUNK STAT JUNK STAT. Got it?

%H is BABIP, more or less--%H = (H - HR)/(PA - HR - K - W), where PA was estimated above. Pitches/Start includes all appearances, so I've counted relief appearances as one-half of a start (P/S = Pitches/(.5*G + .5*GS). QS% is just QS/(G - GS); I don't think it's particularly useful, but Doug's Stats include QS so I include it.

I've used a stat called Relief Run Average (RRA) in the past, based on Sky Andrecheck's article in the August 1999 By the Numbers; that one only used inherited runners, but I've revised it to include bequeathed runners as well, making it equally applicable to starters and relievers. I am using RRA as the building block for baselined value estimates for all pitchers this year. I explained RRA in this article , but the bottom line formulas are:

BRSV = BRS - BR*i*sqrt(PF)
IRSV = IR*i*sqrt(PF) - IRS
RRA = ((R - (BRSV + IRSV))*9/IP)/PF

The two baselined stats are Runs Above Average (RAA) and Runs Above Replacement (RAR). RAA uses the league average runs/game (N) for both starters and relievers, while RAR uses separate replacement levels for starters and relievers. Thus, RAA and RAR will be pretty close for relievers:

RAA = (LgRA - RRA)*IP/9
RAR (relievers) = (1.11*LgRA - RRA)*IP/9
RAR (starters) = (1.28*LgRA - RRA)*IP/9

All players with 300 or more plate appearances are included in the Hitters spreadsheets (along with some players close to the cutoff point who I was interested in). Each is assigned one position, the one at which they appeared in the most games. The statistics presented are: Games played (G), Plate Appearances (PA), Outs (O), Batting Average (BA), On Base Average (OBA), Slugging Average (SLG), Secondary Average (SEC), Runs Created (RC), Runs Created per Game (RG), Speed Score (SS), Hitting Runs Above Average (HRAA), Runs Above Average (RAA), Hitting Runs Above Replacement (HRAR), and Runs Above Replacement (RAR).

I do not bother to include hit batters, so take note of that for players who do get plunked a lot. Therefore, PA are simply AB + W. Outs are AB - H + CS. BA and SLG you know, but remember that without HB and SF, OBA is just (H + W)/(AB + W). Secondary Average = (TB - H + W)/AB = SLG - BA + (OBA - BA)/(1 - OBA). I have not included net steals as many people (and Bill James himself) do--it is solely hitting events.

BA, OBA, and SLG are park-adjusted by dividing by the square root of PF. This is an approximation, of course, but I'm satisfied that it works well. The goal here is to adjust for the win value of offensive events, not to quantify the exact park effect on the given rate. I use the BA/OBA/SLG-based formula to figure SEC, so it is park-adjusted as well.

Runs Created is actually Paul Johnson's ERP, more or less. Ideally, I would use a custom linear weights formula for the given league, but ERP is just so darn simple and close to the mark that it’s hard to pass up. I still use the term “RC” partially as a homage to Bill James (seriously, I really like and respect him even if I’ve said negative things about RC and Win Shares), and also because it is just a good term. I like the thought put in your head when you hear “creating” a run better than “producing”, “manufacturing”, “generating”, etc. to say nothing of names like “equivalent” or “extrapolated” runs. None of that is said to put down the creators of those methods--there just aren’t a lot of good, unique names available. Anyway, RC = (TB + .8H + W + .7SB - CS - .3AB)*.322.

RC is park adjusted by dividing by PF, making all of the value stats that follow park adjusted as well. RG, the rate, is RC/O*25.5. I do not believe that outs are the proper denominator for an individual rate stat, but I also do not believe that the distortions caused are that bad. (I still intend to finish my rate stat series and discuss all of the options in excruciating detail, but alas you’ll have to take my word for it now).

I have decided to switch to a watered-down version of Bill James' Speed Score this year; I only use four of his categories. Previously I used my own knockoff version called Speed Unit, but trying to keep it from breaking down every few years was a wasted effort.

Speed Score is the average of four components, which I'll call a, b, c, and d:

a = ((SB + 3)/(SB + CS + 7) - .4)*20
b = sqrt((SB + CS)/(S + W))*14.3
c = ((R - HR)/(H + W - HR) - .1)*25
d = T/(AB - HR - K)*450

James actually uses a sliding scale for the triples component, but it strikes me as needlessly complex and so I've streamlined it. I also changed some of his division to mathematically equivalent multiplications.

There are a whopping four categories that compare to a baseline; two for average, two for replacement. Hitting RAA compares to a league average hitter; it is in the vein of Pete Palmer’s Batting Runs. RAA compares to an average hitter at the player’s primary position. Hitting RAR compares to a “replacement level” hitter; RAR compares to a replacement level hitter at the player’s primary position. The formulas are:

HRAA = (RG - N)*O/25.5
RAA = (RG - N*PADJ)*O/25.5
HRAR = (RG - .73*N)*O/25.5
RAR = (RG - .73*N*PADJ)*O/25.5

PADJ is the position adjustment, and it is based on 2002-2011 offensive data. For catchers it is .89; for 1B/DH, 1.17; for 2B, .97; for 3B, 1.03; for SS, .93; for LF/RF, 1.13; and for CF, 1.02. I had been using the 1992-2001 data as a basis for the last ten years, but finally have done an update. I’m a little hesitant about this update, as the middle infield positions are the biggest movers (higher positional adjustments, meaning less positional credit). I have no qualms for second base, but the shortstop PADJ is out of line with the other position adjustments widely in use and feels a bit high to me. But there are some decent points to be made in favor of offensive adjustments, and I’ll have a bit more on this topic in general below.

That was the mechanics of the calculations; now I'll twist myself into knots trying to justify them. If you only care about the how and not the why, stop reading now.

The first thing that should be covered is the philosophical position behind the statistics posted here. They fall on the continuum of ability and value in what I have called "performance". Performance is a technical-sounding way of saying "Whatever arbitrary combination of ability and value I prefer".

With respect to park adjustments, I am not interested in how any particular player is affected, so there is no separate adjustment for lefties and righties for instance. The park factor is an attempt to determine how the park affects run scoring rates, and thus the win value of runs.

I apply the park factor directly to the player's statistics, but it could also be applied to the league context. The advantage to doing it my way is that it allows you to compare the component statistics (like Runs Created or OBA) on a park-adjusted basis. The drawback is that it creates a new theoretical universe, one in which all parks are equal, rather than leaving the player grounded in the actual context in which he played and evaluating how that context (and not the player's statistics) was altered by the park.

The good news is that the two approaches are essentially equivalent; in fact, they are equivalent if you assume that the Runs Per Win factor is equal to the RPG. Suppose that we have a player in an extreme park (PF = 1.15, approximately like Coors Field pre-humidor) who has an 8 RG before adjusting for park, while making 350 outs in a 4.5 N league. The first method of park adjustment, the one I use, converts his value into a neutral park, so his RG is now 8/1.15 = 6.957. We can now compare him directly to the league average:

RAA = (6.957 - 4.5)*350/25.5 = +33.72

The second method would be to adjust the league context. If N = 4.5, then the average player in this park will create 4.5*1.15 = 5.175 runs. Now, to figure RAA, we can use the unadjusted RG of 8:

RAA = (8 - 5.175)*350/25.5 = +38.77

These are not the same, as you can obviously see. The reason for this is that they take place in two different contexts. The first figure is in a 9 RPG (2*4.5) context; the second figure is in a 10.35 RPG (2*4.5*1.15) context. Runs have different values in different contexts; that is why we have RPW converters in the first place. If we convert to WAA (using RPW = RPG, which is only an approximation, so it's usually not as tidy as it appears below), then we have:

WAA = 33.72/9 = +3.75
WAA = 38.77/10.35 = +3.75

Once you convert to wins, the two approaches are equivalent. The other nice thing about the first approach is that once you park-adjust, everyone in the league is in the same context, and you can dispense with the need for converting to wins at all. You still might want to convert to wins, and you'll need to do so if you are comparing the 2014 players to players from other league-seasons (including between the AL and NL in the same year), but if you are only looking to compare Jose Bautista to Miguel Cabrera, it's not necessary. WAR is somewhat ubiquitous now, but personally I prefer runs when possible--why mess with decimal points if you don't have to?

The park factors used to adjust player stats here are run-based. Thus, they make no effort to project what a player "would have done" in a neutral park, or account for the difference effects parks have on specific events (walks, home runs, BA) or types of players. They simply account for the difference in run environment that is caused by the park (as best I can measure it). As such, they don't evaluate a player within the actual run context of his team's games; they attempt to restate the player's performance as an equivalent performance in a neutral park.

I suppose I should also justify the use of sqrt(PF) for adjusting component statistics. The classic defense given for this approach relies on basic Runs Created--runs are proportional to OBA*SLG, and OBA*SLG/PF = OBA/sqrt(PF)*SLG/sqrt(PF). While RC may be an antiquated tool, you will find that the square root adjustment is fairly compatible with linear weights or Base Runs as well. I am not going to take the space to demonstrate this claim here, but I will some time in the future.

Many value figures published around the sabersphere adjust for the difference in quality level between the AL and NL. I don't, but this is a thorny area where there is no right or wrong answer as far as I'm concerned. I also do not make an adjustment in the league averages for the fact that the overall NL averages include pitcher batting and the AL does not (not quite true in the era of interleague play, but you get my drift).

The difference between the leagues may not be precisely calculable, and it certainly is not constant, but it is real. If the average player in the AL is better than the average player in the NL, it is perfectly reasonable to expect the average AL player to have more RAR than the average NL player, and that will not happen without some type of adjustment. On the other hand, if you are only interested in evaluating a player relative to his own league, such an adjustment is not necessarily welcome.

The league argument only applies cleanly to metrics baselined to average. Since replacement level compares the given player to a theoretical player that can be acquired on the cheap, the same pool of potential replacement players should by definition be available to the teams of each league. One could argue that if the two leagues don't have equal talent at the major league level, they might not have equal access to replacement level talent--except such an argument is at odds with the notion that replacement level represents talent that is truly "freely available".

So it's hard to justify the approach I take, which is to set replacement level relative to the average runs scored in each league, with no adjustment for the difference in the leagues. The best justification is that it's simple and it treats each league as its own universe, even if in reality they are connected.

The replacement levels I have used here are very much in line with the values used by other sabermetricians. This is based both on my own "research", my interpretation of other's people research, and a desire to not stray from consensus and make the values unhelpful to the majority of people who may encounter them.

Replacement level is certainly not settled science. There is always going to be room to disagree on what the baseline should be. Even if you agree it should be "replacement level", any estimate of where it should be set is just that--an estimate. Average is clean and fairly straightforward, even if its utility is questionable; replacement level is inherently messy. So I offer the average baseline as well.

For position players, replacement level is set at 73% of the positional average RG (since there's a history of discussing replacement level in terms of winning percentages, this is roughly equivalent to .350). For starting pitchers, it is set at 128% of the league average RA (.380), and for relievers it is set at 111% (.450).

I am still using an analytical structure that makes the comparison to replacement level for a position player by applying it to his hitting statistics. This is the approach taken by Keith Woolner in VORP (and some other earlier replacement level implementations), but the newer metrics (among them Rally and Fangraphs' WAR) handle replacement level by subtracting a set number of runs from the player's total runs above average in a number of different areas (batting, fielding, baserunning, positional value, etc.), which for lack of a better term I will call the subtraction approach.

The offensive positional adjustment makes the inherent assumption that the average player at each position is equally valuable. I think that this is close to being true, but it is not quite true. The ideal approach would be to use a defensive positional adjustment, since the real difference between a first baseman and a shortstop is their defensive value. When you bat, all runs count the same, whether you create them as a first baseman or as a shortstop.

That being said, using "replacement hitter at position" does not cause too many distortions. It is not theoretically correct, but it is practically powerful. For one thing, most players, even those at key defensive positions, are chosen first and foremost for their offense. Empirical work by Keith Woolner has shown that the replacement level hitting performance is about the same for every position, relative to the positional average.

Figuring what the defensive positional adjustment should be, though, is easier said than done. Therefore, I use the offensive positional adjustment. So if you want to criticize that choice, or criticize the numbers that result, be my guest. But do not claim that I am holding this up as the correct analytical structure. I am holding it up as the most simple and straightforward structure that conforms to reality reasonably well, and because while the numbers may be flawed, they are at least based on an objective formula that I can figure myself. If you feel comfortable with some other assumptions, please feel free to ignore mine.

That still does not justify the use of HRAR--hitting runs above replacement--which compares each hitter, regardless of position, to 73% of the league average. Basically, this is just a way to give an overall measure of offensive production without regard for position with a low baseline. It doesn't have any real baseball meaning.

A player who creates runs at 90% of the league average could be above-average (if he's a shortstop or catcher, or a great fielder at a less important fielding position), or sub-replacement level (DHs that create 4 runs a game are not valuable properties). Every player is chosen because his total value, both hitting and fielding, is sufficient to justify his inclusion on the team. HRAR fails even if you try to justify it with a thought experiment about a world in which defense doesn't matter, because in that case the absolute replacement level (in terms of RG, without accounting for the league average) would be much higher than it is currently.

The specific positional adjustments I use are based on 2002-2011 data. I stick with them because I have not seen compelling evidence of a change in the degree of difficulty or scarcity between the positions between now and then, and because I think they are fairly reasonable. The positions for which they diverge the most from the defensive position adjustments in common use are 2B, 3B, and CF. Second base is considered a premium position by the offensive PADJ (.97), while third base and center field have similar adjustments in the opposite direction (1.03 and 1.02).

Another flaw is that the PADJ is applied to the overall league average RG, which is artificially low for the NL because of pitcher's batting. When using the actual league average runs/game, it's tough to just remove pitchers--any adjustment would be an estimate. If you use the league total of runs created instead, it is a much easier fix.

One other note on this topic is that since the offensive PADJ is a proxy for average defensive value by position, ideally it would be applied by tying it to defensive playing time. I have done it by outs, though.

The reason I have taken this flawed path is because 1) it ties the position adjustment directly into the RAR formula rather than leaving it as something to subtract on the outside and more importantly 2) there’s no straightforward way to do it. The best would be to use defensive innings--set the full-time player to X defensive innings, figure how Derek Jeter’s innings compared to X, and adjust his PADJ accordingly. Games in the field or games played are dicey because they can cause distortion for defensive replacements. Plate Appearances avoid the problem that outs have of being highly related to player quality, but they still carry the illogic of basing it on offensive playing time. And of course the differences here are going to be fairly small (a few runs). That is not to say that this way is preferable, but it’s not horrible either, at least as far as I can tell.

To compare this approach to the subtraction approach, start by assuming that a replacement level shortstop would create .86*.73*4.5 = 2.825 RG (or would perform at an overall level of equivalent value to being an average fielder at shortstop while creating 2.825 runs per game). Suppose that we are comparing two shortstops, each of whom compiled 600 PA and played an equal number of defensive games and innings (and thus would have the same positional adjustment using the subtraction approach). Alpha made 380 outs and Bravo made 410 outs, and each ranked as dead-on average in the field.

The difference in overall RAR between the two using the subtraction approach would be equal to the difference between their offensive RAA compared to the league average. Assuming the league average is 4.5 runs, and that both Alpha and Bravo created 75 runs, their offensive RAAs are:

Alpha = (75*25.5/380 - 4.5)*380/25.5 = +7.94

Similarly, Bravo is at +2.65, and so the difference between them will be 5.29 RAR.

Using the flawed approach, Alpha's RAR will be:

(75*25.5/380 - 4.5*.73*.86)*380/25.5 = +32.90

Bravo's RAR will be +29.58, a difference of 3.32 RAR, which is two runs off of the difference using the subtraction approach.

The downside to using PA is that you really need to consider park effects if you, whereas outs allow you to sidestep park effects. Outs are constant; plate appearances are linked to OBA. Thus, they not only depend on the offensive context (including park factor), but also on the quality of one's team. Of course, attempting to adjust for team PA differences opens a huge can of worms which is not really relevant; for now, the point is that using outs for individual players causes distortions, sometimes trivial and sometimes bothersome, but almost always makes one's life easier.

I do not include fielding (or baserunning outside of steals, although that is a trivial consideration in comparison) in the RAR figures--they cover offense and positional value only). This in no way means that I do not believe that fielding is an important consideration in player valuation. However, two of the key principles of these stat reports are 1) not incorporating any data that is not readily available and 2) not simply including other people's results (of course I borrow heavily from other people's methods, but only adapting methodology that I can apply myself).

Any fielding metric worth its salt will fail to meet either criterion--they use zone data or play-by-play data which I do not have easy access to. I do not have a fielding metric that I have stapled together myself, and so I would have to simply lift other analysts' figures.

Setting the practical reason for not including fielding aside, I do have some reservations about lumping fielding and hitting value together in one number because of the obvious differences in reliability between offensive and fielding metrics. In theory, they absolutely should be put together. But in practice, I believe it would be better to regress the fielding metric to a point at which it would be roughly equivalent in reliability to the offensive metric.

Offensive metrics have error bars associated with them, too, of course, and in evaluating a single season's value, I don't care about the vagaries that we often lump together as "luck". Still, there are errors in our assessment of linear weight values and players that collect an unusual proportion of infield hits or hits to the left side, errors in estimation of park factor, and any number of other factors that make their events more or less valuable than an average event of that type.

Fielding metrics offer up all of that and more, as we cannot be nearly as certain of true successes and failures as we are when analyzing offense. Recent investigations, particularly by Colin Wyers, have raised even more questions about the level of uncertainty. So, even if I was including a fielding value, my approach would be to assume that the offensive value was 100% reliable (which it isn't), and regress the fielding metric relative to that (so if the offensive metric was actually 70% reliable, and the fielding metric 40% reliable, I'd treat the fielding metric as .4/.7 = 57% reliable when tacking it on, to illustrate with a simplified and completely made up example presuming that one could have a precise estimate of nebulous "reliability").

Given the inherent assumption of the offensive PADJ that all positions are equally valuable, once RAR has been figured for a player, fielding value can be accounted for by adding on his runs above average relative to a player at his own position. If there is a shortstop that is -2 runs defensively versus an average shortstop, he is without a doubt a plus defensive player, and a more valuable defensive player than a first baseman who was +1 run better than an average first baseman. Regardless, since it was implicitly assumed that they are both average defensively for their position when RAR was calculated, the shortstop will see his value docked two runs. This DOES NOT MEAN that the shortstop has been penalized for his defense. The whole process of accounting for positional differences, going from hitting RAR to positional RAR, has benefited him.

I've found that there is often confusion about the treatment of first baseman and designated hitters in my PADJ methodology, since I consider DHs as in the same pool as first baseman. The fact of the matter is that first baseman outhit DH. There is any number of potential explanations for this; DHs are often old or injured, players hit worse when DHing than they do when playing the field, etc. This actually helps first baseman, since the DHs drag the average production of the pool down, thus resulting in a lower replacement level than I would get if I considered first baseman alone.

However, this method does assume that a 1B and a DH have equal defensive value. Obviously, a DH has no defensive value. What I advocate to correct this is to treat a DH as a bad defensive first baseman, and thus knock another five or ten runs off of his RAR for a full-time player. I do not incorporate this into the published numbers, but you should keep it in mind. However, there is no need to adjust the figures for first baseman upwards --the only necessary adjustment is to take the DHs down a notch.

Finally, I consider each player at his primary defensive position (defined as where he appears in the most games), and do not weight the PADJ by playing time. This does shortchange a player like Ben Zobrist (who saw significant time at a tougher position than his primary position), and unduly boost a player like Buster Posey (who logged a lot of games at a much easier position than his primary position). For most players, though, it doesn't matter much. I find it preferable to make manual adjustments for the unusual cases rather than add another layer of complexity to the whole endeavor.

2014 Park Factors

2014 League

2014 Team

2014 Team Offense

2014 Team Defense

2014 AL Relievers

2014 NL Relievers

2014 AL Starters

2014 NL Starters

Monday, September 29, 2014

Crude Playoff Odds--2014

In a world where there are plenty of sources for playoff odds that actually take into account the personnel currently available for each team, use projected rather than 2014-only performance, consider pitching matchups, and the like, there is no real reason for me to post this. Nonetheless, here are some very crude playoff odds. The key assumptions:

• Team strength is constant and is measured by my Crude Team Ratings, using an equal weight of W%, EW%, and PW% regressed with 69 games of .500
• Home field advantage is uniform and the home team wins 54.5% of the time

From there, the math is pretty simple and I will present with little explanation. First, the ratings which are used to feed the estimates:

These ratings don’t know or care that Oakland has stopped scoring runs (or that the magic influence of Cespedes is gone); they don’t know that Garret Richards is injured; they don’t know anything other than these team’s schedules, their wins and losses, runs and runs allowed, and BA/OBA/SLG and allowed.

From here, it’s just plug and chug. Wildcard game odds:

Home field offsets OAK’s perceived strength advantage over KC.

Division series:

Here, “P” is the probability that the series occurs; P(H win) is the probability that the home team wins should the series occur; and P(H) is the probability that the series occurs and that the home team wins [P*P(H win)].


World Series:

The probability of one of the “rivalry” matchups (OAK/SF, LAA/LA, BAL/WAS, or KC/STL) occurring is 21.5%, which is not bad at all. LAA/LA is the second-most likely series; the least likely is KC/SF, which is good because that is the one that I would least like to see.

Putting it all together:

One might be surprised by OAK having better odds to win a Division Series and each subsequent round than KC even though the latter is favored in their wildcard game, but the ratings (perhaps incorrectly) think the A’s are one of the strongest teams in the playoffs, and thus the Royals wild card game edge, solely due to home field, is insufficiently large to keep them ahead. KC would need a rating of around 122 to have an equal World Series win probability to the A’s.

The odds suggest a 57.6% chance that the junior circuit wins it all, not a surprise given that the AL teams rank 1-2-3-6-7 in strength and they have home field in the World Series to boot. In fact, the AL is favored in 22 of 25 potential matchups, with the only exceptions being Washington against Detroit or Kansas City and Los Angeles against Kansas City.

Saturday, September 06, 2014

Saturday, August 23, 2014

This House is Falling Apart

On May 4, Ohio State was suffering its worst ever loss in the eighteen-year history of Bill Davis Stadium, trailing Iowa 17-2. Only once had OSU lost a home game by a larger margin than the fifteen that they would wind up losing the game by--and that was in 1899. On two other occasions OSU had lost a home game by fifteen runs (to add insult to injury, both were against Michigan, in 1934 and 1989).

The music choice on the PA was unintentionally appropriate. Although ostensibly an upbeat song by an Ohio band, Walk the Moon’s “Anna Sun” features the infectious chorus:

What do you know? This house is falling apart
What can I say? This house is falling apart
We got no money, but we got heart
We’re gonna rattle this ghost town
This house is falling apart

I found it quite apropos for the moment, because the house that is the OSU baseball program is falling apart under the direction of Coach Greg Beals, and the historic home beatdown was the nadir of the season.

Or was it? Beals’ tactics present any number of other crazy stupid baserunning events that encapsulate a program being run by a coach with the mind of a junior high coach more concerned with winning games than teaching kids how to actually play baseball. At least in that case, outlandish baserunning can produce victories, whereas at the collegiate level it simply produces embarrassment. (Search hashtag #BealsBall if you are interested in an account of these--the 2014 highlight was a batter who hit a home run being called out for passing the runner at first.)

I usually am not this cynical about OSU sports; I am a true believer, a fan who tends towards homerism. Baseball is a little tougher on this front for me, though, since I can’t just turn off my analytical approach to the game just because the uniforms are scarlet and gray. But my sarcasm is aimed at squarely at the professional coaches, not at the student-athletes on partial scholarships who bust their tail for the greater glory of The Ohio State University.

The Buckeyes got off to a pretty solid start to the season, going 8-6 before playing their first home game; while that record does not sound particularly impressive, it included wins over Auburn, Oklahoma, and Oregon, with three of the losses coming in the final stretch of four games in Oregon (three against the Ducks and one against the Beavers). OSU then won its first five home games to fatten the record for 13-6 before Big Ten play opened.

OSU’s first Big Ten series was scheduled to be played at East Lansing, but it was moved to Columbus on account of weather and the Bucks took two of three from the Spartans. Defending champs Indiana came in the next weekend and unceremoniously swept the series, and a trip to Lincoln the following weekend resulted in a Cornhusker sweep on three one-run wins, the latter two walkoffs including a three-run ninth inning implosion in the middle game. When Ohio lost the first game of its next home series against Penn State, the seven-game conference losing streak was the first for the program since 1987.

1987 was a year that came up repeatedly in 2014. While the Bucks won the final two games from PSU and two of three at Purdue and hosting Iowa (the debacle described to open this post notwithstanding), they closed the conference campaign by losing two of three at Ann Arbor and hosting Northwestern. The end result was a 10-14 Big Ten record, OSU’s worst since 1987. And while all this was going on in Big Ten play, OSU was not exactly tearing it up in mid-week games, going 7-6 in such games (including taking two of three in a weekend series from Murray State). OSU was just 5-12 on the road (.294), the program’s worst showing since 1972 (3-14).

In my season preview post, I made a grievous error, stating that the top six finishers in the conference qualified for the Big Ten tournament. I was mistaken--the field was expanded to eight this year in preparation for the thirteen-team Big Ten of 2015. It was a fortuitous change, since otherwise the seventh-seeded Bucks would have been on the outside looking in. It was little matter, though, as one-run losses to Nebraska and Illinois ended OSU’s season.

The final tally was a 30-28 (.517) record, seventh among Big Ten teams. OSU was also seventh in the conference in EW% (.545) and PW% (.523); Indiana led in all three with figures of .746, .803, .733. OSU averaged 4.79 runs to a conference average of 4.87, while the Buckeyes allowed just 4.36 runs per game versus an average of 4.66 (although in comparing these figures it’s worth considering that Bill Davis Stadium is a fairly strong pitcher’s park). OSU’s offense was pretty much in line with the Big Ten average not only in terms of output but shape as well, with a .267 batting average, .103 walk/AB ratio, and .096 isolated power compared to averages of .271, .101, and .089. It seems difficult to believe that OSU actually had a greater than average power output, but power is significantly down in college baseball, and the Big Ten, never a strong power league (talent and weather likely being contributing factors), is no exception.

Offensively, OSU had a couple bright spots, but they were outweighed by disappointments or puzzling coaching decisions. At catcher, Aaron Gretz was one of the team’s more productive hitters, creating 6.3 runs per game on the strength of a team-high .156 W/AB ratio, but got only 129 PA to 102 to his backup Conor Sabanosh. To be fair, Sabanosh created 5.7 RG himself and also had a good walk rate, but hit for less power than Gretz. Beals has never been willing to let Gretz run with the catching duties despite him appearing to be an adequate defender and consistently outperforming the other catcher with whom he competes for playing time.

First baseman Zach Ratcliff was a disappointment in his sophomore season, failing to show the power he had as a freshman by hitting just 2 longballs in 99 AB and turning in the least effective overall performance of any OSU hitter (.232/.262/.313). 1B/DH Josh Dezse enjoyed a bounceback season, although his injuries prevented him from returning to the Buckeye bullpen as expected. Still, his 5.5 RG and 5 homers were an offensive bright spot and he improved as the season went on.

OSU’s other infield positions were subject to some interesting coaching decisions. Sophomore Troy Kuhn started the year at second base and led the team with 6 home runs and was second with 6.3 RG and +9 RAA. Third base was manned in the early part of the season by sophomore Jacob Bosiokovic, who was expected to be a key offensive contributor and power source, but hit just one homer and turned in a perfectly average 5.0 RG. Shortstop was supposed to be the province of sophomore Craig Nennig, but an early season injury left him sideline and opened the door for sophomore Nick Sergakis, who provided a needed jolt to the offense and became the leadoff hitter despite a team-low .053 W/AB. His overall line of 5.4 RG was a definite upgrade for the middle infield. But when Nennig returned from injury, Beals shuffled the infield to get Nennig back on the field. While Nennig is a superior fielder, he has yet to show any indication of being a productive hitter (.231/.331/.256 in 146 PA was a steup up from his debut campaign). This shifted Sergakis to second and Kuhn to third, leaving Bosiokovic to come off the bench and eventually get a look in left field.

Bosiokovic’s chance in left field came because senior Tim Wetzel failed to reverse his junior year offensive collapse, struggling to a .223/.284/.285 line in 185 PA. Also getting significant playing time in both left and center were a pair of freshman, Ronnie Dawson and Troy Montgomery. Dawson emerged as OSU’s offensive star, a .337/.385/.454 (7.0 RG, +12 RAA) hitter with flair and a certain fan favorite. Montgomery’s debut was less captivating (.235/.297/.353). And in right field, junior Pat Porter caught the Wetzel junior curse, tumbling to a .229/.311/.329 line after being penciled in as a reliable #3 hitter. The only other Buck to get significant plate appearances was sophomore Ryan Leffer, who had a solid offensive line (.303/.355/.343) while earning time at third base and DH.

OSU’s pitching staff wound up with a surprising ace, as freshman Tanner Tully was named Big Ten Freshman of the Year with a team-high 93 innings with +14 RAA and team-low 2.22 ERA and .7 W/9. While Tully’s 3.20 eRA and 5.1 KG suggest that he is a strong regression candidate, performance-wise he was the clear leader of the OSU staff. Junior Ryan Riga started the year as #1, but struggled through injuries and was not particularly effective (4.85 RA and 6.01 eRA over 68 innings and 11 starts). Senior Greg Greve turned in the best season of his career, winning a team-high seven games and contributing 12 RAA over 85 innings. The two most common mid-week starters were sophomore Jake Post and freshman Zach Farmer. Post was inconsistent but showed flashes of being an effective starter when filling in for Riga in the weekend rotation (+1 RAA but a 5.17 eRA). Farmer pitched solidly (4.01 RA over 49 innings) but was diagnosed with leukemia and will miss the 2015 campaign, although all indications are that his prognosis is good which of course is paramount.

OSU’s bullpen took a big step back from 2013, largely due to Trace Dempsey’s regression from stud closer to wild and ineffective (-7 RAA and 4.9 W/9). Freshman Travis Lakins was the bright spot of OSU’s pen, pitching himself into a starting job for 2015 with +12 RAA (second only to Tully) and a team-leading 9.0 K/9. Senior Tyler Giannonatti moved into higher leverage innings and responded well with +3 RAA over 33 innings.

The rest of OSU’s reserve pitchers were hit hard by injuries (particularly to promising freshman reliever Adam Niemeyer) and none logged enough innings to really evaluate their potential to help the team in 2015. As the season wore on, freshman Curtiss Irving got more work, but a 6.66 eRA with 5.7 W and 5.2 K per nine over nineteen frames means the jury is still out. Of the others, it’s worth noting that freshman Shea Murray had the lowest RAA on the team (-8), on the basis of getting lit up for six walks and nine runs in just 2 1/3 innings.

2014 was not an encouraging year for the program. While Beals recruiting efforts continue to draw praise, after four seasons there has yet to be any tangible on-field results. Beals four-year record is 124-105 (.541), the worst four-year stretch for the program since 1987-1990 (117-111, .513), and that period featured an upward trajectory as Beals’ predecessor Bob Todd took command of the program (you will note that 1987 came up multiple times as a low point for the program). In 1991, the Buckeyes exploded onto the national scene by going 52-13, coming within a game of reaching the College World Series and finishing as high as #13 in the national polls. Beals’ four-year Big Ten record of 49-47 (.510) is the worst since 1986-89 (40-48, .455).

At some point, Beals’ recruits need to start producing on the field, and the direction they are given needs to improve. OSU’s baserunning is an atrocity. It is impossible for me to convey just how embarrassing the team-wide effort to give away outs is, and it has only gotten worse as #BealsBall has become the culture of the program. As a proud alumnus, it is infuriating that the university has discarded great men like Jim Tressel, Jon Waters, and Gordon Gee, as well as a promising coach in Mark Osiecki while athletic director Gene Smith is allowed to retain his job and retains the services of Beals. OSU baseball is a program that has demonstrated the ability to dominate the Big Ten, exhibit national relevance, draw fans to the park, and even occasionally turn a profit for the athletic department, a rarity in northern baseball. The powers that be would be wise to keep that in mind before allowing Beals to make all that a distant memory.

Sunday, July 13, 2014

Drew Rucinski, #53

On Thursday, twenty-five year old right-handed pitcher Drew Rucinski made his major debut in relief for the Angels against the Rangers. His performance was not memorable, as he came on in the ninth inning with a 15-4 lead, and allowed 4 hits and 2 runs while recording a strikeout. It was also a short stay in the majors for Rucinski, who was optioned back to AA two days later. My interest in Rucinski stems from the fact that he pitched at Ohio State, and is at least the 53rd former Buckeye to play in the majors as well as the fourth this season (Nick Swisher, Eric Fryer, and fellow Angel farmhand JB Shuck).

Rucinski’s road to the majors has been interesting, both as an amateur and a professional. He came to OSU from Oklahoma, an oddity for a program that generally draws on Ohio and adjacent states for all its talent. He spent his first two seasons in the bullpen, including a 2009 sophomore campaign in which he did yeoman work in middle relief for a staff that really only had three reliable pitchers: ace Alex Wimmers, closer Jake Hale, and Rucinski. That team won OSU’s first regular season Big Ten title in eight years and finished second at the Tallahassee regional.

Wimmers was a dominant pitcher at OSU, arguably the best in program history, and a first round pick of Minnesota, but has seen his career bog down first with control issues and then with injuries; he’s still in high-A in his age 25 season. Hale was a 27th round pick from Arizona, stuck in the minors for two seasons, and currently is pitching in the Atlantic League. Had you told me that the first (and quite possibly only) of OSU’s big three who would reach the majors would be Rucinski, I would have been surprised.

It wouldn’t have been the first time I underestimated Rucinski. He moved into OSU’s rotation for his junior season, a move I was all for, but when he was slated to be the ace in his senior season I expressed skepticism that he was up the typical standards of an OSU #1 pitcher. Rucinski pitched very well, though, leading to someone calling me out in the comments for having been wrong (I was thrilled to have been wrong!)

Rucinski was not drafted, but did sign with Cleveland as a free agent and spent 2011 in their system. He was subsequently released and pitching in the Frontier League before being signed by the Angels mid-season 2013. Last year he pitched well in five starts at high-A Inland Empire, and this year had a 2.35 ERA and 85 K/28 W in 95 innings at AA Arkansas. At the risk of underestimating him again, his prospect status is marginal, but it’s great to see that his perseverance paid off with a cup of coffee, and hopefully much more.

Tuesday, June 17, 2014

1882 AA

The American Association was the first association of clubs to seriously challenge the National League as another major league. Previous organizations, like the International Association, were based on a fundamentally different notion of how to organize professional baseball teams. While everyone who is reading this knows that the NL survives today and the AA does not (unless you give much weight to its eventual shotgun merger with the NL), the AA certainly made its mark on the game. It lives on most obviously through the Cincinnati Reds, Los Angeles Dodgers, Pittsburgh Pirates, and St. Louis Cardinals.

As the AA was organized in 1881, the National League fielded eight teams in Buffalo, Boston, Chicago, Cleveland, Detroit, Providence, Troy, and Worcester. That left major cities like Cincinnati, New York, Philadelphia, and St. Louis out of luck. While independent clubs were still in existence and of high quality (although not in the numbers they once had been, particularly in the east), there was an opening available for a rival league to enter the fray.

William Hulbert’s decision to start the NL was a consequence of his attempt to wrest the power in baseball from the eastern clubs to his own Chicago team, and the NL generally took on a midwestern feel. When the New York and Philadelphia clubs refused to make their final road trip of the 1876 campaign, they were expelled from the league. Neither city would get another NL team for some time; although some sources claim that this is because Hulbert held a grudge, others dismiss this theory and point out that neither city had a strong independent club worthy of admission to the NL. Regardless, two major eastern cities were without NL teams.

Cincinnati was dropped after 1880 because they allowed alcohol to be sold in their park. The NL also generally insisted on a fifty cent admission fee. The combination of high ticket price and no booze encouraged a more cosmopolitan atmosphere, but it also forsook a great number of potential customers.

In October of 1881 backers of the newly formed independent Cincinnati team (sportswriter O.P. Caylor and Justus Thorner) met in Pittsburgh with Denny McKnight, an area businessman who had managed (in the financial sense) the Allegheny club a few years earlier.

The group decided to sent telegrams to other team, inviting them to a second meeting. This meeting, held on November 2 in Cincinnati was a success and brought together six clubs (Again, I am using the modern [City Name] [Nickname] format even if it is not exactly applicable. In the case of “Pittsburgh” and “Philadelphia”, it is closer to flat-out balderdash): the Brooklyn Atlantics, the Cincinnati Reds, the Louisville Eclipse, the Philadelphia Athletics, the Pittsburgh Alleghenys, and the St. Louis Browns. McKnight was named president.

The AA allowed twenty five cent admission, Sunday baseball, and alcohol sales. Critics of the league and NL snobs derided it as the “beer and whiskey league”, because in addition to allowing sales, most of the teams were in some way backed by beer money. Before the season could start, the AA had to adjust on the fly as the Atlantics withdrew in March due to financial problems. The AA had also attempted to attract the New York Metropolitans, a strong independent club, but they were courted by the NL as well and decided to wait the situation out.

Raiding National League rosters was not a priority for AA clubs, and the NL did not initially react to the new circuit with hostility, just indifference. Some NL and AA teams even met in pre-season exhibitions. NL players were not a necessity for AA teams; several of them had been independent teams and already had a base of talent. However, two backup, unreserved infielders from Detroit became the center of controversy. Dasher Troy signed with Philadelphia and Sam Wise signed with Cincinnati. Troy backed out, claiming that he was not aware of the AA’s intentions to play on Sunday when he signed, and went back to the Wolverines. Wise wound up signing with Boston, which infuriated the AA, and the association decided to drop its policy of honoring the NL blacklist.

The playing rules of the AA were largely similar to those of the NL. The AA did not fine pitchers for hitting batters, used the Mahn ball, and continued the use of courtesy runners. The association decided to determine its standings by winning percentage rather than total wins. Beginning in July, the association itself employed the umpires rather than the teams, a move that seemed downright prescient in light of the NL’s Higham scandal.

The first AA game was played in Cincinnati on May 2; Allegheny defeated the Reds 10-9. Louisville pitcher Tony Mullane worked ambidextrously in a July 18 game against Baltimore, the first to do so in a major league game (while doubtful, Mullane’s handsomeness is credited with starting the custom of Lady’s Day at the ballpark). Mullane added the AA’s first no-hitter to his accomplishments on September 11, winning 2-0 at Cincinnati. His teammate Guy Hecker tossed his own no-no eight days later, 3-1 at Allegheny.

The Athletics got to the early lead, but Cincinnati was 20-10 at the end of June and rolled home from there, going 35-15 the rest of the way. Their final margin over the Athletics was 11.5 games. After the season, they met the NL pennant winners from Chicago in a two game engagement. The homestanding Reds took the first game 4-0 on October 6, but they were defeated 2-0 in Chicago the next day. Some histories claim that the series was stopped when Denny McKnight threatened to expel Cincinnati, but the more credible explanation is that both teams had other engagements to attend to (in the White Stockings’ case, their series with Providence) and that it was never intended as a championship series.

Even though the pennant race was anti-climactic, Sunday ball, alcohol, and quarter admission proved a winning combination for the AA. The Association, which boasted a combined population in its markets around half a million greater than the NL, outdrew the league despite having two less teams (with obvious caveats regarding attendance figures). All six AA teams claimed to be in the black.

Relatively flush with success, the AA teams made runs at NL talent. Detroit’s star catcher Charlie Bennett accepted a $100 advance from the Alleghenys in August, but then refused to sign in October as expected. Pud Galvin and Ed Williamson also were said to have backed out of AA deals. The club sued Bennett, but the court ruled that there was no valid contract. Regardless of the outcome of any particular skirmish, it was now clear that there were two major leagues.


Cincinnati was clearly the class of the league, leading in both runs scored and allowed with an impressive EW%. Louisville looked like a contender on paper but not on the field, and Philadelphia was the opposite. The league was pretty well-balanced except for Cincinnati on top and, unsurprisingly, the last-minute replacement team in Baltimore on the other end of the spectrum.

In 1882, the AA hit .244/.271/.312 for a .105 SEC, 5.21 runs and 23.66 outs per game. The AA scored .2 runs less per game, with a batting average seven points lower, the same OBA, and a thirty point deficiency in SLG. As you can see, the major difference in offense between the two circuits was power, with more of it in the NL.


This incarnation of the Reds was founded in 1881 by Caylor to play a weekend series in St. Louis against the Browns. Apparently, it was a successful gate attraction and illustrated the desire for high-quality baseball in non-League cities. While this is the franchise that carries on today, it is not the same club as the original Red Stockings or the two or three different incarnations of Cincinnati NL teams between 1876-1880. In my experience, many fans of the modern day Reds think that their team is a direct descendant of the 1869 juggernaut--they are wrong, and they don't like to be told that they are wrong.

Every regular player on the team had previous major league experience with the exception of Bid McPhee; Joe Sommer is considered a rookie for my purposes as well, but he had 88 PA for the 1880 Reds. Snyder (BSN), Stearns (DET), Carpenter (WOR), White (DET), and McCormick (WOR) had all played in the NL in 1881. Fulmer had not played in the league since 1880 (BUF), Macullar since 1879 (SYR), and Wheeler was another 1880 Red. None of the other five clubs had such a wealth of established talent, so it can’t be considered much of a surprise that Cincinnati won the pennant.

Prior to playing the two games with Chicago, the Reds had played their in-state NL counterpart from Cleveland at home, but had won just one out of three, and the win came when Dave Rowe, an outfielder by trade, pitched for Cleveland.

Reserve first baseman Henry Luff was apparently fined $5 for making a catch one-handed, and quit the team in response.


The Athletics had played in the Eastern Championship Association in 1881, and were picked to join the AA over another Philadelphia club, a new outfit being organized by sporting goods magnate Al Reach and Horace Phillips. The AA went with the more established club, backed by Bill Sharsig, a theatre producer (thanks to Richard Hershberger for sharing his research on the potential membership of the Philadelphias).

Athletics regulars with NL experience (last team and year) were: Dorgan (WOR, 1880), Latham (LOU, 1877), Lou Say (CIN, 1880), and Sam Weaver (MIL, 1878). All others plus Dorgan were rookies. Say’s younger brother Jimmy was a reserve shortstop.


The Eclipse was made up of players who had already been together over the preceding seasons. Only two of the regulars had any NL experience: Denny Mack (BUF, 1880) and Tony Mullane (DET, 1881). Only Mack is not considered a rookie by my standard. It is noteworthy then that he was the team’s least valuable regular in terms of WAR.

Louisville may have had the most balanced combination of good-hitting pitchers yet seen. Mullane and 1B/P Guy Hecker each hit at 110 ARG or better. Among teams using two pitchers, there are none from 1876 until this point that had two pitchers each hit at such a high level. I’m not saying that they are the best hitting combination (as, say, Jim Whitney plus a marginal hitter would still create more runs than Mullane/Hecker), just that they both were solid contributors at the plate.


This is one of the more blatant contemporary recastings of a team name as this team was the “Allegheny” club and a lot of people still spelled the name of the city in which they played “Pittsburg”. Just so you know.

Regulars with NL experience: Taylor (CLE, 1881), Strief (CLE, 1879), Leary (DET, 1881 but still a rookie), Mansell (CIN, 1880), Swartwood (BUF, 1881 but still a rookie), and Salisbury (TRO, 1879).

John Peters may have been the biggest “name” player in the association, at least based on previous NL exploits. I suppose that one could make an argument for Will White, but Peters was one of the better players in the first few seasons of the senior circuit.


After St. Louis dropped out of the NL during the 1877-78 offseason, independent teams billing themselves as the Browns continued to play in the city. This particular incarnation was backed by Chris Von der Ahe, a German immigrant who had succeeded in the beer business. He is one of the most interesting characters in nineteenth century baseball, and I will not be able to do him justice. He was prone to saying silly-sounding things, but what would you expect from someone speaking his non-native tongue? He was also criticized as lacking baseball knowledge (he supposedly boasted that his club’s infield was the largest in the country). Whether that particular one is true or not, one needn’t understand the nuances of the game on the field to succeed as a team owner. My point is that he gets a bad rap from some circles--he was a self-made man who was heavily involved in the operation of a successful major league and within a few years his team would be the AA's greatest dynasty.

In The Ball Clubs, Dewey and Accoella described him as “A German immigrant with a comic strip accent, a comic book physiognomy, and comic wardrobe of diamond stickpins, checkered pants, and spats.” That gives you a decent summary of the common view of the man. For more balanced accounts, read some of the entries on him at This Game of Games, and go there if you want to know anything else about the 19th century game in St. Louis.

Browns regulars with NL experience were: Sullivan (BUF, 1881), Bill Gleason (STL, 1877 but still a rookie), Ned Cuthbert (CIN, 1877 and the manager), and Seward (NYN, 1876). According to Cliff Blau's article on this team, regulars Schappert and Walkera nd reserves Smiley and Fusselback were signed away from the Athletics and Atlantics, while Charlie Comiskey was signed from a Dubuque club, and Harry McCaffery was signed away from an area independent club in June. Jack Gleason missed April with injuries sustained during his off-season job as a fireman.


The Orioles, a late replacement for Brooklyn, were easily the least seasoned team in the AA, at least in terms of NL experience. Just three regulars had played in the NL: Henry Myers (PRO, 1881 and the manager of the Baltimores), Waitt (CHN, 1877), and Nichols (WOR, 1880). Only Nichols fails to qualify under my standard as a rookie, and one source claimed that Myers was the only player with NL experience on the opening day roster. Not surprisingly, the team was overmatched finishing in the cellar by 14.5 games. The pitching duo of Landis and Nichols was particularly bad, coming up 1-2 in the cellar in WAA (although this may also reflect heavily on the fielding in this day and age). Landis had been released by the Athletics after losing to Baltimore on May 4.

Twice (against Cincinnati in June and St. Louis in July) opposing catchers purposefully dropped Oriole third strikes with the bases loaded to turn triple plays. A June 28 game against the Reds set a still-standing record for the most runs scored in extra innings, as each club scored four in the tenth before Cincinnati won it with seven in the eleventh.

Most secondary sources claim that the Von der Horst family owned the team. Apparently they used baseball as the drawing card for their other interests--surrounding real estate that included an entertainment complex complete with restaurant, concert grounds, and dance hall.

However, Cliff Blau's research indicates that the Von der Horsts did not come to own the team until 1883 or 1884. There were rumors that this team would be replaced in the AA by a new AA club representing Baltimore and Washington, but in the end, this team withdrew from the circuit and was replaced by a new Baltimore entry. Blau's article was a source for some of the other information in this post.

Leaders and trailers:
1. Pete Browning, LOU (.378)
2. Hick Carpenter, CIN (.342)
3. Ed Swartwood, PIT (.329)
Trailer: Charlie Waitt, BAL (.156)
1. Pete Browning, LOU (.430)
2. Ed Swartwood, PIT (.370)
3. Hick Carpenter, CIN (.360)
Trailer: Chappy Lane, PIT (.196)
1. Pete Browning, LOU (.510)
2. Ed Swartwood, PIT (.489)
3. Billy Taylor, PIT (.452)
Trailer: Charlie Waitt, BAL (.172)
1. Ed Swartwood, PIT (.225)
2. Pete Browning, LOU (.222)
3. Billy Taylor, PIT (.194)
Trailer: Will White, CIN (.043)
Trailing non-pitcher: Bill Smiley, STL (.047)
1. Pete Browning, LOU (78)
2. Ed Swartwood, PIT (77)
3. Hick Carpenter, CIN (72)
4. Joe Sommer, CIN (66)
5. Mike Mansell, PIT (64)
1. Pete Browning, LOU (238)
2. Hick Carpenter, CIN (180)
3. Ed Swartwood, PIT (179)
4. Joe Sommer, CIN (152)
5. Jack O’Brien, PHA (149)
Trailer: Charlie Waitt, BAL (48)
1. Pete Browning, LOU (+4.5)
2. Hick Carpenter, CIN (+3.4)
3. Ed Swartwood, PIT (+3.2)
4. Joe Sommer, CIN (+2.4)
5. Chicken Wolf, LOU (+1.7)
Trailer: Charlie Waitt, BAL (-2.0)
1. Pete Browning, LOU (+5.8)
2. Hick Carpenter, CIN (+4.9)
3. Ed Swartwood, PIT (+4.4)
4. Joe Sommer, CIN (+3.8)
5. Pop Snyder, CIN (+3.1)
Trailer: Charlie Waitt, BAL (-.8)
1. Denny Driscoll, PIT (61)
2. Will White, CIN (65)
3. Harry McCormick, CIN (76)
4. Sam Weaver, PHA (83)
5. Tony Mullane, LOU (83)
Trailer: Tricky Nichols, BAL (162)
1. Will White, CIN (+4.7)
2. Denny Driscoll, PIT (+2.2)
3. Tony Mullane, LOU (+2.1)
4. Sam Weaver, PHA (+1.7)
5. Harry McCormick, CIN (+1.5)
Trailer: Doc Landis, BAL (-2.5)
1. Will White, CIN (+6.3)
2. Tony Mullane, LOU (+4.7)
3. Sam Weaver, PHA (+2.7)
4. Guy Hecker, LOU (+2.4)
5. Denny Driscoll, PIT (+2.1)
Trailer: Doc Landis, BAL (-2.7)

My all-star team:
C: Pop Snyder, CIN
1B: Guy Hecker, LOU
2B: Pete Browning, LOU
3B: Hick Carpenter, CIN
SS: Chick Fulmer, CIN
LF: Joe Sommer, CIN
CF: Oscar Walker, STL
RF: Ed Swartwood, PIT
P: Will White, CIN
P: Tony Mullane, LOU
MVP: 2B Pete Browning, LOU
Rookie Hitter: 2B Pete Browning, LOU
Rookie Pitcher: Tony Mullane, LOU

These choices were all pretty straightforward. The only position where I did not end up going with the highest WAR was center, where Walker was +1.6 with 6 FR and John Reccius was +1.8 with -3 FR.

Monday, June 09, 2014

Great Moments in Yahoo! Standings

As of midnight on June 9, this is how Yahoo! renders the NL wildcard standings:

WAS and ATL at 32-29 are somehow half a game ahead of 33-30 MIA, although if you were to look at the NL East standings on Yahoo!, they are all listed as tied (yes, I realize that W% is what really defines standings, not GB, but there is no dispute that by the rules of GB, those teams are tied).

It gets even more ridiculous, though, when you see that 33-31 LA and STL are tied with MIA. Obviously, they are .5 game behind MIA.