## Tuesday, September 20, 2011

### A Quick Look at Negro League W-L Records

I wrote this about a year ago and wasn’t sure if I’d ever post it. With the recent publication of some Negro League data at Seamheads, I figured I’d better post it now before it became completely dated. The data I used was compiled by Chris Cobb and posted on the Hall of Merit site, with John Holway's research as his source data.

I need to admit upfront that I know very little about the Negro Leagues. My knowledge level of the Negro Leagues peaked at about age eleven when I read Only the Ball Was White, and has only gone downhill since then. That is one of the reasons for this post--as a (very limited) education for me on the great pitchers of the Negro Leagues.

I am going to be applying the Netural Win-Loss record approach introduced by Rob Wood, which I have written about several times. It is a way to contextualize a pitcher's W-L record using only the win-loss record of the pitcher's team. This post applies it to several Negro League pitchers.

The basic idea behind Wood's approach is that an average team's deviation from .500 is due in equal parts to their offense and defense. The portion of a team's deviation from .500 that arises from the defense (with the exception of relievers in the pitcher's game and fielders) doesn't do anything to increase a pitcher's expected W% in reality, but if you compare his W% directly to that of his teammates', he will suffer for it.

The formula is simple and linear; instead of comparing a pitcher to his team's W% when he does not get a decision (Mate), the comparison is to the average of Mate and .500. The neutral W% is easy to figure:

NW% = W% - Mate/2 + .25

From NW%, one can figure Neutral Wins and Losses:

NW = NW%*(W + L), NL = W + L - NW

It is also very easy to combine NW% and the number of decisions into wins above some baseline. Wins Above Team is traditionally defined as wins above .500:

WAT = (NW% - .5)*(W + L)

I also use Wins Compared to Replacement, with the assumption that a replacement level starter will have a .380 W%:

WCR = (NW% - .38)*(W + L)

There are a number of weaknesses to the Neutral W-L approach, and there are a number of additional complications that arise when applying it to the Negro League data. This is an incomplete list of the methodological issues that are present even when looking at major league data:

* It does not isolate performance when the pitcher actually pitches; some will receive lousy run support despite pitching for good offensive teams.

* While the approach assumes that the team is balanced between offense and defense, this is not always the case. It is a decent assumption for a pitcher's entire career, but there are still going to be cases in which a pitcher is predominantly on teams skewed one way or the other. Those on offensive teams will benefit unfairly in the metric, while those who are on teams with otherwise strong starting pitching staffs will be hurt.

* All of the problems with the definition and concept behind pitcher wins and losses themselves are still present

With respect to the Negro League results included in this post, the data I have used was compiled by Chris Cobb and posted on the Hall of Merit site, with John Holway's research as his source data. Among the problems that arise from the data:

* The records themselves are incomplete (missing seasons, team records only published for half seasons, etc.) and sometimes contradictory (individual totals that don't add up to the team total, etc.) These kind of errors exist even in major league data from the period, so it's no surprise that they are present in the more chaotic, less-organized Negro League data.

To deal with the gaps in the specific data I used, if I couldn't find the team's record, I assumed that they were .500 when the pitcher in question's decisions were removed. If a pitcher split time between teams and there was no breakdown of his W-L record with the two teams provided, I used the average of the two team's record. For seasons in which Cobb did not include the team's record and I had to look it up from another source, I used the ESPN Baseball Encyclopedia. In that case, if the team's record was only available for a half-season, I assumed that the full season record was double the half-season record.

* I only used the results from domestic Negro League games. The world of the Negro Leagues encompassed a lot more than that; players went to the Caribbean to play, teams barnstormed extensively, played games against major league opponents, etc. Limiting the analysis to league games makes it workable, but it does omit a lot of relevant performances.

In this regard the Negro Leagues were similar to the early NA/NL days, in which the league schedule constituted only a small fraction of total games played, and independent teams often compared favorably to league opponents.

* I am way out of my area of knowledge, but even I feel comfortable asserting that the NeL pitching rotations looked more like the early majors then the contemporary majors. Pitchers got a higher percentage of their team's decisions, reducing the sample size from which Mate is drawn and weakening the assumption that the other pitchers are average. I have also read that teams would purposefully match their aces against one another to create gate attractions, whereas our normal assumption is that teams will try to match their pitchers up in whatever manner creates the highest number of expected wins.

* The league structure was less stable from year-to-year, which makes it harder to compare NeL pitchers from one time period to the other. For twentieth-century major league pitchers, we can be confident that, regardless of when they pitched, that they were facing the highest level of competition available (with the obvious exception of the players locked out of the majors due to their skin color). We also know that they pitched in seasons of roughly equal length, and so their career records represent a fair sample of their performance at different ages.

We don't have that confidence when dealing with the NeL data. For example, Satchel Paige gets no credit for 1935 here, but the adjacent seasons of 1934 and 1936 appear to be among his best. Then he gets no credit for 1937-39, as he was not pitching in official league games. You will see that Paige doesn't come out as impressively as might be expected in the career totals, but the gaps in league play might well be the major cause.

* I have listed WCR figures using a .380 replacement level, but in actuality I have no idea where the NeL replacement level should be set.

From all of the caveats, it may seem as if I am declaring the NW-L statistics to be useless. That is not my intention; I simply don't want to oversell them or fail to acknowledge their biases. Many of the issues with the NW-L records are issues that would arise with any statistical analysis of Negro League pitchers. Consider what a logistical nightmare it would be to try to look at runs allowed, needing innings, and league averages, and park factors.

As sabermetricians we all know the flaws of pitcher W-L records, but there are a few benefits. Among them is the ease in determining them, at least if complete games are the norm. All you need to know is who the starting pitcher was and which team won the game, and you've got it. No need for box scores or play-by-play. No need for park factors or league averages--the average in every league and every park for all of time is .500.

These useful properties are most useful when dealing with incomplete data, and we can refine them further by incorporating team record and producing NW-L. Are the results perfect? Absolutely not. Are they likely to give us a better indication of the quality of these pitchers than raw W-L record or uncontextualized ERAs? I say yes.

The pitchers for whom data was available were: Chet Brewer, Dave Brown, Ray Brown, Bill Byrd, Andy Cooper, Leon Day, Willie Foster, Leroy Matlock, Satchel Paige, Dick Redding, Bullet Joe Rogan, Hilton Smith, Smokey Joe Williams, and Nip Winters.

Since I am out of my area of knowledge when discussing the Negro League stars, I'm not going to make a lot of comments--I'll leave interpretation up to the reader. Here are the actual career W-L records for the pitchers, along with Mate. The list is sorted by career wins above .380:

Only one of the pitchers had a worse record than that of his teams (Chet Brewer). If one figures Wins Above Team by the traditional method, Brewer would rate as a below-average pitcher. It's far more likely, though, that a pitcher with a .591 W% regarded as an excellent pitcher was in fact an excellent pitcher. The fact that his teams played .624 baseball without him indicates that they probably had above-average pitching, which while good for the team did absolutely nothing to increase Brewer's expected W%. Brewer still takes a hit, of course, when neutralizing his record by the Wood approach, but is assumed to be an above-average performer.

Here are the career neutral W-L records for the pitchers, sorted by WCR:

Here is a link to the spreadsheet containing the complete yearly breakdowns for each pitcher. You can see exactly what I inputted and which seasons I didn't have team records for (you'll see blanks in the TW and TL columns):