Print Baseball Encyclopedias

As I grow older, I try to stay alert to warning signs of old-fogeyism. One or two such signs are not particularly concerning--they can just be written off as personal quirks/eccentricities, which we all possess to one degree or another. A prime example for me is cell phones. I hate the things, and I always have. I finally got one, only because it was cheaper than paying for a landline, and if there's one thing I hate more than cell phones, it's spending money on any type of phone.

When it comes to baseball, one of the possible signs I've noticed is my continuing love for print encyclopedias. I think it's great that we have Baseball-Reference, Retrosheet, the National Pastime Almanac, the Baseball-Databank, and the like, and obviously there are countless advantages to computerized data that you and I take advantage of every day. Still, I have yet to warm up to the idea of going to Baseball-Reference, clicking on a page, following a link somewhere else, and wasting an hour or two just wandering in the statistical record of the game. I still do this all the time with print encyclopedias. This post is a tribute/review of them.

Of course, the print encyclopedia is a dinosaur. It always was a bit of a wonder that one could publish a multi-thousand page book, carrying a hefty hardcover price, and sell enough of them to make it a worthwhile business endeavor, especially with annual or semi-annual editions. Perhaps they never really earned their keep anyway, but they should have.

The advent of computerized equivalents has driven the print encyclopedia out of existence (although apparently the erstwhile ESPN Baseball Encyclopedia is still being shopped to publishers). If that is the inevitable cost of progress, then so be it--I wouldn't give up my Lahman database to get a new edition of Total Baseball if that was what it would take. Still, I miss the print encyclopedias--and it seems as if other people do to.

As I write this (New Year's Eve), the current cheapest prices listed on for a copy of the final edition of each of the printed encyclopedias (new or used) are:

* Macmillan (10th edition, 1996): $44.99

The 9th edition is available for as little as $25.

*Sports Encyclopedia: Baseball (2007 edition): $123.08

The 2006 edition is available for as little as $3.31.

*Total Baseball (8th edition, 2004): $99.65

The 7th edition is available for as little as $3.58.

*ESPN Baseball Encyclopedia (5th edition, 2008): $95.80

The 4th edition is available for as little as $1.73.

*STATS All-Time Baseball Handbook (2nd edition, 2000): $3.99

The exception, and not really an iconic book as it only went through two editions and presumably had the most limited printing run of any of the five.

I'm not sure if these prices reflect actual demand for the books in question, or whether sellers think they have something valuable and are setting the price above the intersection of the demand and supply curves. Assuming that it is a real phenomenon, it suggests that there are a fair number of people who miss the print encyclopedias so much that they are willing to pay a high price just to have the final update.

I have at least one copy of each of the big four (excluding the STATS book from that designation) on my bookshelf at all times. Of the four, the two that I use most are ESPN and Sports Encyclopedia: Baseball. Of all of the encyclopedias, I have to count SE:BB as my favorite. It's certainly not the most statistically complete or the best-edited, but it's the only one of the four that breaks from the career register format and instead presents a season rosters format.

I've always felt that the season rosters lend themselves better to browsing than the career registers. (This is the part where the readers scream, "With a computer you can have both!") Not only does it allow one to look at team composition and track changes from year-to-year, it allows one to view an entire league-season on 2-4 pages, making it much easier to get the big picture for a season.

The SE:BB is not without flaws, of course. The book is filled with typos, many of which were presumably there from the first edition to the last. Two quick examples, both from the 1994 edition (although I'd be very surprised if they were corrected in later updates):

* Johnny Kling is listed as "Johnny King" with the roster for the 1901 Cubs, and in the 1901-19 Batter Register (later Cub seasons correctly list him as "Kling").

* The header for the 1972 NLCS says "Cincinnati (west) 3 Pittsburg (East) 2". Perhaps if this was a listing for 1882, it could be considered authentic to the times.

There have to be dozens of similar errors throughout the book, none of which are damning to its utility as a baseball reference but all of which do build up to an uneasy feeling of neglect. Still, the charms of the book overcome that for my money.

Like its cousin, the Macmillan, the statistical selection in SE:BB was formed at its first publication (1969 for Big Mac, 1974 for SE:BB). OBA is nowhere to be found, nor is CS or pitcher home runs allowed. Fractional innings pitched are rounded, an the typesetting varies throughout the book, making some sections more difficult to read. Sometimes space requires severe truncating of batting lines--Dick McAuliffe went 7-27 as a 20 year old left-handed hitter for the 1960 Tigers, but that's all you can find out.

The ESPN encyclopedia, edited by Gary Gillette and Pete Palmer, is my favorite of the three career register works. Mostly this is because it is the most recent, superceding Total Baseball. For the most part, the statistical selection is the same as TB. In both cases, I'd love to have a better offensive rate than OPS+, and I think they tried to hard with respect to fielding categories, but both give the basic categories necessary to build standard statistics.

Total Baseball is unique because of the volume of the text that accompanies the statistics--short biographies of notable players, team histories, a history of sabermetrics, and a bunch of other articles that changed from edition-to-edition. More than any of the other encyclopedias, the article turnover created a reason to buy each new edition (other than, of course, the updated statistics).

The MacMillan must be given respect due to its status as the pioneer; the research that went into producing it has been incorporated by every serious baseball historical work of any stripe since that time. As an encyclopedia, though, it's heyday was the first edition. It soon had SE:BB as a competitor, and with the two including essentially the same basic data, the (IMO) superior format of SE:BB made it an unfair fight. MacMillan also played fast and loose with changing statistics for silly ends. Later editions cut this out, and added some interesting data like team home/road splits and sketchy Negro League records, but by that time Total Baseball was on the scene.

The STATS All-Time Major League Handbook was the most thorough encyclopedia for individual statistics, but as such it is the one that has taken the biggest hit from the existence of Baseball-Reference. No other encyclopedia offered complete batting, pitching, and fielding data (including all of the minor categories like GDP and sacrifice hits allowed), but the sheer volume of data sapped the book of any character it might have otherwise had. While Big Mac has standings and playoff records and the like, and Total Baseball had all of that and the articles, there was no room in the Handbook for anything other than the player career register. The ancillary material was shuffled off into an equally large All-Time Sourcebook.

While the massive print encyclopedia may be something of a relic, I do think it would be wonderful if it could live on. Obviously I know nothing about the real-world feasibility of what I am about to spout, but it would be great to see an organization like SABR step up to the plate and subsidize an updated print encyclopedia (even if it had to be in PDF format, as SABR has done with the Emerald Guide) every half-decade or so. Eventually the desire for such a tome might be foreign to even the crustiest old baseball historians, but I think it's safe to say that day is still several decades off into the future.

Standard Deviation of Franchise W%

Speaking of electronic encyclopedias, this is the type of exercise that they make a breeze, which previously would have been an arduous chore. I figured these a while ago with the intent of using them in some other discussion, but that never materialized so I'll dump them here.

These charts simply show the standard deviation of full-decade W% for each major league franchise. I have criticized the use of decades as a line of demarcation for baseball statistics in the past, but this is not a through analytical endeavor and they do provide an easy, straightforward manner of categorization. I have defined the decade here as 1901-1910, 2001-2010, etc, not because I have any particularly strong feelings on the matter of decade division but because it works better since 1) it includes 2010 and 2) the first decade thus defined corresponds with the American League's 1901 ascension to major league status.

There are four different standard deviations shown for each decade--"whole" is the StD for teams that completed the entire decade. This is fairly arbitrary, as it allows the 1961 AL expansion teams but excludes the 1962 NL expansion teams (the four 1969 expansion teams are obviously excluded as well). "All" is the StD for all franchises that played in the decade, even if it was for as little as one season (actually, the shortest in-decade tenure is two years for the 1969 expansion teams). "1901s" is the StD for the sixteen franchises that have played continuously since 1901. While they now make up just over half of MLB, they at least provide a constant frame of reference throughout the century. "Expan", as you might figure, is the StD for whichever of the fourteen expansion franchises competed in a given decade.

By this measure, the 1980s and 90s stand out as very competitive periods in the game, and the 2000s were a step back from that. However, the standard deviation of franchise W% in the last decade were essentially the same as the 1950s and 60s, and still well under the norm for most of history.

The next chart gives the average W% for teams by decade broken down into 1901s and expansion teams. It also lists the best and worst franchise W%s for the decade, but those lists include only the teams that played ten seasons in each decade:

In the 1980s, expansion teams actually had a slightly better record than the 1901s, but they have lost ground in the last twenty years. Of course, most of the big city teams are 1901s, with the major exception being the Angels. The spread between the best team W% and worst was higher in the 2000s than it had been since the 1960s, but I wouldn't attempt to make anything out of it.

Two Team Cities

During a bout of encyclopedia browsing, I noticed that the two Boston teams both had dreadful 1906 seasons. The Braves were 49-102, but the now-Red Sox were even worse, losing three more games (49-105). I made the mistake of pointing this out on Twitter and saying that it "had to be the worst" such record.

Of course, it didn't have to be anything, and it isn't. It is only the third-worst combined record by teams in the same city since 1901. While I'm sure someone has done this before, a quick search turned up nothing. I considered Brooklyn to be New York (meaning that from 1903-1957 New York had three teams), and I considered the Angels/Dodgers and Giants/A's as sharing a city (when applicable). The ten worst single season records for the two or three teams combined:

At least Boston 1906 was the worst in something, as it was the worst non-Philadelphia combined record. Philly has seen some bad records over the years, but none worse than 1919 when the Phillies were 47-90 and the A's were 36-104. The worst years for each of the two-team cities other than Boston and Philadelphia were St. Louis 1913 (108-195, .356), Chicago 1948 (115-191, .376), Bay Area 1979 (12-199, .386), New York 1965 (127-197, .392), and Los Angeles 1992 (135-189, .417).

The best records are:

Four of these top ten featured a crosstown World Series, led by the 1906 victory by the White Sox over the Cubs; the others are St. Louis 1944, New York 1951 (Giants/Yankees as the Dodgers dropped the three-game NL playoff), and New York 1952 (this time Dodgers/Yankees). The banner years for the other cities were Boston 1915 (184-119, .607), Philadelphia 1913 (184-120, .605) and Los Angeles 2009 (192-132, .593).

The overall records for each city (for years in which they had multiple teams) are:

The cities in which the combined record has been good still have two teams; the ones in which they were poor do not. Shocking but true.

