Tuesday, August 25, 2009

My Personal Path to Sabermetrics

This post is quite self-indulgent, and I don't really expect it to be of any interest to anyone. I was writing a post about a related topic and this wound up being a lengthy digression, so I took it out to use as a stand-alone post.

In the aforementioned post, which I may or may not ever get around to publishing, I offer up definitions for three generations of sabermetric practitioners and disciples (for lack of a better word). Pioneers (folks like Pete Palmer and Bill James just to name a couple), second wave (those who were brought into sabermetrics through the work of the pioneers), and the internet generation. Which group one falls into (to the extent that such classifications are valid and worthwhile, which is questionable) is not so much a question of age but of when his interest in sabermetrics was germinated.

I fall right on the bubble between the second wave and internet generation, and in case I ever do publish that post, you might want to understand where I am coming from based on my own experiences.

I was not much of a baseball fan as a kid (nine years old and under). I never played any sport at any level above friendly neighborhood games (no athletic ability--see, I meet the stereotype for a stat nerd already), and we mostly played football and basketball at that age. My general impression of baseball was that it was boring. I remember my dad watching the World Series one time, and I couldn't sit there and watch it for more than one or two innings despite being able to watch entire football games contentedly.

The first major league game I attended was in 1993; it was the fourth-last game played at old Municipal Stadium, Brewers v. Indians. The Indians won 6-4, but I couldn't have told you that if I hadn't looked it up on Retrosheet. It was a chilly fall Sunday, and the Browns were playing in Indianapolis, and a lot of the people there were listening to the game on the radio, and there was a guy with a little portable TV that people were huddled around. The Browns lost--that much I remember. This experience did nothing to lure me into baseball fandom. Looking at the box score, I feel remorseful about this, as a young Jim Thome went deep and Bill Wertz pitched in relief for the Indians, something that would make me incredibly excited were it to happen today.

The Opening Day game at Jacobs Field in 1994, despite the fact that I only experienced it through the radio, made me a baseball fan. By that time I was listening to sports talk radio, mostly to here football talk, but the opening of the Jake and the cautious optimism surrounding the Indians' 1994 campaign was a popular topic. And I certainly had a sense of history, so when I got home from school I listened to the end of the game, which the Indians won in extra innings after Randy Johnson had taken a no-no fairly deep into the game. That alone got me hooked, but it might not have stuck had the Indians not continued to play well. Fortunately, they did, and by the time summer came around I was a born-again baseball nut. Even the strike did nothing to deter me, and I the next spring I was happily listening to Joe Slusarski, Joe Biasucci, Eric Yelding, and the other Indian replacement players on the radio.

To really get into sabermetrics, you need to possess two traits. First and foremost is a deep interest in baseball, but second is the desire to quantify things and to understand them. I may have been woefully lacking in the first department, but I already possessed the latter trait. As long as I can remember I was always interested in learning facts and reading. My favorite book when I was in the second grade was the World Almanac. When I was in kindergarten or first grade, I kept notecards with data about the planets on them--distance from the sun, length of day and year, diameter, etc.--even though I was obviously too young to really understand what it meant.

So it was only natural that when I did catch the baseball spark, it was only a matter of time until I was interested in records and statistics. And since I was predisposed to like that sort of thing, the wealth of records and statistics in baseball only strengthened my interest in the game. I did take a short detour into the world of baseball cards, but that only lasted through the spring of 1995, and I was always reading the numbers on the back.

By 1995, I was filling up sheets and sheets of notebook paper with lists of the World Series winners, home run leaders, team nickname histories--basically, all of the information that you get in an average sports almanac. Fortunately, I was aware of two offensive stats called On Base Average and Slugging Average, and wanted to include lists of those in my folders, and to collect the lifetime stats for all of the great players.

The yearly SLG leaders were easy enough to find, but lifetime stats and OBA leaders were harder to find in conventional sources. I needed a specialized reference, a baseball encyclopedia, to answer my questions. And it just so happened that I had an older friend in the neighborhood who had a copy of Total Baseball II, and he leant it to me.

So I got the data I needed, but I also saw a whole world of new categories, and naturally I wanted to understand those. So I read the glossary, and when I saw The Hidden Game of Baseball at the library, I checked it out. And from there, it was pretty much over. I was into sabermetrics. Bill James followed and that was all she wrote, by the summer of 1996.

Based on my generational definitions, that account falls solidly as second wave. However, while we didn't have the internet in our house yet, I had friends who did. When school started in the fall, I told one of my friends who was also a baseball fan about sabermetrics, and he googled it (actually, he didn't, as Google didn't exist yet), and printed out some stuff for me, mostly from Keith Woolner's site (which was not yet called Stathead.com--it was still called Baseball Engineering). Anyway, I distinctly remember the article on Marginal Lineup Value being one of the things he gave me.

So while I came to sabermetrics primarily through the work of the pioneers, and not the internet (which is my definition of second wave, more or less), it was only a short matter of time until I was getting sabermetric content from the internet. Furthermore, I had only been a baseball fan for about a year and a half before I got into sabermetrics. That means I had very little time to be indoctrinated into the conventional wisdom and the conventional statistics, and I was very open to persuasion by sabermetric arguments as they didn't challenge any long-held beliefs. That is a trait that I generally associate with the internet generation of sabermetric devotees.

If I pride myself on anything about my formative sabermetric experience, it's that I didn't get sucked into Bill James' anti-linear weights stance expressed in the Historical Baseball Abstract. I always kept an open mind between the two positions and tried to see how they could be reconciled (by some rudimentary "+1"-type analysis of Runs Created). It wasn't until much later that I fully understood the full range of benefits of linear weights, but at least I never reflexively rejected them. Perhaps it was fortunate that I read The Hidden Game before I read the Historical Abstract.

Anyway, if I ever get around to publishing that post, it is somewhat critical of certain elements of the internet generation while showing a bit of a bias towards pioneers and second wavers. Hopefully, this account admitting my own biases and my straddling of the fence between the second wave and internet generation will enable my comments to be taken in the constructive nature in which they are intended, rather than viewed as self-aggrandizing.


  1. I'm part of the James Gang. I had already been following baseball since 1975 and understood the trad stats; knew how to calculate ERA and batting average. I was in high school when the Abstracts were being published and found the 1985 one in a local department store. So I dug Runs Created (I liked how it was on the same scale as RBI and Runs Scored) and Range Factor. But more importantly, the HBA got me into the history of the game moreso than I had been before. I'm more of a baseball history buff than a sabermetrician. Never really figured out how to manipulate databases. Had this still been the pencil and paper era, I might be alright, but I'm at a competitive disadvanatage when it comes to doing anything that's derived from the Lahman database and its descendants.

  2. I am really not a database guy myself. I'm good with Excel and standard spreadsheets, but when it comes to SQL and Access I need my hand held, although I do have a basic understanding of what's going on.


I reserve the right to reject any comment for any reason.