Tuesday, March 22, 2011

Historical Park Factors, 1901-2008

I have posted an updated spreadsheet with park factors for all teams, 1901-2008, as a Google Spreadsheet. These are five-year park factors, calculated in the same manner I describe on this page.

The guiding philosophy was to try to include as much data as possible. If there are five possible years of data to be used for a park, they will all be used, even if four of the seasons were in the past or in the future. The source of the raw data was KJOK’s excellent park database for past seasons and various sources (most notably Baseball-Reference) for recent seasons.

I treat a park as new if there are major changes to the dimensions, but I did not by any means do a complete historical survey to find out when those changes have taken place, so some that probably should have been treated differently are not. If you have specific data on when a change should have (or shouldn’t have) been made, feel free to leave a comment and I will try to incorporate these changes when I update the chart some time in the future.

Additionally, when a team moves, and a new team immediately moves in (for example, the Senators of ’60 and ’61), this is treated as a new team. Also, in cases in which teams have played a significant (which I defined as around ten or more) number of games in a different stadium in the same year, those years are treated as being a new park (an example is the Dodgers playing games in New Jersey the two years before they moved from Brooklyn). Whenever a “new park” of this sort is established, when the old order is restarted it is treated as another new park.

The reason the park factors are only shown through 2008 is that my ideal data set is two previous years, the year in question, and two future years. For most of the parks active in 2009, we will after 2011 be able to fill this dataset, and so I don’t want to publish a park factor now and change it later. However, there are parks where the 2008 or 2007 factors are not yet settled because they are new and there are not yet five years of data available. In these cases, I have listed a PF but marked it as one that will change in the future (this is indicated with an orange shading; park factors for the first year after a switch are in pink text).

Now I will give an example of how I chose the years to be considered in figuring the PF. Suppose we look at the Diamondbacks, who have played in Bank One Ballpark since 1998. In 1998, we have no previous data, but there is four future years of data, so the sample is 1998-2002. For 1999, there is one previous year, so we also look at three future years, and get 1998-2002. For 2000, there are two previous years, so we use two future years, and have a sample of 1998-2002. This is now in the ideal format--the year in question, plus the two immediately prior and future years. Of course, in 2001, we use the two previous years (1999 and 2000), and two future years (2002 and 2003), making the total sample 1999-2003, and it will continue in that manner until something changes.

Let’s also consider the end of the Braves’ tenure in Fulton-County Stadium. The last season there was 1996. For 1994, we have two previous years (‘92 and ‘93) as well as two future years (‘95 and ‘96), so we use 1992-1996. For 1995, we have just one future year, so we use three previous years, and also use 1992-1996, and the same for 1996.

In the previous iteration of these park factors, there were three recent parks for which I needlessly inserted a changepoint and thus changed the factors for the surrounding seasons. These teams were pointed out by Terpsfan101, and I have corrected their PFs in this edition. They are Detroit, 1994-1999; Minnesota, 1989-1995; and Seattle, 1993-1996.

4 comments:

  1. "Additionally, when a team moves, and a new team immediately moves in (for example, the Senators of ’60 and ’61), this is treated as a new team."

    Why?

    ReplyDelete
  2. I don't have any particularly compelling reason behind that decision. In the Senators case, they changed parks in 1962, moving into what would become known as RFK, so it only effects the 1961 factor, and it's the only such move post-1900.

    I suppose I could offer that it's a recognition that park factors don't truly isolate the effect of the park, although that would be a poor excuse and wouldn't speak well for the validity of park factors when teams stay put.

    ReplyDelete
  3. Are these numbers the raw data or are they halved to take into account full season stats?

    ReplyDelete
  4. They are halved and regressed.

    ReplyDelete

I reserve the right to reject any comment for any reason.