tag:blogger.com,1999:blog-12133335.post5252305143059194814..comments2022-09-01T18:35:28.937-04:00Comments on Walk Like a Sabermetrician: End of Season Statistics, 2007phttp://www.blogger.com/profile/18057215403741682609noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-12133335.post-89094942087117375632007-10-04T11:28:00.000-04:002007-10-04T11:28:00.000-04:00Yes, all five years are equally weighted. Here I ...Yes, all five years are equally weighted. Here I use, of course, the last five years, but in the future, I would use the 2 years on either side (so for 2003, 2001-2005) when applicable, but five years if at all possible.<BR/><BR/>If I was going to do it exactly, what I would do is find the simple ratio for each team (11.42/9.16 for the Rockies). Then if they played 162 games, I would do (81/162)*11.42/9.16 + (games played in ARI/162)*ARI factor + (games played in SF/162)*SF factor + ...<BR/><BR/>As for the "+1", yes, it should really be the road pf. So that would be (16-1.228)/15 = .9848, and the new factor would be (.9848+1.228)/2 = 1.1064. When you regress that and round to two decimal places, you get 1.10. I don't sweat the small stuff in this instance because by the time you regress and truncate to two decimal places, it makes very little difference (a couple parks move by one point).phttps://www.blogger.com/profile/18057215403741682609noreply@blogger.comtag:blogger.com,1999:blog-12133335.post-19611607669178042972007-10-04T11:09:00.000-04:002007-10-04T11:09:00.000-04:00You're lucky you didn't have your credit card info...You're lucky you didn't have your credit card info in a hidden column.<BR/><BR/>So all five years are weighted equally? If you were to do an analysis of 2003, would you use the years 1999 to 2003 or 2001 to 2005?<BR/><BR/>If I wanted to weight the away parks exactly according to the unbalanced schedule, I'd do that in the "initial factor" -- where the 15*whatever term is located, right? Why is the final environment adjustment (1.2+1)/2 -- shouldn't the 1 really be the away park factor, excluding Coors?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-12133335.post-33274730764800115252007-10-04T10:58:00.000-04:002007-10-04T10:58:00.000-04:00Cool, I didn't realize that if you copied you woul...Cool, I didn't realize that if you copied you would get all the hidden columns.<BR/><BR/>The PFs are five years (when applicable, of course), equally weighted, regressed towards 1, and <BR/><BR/>Let me walk through a sample calculation. The Rockies and their opponents over the last 5 yrs have averaged 11.42 runs/game in Coors and 9.16 on the road. The initial factor is 16*11.42/(15*9.16 + 11.42) = 1.228. The "16" and "15" weights reflect the fact that the "neutral" context is actually 15/16 road and 1/16 Coors, since everyone plays some games there (of course, with the unbalanced schedule, this is not precise, as you don't play equally in each park).<BR/><BR/>Then, to covert it into a straight adjustment factor, we have to account for the fact that 1/2 the games are at home and 1/2 on the road, and so we do (1.228 + 1)/2 = 1.114.<BR/><BR/>Finally, I regress that towards one, based on regression factors posted by MGL several years ago (.6 for 1 yr of data, .7 for 2, ..., .9 for 4+). So the final PF will be 90% of 1.114 and 10% of 1, or 1.103, which I round to the second decimal place for 1.10.phttps://www.blogger.com/profile/18057215403741682609noreply@blogger.comtag:blogger.com,1999:blog-12133335.post-60771538246577487562007-10-04T10:16:00.000-04:002007-10-04T10:16:00.000-04:00Can you explain the park factors a bit more. When...Can you explain the park factors a bit more. When I copy the data from the GoogleDoc, I get additional columns which imply the data's from multiple seasons. How many? How are the seasons weighted?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-12133335.post-43528198873263574472007-10-04T10:02:00.000-04:002007-10-04T10:02:00.000-04:00Thanks, it should be fixed now.Thanks, it should be fixed now.phttps://www.blogger.com/profile/18057215403741682609noreply@blogger.comtag:blogger.com,1999:blog-12133335.post-39710088750791187552007-10-04T09:37:00.000-04:002007-10-04T09:37:00.000-04:00Thanks for the data. I'm not able to access the 2...Thanks for the data. I'm not able to access the 2007 Teams sheet.Anonymousnoreply@blogger.com