Tuesday, April 03, 2007

What is Greatness, or What is Peak?

This is a non-sabermetric question; greatness, unlike “value” or its slightly more vague cousin “ability”, cannot be tied up in a neat numerical package. After all, the word itself implies some sort of transcendent quality that leads to bromides such as “I know it when I see it.”

The only reason I bring this up is that “great” and its synonyms are often used in Hall of Fame discussions, or the ever-present “100 Greatest Players of All-Time”. Even though it is difficult to define greatness, everyone has some internal definition that guides their thoughts when answering these questions.

So suppose you have your personal definition of greatness, and it involves “peak value” or some such thing. How would you go about convincing me that I should care about peak value as it relates to who should be in the HOF or whether Sandy Koufax was “greater” then Don Sutton? Truth be told, I have never seen a satisfying explanation. I suppose this could be because I am not in that camp, and my definition of terms is different then yours, and this different viewpoint is an insurmountable obstacle in me accepting your argument.

This may well be true; however, I will now try to justify my position, which is that the concept of peak value is unimportant for anything other then the question of who had the highest peak value. In other words, I don’t think that looking at a player’s five best seasons or best three consecutive seasons or eight best seasons or what have you is very useful in answering the HOF or greatness questions. There is nothing wrong with asking “Who was the most valuable player over a five-year period?”, but I simply do not see the relevance of this to the greater questions.

My position starts with the premise that what is truly important in the baseball world is helping your team win games. We could now digress and argue about whether this means the actual wins you contributed, including the situation in which your performance occurred (WPA); or whether this means the actual wins you would have contributed had your performance been distributed across possible situations in an average distribution (TPR); or whether this means the actual wins you would have contributed if you had played in ballparks that were neutral in every way, and again your performance had been distributed across possible situations in an average distribution (XW); or any other number of possible ways you could quantify “value”, “performance”, or “ability”.

But I’m not going there, not now, because one can agree with the principle that contributing to team wins is the important thing without agreeing exactly on how to quantify the contribution.

From my perspective (and again that’s all I can offer here, and I’m not passing this off as a black and white issue), the only way I can see giving extra credit for peak performance is if it results in tangible team success, more so then scattered performance would.

This standard immediately in my mind throws out any construction of “best 3 years” or “best 5 consecutive years”. There’s no logical reason why some arbitrary standard like that would be directly related to the extra team success resulting from clustered performance. It *may* be correlated with it, but it certainly in and of itself does not represent it.

Therefore, it all comes down to the extra team success. What value is it to a team to have ten wins in one year rather then five wins in two different seasons? There are two possible answers that I can see. One is that since the number of roster slots on a team is constrained, a player who contributes +1 WAR may have positive value, but if you have a whole team of guys like that, you’re only going to win 82 games. In other words, there is a “roster slot filling cost” that delinearizes the relationship between two players’ WAR. This I can accept, but I don’t think it would make a big difference.

The other, and much more common and potentially bigger impact answer, is that concentrated performance helps the team to win more pennants. Bill James and others, particularly Michael Wolverton and Dan Levitt, have modeled this and found that it is true. One 10 WAR season will have a bigger “pennants added” value then two 5 WAR seasons. However, the difference is not great. Under a career “pennants added” approach, Sandy Koufax is still unable to crack a list of the forty most valuable pitchers of all-time. Also, these approaches are fairly hard to do properly, as you need to make a number of assumptions about the distribution of team winning percentage, what exactly the “pennant” threshold is, how the player’s impact on the team should be valued, etc. Of course, one could argue that this means that the existing approaches are inadequate and that this is why the results have not saved Koufax. I do not think this is the case, I am only weakly trying to justify why I have not gone all out in trying to model it myself. The most user-friendly approach I believe would be to compare the rigorous Pennant Added figures to more conventional ones, and either find a baseline that approximates the impact, or maybe a formula that uses an exponent to increase the value of big seasons, etc.

From my perspective, “peak” is a fuzzy word, since so many people have their own definition for it. Now “peak” is fine for answering the question of “who was the best over a five-year span”, which, while I don’t feel is particularly relevant to the HOF, I have conceded is a legitimate question to ask, if only for the heck of it. I think that “clustered” or “concentrated” performance is a better way to describe the potential impact of high “peak” performance as it pertains to a discussion of the player’s career as a whole.

But what nobody has ever been able to explain, at least persuasively enough to change my outlook, is why I should, say, evaluate Hall of Famers based on “peak” value. It seems to come down to a belief that the Hall of Fame is about “greatness”, and that Sandy Koufax was great and maybe Don Sutton was and maybe he wasn’t. Koufax just should be in the Hall of Fame because that’s how they feel.

Well, if that’s how you feel then knock yourself out. But it ain’t me, babe. I hereby issue a friendly challenge for any peak supporter who happens to be reading this to write a comment explaining why I or anyone else out there should evaluate Hall of Fame candidates in terms of peak value rather then in terms of career value, with considerations for the extra value of “concentrated” performance. I just explained why I don't buy into the peak value mindset, so I would love to know why I'm all wet, if you feel that way. And I certainly wouldn't mind giving your view exposure. If you want to write an article lenght response, I'll even post it here, assuming you have no better place to post, which is doubtful because in five minutes you could set up your own blog which would probably beat posting here.

15 comments:

  1. I remember several years ago devising an exponent method to approximate pennants added. When I applied it to Koufax and Sutton, it reduced the impact of Sutton's 63% edge in WAR to (the relative impact of) *only* a 54% edge.

    ReplyDelete
  2. David, do you remember what you came up with as the approximate exponent?

    That's good info, though, and shows that even in extreme cases, it's tough to make up ground by looking at pennants.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Tango has posted his definition of greatness, and why it matters:

    here

    ReplyDelete
  5. I think there are two fundamental reasons we care about peak. First, the essential unit of baseball is the season. We care much more about winning divisions, pennants, and WS than how many wins our team averaged over a 10 or 15-year period. A high peak performance makes those kind of successful seasons more likely. The pennants added metrics capture some of that, but I'm not sure they get all of it.

    Second, sports fans value individual excellence. Team wins are the single most important thing, but not the only thing we care about. We admire players who are the very best at what they do, and enjoy watching them demonstrate their skill. We admire a no-hitter more than a shutout. Both games were wins, but does that make the performances equal? Most of us would say no.

    It really comes down to being the best (or one of the best). And that means the best in one or more SEASONS. We end up talking about 5- or 7-year peaks because one season isn't quite enough to prove anything, while career measures lost sight of the individual seasons that comprise our baseball-enjoying experience.

    ReplyDelete
  6. "The pennants added metrics capture some of that, but I'm not sure they get all of it."

    If this is in fact true, then I still don't see it as a justification for these arbitrary criteria like "best 5 years" or "best three consecutive years", etc. If our methods are inadequete, then people should try to develop better ones that model reality better, rather then throw their hands up and start making up criteria with no sabermetic justification whatsoever.

    I can certainly accept that great individual performances may be fun to watch. Certainly Koufax's career was more exciting then Sutton's. But if it doesn't have a significantly different real world value, then I just can't care about it when it comes to ranking players.

    ReplyDelete
  7. I didn't mean to cast doubt on the pennant-added metrics -- I just haven't looked at them closely enough to have an opinion. They may well be a big improvement over 5-year, etc. peak calculations. (BTW, looking at Levitt's article, a high peak is valued at about 20% more than a good season -- not a trivial difference.)

    But our real disagreement lies here: "I can certainly accept that great individual performances may be fun to watch....But if it doesn't have a significantly different real world value, then I just can't care about it when it comes to ranking players." You have decided that "real world value" is different from "fun to watch" (or "exciting"), and have defined the former exclusively in terms of games won. That's fine, but recognize that this is your subjective judgment, not an objective sabermetric truth. ML baseball is a form of entertainment for baseball fans. The fans get to decide what matters to them, and it may not only be team wins (and that is especially true if the question at hand is qualifying for the Hall of Fame). And in fact -- this is an empirical reality -- many fans, writers, and players clearly DO place value on players being the very best at what they do, at a seasonal or even game level.

    ReplyDelete
  8. "The fans get to decide what matters to them, and it may not only be team wins"

    Taking this to it's logical conclusion, if fans decide that they value scrappy white guys who hustle over surly black guys who don't get along well with the media, are we forced to conclude that Pete Rose is a more valuable player then Barry Bonds?

    If "value" is defined as added team revenue, then maybe this is something you would have to accept.

    The real issue is that we are using different definitions of value. I have no idea what yours is; mine is that value comes in winning games and pennants. Value in this sense does not depend in any way on perception or revenue or anything else. You can disagree with this definition all you want, but that is how I am using it here, and so all of my comments must be viewed in that light.

    You say that it's not an "objective sabermetric truth", which is true, I suppose. But the traditional focus of sabermetrics has been to evaluate what leads to wins, losses, and pennants in baseball. Sabermetricians can address the question of what fans value, or what generates revenue for teams, but these are all distinct questions. I don't think that any definition of value that incorporates the subjective valuation of fans or owners or anyone else is particularly enlightening or useful when compared to the standings that they print in black and white every morning.

    ReplyDelete
  9. Good discussion, Patriot. Since two separate threads is a pain, I'll limit my comments to Tango's blog.

    ReplyDelete
  10. Pat, I think the exponent formula was WAR^1.16. That is only for seasonal totals, to be added up to a career total.

    ReplyDelete
  11. Thanks, David.

    Guy mentioned that Levitt's method in BTN showed that a high peak was valued at 20% more then a good season. I don't know exactly what he was defining those as, but if we say that 10 WAR is a great season and 5 WAR is a good season, we have a 2:1 ratio in WAR but 2.23 in PA. So that's giving around 12%.

    Going back to Guy's comment
    I didn't mean to cast doubt on the pennant-added metrics -- I just haven't looked at them closely enough to have an opinion.

    I agree that the PA metrics are not a closed case; we should continue to examine those approaches and see which are better, etc. There may even be a lot of room for improvement. But I do think that those kinds of approaches are the "proper" way to credit clustered performance.

    ReplyDelete
  12. In Levitt's article he compares Bench's peak years to the more prosaic years that followed. The ratio of pennants added to wins above average was about 1.2:1. He does another comparison where extreme peak performance is 19% more valuable -- don't remember the details.

    I suppose that would mean that a comparison to WAR would give you an even greater ratio, maybe something like 1.3:1?

    ReplyDelete
  13. A comparison to WAR shouldn't give a greater ratio, it should give a lesser ratio. Suppose you have two players who each make 300 outs. The league average is .18 r/o and we assume a replacement hits at 75% of that or .135. One guy creates .25 r/o and the other creates .4.

    RAA(1) = (.25-.18)*300 = +21
    RAA(2) = (.4-.18)*300 = +66

    RAR(1) = (.25-.135)*300 = +35
    RAR(2) = (.4-.135)*300 = +80

    The gap in each case is 45 runs, because that is the direct difference between Player 1 and 2. But in the RAR case, we add in the 14 run difference between the replacement player and the avergae player, for each player, diluting the ratio.

    ReplyDelete
  14. Sorry, wasn't clear. What I meant was this:
    A pennants-added metric values a very high peak about 20% more than the WAA metric, relative to a merely good performance. If we instead compared the pennant metric to what you get with a replacement-based metric, the "premium" for extremely good performance should be even more than 20% (since, as you say, a replacement baseline minimizes the difference between players).

    * *

    BTW, do you favor using a WAA standard, or WAR, for HOF admission? To me, using WAA seems like a backdoor way to introduce the peak concept. Do you agree?

    ReplyDelete
  15. Crap, I got myself all confused there. Anyway, this all goes to illustrate that the pennants added methods are not all in agreement, and that even though I feel that is the best framework for evaluating the value of clustered performances, there is still work that needs to be done on those.

    As to what baseline I would choose for the HOF, it would be the same baseline that I would use to evaluate a player's career value for any question. I have decided to go with WAR in most stuff that I publish just to avoid baseline arguments. Honestly, I have never made up my mind 100% on what baseline to use. I go back and forth from average to chained replacement to multi-tiered to progressive to straight replacement, etc.

    Comparing to average certainly does benefit the prototypical peak players. I have Sutton beating Koufax 102-65 in WAR but Koufax wins 33-28 in WAA. I stated above that I don't have a firm baseline preference, but I do believe that .500 is the highest possible justifiable baseline for general questions of player value, and so I think the truth at worst for Sutton lies somewhere in between.

    ReplyDelete

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.