Tuesday, January 28, 2020

Tripod: Common Fallacies

See the first paragraph of this post for an explanation of this series.

Here I deal with some misinformation that is sometimes spread about sabermetrics, or poorly designed statistical methods that are against sabermetric principles. The most important things to remember about sabermetrics are 1) that it is not the numbers themselves that matter, it is what the numbers mean and 2) the only thing that matter is wins, and the only things that lead to wins are runs and outs. Those two principles serve to explain most of the folly behind these fallacies.

The "Bases" Fallacy

There are many methods proposed, by many different people, that use bases and outs as the two main components. These include Boswell's Total Average, Offense Ratio, Codell's Base Out Percentage. There are others too, either looking at bases/out or bases/PA. Not all of the people who have designed these methods fall into the fallacy. Specifically I'll look at John McCarthy and his 1994 book from Betterway Books, Baseball's All-Time Dream Team.

McCarthy rates the great players of all time by what he calls the Earned Bases Average. EBA = (TB + W + SB - CS)/(AB + W). McCarthy mentions that he has read the sabermetric research, but that the sabermetric work is too difficult for the average fan to understand. He goes on to talk about how Linear Weights puts a HR as 3.15 times more valuable than a single, a triple 2.2, a double 1.7, and so on. He then says, "I believe that the value of a baseball game is more than just runs and winning. Winning is the player's aim, but there is also a transcendent beauty to great hits. It is that beauty that puts fans into the seats and visions of grandeur into kids' fantasies. A home can immeasurably lift the spirits of the team, or take the wind out of opponents. So I challenge a mathematical concept which devalues the extra bases earned by sluggers and speedsters."

Now, Mr. McCarthy may indeed have a point when he speaks of "grandeur" and stuff like that. It is OK if you want to design a method to measure the grandeur of players. Just don't get that confused with what actually wins baseball games. He later explains that the estimated values are not "tangible or real", and that "they are too complicated and many times are just clearly wrong." Sorry, buddy, it is you who are clearly wrong. A baseball game is not played in a vacuum. A player must interact with his teammates. The situations that occur by runners and outs effect the value of offensive events. Sure they are not always constant. That is why you must decide what you are measuring, be it ability or value, and choose value added runs or context neutral runs. But the fact is, a home run is not four times more valuable than a single. It just isn't. And a stolen base is clearly not as valuable as a single, because it advances just one baserunner by one base, whereas a single advances the hitter by one base, and advances most runners by at least one and sometimes two bases. Plus it gives an extra Plate Appearance to the team's offense. A stolen base does none of this.

The basic problem with McCarthy's thinking is that bases are not what matters. The game may be called baseball, but the winner is not the one with the most bases but the one with the most runs. You must relate everything to run scoring eventually if you want to really approximate its value. And TA and EBA and the like can be decent estimators of runs. But all bases are not created equal. A SB is worth always at least one base and a HR at least four. But a SB can only be worth one base and a HR can be worth as many as ten bases. The EBA concept is assuming that the only bases that matter are the one that individual genereates for himself, but again, no player is an island. Everything eventually comes down to runs and outs, not bases.

The Right-Handed Hitter Adjustment Fallacy

This is one that you can try to sneak by people. After all, sabermetricians seeming like to adjust for everything, whether or not it needs to be adjusted for, right? So, since there are more right-handed pitchers than southpaws, and righties hit worse versus righties, shouldn't they get credit for dealing with this disadvantage? No way, Jose.

Well, I suppose that if you want to measure literal ability, you want a right handed adjustment. But literal ability had nothing to do with winning baseball games. It has to do with batting practice and skills competitions, and jaw dropping, but not winning. Just as, because of the dynamics of baseball, not all bases are created equal, a lefty hitter is worth more than a righty of the same literal ability, assuming the normal left/right effect holds for them both. I view this extra credit for righties as tantamount to giving credit for ability to play the banjo. I mean, if I had a clone, the same as me in every way, except he could play the banjo and I couldn't, that would make him a more interesting guy than me, no? Sure. What does playing the banjo have to do with winning baseball games? About the same amount as being right-handed.

Seriously, being a right-handed hitter in baseball is a small handicap, just as being unable to hit home runs is a handicap, and having an 85 mph fastball is not as good as a 90 mph fastball. It is a great deal like if we gave Muggsy Bouges extra credit for being 5"5. That certainly hurts his stats, so why don't we adjust for it? Because it's a fact of life that these things are disadvantages, and the goal of baseball is to win games, not to look good.

Here is an example of a biased man who manipulates the numbers in this way. Giving Jim Rice 73% of his PAs vs. lefties is stupid, because 73% of the plate appearances pitched in baseball are not by lefty pitchers.

The Fallacy of the Ecological Fallacy

From time to time, someone who has a background in formal statistics will claim that applying various measures tested at the team-level to individual players(usually a run estimator) is falling prey to the Ecological Fallacy and is thus invalid.

Not having a formal statistics background, it may be hazardous to talk about something that I don’t fully understand. But I can tell you that to the extent that I understand the ecological fallacy, the idea that it applies to individual runs created estimates is hokum.

According to this link, the ecological fallacy occurs when “making an unsupported generalization from group data to individual behavior”. They then use an example of voting. One community has 25% who make over $100K a year, and 25% who vote Republican. Another has 75% who make over $100K and 75% who vote Republican. To use this data to conclude that there is a perfect correlation between individuals voting Republican and making over $100K would be the ecological fallacy. In fact, they show how the data could be distributed so that the correlation between individuals voting Republican and making over $100K is actually negative.

People will then go on to claim that since Runs Created methods are tested on teams, it is wrong to apply them to individuals and assume accuracy. It is true that multiplicative methods like Runs Created and Base Runs make assumptions about how runs are created that are true when applied to teams but cannot be applied to individuals(the well-documented problem of driving yourself in; Barry Bonds’ high on base factor interacts with his high advancement factor in RC, but in reality interacts with the production of his teammates). It is also true that regression equations have many potential pitfalls when applied to teams, let alone taking team regressions and applying them to individuals. However, these limitations are well known by most sabermetricians (although some stubbornly continue to use James’ RC for individual hitters).

The ecological fallacy claim, though, is extended by some to every run estimator that is verified against team data. The claim is that there “need not be little to no connection between team-level functions and player-level functions”. I also saw a critic point out once that run estimators did not do a good job of predicting individual runs scored.

My retort was that the low temperature today in Mozambique did not do a good job of predicting individual runs scored either. To assume that the team runs scored function and the individual runs scored function are the same is to be ignorant of the facts of baseball. A walk and a single have an equal run-scoring value for an individual, and a home run will always have an individual run-scoring value of 1. This is not true for a team, because, except in the case of the home run, it takes another player to come along and drive his teammate in. In the team case, all of these individuals stats are aggregated. The home run by one batter not only scores him, it scores any teammates on base. And therefore the act of scoring runs, for a team, incorporates advancement value as well. A single will create more runs, in average circumstances, then will a walk.

Therefore, when we have a formula that estimates runs scored for a team, it does not estimate the same function as runs scored for a player. It instead approximates another function that we choose to call “runs created” or “runs produced” or what have you. Now it could be claimed, I suppose, that the runs created function cannot be applied to individuals? But why not? If a double creates .8 runs for a team, and a hitter hits a double, why can’t we credit him with creating .8 of the team’s runs? All we are doing is assigning what we know are properly generated coefficients for the team to the player who actually delivered them. Or you can look at it, in the case of theoretical team RC, that we are isolating the player’s contribution by comparing team runs scored with him to team runs scored without him.

Furthermore, the individual runs created function and the team runs scored function are the same function. They have to be. Who causes the team to score runs, the tooth fairy? In the case of the voting situation which was said to be the ecological fallacy, you are artificially forming groups of people that don’t actually interact with each other. I can vote Republican, and you can vote Republican, but we’re not working together in that. You can vote Democrat and I can still vote Republican; our choices are independent. Then you make this group that voted Republican, and look at the their income, and yes, you can reach misleading conclusions.

The point I’m trying to make is that voting is not a community-level function, and therefore it is wrong to attribute the community level data pattern to individuals. People vote as individuals, not as communities. But scoring runs is a team-level function. People create runs as teams, each contributing. If we use a different voting analogy, that of the electoral college, people cast electoral votes as states. And therefore we can break down how much of the electoral vote of Montana that each citizen was responsible for(one share of however many if they voted for the winning candidate, zero if they did not). And that’s what we are doing by looking at individual runs created.

I think the problem, and I don’t mean this to apply to all statisticians who dabble in sabermetrics, but to some, particularly those who don’t have a strong traditional sabermetric background to go along with their statistical knowledge, is that they tend to take all of the things they know can often happen in statistical practice and apply them to sabermetrics, without seeing whether the conditions are in place. In the same way, they will use statistical methods like regression when they are not necessary. If you are studying phenomenon that you don’t have a good theory on, then regression can be a great tool. But if you are studying a baseball offense, you’re better off constructing a logical expression of the run scoring process like Base Runs or using the base/out table to construct Linear Weights. You don’t need a regression to ascertain the run values of events--baseball offenses are complex, but they are not nearly as complex as many of the other phenomenons in the world.


Explanation of Ecological Fallacy


Ec. Fallacy claim applied to RC

No comments:

Post a Comment

I reserve the right to reject any comment for any reason.