Monday, December 17, 2007

Providing Zero Insight, but Filling Space Nonetheless

In Bill James’ early Abstracts, he had a little box entitled “Talent Analysis” for each team that estimated the composite value of all of its players (as estimated by the Approximate Value method), what percentage of it was acquired through various means (trades, free agency, development), what percentage of it fell into defined age categories (“young”, “prime”, “past-prime”, and “old”), and how much total value had been produced by the team’s farm system. This article is going to be in the spirit of those and look solely at the offensive players for 2007, and without regard to ho w players were acquired, only whether they were products of the farm system or not. Also, I have omitted the age breakdowns, although I may look at that in the future.

I should note that I do not consider my examination here to be particularly insightful, and certainly it is not unique. This is the kind of stuff that I sometimes figure myself and keep to myself, but since I still can’t (or, more accurately, want to go through the effort to do so) access my pre-written articles, I have to fill this space up with something.

I also would be remiss if I did not point out that this strain of analysis did not die with the Abstract, but is in fact being practiced in other places, most notably by Steve Treder at the Hardball Times. So not only am I just ripping off James’ idea, I’m covering old ground.

Now, for a long list of caveats. One is that I have only considered hitters, and furthermore only hitters with 300 PA. So there are guys who were injured or who were part-time players or what have you and really did have value that are being ignored here.

Second is that I have used my own personal WAR figures, except I have multiplied them all by seven, and put a floor of zero on them. I did this because I didn’t want to get caught up in the numbers as WAR, but wanted something with a direct linear relationship to WAR. The result is a number that looks kind of like a fantasy dollar value--it's not, don’t go out and bid $65 for ARod in your fantasy league because he gets 65 points here, but the scale at least resembles fantasy dollar values.

If you read all of my stuff, you might be thinking “That’s awfully hypocritical, considering he doesn’t like unitless numbers like OPS or EQA”. Duly noted. However, the distinction as I see it is that WAR*7 has a direct, one-step relationship to a meaningful unit (just as Win Shares does, in theory). OPS needs an addition and a multiplication to be a decent run estimator, while EQA differences and ratios are unitless.

Third, I did not factor in defensive value, so everyone is assumed to be an average fielder at his position, thus making Hanley Ramirez the most valuable player in the NL, which I obviously don’t agree with.

Fourth, the system for classifying the producing organization of each player is not optimal. I credited each player to the franchise for which he made his major league debut. Obviously, players often change hands in the minors, and thus you might want to credit Grady Sizemore to the Nationals rather than the Indians. First major league organization is easy to do, though, and I don’t think it’s too much worse than going by signing organization. I believe that in Treder’s analysis, he tries to identify the organization most responsible for the player’s development, which could be either. This is a better approach, but I kept it simple here.

Fifth, the method of assigning players to teams was the same I used in my end of season stat reports, which means each player is credited to just one team even if played a significant amount with two teams (i.e. Saltalamacchia, whose name I find easier to spell than the star first baseman he was traded for).

Finally, I already slipped this in, but I only considered offensive players. So pitching, both on the team and produced by the system, has been completely ignored.

There are a number of different ways to look at the data, and what I am going to do is discuss a few of the interesting things, and then post a big chart at the bottom with all of the data.

Looking at the total (TOT) is not very interesting, because on the team level this just points out teams with talented position players. More interesting is the HG column, which measures homegrown talent retained by the team this year (actually, it includes anyone who made their major league debut with the team and played for So Sammy Sosa is considered homegrown for the Rangers despite the fact that he had been gone from the organization for more than fifteen years).

The leader in homegrown value was the Marlins, with 193 points. The next four teams on the list (Brewers, Phillies, Rockies, Braves) are all Neanderthal League outfits as well, while the Yankees top the AL, with a wide gap from those six teams back to the Indians.

The fact that the Yankees have a lot of homegrown value (James called it talent, and I may slip up and use that term too, but I want to stress that this is a measure of 2007 value and not talent) at first seems surprising, but consider Jorge Posada, Derek Jeter, and Robinson Cano all contributed significant value. Their total is boosted by the presence of Hideki Matsui, who in reality is a free agent signing, but here is treated as a Yankee product since he debuted in the majors with them.

There are a lot of good teams at the top of the homegrown list, but there are some pretty solid teams near the bottom too. One of those is the Cubs, with just 13 points of homegrown value (all contributed by Ryan Theriot). They are beat out by the Giants, though, whose seven points were all contributed by Pedro Feliz.

A logical jump is HG%, which is the percentage of total value produced by the team’s system. As you would expect, this has a strong correlation with the raw HG figure. Milwaukee led the way with 94% homegrown, with Florida, Minnesota, Colorado, and Philadelphia rounding out the top five. The Brewers got just 12 points from imported players (Johnny Estrada and Kevin Mench).

On the flip side, the Cubs (9%) and the Giants (7%) are the trailers. Teams as a whole got 55% of their offensive production from players they had developed.

Moving on, we have the “PROD” column, which measures the amount of value produced by the system. The leaders are the Indians with 278, just edging out the Marlins’ 273. I’ll look at the Tribe more closely in a bit, but Florida has produced 6 20 point players (~3 WAR), of which they retain(ed) five (Miguel Cabrera is now a goner). Only Edgar Renteria is gone. This may seem surprising considering the fire sales they have held, but a lot of the players they gave up in those trades were imports to begin with (Sheffield, Alou, Lowell), or no longer are around to produce any value.

The mean production is 161, pegging the Astros (162) and the Mets (158) as the most average organizations. The standard deviation is 66. I say this to set up that the z-scores range from -1.77 (Cubs) to +1.77 (Indians). With one exception, another half a standard deviation (-2.26) away from any other team. That team is the Giants.

When you see that the Cubs have only produced 44 points (around 6 WAR) of value, you can see that this is pretty bad. Their most notable contribution came from Brendan Harris (22), with the aforementioned Theriot next and just Corey Patterson and Ross Gload to chip in. But the Giants are on a whole different plane with a pitifutl 12. Only two San Fran products batted 300 teams with positive WAR this season--Feliz (7 points) and Yorvit Torrealba (5). I realize that this analysis overlooks a lot, especially pitchers, of which the Giants have a promising crop and a few good exiles out there. But it still strikes me as absurd that they rewarded Brian Sabean with a contract extension. Sabean got just two mediocre (for a playoff team) playoff teams out of four seasons of the greatest offensive force in baseball history, and his team has not been a real factor for a few years now. He has built an impossibly old team (although in his defense he has traded no prospects of offensive value to get it). You’re going to tell him “Nice job, we’d like another five years of this?”

Moving on, I have a column “%Retain”, which is the percentage of value produced by the system retained by the system (HG/PROD). The Rockies lead the way at 91%--only the Juans, Pierre and Uribe, are no longer members of the organization. They are followed by Philadelphia, Cincinnati, Detroit, and Milwaukee. The Tigers’ system has not produced much (61), but they retain 47 of it, and I doubt they’re too broken up about not having Juan Encarnacion, Frank Catalanotto, and Nook Logan. The major league average is 55%, which if you think about it makes sense--it has to be the same as the HG% on the league level.

On the other side of things, the White Sox stick out like a sore thumb with just 12% retention (the Padres are next at 26%). Magglio Ordonez, Carlos Lee, Aaron Rowand, Mike Cameron, and Frank Thomas are all 20 point players who have taken their services elsewhere by whatever means, while their most valuable retained product is a Japanese exception, Tadahito Iguchi. Josh Fields (10) is the most valuable true White Sock standing.

The “#” column gives the total number of players in the sample produced by each team, and “per #” is the per player average value produced by a system. The top three in producing players are Atlanta with 14, then Cleveland and Florida with 13. The average is eight and a third. The bottom three are San Francisco with 2, the Cubs with 3, and Detroit and Baltimore with five. Of course these lists are similar to the value produced lists.

In terms of value per farm product, Florida leads the way at 30, followed by the Yankees (27), Seattle, Philadelphia, and Colorado (26). Detroit has just 12 per player, Chicago 11, and San Francisco 6. So again, not only are the Cubs and Giants last in total value and players produced, the players that they have come up with are the least valuable.

You can play around with a lot of different combinations of the columns, but the last I will present is “Surplus”, which is the raw difference between Total and Production. A positive surplus means that the team had more value in 2007 than its system had produced. The average of course is zero, with the Astros (-4) being the closest. They, most notably, have lost Bobby Abreu, Luis Gonzalez, and Kenny Lofton, but they have also brought in Carlos Lee, Mark Loretta, and Mike Lamb with offsetting value (at least for 2007--the three that got away would have been a much bigger drain in, say, 2001).

The team with the biggest surplus (149) is Detroit, which has imported all of its notable offensive players except Curtis Granderson (Ordonez, Guillen, Polanco, Sheffield, Rodriguez). The flip side of the coin is their divisional foes, Cleveland. The Indians are short 124 points of value. You could make a pretty good team out of Indian exports (C: Josh Bard 1B: Sean Casey 2B: Brandon Phillips 3B: Kevin Kouzmanoff LF: Manny Ramirez CF: Coco Crisp RF: Brian Giles DH: Jim Thome). Even without a shortstop (and John McDonald is probably at least close to replacement level when you consider his defense), this team would have 157 points, which would rank it eighth in baseball (just ahead of the real Indians at 154).

Which feat do you find more impressive? That the Tigers have built a playoff contender on the basis of players brought in from elsewhere, or that the Indians have built a playoff team despite losing all of those players. It helps, I guess, that both teams have significant home grown pitching (on one hand Verlander, Zumaya, Bonderman, Robertson; on the other, Sabathia, Carmona, Betancourt).

Here’s a frivolous question for you: which team possessed the most value produced by another team? My off the cuff guess is the Yankees from the Mariners, on the strength of Alex Rodriguez. And that is indeed the answer. However, the second place finish is based on three players instead of just one--the Padres have 58 points of value produced by the Indians in Josh Bard, Kevin Kouzmanoff, and Brian Giles.

Here is the complete chart, which I sorted by total value produced:

No comments:

Post a Comment

Comments are moderated, so there will be a lag between your post and it actually appearing. I reserve the right to reject any comment for any reason.