Walk Like a Sabermetrician: January 2007

Tuesday, January 30, 2007

Career WAT Data, 1961-1972

Another small group of pitchers to consider using the Neft/Cohen period breakdowns:

In The Politics of Glory, Bill James discussed the similar W-L records of Drysdale and Pappas. As you can see, Drysdale had two more losses and pitched for teams with a W% 25 points higher. Of course, as always, there are other factors at play, and I don’t think there are many people who consider Pappas superior to Drysdale, and rightfully so. Another interesting pair that James discussed is Hunter and Tiant. Tiant had five more wins but six more losses pitched for teams ten points worse. He comes out with a three win advantage through the Oliver thinking, which is what James used, but they are essentially even in the Deane methodology. How do they turn out with the Wood approach?

Tiant and Hunter come out as equal in percentage, but Tiant’s extra decisions allow him to rank a slight advantage in the value categories. Still, the two are very evenly matched. Marichal’s W%, viewed in context, is remarkable. Among 200 NW pitchers that we’ve seen so far, only Grove, Ford, Young, Johnson, Alexander, and Mathewson rank ahead of the Dominican Dandy. Marichal’s gap in NW% over Gibson is bigger then Koufax’s lead over Marichal, and Koufax had 133 less decisions then Marichal.

Saturday, January 27, 2007

Delusional AP Writer of the Day

From the AP's Alan Robinson:

Utley's $85 million, seven-year contract was settled just before the Pirates started talking to Sanchez about a two-year contract. While Utley hits for more power -- he had 32 homers to Sanchez's six -- Sanchez is a better all-around hitter and his 85 RBIs last season compare favorably to Utley's 102.

I'm not exactly sure how 85 RBIs are supposed to compare "favorably" to 102. Context could possibly make that the case; but Mr. Robinson does not deign to tell us how.

As for Sanchez being a better all-around hitter:
Sanchez: 613 PA, 344/377/473, 94 RC, 6.2 RG, .182 SEC
Utley: 721 PA, 305/364/521, 119 RC, 6.6 RG, .310 SEC

Uh-huh. And of course, career-wise:
Sanchez: 313/352/428
Utley: 290/362/509

Do they have editors at the AP?

Friday, January 26, 2007

Tilting at Windmills (Or How I Learned to Stop Worrying and Love the B10 Title)

The Ohio State baseball team will open up its season in a little over a month (Feb. 23) in Tampa, playing James Madison. With the recent release of the roster and a season outlook at ohiostatebuckeyes.com, it is high time to do my own preview of the upcoming campaign.

The Buckeyes are coming off a season in which they finished third in the Big Ten in both the regular season and the tournament. At OSU, it is not really a successful season unless you come away with a B10 title of some sort, but it was a pretty good team. This year’s team will return, essentially, the entire pitching staff intact, and that is a good thing, as OSU paced the conference in runs allowed last year. In fact, the Buckeyes were also tops in runs scored, runs created, and runs created allowed. This sweep of major categories make it a little disappointing that third was all that resulted, but one could argue that the team played better then that.

Unfortunately, two key pieces of the offense have moved on, both drafted in the first ten rounds (3B Ronnie Bourquin went to Detroit in the second, while SS Jeddidiah Stephen went to Baltimore in the eighth). Most of the other key offensive players are back, though, and so the Bucks look on paper to have a formidable squad.

Behind the plate, junior Eric Fryer will be in his third year as the starter, and he is a fine hitter (+21 RAA), batting third most of the time a year ago. He will occasionally play first or DH to get a rest from catching, but OSU needs his bat in the lineup. That will be a little more difficult to do in 2007 as his backup, Josh Hula, transferred. However, Hula did not hit well at all (-11 RAA), so redshirt freshman Nick Stepanovich should be able to step in as the backup backstop. Redshirt freshman Shawn Forsythe could be an option as well.

At first baseman, sophomore ex-catcher Justin Miller took over the job during the season last year and would appear to come into this year with the position locked down. Miller was not outstanding last year (-5 RAA), but one would assume that freshman may have the most room for improvement. Senior Kris Moorman will back up the two infield corner positions.

At second base, senior Jason Zoeller will return, bringing his +11 RAA and surprising power (second on the team at ISO, .202). Zoeller is not a great defensive player, but could play shortstop if the need arose. Apparently, the inside track at shortstop belongs to true freshman Cory Rupert, who I have never seen play. Junior Tony Kennedy will be the key middle infield backup--he rode a fluky .378 BA in 41 PA to an ok RG figure last year, but the small sample size and lack of secondary offense makes it difficult to get too excited about his bat. He also may not have the arm to play short consistently and so Zoeller could slide over to short with Kennedy at second if Rupert is unable to win the job. Redshirt freshman Ben Curran and Ben Toussant will also be in the mix for time.

At third base, the loss of B10 MVP Bourquin leaves a large gap. The first crack at filling it may fall to junior Chris Macke, who hit fairly well in 2005 when he was forced to take the position when Bourquin injured his thumb. Macke only got 14 PA last year, though, so it seems the coaching staff is not sold on him. I believe that third base could easily wind up in the hands of true freshman Brian DeLucia, the younger brother of mound ace Dan.

In the outfield, two of the three starters return. In left, senior Jacob Howell will get the nod, with junior Matt Angle in center. These two will bat second and lead off respectively, and did a fine job of it last year. Both are fast and can bunt for a hit, but they also do a good job of getting on base (.448 and .449 OBAs respectively). Howell’s was fueled by his .402 average, but even with a drop off he will be a solid contributor at the top of the order; he was also hampered by hamstring problems a year ago. In right field, Wes Schirtzinger, who would have been a fifth-year senior, apparently decided to forego a final season, and since he only hit 257/321/296 last year, his production should be easy to replace. One option is sophomore JB Shuck, the reigning B10 freshman of the year, primarily for his mound work. Shuck can play either outfield corner or first, and was a fairly average bat last year (+2 RAA). My guess is that he will DH a lot when not pitching. Near the end of the season last year, he was allowed to bat for himself in games that he worked; I would expect to see this continue in 2007.

Other outfield options include sophomores Michael Arp (-1 RAA in 52 PA) and Jonathon Zizzo (-6 in 83). Redshirt freshman Chris Griffin and Zach Hurley could earn playing time as well.

At DH, JB Shuck will get a lot of time; senior Adam Schneider has regressed since his freshman year, and doesn’t have a lot of defensive value (he is listed as a catcher on the roster), but can really rake left-handed pitchers. Outside of them, Arp or Zizzo will likely fill the spot. True freshman who may or may not be in the mix include catchers Nathan Grove and DJ Hanlin; infielders Cory Kovanda and Matt Streng; and outfielders Brad Brookbank and Ryan Dew.

The entire starting rotation returns intact, anchored by senior lefty Dan DeLucia. DeLucia led the Bucks at 10-2, 108 IP, and a 3.67 RA (+27 RAA). He is not overpowering, but has good control. He was slightly better as a sophomore, and his younger brother Brian could wind up manning the hot corner for OSU. Junior lefty Cory Luebke will be the #2 starter, coming off a solid season (4.34 RA, +15 RAA in 85 innings). He completed seven of his thirteen starts, as he gets the call in one of the Saturday seven inning doubleheader games. He was drafted by the Rangers in the 22nd round, but elected to return to school. He is a cousin of OSU quarterback Todd Boeckman.

The third spot goes to sophomore Jake Hale, who doesn’t throw as hard as you would envision a 6’7” right-hander to, but has a lot of frame to fill out. Hale had a 4.92 RA, but his 4.38 eRA is better, and he was +7 runs regardless. I would not be surprised to see him take a big step forward this year, but that is just conjecture. Aforementioned sophomore JB Shuck will likely be the fourth starter, coming off a 4.56 RA, 79 IP, +12 freshman campaign. Shuck’s performance last year was valuable, but was overrated by causal observers, as only 55% of his runs allowed were earned (versus 76% for the team as a whole); his 2.51 ERA is more sparkling.

The Buckeyes’ rotation should be very solid; however, the bullpen is much less so. Luckily, the bullpen is often not that big of a factor in Big Ten play. Junior closer Rory Meister struggles with his command (28 walks in 33 IP), but also blows batters away, striking out 43. While his save situations were often difficult to watch, he wound up with a 4.36 RA, albeit with a 5.76 eRA. On the other hand, Meister’s balls in play fell in for hits at a 38.2% clip, so one would expect that to fall.

The other key reliever is senior Trey Fausnaugh, who just hasn’t pitched that well the last two years after emerging as the closer midway through his freshman campaign. Fausnaugh was crushed for a 6.11 RA and 8.13 eRA last year, and a remarkable 40.9% hits in play (remarkable in isolation; not so much in light of the fact that he only pitched 28 innings). It is difficult to envision Fausnaugh recapturing his freshman form at this point, but if he can, then Ohio State will have a solid pair of relievers.

Would-be junior (and former Spanish classmate of the author) Dan Barker transferred; he had served admirably as the long-reliever for two years, but I would guess that he was disappointed at being passed by two freshman for the open rotation spots last year. That leaves sophomore Josh Barerra as the next man out of the pen; he was average as a freshman, but did strike out 36 batters 38 innings, and provides a left-handed option. The only other pitcher returning who pitched last year is sophomore lefty Matthew Selhorst, who pitched in just two games and was injured warming up in the bullpen during an April game. He did not pitch in the Scarlet and Gray World Series, so I am unaware if he is ready to go or not.

Beyond them, only sophomore Darren Sizemore has any collegiate experience; he led Georgetown with 13 starts last year (7.22 RA with 59 Ks in 86 IP), so the transfer figures to be in the mix. There are four redshirt freshman: lefties Eric Best and Brad Hays, righties Taylor Barnes and Jake Weber. Among true freshman, North Carolinan RHP Eric Shinn, Pennsylvania LHPs Theron Minium and Josh Edgin, and RHP Jared Strayer worked in the Scarlet and Gray World Series, while righty Dean Wolosiansky did not. The official preview singles out Hays, Best, Edgin, and Minium as the top bullpen candidates.

The non-conference schedule is very weak, so it will be difficult for OSU to reach the NCAA Tournament without a Big Ten Tournament win. Some fans are upset about this, but I decided to stop tilting at windmills a long time ago. Even with a good non-conference slate, the odds of an at-large bid are slim. While it might benefit the team to play stiffer competition, I’m not going to get worked up about it.

Even if you make the NCAA Tournament, there is essentially no chance of winning the College World Series. Just making it to Omaha would be a huge achievement for the program. But the fundamental thing that the Ohio State baseball team should shoot for in my opinion is the Big Ten regular season title. If the idiots that run the Big Ten and the NCAA think that having a double elimination conference tournament to determine the conference representative is the way to go, they are free to do so. But that doesn’t mean that I have to care about it. Don’t get me wrong, I’d love for the Bucks to win the tournament, but the regular season is more telling of the true quality of the team.

Anyway, the much-maligned schedule opens in Tampa, with James Madison, a doubleheader v. Kansas State, and Seton Hall. A week later (March 2), OSU will play three straight days in Clearwater against Georgetown, Duquesne, and Lehigh. The next week it will be Jacksonville, to play at North Florida, then against Western Michigan and UConn. Finally, on March 17th, the annual spring break trip will see the Bucks take the field against Bucknell, Jacksonville State, Dartmouth, Yale (twice), Northwestern (for the second straight year, the Buckeyes will not face the Wildcats in the regular season, as this is a non-conference game), and Harvard.

On March 28, Bill Davis Stadium will open the gates for the 2007 season as Toledo appears in the home opener for the third year in a row. That weekend, Iowa comes in. The B10 weekends that follow are @Illinois, Michigan, Indiana, @Purdue, Michigan State, @Minnesota, and @Penn State. With the B10 home schedule heavily tilted to the beginning of the season, the Buckeyes will play their last conference home game on May 6th.

Wednesday games will feature the aforementioned home opener with Toledo, Cleveland State, Miami, Ball State, Xavier, and Akron. North Florida will reciprocate Ohio State’s visit when they close out the home schedule May 15th and 16th in a rare non-conference series.

The Buckeyes certainly figure to be in the Big Ten mix. One has to anticipate that the offense will not be as effective as a year ago, and will lack power even more acutely. But the pitching projects to be outstanding again, and with some better breaks then a year ago, Bob Todd could easily be celebrating yet another Big Ten crown. Time will tell, and baseball will come back, but today there is snow on the round in Columbus, so that time remains a bit in the future.

Monday, January 22, 2007

You Don't Have to Live Like a Refugee

The current incarnation of the Strategy and Sabermetrics forum frequented by myself and others has been shut down. While the prescence of other sites like Inside the Book and BTF had made the forum less busy then it had been in the past, it would be good if the posters from that community had a place to go.

There are several choices; there is the new Scout board run by the same people as the current board; there is the Baseball-Fever forum; and there is a new, independent FanHome.com. Of these, I prefer the latter--the Scout people have yanked the boards around a few times already and don't seem particularly concerned about archiving posts and other important features. The Baseball-Fever board has some good posters, but the ratio of good posts to inane posts is too low for my liking. Serious discussions would be cluttered with people who know little about sabermetrics. I'm not trying to put down those who are eager to learn, and it's good that there's a forum that largely caters to them, but it's not my cup of tea.

The new FanHome, which I have linked on the side of the page (or click here), seems to be the best option, as it is relatively unpopulated at this time, and it looks like the old board, if that carries any sentimental value for you :).

Anyway, I'm not affiliated with any of them, and I'm not trying to tell people where they should post or if they should post or anything of the sort. Do what you will.

Career WAT Data, 1946-1960

Actual W-L along with Oliver and Deane WAT:

I bet if you polled knowledgeable baseball fans they wouldn’t guess that Bob Lemon’s
teammates had a higher W% then Whitey Ford’s. I certainly wouldn’t have guessed it. Ned Garver is included here not because he was a great pitcher, but because his teammates were so awful that it is worth quantifying.

NW-NL with Wood/Patriot WAT and WCR:

We lose a second 300 game winner, as Gus Wynn drops eleven neutral wins to 289. Robin Roberts inches six games closer to the magic threshold, but is still eight short. Dizzy Trout gets the dishonor of being the first pitcher with a sub-.500 NW% that I figured. You may wonder why I included him despite an unimpressive raw record of 170-161. Well, he ranks fairly well in runs allowed based methodology. As of Total Baseball VI, circa 1998, he had an ERA+ of 124, tied for fortieth all-time, and a TPI of 34.5, good for twenty-sixth place. Again, don’t let these (NW%, WAT, etc.) figures influence your opinions of pitchers’ historical standing too much. I myself would put a lot more weight on the run-based methods then on comparisons of win-loss record.

Monday, January 15, 2007

Career WAT Data, 1920-1945

NOTE: For some reason the charts don't look as nice this time and are tough to read. If you can't read them, click on them and they will come up in full size.

Again, we’ll begin with actual W-L record and percentage, along with Mate and WAT as figured by Oliver and Deane:

Here we have the first pitchers we have encountered who were worse then their teammates, two of them Hall of Famers: Waite Hoyt and Red Ruffing, along with Dolf Luque and Charlie Root. The reason that you can be worse then your teammates and still have a positive WAT figure according to Deane’s methodology is that the method is non-linear, and considering each season separately as I have here produces a different result then figuring the career as a whole. Both Ruffing and Hoyt would have negative WAT(D) if figured for the career as a whole.

Hoyt is an interesting case for other reasons; before doing this study, I had read that Ruffing was the only Hall of Famer with a worse record then his teammates. That is true, if you figure it as most people do. Waite Hoyt’s teams, when he did not pitch, were 2004-1654, for a .548 W%, while Hoyt was 237-166 for a .566 W%. But what I do is weight each season’s Mate by the percentage of career decisions the pitcher earned that year.

I think it is obvious that this is the logically correct way to do it, but I’ll do an example just to illustrate. Suppose there was a pitcher who went 10-10 on a 86-76 team one season, and 1-0 on a 70-92 team the next. His teammates without him would be 86 + 70 -10 - 1 = 145 and 76 + 92 - 10 = 158, for a .479 W%, while our pitcher is 11-10 for a .524.

But what sense does it make to treat equally a season in which he recorded 20 decisions with one in which he recorded 1? Absolutely none. So I would say that his career Mate is 20/21*(86/162) + 1/21*(70/162) = .526. And so this pitcher is actually worse then his teammates.

In the first edition of the Historical Baseball Abstract, Bill James figured WAT by the Oliver method for a number of pitchers, and listed Career Mate as well. I don’t know why our figures diverge in many cases, but they do. For instance, he lists Hoyt’s Mate as .587, while I say it is .570 by my method and .548 by the incorrect method. Perhaps he was weighting by games or innings or something, I don’t know. Or it could be a data entry error on my end. Anyway, we do agree for a number of pitchers, so I think he was using a logical process, but I’d like to find the cause of this divergence. Anyway, it does appear that he did think about this problem the same way I did, and came to the same (IMO common sense) conclusion.

And the Wood/Patriot methodology results:

Here we lose our first 300 game winner, with Lefty Grove dropping out of the club, while still towering over his contemporaries and putting up the best NW% we’ve yet encountered. As a matter of fact, Grove has the highest NW% for any retired pitcher I’ve checked (limiting it to pitchers with 150 neutral wins--Spud Chandler’s .662 in just 152 decisions is higher), but is currently behind three active major league pitchers--it remains to be seen whether they can maintain their standing as they age (I will save their identities for a later installment, but I would think that you figure out for yourself who they are, or at least list a small group of current pitchers that they must be drawn from).

Dolf Luque skates on thin ice, narrowly avoiding the dishonor of being the first sub-.500 NW% pitcher I included in my list.

Thursday, January 11, 2007

Career WAT Data, 1900-1919

See the intro to this series for a fuller explanation of the terminology and purpose of this data.

First, here are the actual W-L records of each pitcher, along with Mate, and Wins Above Team as figured by the Oliver and Deane approaches:

Career Mate is figured by weighting each year’s Mate by the percentage of lifetime decisions the pitcher earned in that year. You can see that very few of these pitchers were with bad teams on balance.

A little note on figuring these for a career. Since Oliver and Wood are strictly linear formulas, the result of figuring WAT or WCR is the same if one looks at each year separately and sums them up, or if the Career W% is compared to Career Mate. However, for Deane, the result will not be the same because a different formula is used depending on whether or not W% > Mate. I have figured career NW% for Deane by comparing to Career Mate, but have figured career Deane WAT by summing each year, because I believe this is how Total Baseball does it, and I want to match the published figures. However, my figures do not match TB in all cases, which is likely the result of data entry error on my part, although I certainly tried to be careful and have corrected any errors that I have caught. You can decide whose figures you want to go by.

Now Neutral Wins and Losses, NW%, WAT, and WCR under the Wood assumptions (list is sorted by WCR). Neutral Wins and Losses assume that the pitcher has the same number of decisions as he actually did, but wins them at his NW% instead of his actual W%:

I don’t want to get too carried away with discussion of larger truths based on these figures, as they are just a very minor part if any of how I would go about a full evaluation of pitchers, but a few points. Rube Marquard’s W-L record is not particularly impressive for a pitcher of this class. But in the Neutral measures, you can really see that he stands out as sub-par. He is over nine wins v. replacement behind the nearest HOF pitcher (Waddell), and nearly seven wins v. average behind the nearest HOF pitcher (Willis). His NW%, just like his regular W%, is the lowest here among the Hall of Famers. Addie Joss is a guy who ranks ahead of six pitchers in WAT whom he trails in WCR, fitting his short but brilliant career. We don’t kick any of the 300 game winners out of the club, and we don’t add any new members. Mathewson and Plank lose ground for pitching on good teams, while Young and Alexander barely move as they pitched on average teams, and Walter Johnson does well as he pitched for poor teams.

However, the Big Train loses almost seven wins above team versus the Deane approach, as I move the team towards .500 a lot more then Deane does. The traditional methods overstate the effect, be it giving too much credit for those on bad teams or not enough for those on good teams. Miner Brown takes a big hit here, but the Cubs teams he pitched for were probably not very well balanced between offense and defense. I believe he would do better if we adjusted based solely on the quality of offense, and he would also do better in a run/earned run/estimated run evaluation.

Babe Ruth and Joe Wood don’t really meet my criteria for this project, but I was curious about them so they are here. BTW, Wood’s ridiculous 34-5 season in 1912 comes out here as 31-8, good for +16.3 WCR, topped by only a handful of seasons in this period (including Chesbro in 1904, +19.8; Johnson in 1913, +19.5; Walsh in 1908, +18.7; Mathewson in 1908, +16.5; and matched by Cy Young in 1901).

Jack Powell is here as an interesting case, a pitcher with a .489 W% on .462 teams. I didn’t list NW% under the Oliver and Deane assumptions, but Oliver would convert that to .527, while my approach is much more conservative and puts him at .508. You will see that the more conservative approach, in many cases, is not that far removed from a pitcher’s actual record.

I don’t consider this a surprise. If you think about, when a pitcher is in the game, he is basically half of the team (again, at least while he is actually pitching and not relieved). Even if you pitch with a bunch of stiffs for offensive teammates, you have a great deal of control over whether you win or lose. Likewise, even if you pitch for a good offense, you have to hold up your end of the bargain to put up a good record.

Wednesday, January 10, 2007

Hall of Fame and Steroids

This is a lightweight post, borrowing heavily from other people's ideas, but the hubub about Mark McGwire and the recent Hall of Fame election really does not interest me. The Hall of Fame, or at least who is inducted in the Hall of Fame, really does not matter to me either.

Now don't get me wrong; I like to amuse myself just like many other baseball fans by raitng the players, making lists of the top 10 this or that, setting your all-time Indians starting lineup, and other such pursuits. In fact, sometime in the not too distant future I'm going to pick my top fifty starting pitchers of all-time, just for the heck of it. And I am interested in the question of who SHOULD be in the HOF. The problem is that does not so much apply to the worthy candidates of today like a Bert Blyleven as much as it does to a Rube Marquard, who is in the HOF but should not be.

Bill James once wrote something to the effect of "The Hall of Fame has lost the ability to honor a truly great player, they can only dishonor him." And this is the reason why I am disinterested. If Rube Marquard is a Hall of Famer, then what honor is there in allowing Bert Blyleven to be in the same honored company as Rube Marquard? They can only dishonor him by temporarily pretending that he does not belong.

Of course, if it was just Rube Marquard, this would not be a problem. But the Hall of Fame has so many mistakes that it is beyond salvaging. And so the tribulations of the real HOF matter little to me. That said, since the steroids debate has been rekindled, I thought I would link to this nice piece by Russ Roberts at one of my favorite non-baseball blogs, Cafe Hayek, on the issue of Big Mac and steroids and "cheating". He makes some points that I whole-heartedly agree with, and he's much smarter then I am, so he makes them eloquently and ties in Fredrich Hayek.

Tuesday, January 09, 2007

Career WAT Data, Intro

This post is the introduction to a series I will be doing that presents Wins Above Team statistics for each primarily twentieth century Hall of Fame pitcher (except those who were primarily relievers, so Eck is here but Sutter, Fingers, etc. are not) and some other assorted pitchers that I have chosen. Some of these pitchers were chosen by me for idiosyncratic reasons--it should not be construed as an attempt to identify the top 100 pitchers or something, just those that I was interested enough in running the numbers for. It will be broken down into historical periods roughly corresponding to those that are used to divide the history of baseball post-1900 up in the Neft & Cohen Sports Encyclopedia: Baseball. In the modern era (as divided by Neft & Cohen, 1973 to the present), I have split it up into earlier and more recent pitchers, but rather arbitrarily as you will see later. For example, Dave Stieb is a “recent” pitcher in that period and Nolan Ryan an “early” one, despite the fact that both of their careers essentially ended in 1993. I didn’t try to divide them precisely into groups, and really, the only reason I have divided them into groups is so that there is a workable number in each installment.

I am doing this not because I feel that Wins Above Team is a particularly important statistic--I would definitely consider measures based on runs or earned runs allowed, or even estimated runs allowed, to be the primary way that the value of pitchers of today and the past should be assessed. However, won-loss records can be a decent measure, if they are interpreted properly, and might possibly be able to provide some insight. Plus, W-L records will always be a primary part of the discussion by non-sabermetricians when evaluating pitchers, so we may as well be able to utilize them in the most sabermetrically sound way.

I believe that the figures that are often published, either based on comparing a pitcher to his teammate’s W-L record with his decisions removed, or Bill Deane’s modification that is used in Total Baseball, are not the most logically sound way to evaluate W-L record. I discussed this in a three part series, and will not rehash those posts here, except to give a brief explanation of the method I prefer.

Simply comparing a pitcher’s W% to that of his teammates, which I’ll call Mate as Rob Wood does, implicitly assumes that his team’s deviation from .500 is solely the product of pitching. After all, if a staff had Greg Maddux, John Smoltz, Tom Glavine, and Denny Neagle, what shame would it be if the fifth starter had a worse record then Mate, which would largely be composed of the records of the four aces? The point of considering his team’s record when evaluating a pitcher’s won-loss record is to account for the support that he received. Having Denny Neagle on the staff does not make it any easier for the #5 starter to win a game, so why compare to him?

Now of course when the standard of comparison is Mate, we cannot completely remove the influence of other pitchers, because we do not know how much of the Mate’s deviation from .500 is due to pitching or offense or any other facet of the game, without additional data, which would defeat the purpose. But we can make an assumption that will be most accurate in more situations then any other.

And that is to assume that the deviation of Mate from .500 is equally a result of offensive and defensive (or, to keep it simple, pitching) efforts. Obviously, this will not be true in all or even most cases, and sometimes will be more incorrect then assuming that the deviation is solely a product of offense, but it will have a lower average degree of error then any other assumption.

For an example of this in action, let’s look at the case of Iron Joe McGinnity, pitching for the 1905 New York Giants. His record was 21-15(.583), fairly impressive on its face, but his teammates were 84-33(.718) without him. McGinnity recorded just 24% of the team’s decisions, but 32% of its losses.

The traditional Wins Above Team method will look at the direct comparison between .718 and .583, extrapolate it over McGinnity’s 36 decisions, and proclaim that he was 4.8 wins worse then an average pitcher would have been. Bill Deane’s modified method would account for the fact that it is hard to improve on a .718 mark, with just .282 potential wins to improve, and instead estimate that McGinnity was -3.4 wins.

My method, which is essentially the same as those that have been proposed by Rob Wood and Tango Tiger (with Wood being the formative influence in my thinking on this matter), will assume that the .218 extra wins/game compared to average are half the result of offense and half of pitching, and will therefore credit .109 wins to the offense. Therefore, an average pitcher coupled with this offense would record a .609 W%. Comparing McGinnity’s .583 to this lower standard, we conclude that he was only .9 wins worse then an average pitcher.

Continuing along the same logic, I can also compare to a replacement level pitcher (the standard Oliver and Deane approaches can do this as well, although their creators did not go down this path). I assume that a replacement level pitcher is a .390 pitcher on a .500 team, and conclude that he would be a .499 pitcher with this Giants team. Comparing McGinnity’s .583 to the .499 replacement, we conclude that he was 3 wins better then replacement.

The formulas for the Wood-inspired methods are:
NW% = W% - Mate/2 + .25
WAT = (NW% - .5)*(W+L)
WCR = (NW% - .35)*(W+L)
NW = NW%*(W+L)
NL = W + L - NW

Where NW% is Neutral W%, the W% we would expect for this pitcher on a .500 team; WAT is Wins Above Team, the number of wins over what a .500 pitcher on a .500 team would have won with this team; and WCR is Wins Compared to Replacement, the number of wins above and beyond those of a hypothetical replacement level pitcher, who is assumed to be a .390 pitcher on a .500 team. NW and NL are Neutral Wins and Neutral Losses; these are the win/loss totals a pitcher would have had if he recorded the same number of decisions that he actually did, but won them at his Neutral W% rather then his actual percentage.

Next time, I will begin with a look at pitchers primarily active in the 1900-1919 period.

Monday, January 01, 2007

Hitting by Position, 2006

This is another “recycled” piece from last year, and I will not go in as much detail this time, but I do find this stuff interesting if not particularly enlightening. As with the leadoff piece, data comes from STATS, via the Baseball Direct Scoreboard.

My opinion on one-year positional hitting data being used in any sort of analysis is similar to my view on one-year park factors: not a good idea. So these figures are presented for a look at what happened in 2006, not as harbingers of future trends or anything of the sort. The positional adjustments I use when calculating runs versus a hitter at a given position are based on a ten-year sample from 1992-2001. Ideally, I should update this for the five new years of data that we have since 2001, and someday I will get around to that. But I don’t believe that those are sufficiently flawed for today’s game, and I certainly place a lot more trust in ten year figures outdated by five years then I do in one year figures.

Here are the BA/OBA/SLG and RG for each position in 2006:
C: 269/323/421/4.65
1B: 285/359/488/6.07
2B: 276/330/409/4.63
3B: 276/344/458/5.43
SS: 277/330/410/4.65
LF: 278/350/464/5.62
CF: 269/329/427/4.82
RF: 277/342/460/5.43
DH: 263/345/463/5.52
P: 131/164/175/.18

Left field was finally able to outhit right field by a significant margin, while catcher, short, and second converged to the bottom together. Pitchers still can’t hit (take note Neanderthals). The top walk rate belonged to the DHs, while first baseman led in Isolated Power but just slightly over their DH cousins, allowing the DHs to better them .325 to .318 in Secondary Average. Pitchers, dynamic and exciting element of the offensive game that they are, hit for a .083 SEC.

I always like to consider 1B and DH a group, as well as corner outfielders:
1B&DH: 278/354/480/5.89
LF&RF: 277/346/462/5.52

Here is how the positions stack up as a percentage of the overall RG, with the 1992-2001 figure in parentheses:
C: 93 (89)
1B&DH: 118 (119)
2B: 93 (93)
3B: 109 (101)
SS: 93 (86)
LF&RF: 111 (112)
CF: 97 (102)

I would give you lists of the teams with the best and worst hitting and each position, but that would insult your intelligence. It would be very similar to a list of the best individual hitters at each position. Shockingly, Phillies first baseman hit pretty well, and so did Mets centerfielders. Don’t have a clue as to why.

For pitchers, though, the best production came from St. Louis, where they scorched the ball to the tune of 173/219/222. Of course I am throwing out the AL pitchers, didn’t get many chances and were often (relatively) very good or very bad in the limited samples. The worst was Milwaukee, 097/114/113. But Texas and Oakland hurlers deserve mention for going a combined 0-32 with no walks, the only AL teams to not get a baserunner from their pitchers, let alone a hit).

It is dangerous to use the ERP formula at extremely low levels of offense like pitchers, because negative runs become unavoidable, but compared to the pitcher average of .18 RG, the Cards were +16 runs and the Brewers -18. So that’s a 3 1/2 game swing in the standings of the NL Central based on pitcher’s offense, and of course that is the largest gap in the game.

The fun part of this article a year ago was when I took the correlation, for each team, between the ten-year PADJ and the actual RG they got out of each position. Last year, only the Indians and Orioles had a negative correlation, meaning that they got better hitting out of the positions that usually are poor. In the Indians’ 2005 case, they got great production at shortstop, catcher, and center field, but horrible production at the corner positions. Here’s how it turned out in 2006 (DHs are only considered here for AL teams, pitchers not at all):
HOU…+.91
WAS…+.85
COL…+.81
BOS…+.75
CHA…+.75
STL…+.73
KC……+.68
CIN…+.66
TOR…+.66
TB……+.65
PHI…+.60
MIL…+.56
CLE…+.53
PIT…+.53
OAK…+.50
CHN…+.47
ARI…+.44
SEA…+.40
All players…+.40
LA…+.33
LAA…+.32
SD……+.27
NYN…+.21
SF……+.19
MIN…+.15
FLA…+.11
TEX……0
ATL…-.16
BAL…-.27
DET…-.29
NYA…-.34

So four team had negative correlations, with Baltimore as a repeater. You can see that the correlations are not clearly correlated with team offense as the Yankees got an unusual positional distribution of offense but still had the best offense in the league. In Baltimore’s case though, it is frustrating when you get a lot of production out of shortstop (Tejada), and then are punchless at the right side of the defensive spectrum positions like first (Conine/Millar) and DH (Gibbons/Lopez).

I will chart three teams here, Houston with the strongest correlation (and most positive), Texas with no correlation whatsoever, and New York with the strongest negative correlation. What I have done is list the positional adjustments as a baseline, then expressed each position’s RG as a percentage of the composite team RG for the positions considered. Houston, for example, got a composite 4.95 RG out of the positions considered (C, 1B, 2B, 3B, SS, LF, CF, and RF, plus DH for AL clubs). Their first baseman had a 7.94 RG, so 7.94/4.95 = 1.61:
Houston
POS……PADJ……ARG
C……… 89…………62
1B………119………161
2B………93…………83
3B………101………114
SS………86…………69
LF………112………107
CF………102………84
RF………112………127
As you can see, ever position trended in the same direction as the PADJ (by this I mean either above or below average), except for CF. In fact, most of the positions are more extreme then PADJ would guess.

Then you have the strongest negative correlation, from the Yankees:
New York(A)
POS……..PADJ……..ARG
C………89…………92
1B………119………91
2B………93…………95
3B………101………119
SS………86……… 119
LF………112……… 90
CF………102……… 93
RF………112……… 97
DH………119………104
The Yankees top hitting position was shortstop, which is generally the weakest position. Their first baseman were poor, and other positions are pretty well clustered around 100. You can see that their offensive contributions by position were much more balanced then Houston’s.

Finally, the Rangers, for whom there is no correlation either way:
Texas
POS……PADJ………ARG
C……… 89………… 88
1B………119…………122
2B………93………… 109
3B………101…………99
SS………86………… 103
LF………112…………105
CF………102…………105
RF………112…………87
DH………119…………81
Here, many of the positions match expectations, but the RFs and DHs were well below what you would expect while shortstop was well above, and it adds up to very little correlation when considered as a whole.

Walk Like a Sabermetrician

Tuesday, January 30, 2007

Career WAT Data, 1961-1972

Saturday, January 27, 2007

Delusional AP Writer of the Day

Friday, January 26, 2007

Tilting at Windmills (Or How I Learned to Stop Worrying and Love the B10 Title)

Monday, January 22, 2007

You Don't Have to Live Like a Refugee

Career WAT Data, 1946-1960

Monday, January 15, 2007

Career WAT Data, 1920-1945

Thursday, January 11, 2007

Career WAT Data, 1900-1919

Wednesday, January 10, 2007

Hall of Fame and Steroids

Tuesday, January 09, 2007

Career WAT Data, Intro

Monday, January 01, 2007

Hitting by Position, 2006

Me, Elsewhere

Analysis Links

Reference Links

Blog Archive

OSU Baseball

End of Season Statistics

Win Shares Walkthrough

NL 1876-1881 Series

Labels

About Me