Sunday, April 18, 2010

The Account-Form Scoresheet

Cross-posted at Weekly Scoresheet, for which I still want your scoresheets.

Most scoring systems endeavor to keep a chronological account of the game (at least in theory--only the Project Scoresheet system is purely linear, but one can reconstruct the game in order from a traditional scoresheet). However, what if one was only interested in recording the basic box score statistics of individual players, and not about preserving the game play-by-play?

For starters, you could certainly save a lot of paper. You could eliminate a lot of codes--there would be no need to record wild pitches, passed balls, balks, and the like (remember, I said basic box score statistics). It would be possible to describe each batter's plate appearance in just a few characters.

As is the case with so many other baseball topics, one can turn to Bill James for a creative idea about how to do this. In the 1983 Abstract, he introduced what he called the account-form box score. It was a box score that retained much of the data from a scoresheet, yet was compact enough to be printed in a space no larger than that devoted to traditional box scores.

I recently threw together a new scoresheet that records only the information necessary to create an account-form box score of a game. I made a few tweaks to James' coding system, but for the most part I just used it as is.

First, the ways to reach base:

S = single
D = double
T = triple
H = home run
W = walk
I = intentional walk
B = hit batter
E = error
f = fielder's choice (at end of code)

Three baserunning events are noted, by use of a lowercase letter after the on-base code:

s = steal
c = caught stealing
x = other out on base (That is charged to the runner, like a pickoff or out advancing when not forced. If the runner is retired on a force, no "x" is added).

Outs are recorded in typical scorekeeping notation--a ground out to short is "63", a fly to right is "9", a strikeout is "K". Double plays get a D prefix--"D643". A sacrifice hit gets a b suffix (for bunt); a sac fly needs no modifier because it will involve a RBI. If a fielder's choice results in a batter-runner not reaching base (i.e. it is the third out of the inning, so the batter-runner never actually gains first), then it is recorded with a "f" suffix (for example, 46f).

When a batter gets a RBI for driving in someone other than himself, James uses a ', so D' is a RBI double. D" would be a two-run double. When three other runners score, an exclamation point is used, so a bases-clearing double is D!. Thus, a solo homer is just H, a three-run homer is H", and a grand slam is H! A sac fly to left is 7'.

In James' system, batter-runners that reached base and do not end up scoring are given lower case symbols...s, d, t, etc. If you are keeping a paper scoresheet in this style, this is problematic, as you'll have to do a lot of erasing. So what I did instead was embolden the codes of batter-runners who scored. When paper scoring, this is easily accomplished by tracing a pencil marking with black ink. When keeping a paper scoresheet, it is also helpful to mark off the innings, which I did with a number in the far left portion of the box.

The biggest drawback to the account-form box as James envisioned it was that pitcher records are hard to read. James simply noted the pitchers by the number of batters they faced each time through the order. Suppose for instance that CC Sabathia faced 32 hitters, Phil Hughes faced one, and Mariano Rivera faced five. James notation would be:
Sabathia (9995), Hughes(xxx1), Rivera (xxx32)

Sabathia faced nine hitters the first three times through the order, than five the fourth. Hughes faced no batters the first three times through, and one on the fourth. Rivera faced no batters the first three times, the final three on the fourth, and two on the fifth.

This makes it a chore to read through the account and tally up a pitcher's statistics (the same issue exists for batters of course, but it is a lot easier to read across a row to take in the result of three to six plate appearances than it is to read up and down column to take in the result of potentially dozens of plate appearances). There's no easy solution--the best option might be to just give each pitcher a basic line, perhaps in the format of (IP H R ER W K Dec), like:

Sabathia (7.2 8 3 2 2 6 W) or Sabathia (7.2, 8, 3, 2, 2, 6, W)

Also, I tend to think that a better notation to indicate batters faced for the above example would be something like:

Sabathia (1-32), Hughes (33), Rivera (34-38)

This does not solve the problem with reading a pitcher's performance I described above, and it may force you to count off by nine (9, 18, 27, 36, etc.) in order to figure out where to start reading the scoresheet to find a particular pitcher, but I think it's more intuitive and it saves a little space that is being given back by listing out pitcher lines. Of course, counting off by nine shouldn't be a problem for those versed in systems (such as Project Scoresheet) that use numbered boxes.

Another issue is how to list substitutions. When printing a box score, it is easy to insert an extra line for substitutes and indent it. But with a pre-printed scoresheet, it's trickier. What I've done is use letters (a, b, c, d, etc.) to show substitutions. The timing of the substitutions is noted by using the first time through the batting order that the player appeared. For example:

[a Giambi 4] indicates that Giambi pinch hit, with his first PA coming the fourth time through the order in the batting order slot where a is listed.
[d PR Guzman 5] indicates that Guzman pinch-ran the fifth time through the order.
[c 6 Everett 3] indicates that Everett was a defensive replacement at shortstop, and that his first PA (if in fact he got one) would come the third-time through the order. The exact time he entered the game defensively is not recorded, as it isn't on a typical basic box score.

This allows the scoresheet to be more compact, but it does make it a little tougher to quickly read batting lines for lineup slots in which multiple players appeared.

Honestly, I don't have a lot of use for an account-form scoresheet myself; it's not detailed enough for my tastes as a scorekeeper. It's the box score that I want--I want to be able to get on the internet in the morning and read a mini-scoresheet rather than a box score. Which line tells you more, and more importantly IMO tells you a better story:

63, K, H", 9, W

or

4 1 1 3 1 1
with a note below saying HR: Player X

or

4 1 1 3 1 1 HR
as the more helpful Baseball-Reference box score does?

It's true that the account-form style requires you to do a little counting in your head if you want to know H-AB, or runs scored, or the other box score categories. However, I'd argue that it's very easy to do that sort of addition, and that it also is a cosmetic improvement--there are an awful lot of zeros in box scores.

Incidentally, this is the same reason that all of the scoresheets I design myself omit columns in which to record batting lines. I'd rather devote that space to larger scoreboxes and do some math in my head than waste a bunch of space on columns I'll have to fill in to create a box score that I could just as easily read on the internet. Pitchers are a different story, as it is much harder to get a quick read on their performance as the number of scoreboxes under consideration is much larger (and more spread out) than for a batter.

Any site that implemented an account-form box score would instantly get all of my morning box score hits. I'm a little surprised that someone hasn't implemented it or something similar on online just to stand out, as there are so many places that you can go to get box scores. As it stands now, only Baseball-Reference offers something truly unique.

Below is an account-form scoresheet for Cincinnati @ Milwaukee, 4/10/2008, copied from a scoresheet I kept in a different format. I simply typed the account into Excel, although ideally I would have paper-scored it and scanned. It will be easier to read this way though. A drawback is that since I am using the cells in the spreadsheet, if a leadoff hitter scored, the inning number is in bold as well. I did not include pitcher lines, just the batters they faced:




Let me go through batter-by-batter and write what they did in English:

Patterson: grounded to second, grounded to second, grounded to second, grounded to second, flied to center
Keppinger: grounded to short, popped to short, grounded to pitcher, intentional walk, grounded to pitcher
Griffey: struck out, flied to center, walk, hit into double play
Phillips: flied to right, flied to right, struck out, grounded to second
Dunn: grounded to first (a pop to first would have been marked 3^; although James used 3 for a pop to first and 3- for a grounder to first), grounded to first (pitcher covering), single (scored), grounded to short
Encarnacion: grounded to third, walk (scored), two-run home run, single
Hatteberg: walk, single (put out on base), double (scored), grounded to second
Bako: struck out, RBI double, RBI single, walk
Harang: sac bunt to first, struck out, single
Votto: flied to center

Weeks: popped to third, flied to center, hit into fielder's choice at second, popped to second
Gross: double, flied to center, grounded to second
Counsell: hit into fielder's choice at short
Fielder: grounded to short, flied to left, flied to right, popped to third
Braun: grounded to short, popped to third, grounded to third, struck out
Hall: double (scored), grounded to short, grounded to short, grounded to second (via pitcher)
Hart: sac bunt to first, grounded to short, struck out
Villanueva: sac bunt to pitcher, struck out
Kapler: grounded to first (pitcher covering)
Kendall: grounded to short, single, single

Here is a link to download the account-form scoresheet in Excel format (scroll down to Account-Form scoresheet).

Great Moments in Yahoo! Box Scores


Hide your women and children...the R. Ortiz's are multiplying.

Thursday, April 15, 2010

Great Moments in Yahoo! Box Scores


At first I thought they might be thinking of the Dodgers' other washed-up R. Ortiz, but he has a loss too.  And don't ask me which is which; Yahoo! box scores don't bother to elaborate.

Saturday, April 03, 2010

2010 Predictions

As always, these are offered in the spirit of fun and not serious analysis. If you want the latter, there are plenty of smart folks in the saber-sphere that do W-L projections based on their projection systems. Even with fancy tools, the margin of error inherent in this exercise is enormous.

So I don't do anything special. I put together a very rudimentary spreadsheet and than I just make my best guess as to how the teams will finish and put it out on the internet for the world to see.

One thing to always keep in mind about predictions: picking a team first doesn't mean that I think they'll win. It just means that I think they have the highest probability of winning. If you don't understand what I mean, don't waste your time by reading on.

I'll keep the rambling introduction short this year and get to the point.

AL EAST

1. New York
2. Tampa Bay (wildcard)
3. Boston
4. Baltimore
5. Toronto

On paper, it's extremely hard to pick against the Yankees. They have the most potent projected offense and strong pitching to boot. That being said, any notion that the Yankees are some sort of unbeatable juggernaut is silly, as such notions usually are. It's an old team, there are multiple players that have noteworthy lingering injury concerns, and one wonders how sure even their brass is about the structure of their pitching staff. They are an easy choice as favorite for me, but not a super-team.

The Rays would probably be my pick to win any other division save the NL East and perhaps the NL Central, but we all know they're in a tough spot in the East. It would be foolish to write them off. The Red Sox have retooled around pitching and defense, but it seems as if the mainstream take is underestimating the offensive prowess of a healthy Beltre and Cameron. Picking them third is just a means to deviate a little bit from the conventional wisdom; they could easily end up with the best record in MLB. The Orioles are improving, and I expect big things from Matt Wieters, but it's not their year to get back in the hunt and even if it were, they'd be buried in this division. The Blue Jays may be a challenger...for the Royals, that is.

AL CENTRAL

1. Minnesota
2. Chicago
3. Detroit
4. Cleveland
5. Kansas City

My track record at predicting AL Central results is horrible. Some of it has to do with the fact that I'm a Cleveland fan, and am probably a little biased towards them in my picks (although when I've picked the Indians I've hardly been the only one doing so, I've admittedly tended to pick them whenever it is reasonable to do so).

Joe Nathan's elbow injury is obviously a blow to the Twins, and it does cut into the gap between them and the rest of the division, but I still think they stand out as the clear on-paper favorite. Chicago figures to have excellent pitching, but their offense is among the worst in the AL. The additions of JJ Putz and Juan Pierre and the subtraction of Jim Thome just serve to increase my (somewhat irrational) antipathy towards the team. The Tigers didn't seem to have much of a plan; apparently they really like Max Scherzer, because otherwise the Granderson trade is very puzzling given that they acted like contenders by signing Damon and Valverde (Just to make it clear, I don't have a problem with the trade for the Tigers, except that it doesn't make a whole lot of sense for winning in 2010). I've written about the Tribe in detail; suffice it to say, the pitching is a major weakness but they should score some runs and aren't as bad as many seem to think. The Royals...I don't even know where to begin.

AL WEST

1. Los Angeles
2. Texas
3. Seattle
4. Oakland

On paper, this should be the best pennant race of the year. It's easy to see any of the first three winning it, and Oakland can't be dismissed out of hand. I had a hard time deciding which team to pick, as the Angels and Mariners are just about even in my crude spreadsheet. On one hand, it's boring to pick LA; on the other hand, Seattle has gotten so much love from certain segments of the saber-sphere that it almost feels like groupthink to pick them.

I'm thoroughly unimpressed by the Angels off-season maneuvers and like just about everyone else, I love what Seattle did. However, the Mariners are still short on offense, and carrying Ken Griffey and Mike Sweeney does not help. Their spring training moves have not impressed as much as their winter deals, to say the least. The Rangers have high hopes, but I'd be surprised if they matched the 87 games they won last year; less could be enough, though, and while I would have pegged them third when spring training started, I've moved them up a notch. I'm probably a little behind the curve in admitting this, but Oakland does seem to be floundering about without a long-term plan. Whether they're in a holding pattern until their stadium situation is resolved, changing their minds on the strength of the rest of the division, or just plain confused is a different question. There's some talent here, but the offense also leaves a lot to be desired.

NL EAST

1. Atlanta
2. Philadelphia (wildcard)
3. Florida
4. New York
5. Washington

Just after the turn of the century, I was always picking against the Braves--it had gotten so boring to pick them that anytime I thought they had a halfway-serious challenger, I'd pick the challenger. However, in the last three seasons I've gotten in the habit of picking Atlanta when I probably should know better.

This season, my crude analysis indicates that the Phillies are the best team in the NL. But picking them would mean picking a World Series rematch, which is beyond boring. If my goal here was to be as painstaking as possible and strive for the highest possible chance of being right, I'd do it. But this whole exercise is for kicks and nothing else, so I'm going with Atlanta.

I hope it doesn't sound as if I think Atlanta really doesn't have a chance--of course they do, and they'd be my wildcard pick regardless. Their offense is still shaky (in Jason Heyward I trust, apparently), the back end of the bullpen is a ticking time bomb of age and injury, and trading one of the NL's top starters of 2009 may make business sense but it certainly doesn't improve their 2010 hopes. Meanwhile, the Phillies have gotten too much of a halo in my opinion from the mainstream based on their playoff successes over the past two seasons. They have been remarkably consistent since 2003 (86 wins, 86, 88, 85, 89, 92, 93), but 93 wins doesn't make you a super-team and I see no real reason to expect improvement. It wouldn't be shocking to see them on the outside looking in during October.

The Marlins are interesting in their own right; they could very well slip in as the wildcard or challenge for the division with the right set of circumstances. What's kind of funny about a team that's consistently had good young talent is they've never had that one magical year in which everything comes together and they win 95+ games--their highwater mark is still 92 in 1997. The Rays, Twins, A's, and Indians have all managed to put together at least one season of that type in the same time period. The Mets' offense should bounce back to some extent, but the starting pitching is awfully thin to expect a strong contender. The Nationals' 2010 should vaguely resemble the Orioles' 2009--hoping for mediocrity while the primary drama is the debut of a much-hyped prospect.

NL CENTRAL

1. St. Louis
2. Milwaukee
3. Chicago
4. Cincinnati
5. Houston
6. Pittsburgh

The Cardinals look like a clear favorite amongst a pretty motley group. Of course the proper candles must be lit for the health of Pujols and the twin aces. The Brewers look like a thoroughly average team, which makes them the safest pick for second here, but I think the Cubs might have a better chance of winning it. On the other hand, when your rotation includes Carlos Silva and your outfield is filled with two albatross contracts and a third that may end up as one, it's hard to be too excited. Reds fans seem to think it's their year; contention is certainly possible, particularly if St. Louis comes back to the pack, but this team is still a year or two away from being obvious contenders. The Astros are spinning their wheels as they have since they won the pennant, which may have been the worst thing that ever happened to them. They have the potential to be the worst team in the NL, but there's enough reliable-if-healthy veteran talent to make me conservatively pick them ahead of the Pirates.

NL WEST

1. Colorado
2. Los Angeles
3. Arizona
4. San Francisco
5. San Diego

This is another division where any of the top three winning would not surprise me in the least. The Rockies get the nod just to shake things up a little bit; I actually have Los Angeles ahead on paper. Colorado probably can't expect on repeat performances from Tulo and de la Rosa among others, and I'd like to take this opportunity to again advance my Free Chris Iannetta! campaign. The Dodgers are being written off all too quickly by a lot of mainstream writers; no, they didn't improve themselves in the off-season, but I don't see why their starting pitching is invoking hand-wringing. The Diamondbacks starting pitching looks shaky, but they have a solid offense and bullpen and with any luck should stay in the race. The Giants are seemingly an injury to one of the big two away from disaster; the incompetence in assembling an offense shouldn't be surprising any more, but it makes me shake my head. The Padres should challenge the Pirates, Astros, and Nationals for league cellar position--I'm looking forward to the potential of a Cory Luebke sighting later in the year.

WORLD SERIES

New York over Atlanta

It's boring to pick the Yankees, and of course I think the odds against any particular team winning the pennant are fairly formidable.

AL Rookie of the Year: 1B Chris Carter, OAK
I realize he's been optioned; these picks are anything but scientific and are more or less the first name that pops into my head.
AL Cy Young: Jon Lester, BOS
AL MVP: 3B Evan Longoria, TB
NL Rookie of the Year: OF Jason Heyward, ATL
It's way too easy to pick Heyward, but he'll have the hype machine on his side from the start if nothing else.
NY Cy Young: Yovani Gallardo, MIL
I hate making the obvious Cy Young picks, so I usually look like a moron.
NL MVP: 2B Chase Utley, PHI
The World Series seems to have finally convinced a lot of media-types how good Utley really is. Unfortunately, it may have come too late to earn him any hardware.

First manager fired: Ron Washington, TEX
Worst pennant race: NL Central
Best pennant race: AL West
Worst teams in each league: KC, PIT
Most likely to go .500 in each league: SEA, MIL
Team in each league most likely to disappoint mainstream consensus: DET, SF
Team in each league most likely to surprise mainstream consensus: OAK, ARI
Most obnoxious stories of the year: This is no longer fun to predict, as it's always and forever steroids. This year Mark McGwire as Cards hitting coach seems to be the leading cause for hand-wringing by the insufferable steroid crusaders. Others: Joba and Phil as relievers or starters, Jeter's contract, the Twins' closer situation, the various Twitter-fueled controversies that are bound to pop up (Ozzie and Oney could just be the beginning)
How stupid am I likely to look if one reviews these predictions after the season and ignores the disclaimers: Very