Tuesday, November 21, 2006

Pujols vs. Howard
It's a close call, but Ryan Howard was the wrong choice for MVP. It's an interesting MVP battle because they're so similar. This wasn't a choice between a slugger and a speedster, a shortstop and a DH, or a pitcher and a hitter; this was a choice between two slugging first basemen. Forget all the hype about Howard's homeruns, the only real difference between the offensive lines for these two players was their playing time. Pujols missed a large chunk of June with an injury. He finished the season with 143 games and 634 plate appearances. Howard totaled 159 games and 704 plate appearances. Otherwise, their offensive numbers were very similar. The following two lines are Howard's actual stats, compared to projections for Pujols if he had also had 704 plate appearances:


playerRH2BHRRBIBBKSBAvgOBPSlgOPS
Howard10418225581491081810.313.425.6591.084
Pujols1321973754152102568.331.431.6711.102


There's one huge difference: the strikeouts. Other than that, the two lines are very similar. Howard has a few more walks and a few more homers. Pujols has a few more singles and a dozen more doubles. Overall, Pujols's rate stats are slightly superior ; he has a slight edge in both on-base percentage and slugging percentage.
Obviously, that slight edge doesn't make up for the 16 game edge Howard has on Pujols. 16 extra games from your best player has a lot of Value. And that's probably where the analysis ended for a lot of the voters. The problem is the offensive numbers are a little misleading. One reason: park factors. My guess is that most voters ignore park factors unless a Rockie is in the running. Here are the park factors (numbers over 100 indicate that the park improves run scoring by the percentage over 100) for Philly and St. Louis according to three difference sources (they all base their numbers on different time spans):


PhiStL
Prospectus10399
BB Reference10398
ESPN10695


Doesn't look huge, but it's a big enough factor that it needs to be taken into account. So, now we're at this point: (1) Pujols's slightly better rate stats need to be adjusted for the park factors and (2) the resulting edge for Pujols needs to somehow be weighed against Howard's significant edge in playing time. Fortunately, Baseball Prospectus has statistics that already do this hard work. I'll note that of the three pairs of park factors listed above, Prospectus's find the least difference between the parks, so if anything, an argument can be made that Prospectus's stats are all skewed slightly in Howard's favor. Equivalent Runs (EqA) is one of their park-adjusted measures of overall offensive production. It's a rate stat that is calibrated to look sort of like batting average (i.e. over .300 is good, below .250 is bad). Equivalent Runs (EqR) is a counting stat that shows how many runs a player produced based on his EqA. RARP is based on EqR and is a measure of how many more runs a player produced than a replacement-level player at his position. Before I go any further, here are the numbers:


EqAEqRRARP
Pujols.34612873.1
Howard.33713472.7


As you can see, Pujols has a decent edge in EqA (which is greater than his edge in OPS because of the Park Factors). Howard has a slight edge in EqR (due to his edge in playing time), but Pujols has an edge in RARP. One can quibble with the way Prospectus determines what a replacement player would produce, but it's clear that doing such a calculation is necessary. Essentially, using EqR to determine Value would be flawed because it assumes that for the 16 games Pujols missed, the Cardinals received no offensive production from first base. RARP assumes that the team could've received some minimal amount of output, and therefore, reduces the edge given to Howard due to playing time. The difference in RARP is only .4 runs, which is pretty much meaningless. (By way of comparison, VORP is a similar stat produced by Prospectus; it gives Pujols an edge of 3.9 runs). So, based on their offensive production, it's pretty much a tie. So, you can either flip a coin or look at some possible tie-breakers. If you do the latter, it's clear that Pujols deserved the MVP.

1. Fielding
This really should be more than just a tie-breaker. Fielding is an important part of the game. But, when we're talking about first basemen, I think fielding was probably an afterthought for most voters. Fielding stats aren't wholly reliable, and there are particular problems for first basemen because a lot of their value is hard to measure. But, most of the numbers agree with the general observations of scouts and experts that Pujols is one of the best defensive first basemen in baseball. Howard's general reputation is that he's bad in the field, the numbers indicate that he's average at best.

Pujols won the gold glove award this year. He was also named the best defensive 1B in all of baseball by a panel of experts put together by Bill James. Prospectus calculated Pujols's defense as being worth 17 runs more than an average defender, and Howard's as being 15 runs worse than an average defender. That's huge. I think their defensive numbers are flawed, but it's at least another data point in Albert's favor. Chris Dial's defensive numbers posted on Baseball Think Factory show them both to be about average, giving Pujols a mere 3 run edge. Probably the best publicly available numbers (although still probably somewhat flawed for first basemen) are the +/- numbers calculated by John Dewan (who published the Fielding Bible last year). The new Bill James Handbook shows the top 10 according to Dewan at each position for the season. Pujols is #1, with 19 more plays made than the average 1B. Howard isn't in the top 10, which means he has no more (and probably less) than 5. Howard made 14 errors (one behind Nick Johnson for the league lead); Pujols made 6.
What does all this mean? Well, it's hard to put a firm number on defensive impact, but all evidence indicates that Pujols is a superior defensive player, and that certainly needs to be factored into the discussion.

2. Base Running
Base running is generally a relatively minor factor. Some studies I've read indicate that most players fall within the range of being worth plus or minus 5 runs a season compared to an average baserunner. That's only going to impact the discussion when two players are very close otherwise and have a large difference in baserunning ability. In other words, this is one of the rare times when it is a factor. The Bill James Handbook has a number called Running Rating, which is based on a number of factors, such as how many times a player advances from first to third on a single, scores from second on a single, etc. Howard has a -21, ranking as one of the worst runners in baseball (amongst unsurprising names like Giambi and Thomas). Pujols is well above average with a +13.

3. Clutchiness
I'm generally not one to talk about clutch hitting. All of the studies I've read indicate that past "clutch" performance has almost no predictive value about future "clutch" performance. So, like most statheads, I usually ignore it. But, the fact that is has little predictive value doesn't mean that it didn't actually happen in the past. So, I think it has a place in MVP talks. Here are the OPS's of both candidates with runners in scoring position and in situations defined by ESPN as "close and late."

ScoringPosClose/Late
Pujols1.3371.249
Howard0.9411.049


Obviously, it's hard to define which situations are "clutch," but both of the above measures seem to fit into the general area of what most people are talking about when they talk about clutch hitting, and Pujols has a large edge in both. A much more comprehensive analysis is available through a stat called WPA (Win Probability Added). This article explains it, but the short version is that it measures the probability of a player's team winning before and after each at-bat of the season, and calculates how much the player improved his team's chances of winning in each of those at-bats. So, a walk-off homer would have a very high value and a solo homer when the team is down by 10 would have a very tiny value. Of course, this formula assumes that all games are equally important, but otherwise, it should capture the total offensive value, including "clutchiness," over the course of the season. Pujols has the edge here: 9.2 to 8.2, meaning that he was worth one more win to his team according to this calculation.

4. Playoffs
There's a lot of disagreement about how much weight should be given to whether a player's team made the playoffs, but I think most people would agree that if everything else was truly equal, this factor should at least be a tie breaker. The Cardinals won their division by 1.5 games. Clearly, without Pujols, they would've gone home on October 1. The Phillies missed out on the wild card by 3 games.

I think both measures are flawed, but it's worth pointing out that both WARP and Win Shares (two measures that take both offense and defense into account) gave Pujols an edge of approximately 3 wins over Howard, which means that if you'd traded the two players, the Cards would've gone home and the Phillies would've advanced to the playoffs.

No comments: