Saber-Terms: WAR Pt. 2 (Defense)
By Michael Jong
Last time on Saber-Terms, we began discussing the framework for WAR by looking at the offensive side of the ball for position players. Of course, with the advent of better defensive statistics, more and more people have found the value of defense and its importance in evaluating players. Look no further than last offseason, when numerous premium bats such as Adam Dunn, Manny Ramirez, and Bobby Abreu struggled to find work in a difficult economic climate due to growing concern for their defensive inabilities. Examples can also be found this season, with teams signing premium defensive players to deals they would have never seen five or six years ago.
So how do we use defense in the measurement of WAR? Well, it’s fairly simple, and it is incorporated in a way very similar to how we counted offense.
In a previous Saber-Terms piece, I talked about how defense is being measured nowadays and how it goes a long way from the old days of fielding percentage. In today’s world, defensive metrics are all the rage. Even though they are still imperfect, they are right now one of the best ways to evaluate defense out there. Now, instead of just knowing or seeing that a player is good or bad at his position, we have a set of numbers that can back up those assertions.
How do they get input into WAR? Well, the problem with these data is that much of it is based on proprietary raw data and therefore somewhat difficult to reproduce. Sure, we could all dig through and run the calculations to recreate much of TotalZone based on Retrosheet data released at the end of the season, but it won’t be easy and Rally already does it for us. Many of these systems use proprietary data that costs a lot of money to acquire and is thus not feasible for many of us as well. Things may change when Colin Wyers rolls out more of his work on a defensive metric based on MLB Gameday data which is easily available for anyone to use.
Nevertheless, this part of the calculation is fairly simple: since much of this data is already put into runs above or below average, we simply need to choose the metric of choice and have that be our “defense” input. There are a few metrics out there of interest, the most available and widely known being FanGraphs’ bUZR (UZR using Baseball Info Solutions’ raw data). Of course, there are also other sources such as TotalZone (using Retrosheet) and Plus/Minus (using BIS data and video). Whatever methodology you find best (and they’re all fairly close) can be your choice. For reference, here are three Marlins on various different sides of the performance spectrum in 2009 (all numbers bUZR data from FanGraphs).
Brett Carroll: 14 runs above average
Hanley Ramirez: 0 runs above average
Dan Uggla: 10 runs below average
However, as I’ve said before, defensive metrics can be troublesome, especially in one-year samples. Not only are we dealing with sample sizes too small to make a good, statistical determination (a season of defensive data corresponds to something around 1/3 of a season worth of PA), we’re also dealing with metrics that have a good deal of measurement error involved. To resolve this, we can actually use scouting information to regress the data and get perhaps a better estimate of how a player performed. Here is where scouting can be critical to analyzing a player’s defensive performance.
Here are a few examples on how this can be done. Justin Inaz mentions in the defense article in his Player Value series how to use Tango’s Fans Scouting Report and convert it into runs. Steve Sommer did work in combining FSR data into UZR projections and came up with an excellent set of projections. And of course, the last few days I’ve been doing something similar to Steve’s work by coming up with a list of comparable players and getting a weighted average of their UZR/150.
From there, it is a simple matter of how much you want to weigh the numbers and the scouts. For a WAR-type calculation, I usually go 75/25 numbers/scouts, just as a personal preference. The choice here is yours. Based on that weighing system, the UZR data above, and my own comparison methodology with the FSR, here are the defensive contributions of those three players.
Brett Carroll: 11 runs above average
Hanley Ramirez: 0 runs above average
Dan Uggla: 8 runs below average
Now we have a number in mind for those Marlins players. However, each of those players plays a different position, and as we all know, not all positions are made equal. A +5 shortstop is much more difficult to find than a +5 right fielder because the pool of players that can play right field is much larger than the pool of players capable of playing shortstop. In other words, this inequality is a matter of positional scarcity.
So how can we adjust for that? Tom Tango and others have worked on positional adjustments using defensive metrics. The adjustments used by FanGraphs are the ones I will refer to here (Sean Smith also has slightly different adjustments based on his TotalZone metric).
C: +12.5 runs
1B: -12.5 runs
2B: +2.5 runs
3B: +2.5 runs
SS: +7.5 runs
Corner OF: -7.5 runs
CF: +2.5 runs
DH: -17.5 runs
All of those adjustments are rates per 162 games played at a position. Essentially, these adjustments even out the pools of players available at a position. As mentioned, the pool of +5 shortstops is much smaller than the pool of +5 corner outfielders. However, because the difference between the positions per 162 games is 15 runs, we can use that number to even the pools. Thus, the pool of players who are +5 shortstops (per 162 games) is about the same as the pool of players who are +20 corner outfielders. In other words, guys like Rafael Furcal (UZR/150 of 0.4 the last three seasons) are about as valuable and scarce as guys like Carl Crawford (UZR/150 of 13.1 the last three seasons).
Here are the three players we mentioned earlier and how the positional adjustments change their defensive value.
Brett Carroll: 9 runs above average
Hanley Ramirez: 6.5 runs above average
Dan Uggla: 6.5 runs below average
To achieve a similar defensive value in 2009 as Hanley’s in the same number of defensive games, you would have to have around a +4 2B/3B /CF, a +13 corner outfielder, a +17 first baseman, or a -4 catcher. Even though Hanley was about dead average at his position, the fact that his position was shortstop in and of itself adds value.
I will run through some theories on catcher defense in another installment of this WAR series, but not in this one. Suffice to say that catcher defense is difficult enough and still in its infant stages of quantification that we don’t have the space for it here. Look for it either Friday or next week.
Defensive Conclusion (sort of)
Here I’ve displayed what you can do to account for the defense of position players in their value. With the estimates of defense available in runs above or below average, we can tally these runs with the offensive totals and get an estimate of a player’s value above average. But knowing the player’s value above average makes it difficult for us to evaluate the monetary value of players, particularly average ones. On one of the next Saber-Terms pieces, we’ll discuss replacement level once again and point out what that adjustment accounts for and why it’s a logical and important level to use.