Saber-Terms: UZR and Defensive Metrics


I’ll start off by saying that there are plenty of solid primers on defensive metrics already available, but I’ll give it a try myself just so that we can have one for this site. I think the readers of Marlin Maniac want an explanation to all of this UZR stuff, and I plan on doing my best to give it to them. Here we go.

Start at Zone Rating

I was initially going to go through a history of defensive stats, but I’ll start my explanation with Zone Rating. The only thing I’ll say about the past is that it featured fielding percentage prominently, and, well, everyone knows that fielding percentage sucks. One third baseman could have great range to his left, get to a ball in the hole and boot it (think Ryan Zimmerman, for example), while another could have little range and just let that ball scoot into left field (think Jorge Cantu). That’s why fielding percentage sucks.

So a big part of defense is range, and for the longest time we did not have a real strong way to measure it. When John Dewan first came up with Zone Rating while working at STATS Inc., it seemed like a good approximation for range. Zone Rating broke down the field into multiple zones and assigned “zones of responsibility” to various positions. From there, it’s just a matter of dividing:

Zone Rating = balls in zones of responsibility turned into outs / total balls in zones of responsibility

After the move to Baseball Info Solutions (Dewan’s own company) years later, Dewan updated Zone Rating accordingly, adding out of zone plays (plays that were made out of the zone of responsibility of a position player) as part of the equation. Zone Rating could be compared to a position average to find a number of plays above or below average for each player for their playing time.

The data gets granular

Zone Rating is a fine enough measure of range, but as we all know, not all balls hit to an area are made equal. With STATS and BIS recording more and more granular data, the ability to get more information on a ball in play allowed for the advancement of even more intricate metrics for defense. The one I’ll describe here is UZR, because I am most familiar with the methodology, but Dewan’s plus/minus system, Sean Smith’s TotalZone, and other defensive metrics are similarly built. The differences are largely between data sources and the way buckets are designed, but the principles are basically the same.

(If you’re interested, MGL himself explains the UZR methodology at its early stages here and here.)

Zone Rating generally discusses the field in terms of zones and uses that to split up the “buckets” for balls in play. UZR and other systems use more granular data to further separate the field. The field is split into zones, and each of those zones receives buckets based on the type of balls in play (grounders only for infielders, fly balls and line drives for outfielders). For each bucket of data, an average number of hits and plays made by each position is measured. From there, a run value for a hit into that bucket is assigned.

For all individual players, the same measurements above are made for each bucket in which a player of that position recorded an out that season.  Here’s where it gets interesting. The individual player only gets credit for the “extra” hits and plays made at the position, or the hits and plays made above/below the average. To illustrate, I’ll give a fictitious example based on the one MGL provides in the linked articles.

Let’s say that in zone 56 (the hole between shortstop and third base), 45% of balls in play become hits, with an average run value of a hit of 0.48. Let’s say that of the remaining balls in play that turn into outs (55%), 20% are converted by the shortstop on average, while 80% are converted by the third baseman.

Assume that in a given season, Hanley Ramirez recorded 20 outs in zone 56 while 80 hits were recorded in zone 56 while Ramirez played shortstop. At the same time, assume Jorge Cantu recorded 65 outs in zone 56 while 80 hits passed through while he was at third.

How well did Ramirez do? Ramirez recorded 20 outs, but he only receives credit for the outs he made above average. The average ball in play is worth 0.55 outs in this zone. His 20 outs recorded are worth each worth 0.45 outs above average, for a total of 9 outs or plays above average. Eighty (80) hits were allowed while Ramirez was on watch, but shortstops like him are only responsible for 20% of the outs, so Ramirez is only responsible for 20% of those hits, or 16 hits. Each ball in play already worth 0.45 hits, so each of those 16 hits allowed is worth 0.55 hits above average, or 8.8 hits (or balls not caught) above average.

Ramirez is thus responsible for catching 9 extra outs and allowing 8.8 extra hits, for a total of 0.2 plays above average. The run value of a play made in this case is the difference between the run value of a hit and an out. An out is worth -0.28 runs, while a hit was worth 0.48. Thus, Ramirez saved 0.2 plays above average, each play above average being worth 0.76 runs, for a total of 0.15 runs above average.

Cantu is recorded 65 outs for a total of 29.25 outs above average. He also had 80 hits get by him, of which 64 were his responsibility. Those 64 hits were worth 35.2 hits above average, putting Cantu at about six plays below average. Six (6) plays below average is worth -4.5 runs compared to average.

The example above shows the methodology used for UZR and most other metrics of similar build. Keep in mind a few notes:

– That above method only shows range. Things such as double plays, outfield arm, and catcher defense are much simpler calculations based on plays converted versus the league average.

– The “outs” mentioned above include errors as outs. Errors are then calculated separately based on a league average error rate.

– There are other things to keep in mind in terms of adjustments, particularly adjustments made for runners being held and some park effects such as left field in Fenway Park.


The methodology shown here does an excellent job of estimating range compared to the average. However, it does have its limitations. In particular, in Zone Rating, the data was so broad-ranging (zones) that it did not accurately capture the types of balls being defended. Not all balls hit to a certain area are created equal. UZR and other systems like these still have similar issues. A fly ball recorded to deep center may be two different things in two different situations; one may be a lazy shot at the bottom edge of the zone in center field, while another may be a hard-hit shot that has to be caught at the wall. This could easily be the case because there a lot of variables in balls in play that cannot always be accounted for by the data given. There is also still a problem with scoring bias. Not only are there issues with scoring errors, but fly balls and line drives still have odd distinctions that can vary based on individual scorers and even press box heights.

Beyond that, there is an aspect of need for more sample size to determine a better measurement of production. Balls in play to certain players are not nearly as prevalent as plate appearances, so single-season numbers for any defender should be regressed and taken with a grain of salt. In general, it is said that three years of defensive data is needed to approach the same sampling of one season of offensive data.

The Takeaway

Despite these limitations, the methodology of UZR and other defensive metrics is sound. Technology is still limited, but with Hit and Field F/X coming soon to stadiums around baseball, the hope is that defensive measurement will become even more of a definite science as we move forward. Right now, it is still difficult to remove the subjective views (fly balls vs. line drives) and problems with determining exact locations, but the technology set to come around in a few seasons should help make it a reality. One day, I am confident that defense will be just as well quantified as offense. For now, these systems will have to do. But make no mistake, they are worth the work that has been put in and worth your time in considering their results.