Why does PECOTA have the Marlins at 76 wins?

Baseball Prospectus recently released their first shot at Depth Charts for the 2010 season, which come complete with full team projections based on the projected performances of individual players. The release got a lot of hub over the Internets because it showed that the New York Yankees would be projected 3rd in the AL East, with the Tampa Bay Rays winning it. But of course, that is not what I was interested in. I immediately glossed over the page to find the NL East rankings, only to be significantly disappointed to see our Florida Marlins ranked 5th in the NL East, behind the New York Mets and the Washington Nationals, who were projected to finish above .500.

Now, I’m not going to be “that guy.” You know, the homer team blogger who complains about projections and gets mad because their team came off so poorly. And to be honest, these are projections that come from the weighted mean average; in other words, they have not tinkered with the playing time correctly yet. BP author and friend of the Maniac Colin Wyers said so himself. And, as Wyers points in another comment on BP:

Even if we are correctly predicting everything – everybody’s raw hitting and pitching stats, everyone’s playing time, etc. – correctly, there is still some random variation.

How much? I mean, a lot. The standard deviation of win percentage over 162 games simply due to random variation (or “luck,” if you prefer) is a little over six games. Events within one standard deviation occur 68% of the time.

There’s something to recall next time your team finishes 83-79 and you call it “disappointing” because you thought they’d win 88 games (ah, 2004 and 2005, how I miss you!). Let’s face it, these are projections, hard to say anything concrete from them at all.

So there’s a lot that goes into why these projections are where they currently are, with the Marlins going 76-86. Let’s see if we can’t come up with why.

1. Playing time is well off.

Here is one of the most likely reasons for the issue. As Peter over at Capitol Avenue Club so succinctly puts it in his sabermetric PSA on projections, these are 50% mean projections, and these are based on playing time estimates built by algorithms that are likely worse than you or me at determining playing time. Both of those problems can lead to inaccurate amounts of wins.

To compare the findings of PECOTA with those of other projections, I looked into the numbers provided by Sean “Rally” Smith’s own system, CHONE. I looked at the projections for each player expected to be on the starting roster on FanGraphs and totaled the amount of WAR expected. For pitchers, I converted FIP into WAR with Pythagenpat. I used FanGraphs’ position adjustments and replacement level. Adjusting a small amount for certain circumstances (mainly that Ronny Paulino and John Baker had too many combined PA), I got a WAR total unadjusted for additional playing time of 32.9 WAR. That number would be expected to give us about 81 wins if it were to hold up (which again, is absolutely not guaranteed). Not quite as down as PECOTA, but totally expected.

What happens when you correct for playing time? The playing time projections added up to about 1052 PA short of the NL average PA from last season and 312 innings pitched short of the league’s IP total. To be fair and allow for appropriate injury risk, I doled out half of the playing time to the expected position player starters and half to replacement level players. I did a similar thing to starters and relievers, in order to get them in line with a league average ratio between starter and relief innings. Without boring you with the process, the results were bumped up almost six wins, giving a projection of 87 wins. Using CHONE’s projections for offensive and defensive runs above average and the FIP projections for pitching runs, I came up with a .532 win percentage from Pythagenpat, corresponding a projected 86 wins.

What does this mean? Well, by filling in the playing time (admittedly fudging some of a bit, but within a reasonable measure), I got a value of wins that was around five to six more than the projection with unadjusted playing time. In other words, a more thorough look at the playing time for PECOTA could yield a win total of around 82 wins, which is right around where the team was last year in terms of team WAR and where I would expect to see the team this year.

2. Competition

I do not know how PECOTA measures the win totals exactly, though I presume they follow a method of either adding up individual player WARP (their version of Wins Above Replacement, with a tacked-on Player at the end of the name) or coming up with a composite runs scored and allowed and running it through Pythagenpat. If they go by the Team WARP method, undoubtedly they will add up their second-order WARP, or WARP2 for the projected wins total. WARP2 accounts for one thing of interest to these sorts of projections that would not be accounted for in WAR: competition, or strength of schedule.

Take a look at the projected standings again. The Marlins sit in what is likely the toughest division in the National League, though the NL West may have something to say about that. Assuming those divisions are even in competition, we can look to the Marlins out-of-division schedule for any further answers. Outside of the division, the Marlins play six games versus the Rays, an abnormal nine games versus the San Diego Padres, three versus the Baltimore Orioles, three versus the Chicago White Sox, and more or less six versus everyone else in the NL. Based on that, I don’t see anything that would give us an unusually difficult schedule that would bump our winning percentage down.

Who cares?

Good question. As I said, these projections are:

1) projections, which means they come with their share of error and inaccuracy (such is the nature of these sorts of things)


2) not endowed with good estimates of playing time.

So there is no good reason to get up in arms about anything. And given the fact that the standard deviation for a “true-talent” 76-win team is between six or so wins just from random deviation, this sort of early season number is hardly something worth reading into. Still, it’s Friday, it’s something to talk about, and it’s something to be happy about when we once again surpass those values.

