A couple of weeks ago, I dissected the PECOTA projections for the Nats, and mentioned why I thought some of them might be inaccurate. The first one I discussed as probably being a bit off was Bryce Harper’s to which I said
This would be an incredible stat line for any 20 year old… human. But since Harper is superhuman, I’m guessing there’s nothing in the code to deal with that, and his age is hurting his predicted performance
I didn’t go much beyond that, but Matthew Kory did. It’s worth reading the whole thing, but here are a few highlights as to why he thinks Harper will do better than the 259/.324/.442 PECOTA says he will. When trying to come up with an actual comparable player to Harper, he notes the biggest issue with projection systems, which ”projects player performance based on comparison with historical player-seasons.”:
This illustrates the problem with projecting a player with Harper’s specific skill set at so young an age. Where projection systems can usually be very precise, with Harper they can’t; the data just doesn’t exist. Therefore projection systems can’t be as certain, and the range of possible outcomes is much greater than it normally would be.
And there is the real issue with Harper. There just aren’t enough comparable players to make a projection of any real confidence. He mentioned comparing Harper to ARod at the same age, which has it’s issues. Even with the entire universe of 19 year old Major Leaguers with significant playing time, Kory notes:
Harper’s 2012 season was the third best season by a 19-year-old in the last 110 years. That may not be exactly unique, but it’s getting there.
It’s not easy building projections for players, and PECOTA really does a great job at giving us a very realistic view of what 90% or more of the league might do. But there are some cases where it just might not work, and even if Harper hits exactly what his projections say, I’m convinced that would be coincidence as he is too unique at this age to be projected accurately.