It's the summer of 2011. You're looking for scoring, and 31-year-old Brad Richards is on the market. He's averaged better than a point per game over the last few years, so you're interested. How much should you be willing to pay on a nine-year deal?
You can't just look to the past to answer that question. You need to guess how his skills will change over the life of the deal. He might easily be worth $7M for the coming season, but what should we expect from him at age 33, or 36, or 39?
Knowing how players age is an important part of making projections. This sounds pretty simple to study -- you'd think you could just look at how the average 19-year-old does, then the average 20-year-old, and so on up through age 35 or 40. But it's trickier than that; not everyone's career lasts 20 years, which complicates things.
Let's outline why that goes wrong and how to improve on it.
A winnowing population
Look at the list of players who were regulars at age 35+ last year and it's hard to deny that there's more talent on that list than on a randomly selected group of NHL regulars. Not all of them are great players, and at their age, most of those players aren't anywhere near as good as they were in their prime, but as a group their prime was much stronger than the average player's.
If we used that simple averaging approach, we'd look at this list of players and come to the conclusion that the average player will average ~45-50 points per 82 games in his late 30's. That's clearly not right, though. Since we're looking at an above-average group of talent, we're setting the bar too high for the average player.
I've put together an illustration of how this goes awry. Let's imagine that deep down, people's inherent, natural development progression looks something like this:
Some players have more talent than others, so their curve starts, peaks, and finishes higher, but they all (in our simplified world) all have this same shape to their progression. So we might make a new chart that shows the variety of skill levels of our players:
But hold on. The weaker players represented by the brown curve don't have 20-year careers like the top-end guys do. So what we actually observe looks more like this:
Now let's think about what happens if we follow the method described above for estimating an aging curve. I've added a gray line to this plot that represents the average performance of all players who play at any given age, and we can see that it's quite different from their actual individual development curves:
Because the weaker players drop out of our population as they age (and don't make the NHL as early), our calculated curve is much flatter than it should be. We need an approach that won't be quite so affected by the winnowing of the population.
A big improvement is to look at pairs of years. Instead of asking "how good were these guys at age 35", we'll ask "how much better or worse were they at age 35 than at age 34?"
We do the same thing for 34 vs 33, for 33 vs 32, and so on until we know what the average change is at every age. Since the players in any given year are only being compared to how they performed the year before, we fix a lot of the problems of the changing population and can put together a reasonable aging curve.
But as Tom Tango pointed out, our curve still isn't quite right. This approach includes everyone who plays at both ages 34 and 35 but excludes the people whose career ends after age 34. The problem with that is that variance has an impact on whether a player makes it to age 35 -- a guy who runs hot at age 34 probably gets another year and a guy who runs cold probably doesn't.
Let's do a thought experiment. Imagine we had three 34-year-olds who all had the same talent; we'll call them 17-goal scorers ("talent" doesn't have to mean scoring goals, but that's the metric we'll use for this example). Let's suppose that as omniscient beings setting up this hypothetical, we know that at their age, they'll decline to being 15-goal talents at age 35.
But while they each have the same amount of talent, they had different amounts of luck last year -- one ran hot and scored 21 goals last year, one had 17 goals, and one ran cold and scored 13 goals. They're the blue, green, and red dots respectively on this plot:
That third guy, the one who had bad luck, can't find a job coming off his 13-goal season and decides to retire. The other two -- the ones who had 21 and 17 goals -- do get another year in the NHL. Their luck evens out and they score (on average) exactly what their talent would suggest, 15 goals apiece.
But now the non-omniscient analyst who's stuck using NHL data to estimate an aging curve gets it wrong. He sees that those two went from an average of 19 goals to an average of 15 goals. We know their actual talent really only dropped from 17 to 15, but he doesn't get to see that because there's still a survivorship bias in this approach.
Looking back at our plot, his estimate of how players change when they turn 35 comes from averaging together the blue and green lines, and he never gets to factor in the red line. So there's still room for improvement if we can account for this somehow.
Accounting for luck
The assessment above was wrong because we didn't include the red player in our analysis. Since he didn't play both years, that's unavoidable -- we'll just never have a direct measure of how he would've performed at age 35. So instead, if we're going to get this right, we'll have to adjust the other curves somehow to try to account for variance.
It's impossible to know exactly how much talent and luck went into any specific player's performance, but that doesn't mean we should ignore variance completely. As we've discussed before, the more extreme a player's performance is, the more likely it is that luck played a significant role.
It turns out that our best estimate of a player's talent isn't his observed performance, but something a little bit closer to average. How much closer depends on how much luck goes into whatever performance metric we're looking at -- the larger the year-to-year swings, the more skeptical we'll be of someone whose results in any given year are far from the average and the more we'll reel in our estimates of their talent.
If we have the right estimate for how much luck goes into our measurement, this approach will account for it and give us a much more accurate estimate of how the average player ages. It can make a big difference!
Of course, that depends on having the right estimate for how much luck goes into our measurement. It turns out that survivor bias complicates that assessment too. We'll explore that further in our next article.