On Tuesday, we looked at how to balance the recent past and more distant past when trying to project a goalie's save percentage. In brief, the result was that a shot faced that was x years ago should be given roughly a (2/3)^x weight.
Let's do the same thing as last time, but this time looking at forwards' scoring rates.
Projecting point totals
Just like before, I'm going to think back to what we knew in, say, 2010, and see what data would have been most helpful in trying to predict what the player's future performance would look like. I'll do the same for each forward in each off-season from 2005 to 2013 -- a total of 5104 projections.
And like before, I'll run the calculations several times, looking at what happens if I look back three, four, or five years and if I try to project just the next year or the next three. I'll normalize the player's performance each year to the league average, since scoring rates changed quite a bit over this period.
So for example, if I'm looking back at years 1-4 and trying to project the players' performance in years 5-7, I come up with the following weights being best:
- Each game played in year 3 counts 30% as much as a game played in year 4
- Each game played in year 2 counts 15% as much as a game played in year 4
- Each game played in year 1 counts 7% as much as a game played in year 4
I get something pretty similar if I use three years (100-30-20), or if I just try to project a single upcoming season (100-30-15-6). However, if I look back five years, things get a little weird looking (100-20-15-10-20).
That's a good reminder that these results are a little bit squishy; it's a lot more likely that the answers coming out of my tests aren't perfectly precise than that data from five years ago matters more than three years ago.
In that first optimization, 100-30-15-7 came out as the best answer, but not by a huge margin. If I use 100-40-25-15 instead, my projections are only slightly worse; it's close enough that a handful of players' results can make the difference.
So take these answers as guidelines rather than strict rules -- but our results won't change much if they're off by a little. And in general, things fell into a narrow enough band to keep me happy. The basic answer seems to be that a weight like 100-30-20 is a pretty reasonable approach for estimating points per game.
The pros and cons of using older data
This result is quite different from the answer we got for save percentage in the previous article, which was more like 100-60-50-30-20. Why wouldn't the older data matter as much for points as it does for save percentage?
Any stat is a mixture of skill and luck. (I'm using the statguy meanings of those terms: skill includes everything about a player's ability or usage that stays the same from day to day and luck includes everything that fluctuates unpredictably.) How far back we look depends on that balance.
The advantage of broadening the sample is that the luck portion washes out more. The more games you average together, the less likely it is that your sample covers some random hot streak or injury.
The advantage of focusing on the most recent past is that the skill portion will be most like what the player will have going for him tomorrow. It's when he was closest to his current age (and therefore presumably ability), and is also most likely to have similar usage.
The usage factor is particularly important when we're talking about points per game. Save percentage doesn't depend very much on usage, but points per game does.
Usage versus ability
If you wanted to predict how many points James van Riemsdyk would have this year, his scoring rate last year when he played 19:12 per game on Toronto's top line will be a lot more relevant than when he was getting 13-14 minutes per game in Philadelphia.
Ice time can change sharply over the course of a couple of years, often much faster than the player's scoring ability. So the player's points per game from a few years ago don't necessarily help us much when we try to guess how he'll do in the coming year.
But we can separate ice time and scoring ability to some degree by looking at points per minute played instead of points per game. We can do the exact same analysis with this alternative measure, and we come up with a very different result.
For points per minute, the best fit in my weighting system is 100-65-50-30, which is very similar to what we previously came up with for save percentage.
For booth shooters and goalies, the variability of shooting percentage makes it important to look at more than one year when trying to project going forwards, but changes of skill make the most recent year the most important one.