A key part of my model is the concept that the selected players, along with the less player-specific team performance, determine the strength of the team as a whole. Here I’m going to discuss some ways to measure a player’s form to, hopefully, accurately predict their performance in an upcoming game.
Imagine for a moment each player’s value is described by a single score; I’ll use Supercoach points for this example. How do you best predict what the player’s score will be in their next game?
Speaking of Supercoach, the site provides projected scores for upcoming matches:
“Projected scores are calculated using the Herald Sun SuperCoach AFL’s proprietary formula, which takes into account several factors including the player’s most recent performances, their direct performances against the opponent and their past performance at the venue.”
This is fairly self explanatory, it uses recent form (rolling mean from n matches), form against the opposition team and form at the game’s venue. I understand the thinking behind the latter two measures but personally I would be hesitant to use them. Most players play each team only once per year; the carryover form seems tenuous unless they happen to play on the same opponent player, against the same gameplan. Likewise, the venue form will only be useful for the home team.
The AFL Ratings system uses a long-term forecast to rate a player’s value:
“A player’s rating is determined by aggregating his points tally based on a rolling window of the previous two seasons. … only a player’s most recent 40 matches are used in the calculation of his rating. … A player’s most recent 30 matches are given greater weight in determining his rating. Matches 31 through 40 are progressively reduced in weighting”
This is a different kettle of fish. The trend over a long time is a better measure of a career value, rather than a form. As I write this, Scott Pendlebury is the 5th highest rated player in the league. He’s not in good form though (**see below), so he’s unlikely to rank that highly in his next round.
How can you handle cases like Tom McDonald and James Sicily, who have been moved to the other end of the ground and their output has changed (improved?) dramatically? I think there needs to be an element of long-term and short-term form. Let’s look at some case studies using my terrible visualisations.
Here are Scott Pendlebury’s Supercoach scores for the last few years.
The first two plots have rolling means of the last 5 and 20 games respectively (**he’s clearly out of form on this measure). The final plot I’ve separated his home and away games and plotted a rolling mean of his last 10 home/away games. Maybe his calf injury is hanging around and he’s not travelling well? Let’s check some others out.
Shannon Hurn is a nicer player to model; plenty of variation as you’d expect but a nice clean upward drift on the 20-game average; it’s probably a better predictor of future performance. Lately he’s been showing better form away than at home. Bodes well for the finals.
Finally, James Sicily has gone through that transformation from forward to back and has shown a dramatic improvement in this measure. His 20-game average is taking a while to catch up and it’s likely his 5-game average will be a better predictor.
With just a few examples it’s easy to see why there are benefits with a short-term and a long-term mean. There is possibly some benefit of considering home and away form separately too.
My model considers a weight of these three means for predicting the mean output of a player – their current form. Each of the seven (!) parameters I use to measure a player’s performance gets a mean of this form. The choice of 5- and 20-game means passes the eye test (for me, anyway) and gives good outcomes when simulating past seasons. Rookie players with n<5 games just have a mean of all of their performances, players with 5<n<20 get a 5-game mean and a n-game mean.
Now, what about the game-to-game variance from the mean(s)? The means will be good for giving a mean outcomes but variance will be useful in determining the probability of outcomes. In the finance industry, estimating volatility is where the money is.
I need to do more investigation on the best way of handling this, and I might write a piece when it’s a little more refined. My current method is to take a sample standard deviation of the measure over 10 games. I had been using 5 games and this was giving me some strange outcomes. Having done some more simulations, 10 seems to work pretty well as compared to larger and smaller samples.
Cheers for now!