Jonathan Bales is the author of Fantasy Football for Smart People: How to Dominate Your Draft. He also writes for the New York Times and Dallas Cowboys.
The following is a section from a chapter from my book Fantasy Football for Smart People. In Identifying Value: Regression, Randomness, and Running Backs, I discuss how to leverage differences in perceived value and actual value into a competitive advantage.
Identifying Value: Regression, Randomness, and Running Backs
Back in 2008, I had running back Thomas Jones ranked well ahead of most owners. Jones was playing for the Jets and coming off a season in which he ran for 1,119 yards, but averaged just 3.6 yards-per-rush and scored only two total touchdowns. Those two scores represented just 0.59 percent of Jones’ 338 touches in 2007.
ESPN had Jones ranked 21st among all running backs. I had him 10th. Why would I possibly rank a then 30-year old running back coming off a season in which he tallied 3.6 yards-per-carry and two total touchdowns in my top 10? Regression toward the mean.
Regression toward the mean is a phenomenon wherein “extreme” results tend to end up closer to the average on subsequent measurements. That is, a running back who garners 338 touches and scores only twice is far more likely to improve upon that performance than one who scored 25 touchdowns.
Regression to the mean is the reason the NFL coaches who take over the worst teams are in a far superior position to those who take over quality squads. If I were an NFL coach, there is no team I would prefer to take over more than the 2008 Detroit Lions. Coming off an 0-16 season, the Lions were almost assured improvement in 2009 simply because everything went wrong the previous season. Even though Detroit was a bad team, any coach who took over in 2009 was basically guaranteed to oversee improvement in following years.
This same sort of logic it the reason there are so many first-round “busts” in fantasy football. Players almost always get selected in the first-round because they had monster years in the prior season. In effect, most first-rounders are the “outliers” from the prior season’s data, and their play is more likely to regress than improve in the current year. It isn’t that these players are poor picks, but rather the combination of quality play, health, and other random factors that led to their prior success is unlikely to work out so fortunately again.
Walk into any casino in America and you will see lines of hopeful grandmothers lining up behind slot machines that haven’t paid recently. Since the machines pay a specific average of money over the course of their lives and these numbers always even out over the long run, surely an under-performing slot machine must be due to pay out soon, right?
This is one of the biggest misconceptions regarding statistics and regression, and it is the cause of millions of lost dollars each year. In a set of random data, previous occurrences have absolutely no effect on future events. If you flip a coin right now and it lands on heads, the chance it lands on heads again on your next flip is still 50 percent.
Similarly, if the overall payout rate of a slot machine is 40 percent, the most likely outcome of placing $1,000 into it is walking away with $400. You could walk away big or you (theoretically) could lose every penny, but the most probable single dollar amount you could “win” is $400. So when the previous 100 pulls of the lever are fruitless, the payout “improvement” that is likely to take place over the next 100 pulls isn’t because the machine is “due,” but rather it is simply working as normal. This is regression toward the mean.
Football isn’t totally random, but it’s more random than you think.Actually, some statisticians have estimated the “luck factor” to be as high as .924 in the NFL. That means on any given week, the “true” winning percentage of teams that win is really around .538. In a league in which only 16 games make up a season, the talent gap between teams is lessening, and turnovers play a huge role in wins, the amount of luck involved in the game is more so than any other professional sport.
Even disregarding the potential randomness of NFL outcomes, the identification of under-performing players can be of incredible value to fantasy owners. As it relates to Thomas Jones, it doesn’t really matter how much randomness was involved in his two-touchdown season. Heading into the 2008 season as the workhorse back on a team with a strong offensive line and no real reason to think he was a fundamentally poor short-yardage runner, projecting Jones to score more than a handful of times was easy. I projected him at 10 touchdowns. He scored 15.
So when other owners are jumping all over the players who had “extreme” seasons the prior year, look for talented players who actually underperformed. As long as they get similar opportunities to make plays, their numbers will probably improve. For fantasy owners, this represents value.
Of course this doesn’t mean you should select weaker players simply because they had poor years. In the first few rounds, you are almost certain to draft outliers who played better than normal the season before. Your job is to recognize which players’ value is primarily the result of random factors, and thus likely to regress to the average, and which is based largely on talent, and thus likely to repeat itself.
Of course, not every player has the same “average” season. If we were to simulate 1,000 NFL seasons, Ray Rice’s per-season totals would obviously eclipse those of, say, Beanie Wells. So recognizing how players’ stats will regress involves identifying (or at least intelligently estimating) their “average” season. In a typical season, how many more yards, touchdowns, and receptions will Rice score as compared to Wells? Until we establish mean seasons for each player, we have no base from which we can determine to where their numbers from the previous season will regress. That is, the totals for Rice and Wells aren’t likely to regress to the mean for all backs, but rather they will regress to their specific averages.
Determining this value can be tricky. One of the easiest ways is to determine how many “lucky” plays a player benefited from in a specific year. We have already seen stats like interceptions are inherently fluky, and thus very likely to regress to the mean in subsequent seasons. Aaron Rodgers is a heck of a player, but he’s very unlikely to match his 45:6 TD-to-INT ratio from 2011.
Other statistics, such as touchdowns and long-yardage plays, are not necessarily extremely random, but they can still have a major impact on fantasy scores. In Chris Johnson’s 2009 season in which he broke the record for total yards from scrimmage, he totaled seven touchdowns of 50-plus yards. That number is the ninth best of all-time. . .for a career!
Despite possessing game-breaking speed, it would have been foolish to believe Johnson would repeat his 2009 campaign. Our job as fantasy owners was to determine what an “average” Johnson season would look like, taking the extent of Johnson’s 2009 “luck” into account.
Still, the task of predicting average seasons on an individual basis is a difficult one. There is no single method to do it, but understanding the inherent instability of interceptions, fumbles, long touchdowns, field goals, etc. is a start.
Check back later this week for Part II of Identifying Value: Regression, Randomness, and Running Backs. You can buy Fantasy Football for Smart People at Amazon.
If this stuff is way to complicated you can always “cheat” and purchase our Fantasy Football Draft Guide which fully automates the value of each player as well as projected points based on your league settings, for you.
Be sure to check out other great articles at Fantasy Knuckleheads.