Year after year, we see sports franchises aggressively seeking the strongest, fastest, most athletic players that can help them win. In fact, the NFL has built an entire industry on accurate player scouting and analysis as athletes transition from college football to the pros. The measures taken by football clubs range from sending scouts across the nation to hiring economists to run their personnel department (most recently done by the Cleveland Browns, who hired Paul DePodesta of Moneyball fame).
However, many of the practices in sports scouting, especially the NFL, are archaic in nature. While sports like baseball have great heuristics (like Sabermetrics) that have grown popular over time, the NFL has never developed a consistent analytical methodology for player evaluation. In an era where an algorithm can successfully fly a plane or pilot a car safely through a crowded street, NFL teams still draft players based on gut feelings or how fast they can run 40 yards in a straight line. It’s actually very common for teams to place wide receivers that can run a 40-yard dash under 4.5 seconds higher on their draft board with little consideration of other tests. To those of us in the profession of data science, this is a lost opportunity and a devastating waste of resources. Millions of dollars are gambled on contracts that ride on only a few data points for justification. It is time for an analytics revolution in the NFL, and it all starts with a February showcase often dubbed the “Underwear Olympics,” or more formally, the NFL Combine.
The NFL Combine is an event in which NFL prospects perform athletic tests like the 40-yard dash and broad jump so that teams can better understand the athletic potential of a player. Teams use this data to try and assess how well a player’s athletic traits can translate into NFL production. However, this information is used in a piecemeal fashion, meaning sometimes teams make bets on players based off one single aspect of their performance metrics. This would be akin to a mechanic determining the health of your entire vehicle solely by checking the oil.
It is proven that NFL Combine scores can successfully be used for predictions, as is evidenced by numerous research publications. However, no major publication or team has ever announced the adoption of any advanced machine learning for player evaluation. Because some of us at SparkCognition are avid football fans and are always looking for a leg up with our fantasy sports teams, some folks in our office decided to see if it was possible to use NFL Combine data to create an all-encompassing prediction for success at the next level.
The results were fantastic, accurately predicting the likelihood of success (barring injury) for the majority of our testing on wide receiver data (See graph below). The plot “NFL Wide Receiver Predictions vs. Actual Production” shows wide receiver success prediction versus actual wide receiver success scores. The predictions were determined by feeding the player’s combine results into machine learning algorithms. The actual scores were calculated by weighting yards/game, yards/target, and total touchdowns over the player’s first three seasons. It’s easy to see that a simple machine learning approach is capable of precisely, if not perfectly, predicting success. Notice that there is a linear trend between predictions created and actual prospect performance.
What is amazing about the potential of this analysis is that it is based solely on eight tests that players take during the NFL Combine. It doesn’t factor in body measurements taken during the event, player interviews, or any historical performance data. If it was possible to add the wealth of tangible data available to NFL teams to this analysis, a highly powerful predictive capability could be uncovered. This could then be used to supplement scouting departments, making them capable of analyzing every movement of every draftable prospect across the nation.
In conclusion, NFL personnel departments could be running much more efficiently by using machine learning to predict the future success of a prospect. Because of machine learning’s ability to consider multiple facets of data and how they correlate, artificial intelligence will provide a more complete perspective on the future capabilities of any prospect. With more data for the prediction models to train with, there is significant potential for machine learning to supplement or replace components of the existing scouting process. If any team needs some help on where to start, call us at SparkCognition. We’re fans, too.
Last modified: November 22, 2017