19th century mathematician and genius Francis Galton, the man who invented the regression line (for which we at HSAC are forever indebted), once found himself at a country fair with a peculiar contest: who could guess the exact weight of a slaughtered and dressed cow? While no individual correctly guessed the exact weight, 1198 pounds, Galton noticed that the average of the 800 guesses was 1197 pounds–essentially perfect. This observation of the so-called “wisdom of the crowd” has been expanded upon in the last century with the creation of prediction markets. These markets (Intrade, for example) tend to be more accurate than any particular expert could hope to be over large sample sizes.
One particular set of markets of interest to sports fans and bettors are the sports betting markets of Las Vegas. Sports betting has become a multi-billion dollar industry, a way for shrewd bettors to make a living and squares to lose their money. The spreads and odds set by a combination of Vegas oddsmakers and gamblers are, in large samples, an accurate reflection of team strength and ability.
I think that most people would assume that the accuracy of the Vegas market increases as the season goes on. It makes intuitive sense that with more information about teams, the oddsmakers and bettors should set lines that are closer to the actual outcome of the game. But is this perception grounded in reality? Professional bettors may know better. To test this, I used a dataset of over 30,000 closing lines from college basketball games over the period of 1997 to 2011. I’d like to thank Mike James for doing the heavy data collection lifting.
I numbered each game in every team’s season, using only those seasons for whom I had at least 20 game lines. I wanted to analyze the actual results of the games compared to their lines for each of the game numbers (i.e. the average deviation from the line for the first game of each team’s season, the second game, etc). I tried to filter out teams who had played multiple games before their first game with a Vegas line.
You may find the results surprising. We often use standard deviation of a set of games from their lines to assess the accuracy of the lines. If the betting lines were getting more accurate with more information, one would expect the standard deviation of the lines to get smaller as the season went along. This would reflect fewer games that widely varied from the betting expectation. The SD over the course of the season is charted below:
While there is some trend lower towards the middle of the season, the series stays remarkably constant. The graph also makes the trend look more significant than it might be. Using a Dickey-Fuller Test, a concept borrowed from time-series forecasting, I tested whether the progression of SD over the course of the season exhibited stationarity. Stationarity means that the process is mean-reverting to some mean level. Large observations are followed by more negative ones, and vice versa. In this case, it would mean that regardless of when the game is in the season, we would expect the accuracy to be around some mean standard deviation.
The Dickey-Fuller Test for this series was significant at the 5% level (t-stat of -2.997), supporting the alternate hypothesis of stationarity. This means that at any given point in the season, the accuracy of the closing Vegas line, measured by the standard deviation of the results from the lines, is relatively constant. While there appears to be some increased variation at the beginning of the season, it is not different enough from the rest of the season to show that the sports betting markets are learning more about the teams.
But perhaps by looking at all teams, I’m missing a subset that has increased variability. Perhaps the market isn’t as good at predicting the relative strength of teams with fewer returning players, and thus more unknowns. To look at this aspect, I collected returning minutes data from 2008 to 2011, and sought correlations between returning minutes and the line miss for early season games. The correlation between returning minutes and line miss in each of the first 10 games of the season was at most -0.14, essentially zero. It seems that the prediction markets are equally good at assessing teams with lots of returners and teams with few.
What is there to conclude from this exercise? While there is more work to be done, it seems that there is some good evidence that even at the beginning of the college basketball season, Las Vegas sports betting markets are close to as accurate as they are all season. Of course, I am talking about large sample averages: there are obviously individual teams that gamblers learn more about whose early season lines are out of whack. But it seems to me that lines, like the stock market, are a mean-reverting process.
Vegas does not appear to learn much over the course of the season, but I think that is to the market’s credit. There is an inherent amount of randomness in the game of basketball. The betting market seems to be accurate, around that level of randomness, even at the beginning of the season, when one might expect it to be less accurate. This is yet another example of the power of prediction markets and the wisdom of crowds.