For the past two years, I have worked on a model that predicts NCAA Tournament upsets using the gospel of tempo-free basketball stats, the Four Factors. Over that time, the model has gone a perfect six for six in predicting 11 through 14 seeds that became Cinderellas.
While I am fairly sure that this perfect record will come to an end sooner rather than later, I still think the model has value. I ran an improved model on this year’s bracket to try to help you gain an edge on your co-workers in your office pool. Without further ado, the predicted probabilities:
As you can see, the model is only predicting two upsets–both 11 seeds over 6 seeds–this year. The model likes Minnesota over UCLA, and whoever emerges from the Middle Tennessee State-Saint Mary’s First Four clash over Memphis.
Minnesota may be slightly overrated because of their extremely tough schedule, but the Golden Gophers are a very good rebounding team playing a very weak six seed in UCLA. MTSU has a great turnover margin, St. Mary’s cleans up on the glass, and Memphis has played a very weak schedule for a six seed.
In building this model, I’ve used a dataset of every 3-14, 4-13, 5-12, and 6-11 matchup from the last ten NCAA Tournaments. Last year’s post has details of the specific model inputs. The only addition this year is a measure of teams’ consistency.
Note that I am not predicting that there will only have been two upsets come Friday night. Rather, the Upset Model is intended to be conservative: over the last 10 years, using out of sample testing, the model has predicted 25 double digit seeds to pull upsets. 22 of them have been successful, yielding a false positive rate of under four percent.
Over that same time period, there have been a total of 40 double-digit seed upsets. Clearly, some of the teams that my model does not make outright favorites to pull upsets will, in fact, win. That is part of the beauty of March.
A note for the Harvard fans reading this: one of the weaknesses of the model is that it underrates moderately low probability outcomes. The Crimson certainly are not the favorite in their matchup with New Mexico, but they likely have a far greater chance than 4.6% of pulling the upset.