Predicting the Madness: 2013 Upset Edition

For the past two years, I have worked on a model that predicts NCAA Tournament upsets using the gospel of tempo-free basketball stats, the Four Factors. Over that time, the model has gone a perfect six for six in predicting 11 through 14 seeds that became Cinderellas.

While I am fairly sure that this perfect record will come to an end sooner rather than later, I still think the model has value. I ran an improved model on this year’s bracket to try to help you gain an edge on your co-workers in your office pool. Without further ado, the predicted probabilities:

Screen shot 2013-03-18 at 11.09.06 PM

As you can see, the model is only predicting two upsets–both 11 seeds over 6 seeds–this year. The model likes Minnesota over UCLA, and whoever emerges from the Middle Tennessee State-Saint Mary’s First Four clash over Memphis.

Minnesota may be slightly overrated because of their extremely tough schedule, but the Golden Gophers are a very good rebounding team playing a very weak six seed in UCLA. MTSU has a great turnover margin, St. Mary’s cleans up on the glass, and Memphis has played a very weak schedule for a six seed.

In building this model, I’ve used a dataset of every 3-14, 4-13, 5-12, and 6-11 matchup from the last ten NCAA Tournaments. Last year’s post has details of the specific model inputs. The only addition this year is a measure of teams’ consistency.

Note that I am not predicting that there will only have been two upsets come Friday night. Rather, the Upset Model is intended to be conservative: over the last 10 years, using out of sample testing, the model has predicted 25 double digit seeds to pull upsets. 22 of them have been successful, yielding a false positive rate of under four percent.

Over that same time period, there have been a total of 40 double-digit seed upsets. Clearly, some of the teams that my model does not make outright favorites to pull upsets will, in fact, win. That is part of the beauty of March.

A note for the Harvard fans reading this: one of the weaknesses of the model is that it underrates moderately low probability outcomes. The Crimson certainly are not the favorite in their matchup with New Mexico, but they likely have a far greater chance than 4.6% of pulling the upset.

About these ads
This entry was posted in NCAA Basketball and tagged . Bookmark the permalink.

17 Responses to Predicting the Madness: 2013 Upset Edition

  1. Zachary says:

    I’ve been waiting for this post with anticipation for a day now. Awesome work. Are you going to make a post with the survival method?
    And if it weren’t for Butler’s coaching I’d pick Bucknell, Muscala cleans the glass and is a real presence inside. Also, is Davidson’s odds particularly high historically for a 14 seed? Because I saw they are +3.5 which seems like quite a low spread for a 14 seed.

  2. PLM says:

    Wondering if these picks (last two years) were true upsets against the point spread (or a power rating system such as Pomeroy)? E.g., Minnesota is the distinct point spread favorite this year despite being the worse seed. The only teams in your table that differ significantly from the betting odds are a few that your model says are being overvalued by the market:

    Oregon
    Belmont
    Davidson
    New Mexico St.
    Iona

  3. J-Doug says:

    Thanks for this. You shared your probit coefficients last year. Will you be sharing them again this year to include the new team consistency variable?

  4. wubr2000 says:

    This is great. How exactly can I use this chart to place bets??

  5. Just plain Doug says:

    I’m with wubr2000. Making a couple bucks in Vegas could be fun. I wonder if they take parlay bets? One could parlay the two teams that are predicted to upset their opponents, or do various combinations among several teams with the best chance of pulling an upset. My only caveat would be I live in Minnesota and can attest to how bad the Gophers are right now. The guards are not confident, team leadership is lacking, and they are a poorly coached team.

    • PLM says:

      Doug, the probabilities John lists pretty closely match those offered in Vegas (except for the extreme underdogs, which he says are not reliable). So there would be very few bets to make even if you had the utmost confidence in the model. I have nothing against looking at this kind of analysis, but I think in order to evaluate whether it has any true predictive merit, you’d have to grade it against spread or money line. I.e., you’d want to know if the model adds any predictive ability that is not already reflected in the betting public’s perceptions.

  6. Tony says:

    In previous years the 15 seeds haven’t been listed. Is this (relatively speaking) a stronger batch of 15 seeds compared to the last two years?

  7. Justin says:

    Will you be posting your final team rankings going into the tournament using your survival method as you did last year? That was a great tool to use to assist in picks and I would love to see it again this year!

  8. John Ezekowitz says:

    PLM:

    Look at the model picks from the past two years and the money lines on those games. There were very big differences in the past.

    I am not sure why this year hews closer to the Vegas lines. Maybe the market now realizes the value of NCAA Tournament-specific prediction models?

  9. PSD Financier says:

    John, will you be coming out with a full bracket like you did last year?

  10. Cal Kaniff says:

    Will there be full rankings released?

  11. Pingback: Chomping At Bits: NCAA rescinds two college football recruiting rules - Sportsits.com

  12. James says:

    John, could you please send me step by step instructions so I could try to replicate your results? Thank you for your time.

  13. Pingback: March Madness links | God plays dice

  14. Pingback: Bracketology | Above the Market

  15. Matt says:

    Compared to implied win probabilities of Vegas moneylines, Minn, St Marys, Miss, and Pacific offer value.

    Harvard Vegas
    Minn 70% 60%
    St Marys 53% 36%
    Cal 40% 43%
    Bucknell 37% 39%
    Miss 31% 30%
    Oregon 29% 47%
    Belmont 23% 36%
    Davidson 20% 41%
    Pacific 14% 13%

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s