Links We Like: Week of April 1

The best in the week of sports (or sort of sports) analysis:

Chris Boyle gives a primer on fenwick, an advanced hockey stat, and reveals how central it is to playoff and regular season success.

Heading into the Final Four, Nate Silver gives Louisville the best chance of raising the trophy in Atlanta.

Dave Cameron at Fangraphs examines whether Cuban pitchers give the Miami Marlins an attendance boost.

At ESPN Insider, Kevin Pelton looks at how Brittney Griner compares physically to NBA and WNBA players.

Deadspin breaks down the epidemiology of the Now That’s What I Call Music! series.

 

Posted in Uncategorized | Leave a comment

Links We Like: 3/29/13

It’s been a while since our last set of links, but we’re back in force to bring you the best reads from around the web for your weekend reading.

- Jeff Sullivan analyzes Justin Verlander’s $180 million contract at FanGraphs
- What does March Madness fandom look like in a map? Michael Bailey’s got the answers.
- Chase Stuart expands on his draft analysis, examines the time value of draft picks.
- Harvard’s own Kirk Goldsberry shows LeBron’s evolution over the past few seasons.
- To get a view of what professional NBA analytics looks like, check out Zach Lowe’s great piece.
- Muthu Alagappan re-examines the 10 positions in the NBA.

Posted in March Madness, MLB Baseball, NBA Basketball, NCAA Basketball, NFL Football, Weekly Links | Tagged | 1 Comment

So Far, 2013 Among History’s Maddest Marches

By Andrew Mooney

An already unpredictable season of college basketball got a little bit wackier last weekend. After defeating Georgetown, 15-seed Florida Gulf Coast has received the majority of the Cinderella-centric media coverage, and rightly so, but let’s not forget about the other two double-digit seeds in the Sweet Sixteen: 12-seed Oregon and 13-seed La Salle.

The NCAA tournament hasn’t lacked for madness in recent years; this is the fourth consecutive year at least three double-digit seeds have survived into the second weekend. 2012 saw the 13-seed Ohio Bobcats advance to the round of 16. The year before featured a matchup between a No. 8 (Butler) and a No. 11 (VCU) in the Final Four, and Cornell nearly took out top-seeded Kentucky in 2010.

But, with the first ever 15-seed in the Sweet Sixteen this year, is it safe to say 2013 has been the craziest of the bunch? I attempted to quantify just how wild the first weekend of each tournament has been since the field expanded to 64 teams in 1985 to see how this one compares to tournaments past.

To start, I summed the seeds of all the teams that made up each year’s Sweet Sixteen, then normalized those sums into an index from 0-100, with 0 being the chalkiest possible sixteen teams (1-4 seeds in all four regions) and 100 being the “maddest” Sweet Sixteen we’ve seen so far: 1986, when the average remaining team’s seed was 5.56. The Madness ratings for each year are graphed below.

madness

By this measure, 2013 has indeed been a particularly mad year—only three other tournaments (1986, 1990, and 2000) rank ahead of it. In 1990, only one two-seed made the Sweet Sixteen, and in 2000, two one-seeds (Arizona and Stanford) bowed out early to a pair of eight-seeds (Wisconsin and North Carolina). In addition to the aforementioned double-digit seeds still alive in this year’s tournament, No. 9 Wichita State eliminated No. 1 Gonzaga and now squares off with La Salle, ensuring the presence of at least one big underdog in the Elite Eight.

In examining the graph, there doesn’t seem to be much of a chronological pattern to the Madness. The recent stretch of craziness was preceded by the most boring year in history, when, in 2009, only one team seeded higher than fifth reached the round of sixteen. And who could forget the snoozefest that was 1989, when every No. 1 and No. 2 seed survived its first two contests?

This March, however, the cause of the little guy is being abundantly supported. For another week, we can revel in the improbable and urge Dunk City or La Salle deeper into the tournament. If just for images like this, let’s hope these folks keep dancin’.

Posted in Uncategorized | 7 Comments

Blackhawks vs. Heat: Which Streak is More Unlikely?

William Marks

As of this writing, the Miami Heat have won 22 consecutive games. Meanwhile over in the NHL, from January 19th through March 6th, the Chicago Blackhawks earned at least one point in 24 consecutive games. While the Heat’s streak is about the same length, the Blackhawks’ run appears more impressive, given that at the time, they had earned a point in every game they had played this season.  (Points are earned in a win (2 points) or an overtime loss (1 point))  To determine which streak was actually more difficult to pull off, I looked at the money lines of each team’s games over the length of their respective streaks. I then converted the money line from each game into the odds of winning.  Under the assumption that each game is an independent event, I calculated the probability of the Heat winning all 22 games (according to Vegas odds) to be 0.2344174%. Then, treating each of the games in which the Blackhawks earned a point as a win, I found the probability of their streak to be 0.0000597915%. When it comes down to the odds, the streak the Blackhawks put together looks much more unlikely in hindsight, considering how infinitesimally small the probability of this points streak is compared to the Heat’s run. Given the relative randomness of hockey, the Heat would have to extend their current streak by a significant margin to even approach a streak as unlikely as the Blackhawks’.

Screen shot 2013-03-21 at 12.45.44 AM

Posted in NBA Basketball, NHL Hockey | Tagged , , | 3 Comments

Survival of the Fittest: Predicting the 2013 NCAA Tournament

The goal of every team in the NCAA tournament is to survive and advance. And, if you want to win your March Madness pool, your goal should be to predict which teams will do just that.

Most prediction systems view the NCAA tournament as an extension of the regular season. While that may be the best way to pick the most games in the tournament correctly, I do not believe it is the way to predict the most important games correctly. Correctly selecting a team to make the Championship Game can more than make up for a relatively poor first round.

That is why, building off of Ken Pomeroy’s great work, for the past two years I have been publishing a model of the NCAA tournament based on Survival Analysis. Academic researchers use Survival Analysis to determine whether new pharmaceutical drugs or treatments are effective. I co-opted the framework to try to discover something truly important: the path to bragging rights over your friends. Continue reading

Posted in NCAA Basketball | 39 Comments

Predicting the Madness: 2013 Upset Edition

For the past two years, I have worked on a model that predicts NCAA Tournament upsets using the gospel of tempo-free basketball stats, the Four Factors. Over that time, the model has gone a perfect six for six in predicting 11 through 14 seeds that became Cinderellas.

While I am fairly sure that this perfect record will come to an end sooner rather than later, I still think the model has value. I ran an improved model on this year’s bracket to try to help you gain an edge on your co-workers in your office pool. Without further ado, the predicted probabilities:

Screen shot 2013-03-18 at 11.09.06 PM

As you can see, the model is only predicting two upsets–both 11 seeds over 6 seeds–this year. The model likes Minnesota over UCLA, and whoever emerges from the Middle Tennessee State-Saint Mary’s First Four clash over Memphis.

Minnesota may be slightly overrated because of their extremely tough schedule, but the Golden Gophers are a very good rebounding team playing a very weak six seed in UCLA. MTSU has a great turnover margin, St. Mary’s cleans up on the glass, and Memphis has played a very weak schedule for a six seed.

In building this model, I’ve used a dataset of every 3-14, 4-13, 5-12, and 6-11 matchup from the last ten NCAA Tournaments. Last year’s post has details of the specific model inputs. The only addition this year is a measure of teams’ consistency.

Note that I am not predicting that there will only have been two upsets come Friday night. Rather, the Upset Model is intended to be conservative: over the last 10 years, using out of sample testing, the model has predicted 25 double digit seeds to pull upsets. 22 of them have been successful, yielding a false positive rate of under four percent.

Over that same time period, there have been a total of 40 double-digit seed upsets. Clearly, some of the teams that my model does not make outright favorites to pull upsets will, in fact, win. That is part of the beauty of March.

A note for the Harvard fans reading this: one of the weaknesses of the model is that it underrates moderately low probability outcomes. The Crimson certainly are not the favorite in their matchup with New Mexico, but they likely have a far greater chance than 4.6% of pulling the upset.

Posted in NCAA Basketball | Tagged | 17 Comments

National TV Rondo Actually Exists

By Ryan Fortin

Throughout Rajon Rondo’s career, he has always shone under the spotlight. He is, of course, Boston’s best point guard and arguably the best player on the team. But as Grantland’s Bill Simmons has noted on occasion, he seems to save his best games for when he appears on national television, including many of his triple-doubles. Is it true that Rondo actually tries harder or performs better when he is in front of the entire country?

I decided to test this theory by compiling data over the past two years and running a t-test, using his non-nationally televised stats and his nationally televised stats to see if the two groups were significantly different. The results seem to back up the critics:

Stat Nationally
Televised
Average
Non-Nationally
Televised
Average
T-Test
P-Value
PPG 16.1 13 0.1029
RPG 6.8 5.3 0.1375
APG 10.2 11.6 0.8349
STL 2.3 1.8 0.1457
BPG .5 .2 0.0678
FGA 13.7 11.6 0.1183
TO 3.6 4.1 0.2699

All of his averages, including points, rebounds, steals, and field goal attempts are higher in nationally televised games. Though none of the p-values are below 0.05, and thus not “statistically significant,” the sample of games is small enough that we should still take notice — I think they can still be considered practically significant results. Interestingly, his assist total is lower, but the high p-value suggests that there’s not much to read into there.

It would appear Simmons is right — Rondo does play better when the most people are watching him. Whether it’s a positive thing that he can elevate his game in big contests or negative that he doesn’t play this way all the time is difficult to say, but it shows why Celtics fans often become frustrated with their talented point guard.

Posted in NBA Basketball | 11 Comments