Every March, millions of Americans fill out an NCAA tournament bracket. Whether they know what they’re doing or not, most fans enjoy competing in pools and against their coworkers. But what if I told you there’s an even better way to enjoy college basketball’s most thrilling few weeks?
Montgomery Blair High School in Silver Spring, Md., has found the key to staying involved in March, even when all of your teams lose.
“When you pick brackets, when your team loses, you’re out,” the class’s teacher, David Stein explained. “This way, we’re always modeling the next game.”
In the school’s sports statistics class, students use math to predict the results of every possible NCAA tournament game — and document it all on their class blog, FiveThirtyBlair.
“It makes teaching the math a lot easier and much more exciting,” said Stein.
The class is divided into groups of students. Each group identifies variables, based on its research, that are important to determine success in the tournament, like RPI, offensive and defensive efficiency, win percentage, the number of times a coach has appeared in the tournament, and more. The students then use logistic regression to produce the probability of a team winning any given matchup.
Rather than making a bracket, these students are calculating the probabilities for all 1,263 possible games before the tournament starts. The stronger a prediction, the more they can get penalized in their scoring if they’re wrong. If a group predicted a 60 percent chance for Duke to win, for example, it wouldn’t be scored too high if Duke won, but it wouldn’t hurt them as much if Duke lost.
While the probabilities for winning in the first round are more confident, they become weaker as the rounds continue. This statistics-based method of predicting the tournament keeps the competition fierce until the very end.
“Even though Villanova has the highest chance of winning [the entire tournament], it’s still only 22.8 percent,” said one student group in its first blog post before the tournament. “So there’s a 77.2 percent chance that a team other than Villanova wins. That’s the beauty of college basketball, the parity.”
Nearly every group in this year’s competition used RPI as one of the variables. Check Bill Self, the team in the lead, based its model off two factors: experienced coaches and RPI, factors that consistently predicted winners year after year, they wrote.
“The team that did very well last year relied on offensive and defensive efficiency,” Stein said. “RPI is also a good predictor.”
What isn’t so good for March Madness predictions? Win percentage, Stein said.
Most of the students picked Duke to take it all, which clearly won’t happen. But even when the Blue Devils were favored to win in some models, it was only a 29 percent chance. So while the numbers don’t lie, they also don’t tell the whole truth.
Heading into the Final Four, team Check Bill Self is still leading the class and ranked No. 20 on Kaggle, a data website that the class uses with hundreds of competitors. The second-place team in the class is fairly far behind on Kaggle, coming in at No. 115.
While it incorrectly predicted (like most of us) that Baylor and Arizona would be making it to the Elite Eight, with a 66.5 percent chance and 73 percent chance respectively, it predicted the other six teams correctly. But even with those correct predictions, if the class only gave the teams a lower percentage chance of winning, the score won’t increase a ton.
Another team, the Ali-Oops, correctly predicted from the start that Oregon would make the Final Four, which the students admitted surprised them. They also surprisingly calculated that South Carolina had a higher chance of making it to the Final Four than Florida.
The top group gave UNC a bit better odds for the champs over the Fighting Ducks and calculated that Gonzaga had the highest chance by far to making it to the championship, at about 30 percent. Sorry, Gamecocks.