Tracking the “Giant Killers”
Most of us think if only we research a bit more, think a little harder, watch a few extra games, we’ll be able to brilliantly pick those upsets in our NCAA tournament pool. Furman math professors John Harris ’91, Kevin Hutson, and Liz Bouzarth, agree—only their idea of “research” and “thinking” involves linear algebra and matrix equations.
Throwing information at computers in hopes they can somehow make sense of the randomness that seems to control athletic competition is nothing new, but their approach is. So new, in fact, that ESPN featured their work prominently during this March Madness.
The three collaborated with writers Peter Keating and Jordan Brenner on their “Giant Killers” blog, which is an “annual metrics-based forecast of NCAA tournament upsets,” defined as games involving a team that beats an opponent seeded at least five spots higher. Keating and Brenner have been refining their approach since 2006, but Drs. Harris, Hutson, and Bouzarth took the prognostication to a new level when they figured out an approach based on cluster analysis.
“Regression analysis looks at what affects things on average. Cluster analysis tells you within this group what are defining characteristics of subgroups,” Keating said. “So we’ve been able to say this year that Dayton is one kind of giant killer, while Eastern Kentucky is a very different kind of giant killer based on the statistics that came out of cluster analysis that John and Kevin and Liz did. That’s something that’s new and it’s original and it’s something that nobody else is doing. We’re really grateful and we’re excited to build on it.”
Harris and Hutson met Keating at a sports analytics conference at the Massachusetts Institute of Technology in Boston. They hit it off, and he asked them to do some work coming up with a college football ratings system before they expanded into this year’s Giant Killers blog, where they had the epiphany to classify “giants” and “killers” into four categories (or “families” as ESPN calls them).
“We try to group the giants into styles and we try to group the killer into styles, and we look historically at which kinds of giants are able to neutralize which types of killers and we look at which types of killers tend to play well against which types of giants. We factor that into the analysis as well, so the matchup is important,” Hutson said. “By doing the clustering analysis, we were able to identify teams and put them in a box.”
Interesting, to be sure, but let’s get to the real question: Has Furman brought the world any closer to Warren Buffet’s billion? The proof is in the predicting, though it’s important to understand what that means before drawing conclusions.
The analysis wasn’t meant to name winners and losers with certainty. That’s mathematically impossible, as Bouzarth points out. “It’s all about probability, and you’re never going to get a 1.0 in probability,” she says.
So in that sense, there is no way it could fail or succeed. What it could do, however, is conclude which teams had the best odds of pulling an upset, followed by a comparison with the teams that actually DID pull upsets. Tennessee, Harvard, North Dakota State and Stephen F. Austin were deemed most likely to kill a giant, and guess what?
They all did.
The latter three were 12 seeds given approximately a one in three chance of victory in the round of 64. Tennessee, meanwhile, was given more than a 70 percent shot of knocking off Massachusetts as an 11 seed, and the Vols throttled the Minutemen on their way to the Sweet 16.
“We were super confident about that,” Hutson said.
It gets better. The system gave Kentucky the best chance among the 8 and 9 seeds of defeating a No. 1, which is precisely what the Wildcats did in upsetting Wichita State in their march to the Final Four.
On the flip side, the trio’s list of “shaky giants” didn’t end up being very shaky outside of a Creighton’s round-of-32 blowout loss at the hands of Baylor. Wisconsin is in the Final Four for the first time, while Michigan, Arizona, and Iowa State all made it to at least the Sweet 16.
And then there’s favorite Louisville, ousted well short of the national title by resurgent Kentucky.
“A lot of this is based on trying to predict on who’s going to outperform their basic skill. How do you know when somebody’s going to have a very good night and somebody’s going to have a very bad night? You have to admit that a giant factor in all of this is luck, so at a certain point we’re going to get diminishing returns,” Keating said. “It’s never going to be totally predictable because you’re trying to predict a surprise.”
Clustering analysis was able to pin down matchups that would favor certain underdogs whether they took a lot of 3s or tried to slow the game down, but there is one factor that seems to consistently give teams a better chance of winning regardless of style. “Rebounding,” Harris says. “That’s probably the biggest thing.”
“If your coin is biased so that you get a head 48 percent of the time and I get a head 37 percent of the time, you’re going to win this game as long as you can flip it as many times as me,” Hutson added. “So I’ve got to do things to keep you from doing that. That’s the key.”
All three professors thoroughly enjoyed watching the fruits of their labor play out on television and hope to continue the project next year, this time with student involvement.
“There were a lot of really tight games, so that was fun for us with our models in trying to see these upsets. . . Even the ones that we didn’t win, we felt like we won because the gap was way smaller than some people thought,” Bouzarth said.
As confident as she is in the process, however, not even Bouzarth, who earned her Ph.D. in mathematics from North Carolina, is quite ready to trade humanity for cold, hard numbers when it comes time to fill out her bracket. Other factors are still more important. “I picked things like Mercer over Duke,” she said. “The model didn’t, but I had some insider information as a UNC fan that Duke might lose.”