College football resists math...
If you haven't already listened to it, the recent episode of the Solid Verbal with SB Nation's Bill Connelly is well worth your time. Dan Rubinstein spends close to an hour talking to Connelly about Connelly's views on where stats are in college football and where they are going.
One prevailing theme that came out of the interview (for me, at least) is that the use of advanced math in analyzing college football has a long way to go. Connelly is doing yeoman's work, but he is only at the stage where he and some dogged volunteers can chart roughly an eighth of the total games played in FBS last year. Most of us love the unruly, anarchic element of college football, the fact that there are so many teams and conferences that the sport rejects standardization and encourages creativity by its nature. However, that quality also means it is hard to get a grasp on the sport from a statistical sense because there is just so much to love.
Aside from the fact that it is hard to obtain granular information on a large scale, Connelly also had only a few examples of major programs using advanced stats to analyze themselves. In fact, his two examples both related to Texas, a program that would never be analogized to the plucky Oakland A's, who make more out of less. Maybe college football programs are behind the curve in terms of their use of advanced analytics. Maybe coaches are doing a good job of hiding their methodology from public view.
Or maybe college football lends itself to fancy numbers less than other sports do. As Marc Tracy writes in the New Republic, football is a tough nut to crack when it comes to reducing the results on the field to numbers:
Much work has been done in football, both at outside sites like Football Outsiders and, much more quietly, within teams. But some of football's top analysts said that the incredible intricacy of football-the different types of scoring, the uncountable potential game situations, the byzantine mess of eleven men working in tandem, intangibly but undeniably dependent upon each other-means that only so much can ultimately be accomplished. "The sheer complexity of football makes it to my mind the hardest," said Brian Burke of Advanced NFL Stats.
Two people I spoke to, one from football and one from baseball, argued that football teams might be particularly stubborn about adopting analytics because of the number of people that must sign off on virtually anything.
"There's so many people involved in decision-making," said Grantland football writer Bill Barnwell, who got his start at Football Outsiders. "You can get one person in the front office to buy in, one coach to buy in. But it's hard to get seven coaches, and the G.M., and the owner." Farhan Zaidi, Oakland Athletics director of baseball operations, said, "In soccer and football, the on-field manager has more influence"-and, therefore, stats have less room to thrive. The exception, he noted, is when the on-field coach and front-office manager are the same person, the most prominent example of which is the New England Patriots' Bill Belichick. Not coincidentally, the Pats would make anyone's list of the top five teams most interested in analytics.
And then beyond the general difficulties of analyzing performance on a gridiron, you have specific problems in college football. A college program generally does not have anything like a general manager and a front office to pore over numbers. There are coaches who are pulled in a million different directions during the season and there are athletic department personnel who have multiple responsibilities, but there is not as much of an administrative structure that would lend itself to data analysis.
More importantly, college football's talent acquisition process mostly defies rigorous analysis. An NFL team needs to quantify what a player is worth in terms of contract value, and it needs to do so in manner that allows it to compare those players relative to others on the market. Should the Falcons pay Steven Jackson $5 million per year or is Jacquizz Rodgers' production sufficient, especially if combined with a bigger back taken in the fourth round? In short, the NFL requires asset valuation skills in order to assemble the best roster possible under the salary cap. A room full of quants can be very useful for an NFL team.
In contrast, college football's talent acquisition process is about marketing instead of accounting. A major program brings in as much top talent as it can. It doesn't have to worry about allocating cap dollars (insert Ole Miss joke here); it just worries about building relationships with recruits and making the right pitch. There is analysis regarding allocation of time and scholarships, but the roster still gets built by the non-quantifiable recruiting process.
We are starting with a game that doesn't lend itself to precise, number-heavy conclusions, and then we remove most of the market elements that exist in the NFL.
Additionally, there is greater potential for using advanced data analysis in scouting NFL Draft prospects than there is in scouting recruits because college stats are more meaningful than high school numbers. For instance, Football Outsiders has contributed to the pre-draft discussions by creating the Lewin Career Forecast. Using regression analysis, Lewin was able to identify certain college numbers as being predictive of NFL success for quarterback prospects. Fans who were familiar with the forecast were not surprised when Russell Wilson burst onto the scene in the NFL last year, as Wilson was off the charts in the two categories that Lewin identified as being the most important: games started and completion percentage.
Doing something like the Lewin Career Forecast for high school recruits would be an exercise in futility. High school stats vary wildly because of the different classifications and therefore, the differing levels of competition. College coaches can scout high school quarterbacks just like NFL coaches can scout college quarterbacks, but the latter can supplement their scouting with data analysis, whereas the former have a hard time doing the same.*
Thus, in two major respects - assigning player value and analyzing lower level stats - college football resists the influence of sabermetrics.
* - Or maybe this is just wishful thinking by a Michigan grad who is familiar with Shane Morris's high school stats.
But college football fans should embrace math.
That said, there is one obvious area in which college football's use of numbers should improve: ranking teams.
Our sport gives itself an impossible challenge in that it has a small playoff in which the participants are selected based on subjective polling, and the data upon which the pollsters base their decisions consist of disparate schedules with few common opponents, along with a healthy dollop of the subjective mental images that come to mind when we think of teams.
And despite the futile task that the BCS gives itself by its own structure, it makes matters worse by: (1) relying on human polls composed of overworked coaches and the bizarre roster that makes up the Harris Poll; and (2) castrating the computer rankings by preventing them from using margin-of-victory. It's hard enough comparing an 11-1 Alabama team with an 11-1 Oklahoma State when there are precious few direct connections between their opponents; it's far worse when the people making the decision are addled, data-averse voters and computers that are forbidden from relying upon a source of information that anyone with skin in the game (read: oddsmakers and Vegas sharps) would consider.
This past bowl season drove this point home. Why was Alabama preferred over other candidates for a spot in the BCS National Championship? I would like to think that it was not their reputation and that of the SEC, but rather because the Tide dominated their opponents in almost all of their wins. Why were people like me regretting the fact that the Tide were playing Notre Dame in the title game instead of Oregon? Because the Ducks, like the Tide, put up impressive scores in their wins. And what did Alabama and Oregon do in their bowl games? They vindicated math by playing up to the faith that the more sophisticated computer ratings placed in them.
Conversely, you have Florida, a team unlike Alabama and Oregon that had a more impressive collection of scalps, but had a series of narrow, unimpressive scores attached to those scalps. What happened to the Gators in the Sugar Bowl? (A smart ass would point out that Florida lost to another team whose record was inflated by a number of close wins.)
The coming of the NCAA tournament and Stewart Mandel's comparison of Boise State in football and Gonzaga in hoops drives this home. Gonzaga gets the opportunity to play a number of games against major programs because of college basketball's longer regular season, and then they get an equal shot at a national title because of the Tournament. (Personally, I'm a fan of cutting the Tournament from 68 to 16, but I'm a curmudgeon like that.) Boise State plays only a smattering of games against top competition, and then everything has to go perfectly for them to make a title game. Boise State's 2010 and 2011 teams would have been aided by computer rankings that could account for margin of victory. What made those teams legitimate contenders was the fact that they beat up on weaker opponents by scores that one would expect from Alabama or LSU playing the same schedule. However, the system was stacked against them in more ways than one, as the BCS made a political decision to exclude margin-of-victory.
Thankfully, we are coming to the end of the BCS era. The decisions that were previously made by suspect pollsters and denuded computer programs will now be named by an as-yet unnamed roster of committee members. College football's innumeracy will no longer be written into the rules. Instead, we will just have to hope that the individuals tasked with selecting four teams will do a better job of utilizing mathematically sound reasoning. It would be something new for college football to go down that path. (Bill Hancock's comments on the composition of the committee are not confidence-inspiring.)
Baseball has made major strides and yet even now, we are far more likely to see batting average on the screen when a batter comes to the plate as opposed to on-base percentage. How long are we going to have to wait for college football to make similar progress and where will that progress be reflected? In baseball, the use of outdated numbers does not affect the competition. Because of the subjectivity inherent in picking teams to play for the national title, it potentially matters when ESPN uses bad numbers to analyze teams and then Hancock's "retired ADs and coaches" form their beliefs based on those numbers.
Who here is confident that Pat Dye would watch Mark May criticize Oregon for being 44th in total defense and then intone from his La-Z-Boy, "yeah, but they are only 26th in yards per play allowed; the total defense number does not account for pace?"
More in College Football: