After my last post about maps, I have geography on the brain. Specifically, I've been wondering about how big home court advantage is in tennis. Home court is a phenomenon universal to all sports, but it doesn't have a great consensus on exactly how and why it exists. The current consensus is some combination of the effects of travel and influence of the home crowd on officiating. Both of those definitely exist in tennis, but their capacities are markedly different from other sports.
Unlike domestic leagues, where travel is typically restricted to within a single country/continent, tennis is a global sport, so travel distances are much further. And while crowds will occasionally show strong favoritism towards a specific player (Andy Murray at Wimbledon, for example), most venues and matches are not designed to be explicitly partisan towards any particular player.
So how do these differences play out? Does travel have a greater effect on players in tennis than other sports? And can it be effectively separated from the effects of partisan crowds?
To take a first pass at answering these questions, I went back through the past nine years of Advanced Baseline-generated win probabilities for each match and classified each player in the matches into three categories: from the same country as the host tournament, from the same continent as the host tournament, or neither. (Continent was included as a variable to distinguish between travel distances: players from the United States playing in Europe, for example, might be more subject to travel effects than when they play in Canada or Mexico.)
This allows each match to be put into one of four bins: one player has home country and continent advantage (Richard Gasquet versus John Isner in the French Open), home country advantage only (Gasquet versus Stanislas Wawrinka in the FO), home continent advantage only (Wawrinka vs. Isner in the FO), or geography neutral (Isner versus Juan Martin del Potro in the FO). I then compared the actual win rates of the favorites in each match against their expected win rates to see if players with home country and/or continent advantage outperform their expectations (this methodology is explained more fully here).
Below is a chart showing the results for each group:
Ideally, the matches where neither player has a geography advantage should be right in line with expectations and not show a bias in either direction. Interestingly, the men's and women's data points show a bias in opposite directions, but I'm inclined to think that's random noise instead of anything meaningful, especially since they still average to zero. In addition, both men's and women's players with home country advantage outperform their expectations by about one to two percent, representing a crude guess for how much home court advantage might be worth. It's certainly less than the typical four to six percent of other sports, but as mentioned above, tennis isn't designed to give partisan advantage to any one player in particular.
The most interesting part of that graph, though, is where both men's and women's players with just home country advantage do better than when they have home country and continent advantage. Before I ran the numbers, I thought playing on a different continent would induce a much bigger travel penalty. Not only does that penalty not show up, but apparently there's an even bigger penalty when playing within your own continent outside of your home country.
Assuming this isn't also just noise (and it could very well be), what's the narrative that fits this data?
The best I can come up with is a two-parter: travel penalties for short and long distances are roughly the same (supported by home continent advantage not providing much of an edge), and crowds get more partisan for intracontinental matchups (entirely speculative). It's plausible, but this graph alone isn't nearly enough by itself to support that conclusion. I would want a lot more slicing of the data and getting a clearer picture of when and where home court is strongest first.
That said, I think this is a decent start for quantifying home court at one to two percent. In no particular order, here are some deeper dives I'd love to make going forward:
- Seeing how home court compares across different levels of tournaments (Tour level vs. Challengers and Futures). It's tough to imagine crowds are super-partisan on the side courts in Peoria, Illinois, so there "should" be less of a home court advantage at the lower levels.
- Comparing home court on Tour-level events that have HawkEye and those that don't. If there's a significant difference between the two, it could serve as a rough estimate for how much home court is due to influencing the umpires.
- Seeing if home court is the same at different stages of the tournament. Umpire bias has been shown to be greatest in baseball during high-leverage situations. Is there a similar situation in tennis, where the potential for bias increases as the stakes get higher?
- Testing whether or not home court is significantly different on hard versus clay courts. It's plausible that hard courts are more prone to errors from line judges (and therefore more susceptible to home court), since clay courts leave obvious marks that can be checked. Sometimes they're big enough to see from space.
These are just the ones off the top of my head. If I'm missing any other tests that are worth running, let me know.