Mallory Burdette and scoring margin in tennis

US PRESSWIRE

Inside Advanced Baseline's most unconventional ranking, and how scoring margin can identify top players.

The April 2nd Advanced Baseline update featured the model debut of a player whose ranking is most at odds with conventional wisdom and ATP/WTA rankings: Mallory Burdette at No. 39. While she technically had a computer ranking before, April 2nd marked the first time where her sample size of matches was large enough that her ranking was considered "legitimate" and eligible for display on the main page.

Of all the results and ranks Advanced Baseline has produced to date, no player gives me more heartburn about whether or not AB got it right than Mallory Burdette -- not even Ferrer being ranked ahead of Murray. This is about a bold a statement as a ranking system can make; not only is Burdette in the top 40 immediately after logging the minimum number of data points, she also got there without registering a single win over a top 50 opponent. How is this even possible? Shouldn't you have to beat at least one opponent in the top 50 to be there yourself? Not necessarily. To explain why, it's worth taking a detour into another sport where ranking controversies happen every year.

College basketball has made great strides in the last couple years in terms of its evaluation of teams. There is a plethora of ranking systems (Pomeroy, LRMC, BPI, etc.) that do a good job of figuring out who the good teams are even when they're separated across conferences with drastically different strengths. And every year, they always come up with a mid-major in the top 25 that's a complete head-scratcher to plenty, eliciting a chorus of "BUT WHO HAVE THEY PLAYED?!?!?!" as though that's something a team can control.

Middle Tennessee State was this year's example; 2012 Belmont was an even more extreme version, ranked as high as No. 8 in some systems despite not beating a team in the top 50. The reason all of the advanced systems like these teams so much is almost always a single reason: scoring margin. The MTSUs and Belmonts routinely blow out their conference opponents by double digits. In college basketball, it's pretty widely accepted by now that accounting for scoring margin in a team's body of work is more predictive than a simple win/loss tally and adjusting for quality of opponents. So when you have a team that racks up gaudy numbers against poor competition, it is defensible to infer that it is probably a top 25 team. If you swapped jerseys with the solid mid-major and a blue-blood No. 1 seed, how would you be able to tell them apart from just looking at the box scores?

This brings us back to tennis. Scoring margin in tennis isn't really a big discussion point aside from a couple vague observations when someone gets bageled or wins in straight sets. I think this is a mistake. There's valuable information to be inferred from scoring margin in tennis, just as there is in college basketball.

AB looks at match scores using a metric I call Game Differential Ratio (GDR), which is simply the games a player wins minus games a player loses, all divided by total number of games played. The idea behind GDR is that it encapsulates the scoring margin by incorporating game differential, but also accounts for if a match goes 2 sets vs. 3 by weakening the margin if more games are played in a 3-set match. Winning GDRs can range from -0.2 (representing a 7-6, 0-6, 7-6 win) to 1.0 (representing a 6-0, 6-0 win). For reference, here is the distribution of winning GDRs across all men's matches from 2002-2012 (the distribution for women's matches is roughly the same):

Dist_medium

The majority of matches fall in the 0.1-0.3 range, which are your typical 6-4, 6-3 wins. Anything over 0.4 is a solid margin of victory, roughly equivalent to when a basketball team wins by double digits.

What can scoring margin tell us about future success in tennis? Consider the following graph, constructed from all first-time rematches between two players on the same court. The graph seeks to answer this: Given the winner and the scoring margin of the first match, how often does the original winner win the rematch when the two players meet again?

The graph shows a clear relationship between scoring margin and the probability of winning the rematch: the higher the margin of victory in the first game, the more likely the first-time winner will win the second match. If there were no relationship between scoring margin and future success, the graph wouldn't be so neatly ascending.

This doesn't sound like such an earth-shattering conclusion when stated plainly. Obviously, racking up more games in a match probably means you're better than your opponent. But from the way results are commonly discussed, you'd never know. Tennis is subject to a lot of the bad clichés that also plague college basketball, where close wins are a sign of heart, determination, and "finding a way to win," and blowout wins are met with an uninterested shrug. While this might make for a more compelling narrative, I think it's also leaving out valuable information, especially if you're interested in finding the next great player before everyone else does. So far, Mallory Burdette fits this bill quite nicely.

Mallory Burdette's Record

Here is Burdette's entire WTA career to date, summarized in wins/losses record along with her average GDR in her matches:

Opponent AB Rank W L Avg Winning GDR Avg Losing GDR
1 to 50 1 9 -0.03 0.35
51 to 100 8 4 0.27 0.37
101 to 150 5 1 0.39 0.18
151 to 200 6 1 0.28 0.07
200+ 17 0 0.50 n/a

That 1-9 record against the top 50 gives plenty of room for doubt about Burdette's top-40 rank. However, look at her record against sub-100 competition: not only is that an eye-popping 30-3, but she's putting up video game numbers against her opponents, averaging 0.44 GDR per win. You might want to write off the results as coming against poor competition, but even after accounting for that, it's really hard to sustain that level of scoring margin at any level (look at the scoring distribution graph to see exactly how hard)- it's a feat that only good players can realistically accomplish. So how come her blowouts against sub-100 competition haven't translated into top-50 wins? First of all, let's determine what a reasonable level of success against the top 50 should be by comparing the records of 3 similarly ranked players: Madison Keys (35), Varvara Lepchenko (41), and Yanina Wickmayer (39).

Madison Keys' Record

Opponent AB Rank W L Avg Winning GDR Avg Losing GDR
1 to 50 5 8 0.41 0.27
51 to 100 12 5 0.30 0.22
101 to 150 9 10 0.27 0.21
151 to 200 5 3 0.22 0.17
200+ 24 5 0.44 0.27

Varvara Lepchenko's record

Opponent AB Rank W L Avg Winning GDR Avg Losing GDR
1 to 50 14 25 0.16 0.26
51 to 100 15 9 0.27 0.29
101 to 150 11 4 0.38 0.28
151 to 200 7 5 0.33 0.36
200+ 18 3 0.37 0.38

Yanina Wickmayer's Record

Opponent AB Rank W L Avg Winning GDR Avg Losing GDR
1 to 50 18 25 0.24 0.26
51 to 100 15 10 0.28 0.25
101 to 150 10 5 0.24 0.14
151 to 200 8 2 0.28 0.47
200+ 2 0 0.40 n/a

All players have roughly the same macro-level splits: a 40% win rate against the top 50 and a 66% win rate against everyone else. Burdette is more polarized at 11%/86%, but the big asterisk is she's only been on tour for nine months, whereas the other three have a full two-year record. Still, even in a limited sample size, a 30% win-rate gap against the top 50 is tough to explain.

This isn't a problem unique to analyzing tennis. Every ranking system that accounts for scoring margin, whether it's basketball (Belmont), football (Boise State), or any other sport, will inevitably run into a team that beats up on poor competition and ranks them really high. The difference is in those sports, there aren't a lot of opportunities to validate those high rankings because if you're beating up on poor competition, it's because you're in a weak conference and won't get a shot to play other good teams.

Tennis doesn't have that problem. If you win at the lower levels, you get to play against better competition. That's why it's a great sport to analyze with respect to scoring margin; you will have a chance to validate your theories one way or another instead of arguing hypotheticals and being limited to small sample size.

Should Burdette be rated lower? Honestly, I don't know, but time will tell one way or another. Even if she doesn't improve her record against the top 50 (and I still think she will based on the results to date), it wouldn't be enough to change my belief that scoring margin matters. That graph is too orderly and demonstrating of a relationship to dismiss it outright. At the very least, I think the fact that she does better against poor competition than all of the other three listed players is a sign of enormous upside. Unfortunately, the win-based point structure of the WTA means that even if Burdette is a top-40 player, she'll have to toil in qualifiers and lower-level tournaments while the ranking system catches up to her level of performance.

The existing ranking formulas have a number of flaws, and this is just another one. They're incapable of giving players proper credit without a large enough sample size. Insofar as tournament directors are able to partially mitigate this with wild-card entries, they would be wise to do things like incorporate scoring margin to make sure the most deserving players get their entries.

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

Join SBNation.com

You must be a member of SBNation.com to participate.

We have our own Community Guidelines at SBNation.com. You should read them.

Join SBNation.com

You must be a member of SBNation.com to participate.

We have our own Community Guidelines at SBNation.com. You should read them.

Spinner

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.