World Cup Formats: taking a look at the data

It's been a curious feature of this ongoing World Cup in New Zealand and Australia that everyone seems more interested in talking about the next one. That the 2019 World Cup in England and Wales is slated to be a ten-team affair has been known since 2013, and debate has been simmering since, but since the current tournament got underway, the decision to effectively exclude Associate nations from the next edition has attracted increasing attention and commensurate derision.

An online petition against the planned "streamlining" of the event has attracted more than 20,000 signatures so far, and the game's great and good have lined up to voice their opposition. The list of cricketing luminaries to have spoken out against the decision includes the likes of Gideon Haigh, Mark Nicholas, Jonathan Agnew, Andy Bull, Ed Smith, Martin Crowe, Sachin Tendulkar, and most recently Dutch skipper Peter Borren who, with characteristic frankness, dismissed the ten-team format as "bullshit".

Facing down this tempest of protest is ICC CEO Dave Richardson, to whom falls the unenviable task of defending the change. "Every match should be very competitive and having ten teams at the 2019 World Cup will make sure that will be the case" says Richardson, arguing for a return to the "best format" - the round-robin template of 1992. In this Richardson is joined by others whose nostalgia for a time when nobody minded a global sporting event being sponsored by a brand of cigarettes lets them see the Benson and Hedges World Cup in tones of smoke-stained sepia. The current format is overlong, they say, and characterised by pointless mismatches and dead rubbers. The World Cup should be reserved for the best of the best - an intense, rigorous test of the game's finest exponents - "the pinnacle of the game" in the words of Richardson, not a "cricket jamboree".

Critics counter that the length of the tournament comes down to format, not participation, pointing out that the proposed ten-team version shaves only one game off the current 49, and is scheduled to take a day longer. Moreover dead rubbers are a result of big groups, not mismatches, and smaller groups rather than fewer teams are the best recipe for meaningful must-win matches.

But smaller groups were the defining feature of the 2007 edition comes the reply, and the risk of a repeat of India's groups-stage exit from that tournament - and its attendant calamitous financial consequences - is not commercially acceptable. Moreover it is tentpole matches between the top Full Members that draw eyeballs, not one-sided blowouts or Associate sideshows.

In trying to distil something quantifiable from all this noise, we're left with several seemingly contradictory criteria which an ideal format is expected to satisfy. A successful tournament should be inclusive, but competitive. It should be short, but feature enough bankable games to be lucrative. Dead rubbers are to be avoided, but top teams can't be eliminated off the back of one or two bad performances. Television audiences must be maximised, but we should all be subjected to Russel Arnold's commentary for some reason.

What follows is an attempt to put numbers to these criteria, in the hope of presenting something that approaches an objective comparison of competing formats. A number of alternative solutions have been proposed, but for the purposes of this analysis we'll be limiting ourselves to five:

The current format, featuring fourteen sides split into two groups, with the top four from each group progressing to the quarter-finals.
The proposed ten-team format, with a single league phase where each side plays all the others once feeding directly into the semi-finals.
A fifteen team format of my own devising, featuring three groups of five, with the top two in each group progressing to the quarter-finals and the three third-placed teams competing in an intermediary three-way playoff for the remaining two spots.
A sixteen team format modelled on the 2007 WC but with the near-universally scorned super-eight phase replaced with straight knockouts.
The twenty team format proposed by Russel Degnan and championed by many in the "jamboree" camp, featuring four groups of five with winners progressing to the quarters and second- and third-placed teams playing cross-over playoffs for the remaining four spots.

Each format has its advantages and disadvantages, but we'll start by comparing them on the most easily quantifiable criterion - tournament length. Tournament length is a fairly straightforward function of format and participation, and the simplest metric is total number of matches, as shown on the first bar chart below. Even by this simple measure it is immediately obvious that those claiming the inclusion of Associates is the principal cause of overlong tournaments are talking nonsense. There's no real correlation between number of participants and number of matches - two of the four proposed broader formats produce fewer matches than the ten-team format, and the other two are only a couple of matches longer.

But then there is of course a far more important reason that the Cricket World Cup takes longer than pretty much any global tournament that doesn't literally involve circumnavigating the globe - namely the refusal to schedule more than one match per day. The reasoning behind this policy is principally commercial, with TV rights-holders looking to maximise viewing figures for every match. But even accounting for this cold commercial logic, it is nonetheless possible to abbreviate the broader tournaments considerably without any significant likely loss in revenue, simply by scheduling so-called "sideshow" matches in parallel.

The adjusted chart shows the comparative length of the formats achieved by occasionally scheduling two games a day, with the rather stringent proviso that no two games featuring a top-eight team be played concurrently. Considerably greater time-savings could be made by relaxing this provision, and from a commercial standpoint broader tournaments are of course more suited to simultaneous matches, whereas a round-robin format with a restricted field is more resistant to abbreviation. An cursory examination of the potential length of the different tournament formats thus actually leads to the counter-intuitive conclusion that, in terms of minimising duration, the more inclusive formats are in fact preferable.

Next up is that most contentious of criteria - "competitiveness". Here again both sides of the debate have been vocal. A series of decent showings from Associate teams at the front-end of the tournament, coupled with some, err, less-than-competitive performances from certain Full Members *cough* ENGLAND *cough* made the contention that Associates were out of place on the World stage look faintly ridiculous. The tail-end of the group phase has restored a modicum of respectability to that position however, with a string of victories for top Full Members over Associates by margins ranging from the convincing to the cataclysmic.

So before we can assign formats a score based on generated mismatches, we need to take a look at what actually constitutes a mismatch. Time for another graph or two.

Plotting margins of victory by runs or overs-remaining against the difference in pre-tournament ODI ranking between victor and vanquished lends a degree of clarity to the question. First off it's immediately clear that blowouts are not the preserve of matches featuring Associates, but ranking difference is still a fairly decent predictor of win margin once it passes a certain value. Picking out matches featuring Associate nations nonetheless shows that there is no real discontinuity in this correlation between numbers 10 and 11, and in fact the Associates have performed slightly better than would be expected from a simple linear correlation between victory and ranking margins.

Overs remaining is of course an imperfect measure of victory, as a quick but closely-fought chase like New Zealand vs Australia ends up looking a lot more one-sided than it was, given that New Zealand just scraped home with just one wicket in hand. In fact adjusting wins by chasing sides for wickets in hand draws most all the data points closer to what would be expected, with the glaring exception of England games - actually having the entertaining effect of showing England lose to New Zealand by more than 60 adjusted overs in a 50 over match and utterly banjaxing the scale of the graph.

But leaving aside outliers generated by England's statistically improbable ineptitude, the results show that games between sides ranked within 5-6 places of each other are generally competitive or unpredictable regardless of whether one or other is an Associate member, but once the ranking difference exceeds 6 places, increasingly convincing wins become the norm. An ODI ranking difference of greater than six thus seems sufficient to label a tie an egregious mismatch.

Conversely,"tentpole" games are extremely simple to define given that, as with most full-member ODIs, the actual outcome is of no particular consequence. For the purposes of this comparison, we'll be going with the the somewhat arbitrary definition of expected meaningful group matches between top-six sides plus knock-out games.

That does however leave us with the rather trickier question of "meaningful matches". What constitutes a dead rubber is of course fairly obvious - a group stage game where both sides are either already out of contention or assured of progression. Equally straightforward; a semi-meaningful game is where only one side has anything but pride to gain or lose. Predicting the number of such matches turns out to be surprisingly complex however, especially as group sizes increase. An inexact estimate can nonetheless be arrived at by running repeated tournament simulations and taking the mean number of dead games. For the purposes of this analysis a somewhat over-simplified model, assigning each team a linear-scaling probability of victory in each game based on rank-difference was used to generate results. The possibility of ties and questions of net run rate were ignored for the sake of simplicity.

The risk of early elimination for bankable sides is a similarly complex question, here we've gone with a straightforward but again over-simplistic metric - taking the percentage of teams exiting before the knockouts, multiplied by the mean rank of opposition faced by the top four teams at the group stage, divided by the number of group games per team. This figure could doubtless be refined and improved in a number of ways, but on the slightly hand-wavy premise that one figure derived from questionable assumptions is as good as the next, we'll go with one that's fairly easy to calculate.

So having run the numbers we can now draw up the rather fetching table above, and see how the formats line up. The first surprise is just how badly the current format comes out of this. The 14-team version scores poorly on length and compressibility whilst also featuring a decent, but somewhat higher-than-expected early elimination risk, and only faring modestly on preventing mismatches and staging big games. Most damningly, it scores the worst of all formats on meaningful matches, as born out by the half-dozen dead rubbers we're plowing through at time of writing.

The 16-team format, as well as the expected but nonetheless glaring risk of an early top-four elimination, also throws up a jarringly high score for dead rubbers. The introduction of crossover 2nd-3rd playoffs before the quarter finals offers a potential quick-fix on both counts of course, but eliminating only 25% of teams at the group stage would make the first phase considerably less cutthroat, and the poor tentpole score likely scotches this format's chances regardless. In any event the results certainly put paid to the suggestion that smaller groups guarantee more meaningful matches.

The 15-team format emerges as something of a compromise option, scoring decently but not spectacularly across all criteria. The three-team intermediary phase remains something of an ungainly contrivance however - resembling nothing so much as a miniature super-eight phase minus the eight and the super. Nonetheless the format is the best of the broader formats in both minimising mismatches and staging tentpole games, and beats out the current format on seven of eight counts.

The 20-team format scores very highly on several criteria, notably minimising dead rubbers and maximising knock-out games whilst, surprisingly, carrying a lower risk of an early top-four elimination than the proposed ten-team format. Though the match-count is marginally higher than the current and proposed formats, the 20-team template lends itself far more readily to scheduling concurrent matches, theoretically allowing the tournament to be completed faster. But the most eye-popping stat that the format throws up is of course the bright-red 28 egregious mismatches.

This number, however, is sufficiently problematic to warrant an asterisk. Applying the >6 cut-off for mismatches to the 20-team format contains the somewhat dubious assumption that the disparities in quality below the top-flight mirror the those between the teams at the current World Cup, which would imply that the gap in ability between say Hong Kong and Kenya or PNG and Nepal is comparable to the gulf between, for example, South Africa and Zimbabwe. If one relaxes this assumption to account for the generally recognised competitiveness of the WCLC, the figure falls to somewhere between 15 and 20.

Back then to the other extreme, the ICC's currently-favoured 10-team format, which fares more-or-less as expected. The bloated league phase is resistant to compression, leaving the format right at the top of the interminability table. The lack of knock-out games could theoretically be fixed by introducing playoffs or quarters at the expense of ever greater incessance, but then 45 games seems like an inordinate length to go to just to eliminate Zimbabwe and, realistically, England.

Likewise fond remembrances of 1992 are made to look rather unsound by the dead rubber numbers. As Richardson would have it, "you had nine teams, then the semi-finals. There was something up for grabs in every match." This is objectively untrue. There were three group games in 1992 where both sides were either through or out, several more where one of the sides had nothing to lose or gain, and the data implies that even then the organisers got lucky. The ten-team format is in fact among the worst for generating meaningful games.

What the ten-team format does do well of course is what it was designed to do - namely minimise mismatches and maximise games for top-six countries against "marketable opposition". But this does little to answer the most important criticism of the format, namely that it effectively excludes 90% of cricket-playing nations from the "World" Cup. Even leaving aside the questionable long-term logic in contracting the sport's most high-profile event, one wonders whether the law of diminishing returns might not come to bear on a tournament featuring so many supposed "must watch" matches.

After all, if you just stick a bunch of tentpoles in a row and don't bother with the canvas, all you've really done is build a fence.

Aknowledgments: credit and lasting gratitude to /u/angelatheist of Reddit's indispensible /r/cheatatmathhomework for supplying the tournament simulator programme.

Click here for full graph gallery.