May 14, 2018, by Timothy Hill

Heads I Win, Tails Don’t Count: Negative Results, Open Data, and the Curious World of Research Publication

In Tom Stoppard’s famous absurdist drama Rosencrantz and Guildenstern Are Dead (well, famous, at any rate, to people who like absurdist drama) the first act opens with the title characters betting on a coin toss. Rosencrantz is quite happy with how this has been going: he is betting on heads, and the toss has indeed come up heads, seventy-six times in a row.

Guildenstern, on the other hand, is starting to suspect something is amiss.

The joke, of course, is that something is amiss: the play is a metafictional fantasy, and the two characters are trapped in a theatrical alternative universe they occasionally start to suspect is entirely artificial and contrived.

On stage, this is funny. But according to Scientific American writer John Horgan, the universe the scientific community inhabits has come in some ways to resemble Stoppard’s curious heads-only world. And here the results are rather more worrying than the predicament Rosencrantz and Guildenstern find themselves in. As I discussed last week on this blog, Horgan opens his two-part series on the slowdown of scientific discovery with an alarming graph showing research inputs – time, money, people – continually zigging upwards while research outputs zag steadily down. And one of the main reasons for this, he says, might well be that scientists and researchers seem to live in a world where the heads-side of the coin comes up much, much more often then not.

The problem, at root, is a kind of survival bias, whereby only positive results are published in scientific journals, while negative results (those simply confirming the null hypothesis) are left to languish in a file drawer somewhere, never to see the light of day. In other words: heads, I win. Tails don’t count.

The results, Horgan says, are two-fold. First, as John Ioannidis pointed out in his now-(in)famous paper ‘Why Most Published Research Findings are False’, fluke results end up dominating research publication. And the result of this, insidiously, is that researchers and their sponsors get deluded into thinking particular approaches or areas are ripe for further exploration witnessed in apparently-promising when in fact they’re exhausted, sterile, or just disproportionately difficult for the results they obtain. Universities and multinationals hare off after brilliant successes fields, oblivious to the ninety-nine failures encountered on the way to that single positive result.

What if, Horgan – echoing others – asks, the billions spent on gene therapy and molecular medicine have been pretty much squandered, owing to the tendency for the scientific community to keep flipping the research coin until it gets the results it’s expecting?

The implications of that question – for researchers, for pharma companies, for doctors, and for anyone (that is to say, everyone) who might ever be a patient – are immense. And right now, it’s a question we can’t answer. We simply don’t have the data we need.

At present, there are at least three obstacles to getting it. The first is, quite simply, publication bias: the tendency for journals and their peer-reviewers to favour dramatic, positive results over expected, negative ones. And then there’s the flip-side of this, submission bias, the reluctance researchers feel towards publishing negative results. In principle, negative results are extremely valuable to the scientific enterprise. But writing up results is a long and arduous task, and time spent preparing failed experiments for publication will always feel better spent trying out new ones with some chance of success.

Fortunately, awareness of these biases has been steadily increasing with time, while the bar to the publication of negative results has been lowering. On an intellectual and cultural level, the Open Science and Open Data movements have been working to shift the dynamics of research publication for two and a half decades now. And one practical upshot of this for Nottingham researchers is that making their results – any results – public is easier than it’s ever been in the past. The University of Nottingham’s ePrints service was among the first institutional Open Access repositories deployed in the UK; and since February 2016 the Library’s Mediated Deposit service has made Green Open Access as simple as sending an email.

Of course, this doesn’t mean all the problems with negative-results publication are solved. Questions of journal impact factors and other aspects of academic culture mean there’s still considerable work to do in reshaping perceptions of research value and minimising inbuilt biases. And, as the Open Data movement has long recognised, there’s a big difference between publishing data online, and providing it in a form that’s actually usable by researchers. How useful are tables published in a PDF? Or in a tarballed PostgreSQL database? There’s going to be a lot of work to do to ensure that relevant published research is actually discoverable by, and legible for, researchers.

And then, finally, there’s a third problem: simple volume. Negative findings can be much, much more common than positives. As biologist Peter Dudek observed, in response to the suggestion that negative results should be published: ‘if I chronicled all my negative results during my studies the thesis would have been 20,000 pages instead of 200.’

Considerations like this seem to imply that research publication has to be rethought on a fundamental level: not just as an end result of research, but as something that’s embedded into the process of research itself. We need tools for accurately capturing, describing, cleaning, and validating data – and we need them to be not just usable by researchers, but to serve as frictionless elements in the research cycle.

It’s a potentially radical step. If Horgan is right, though, it’s one researchers need to take if we’re going to escape the inexorable downward slide of research outputs. In Rosencrantz and Guildenstern Are Dead, the two comic buffoons end as they began: they never do quite manage to understand the nature of the world they find themselves in.

But for a scientist, of course, that’s not an option.

 

 

 

 

 

Posted in Digital research topics