Skip to Main Content

Breaking up. Teenage years. Replicating scientific research. These things are hard.

Five years ago, a pair of cancer researchers published an article in Nature lamenting what they called the “remarkably low” rate of success in turning early-stage oncology trials into marketable drugs. The looming issue, according to the authors, was this: Cancer studies are shockingly hard to reproduce. How hard? Results in just six of 53 “landmark” trials could be confirmed by further analysis.


That paper went on to inspire an ambitious initiative to verify the findings of dozens of high-impact cancer studies. The long-awaited early returns are in, and the results aren’t particularly encouraging.

Although the details aren’t exactly tidy, the bottom line results are these: Two studies generated findings similar to the original; two were inconclusive; and one failed to replicate.

That last study, from 2010, found that dosing tumors with protein fragments made cancer drugs work better — an approach that didn’t hold up in the subsequent analysis. Erkki Ruoslahti, a coauthor of the unreplicated study, told Nature that he fears the latest news could hurt the ability of his company, DrugCendR, to raise money for the therapy. “I’m sure it will. I just don’t know how badly,” the scientist told Nature.


“What we’re seeing here is that there are a lot of opportunities to increase reproducibility,” Tim Errington, the “metascience manager” at the Center for Open Science, which is leading the cancer initiative, told STAT.

“Reproducibility is hard, and once you fail to reproduce something, it isn’t always obvious why,” added Brian Nosek, a psychologist at the University of Virginia, in Charlottesville, who runs the Center for Open Science. (Full disclosure: Retraction Watch, which we cofounded, is partnering with the center on another project.) The new articles “are additional evidence that we don’t quite understand this as well as we thought we did.”

Nosek has been leading an effort to push for reproducibility in psychology, including a 2015 study in Science that found that only about 40 percent of results in 100 papers he and his colleagues analyzed were replicable.

The new analyses, which will be published Thursday in eLife, are the first of many papers to come from the group, which also includes scientist Elizabeth Iorns and Science Exchange. They have already finished three more papers, which they are now submitting for publication, and have roughly 21 more studies to conduct. In addition to replication attempts, the team also intends to publish meta-analyses of their studies to get a bird’s-eye view.

The effort isn’t cheap: Each replication took an average of nearly seven months to complete, at an average cost of about $27,000, according to data from the investigators. (Budget concerns had earlier forced the group to cut from 50 to 37 the number of trials they could re-run.) On the other hand, that’s barely visible against the backdrop of the National Institutes of Health $30+ billion annual budget.

In the sort of self-reflection you’d hope for in this sort of venture, an editorial accompanying the papers is quick to point out the potential shortcomings of their approach. For example, the contract researchers the project relied on to conduct the replications, while unbiased, might not have had the same level of expertise as the original investigators.

Similarly, the rules of the project stipulated that the labs not change experiments mid-stream in response to vague results. But, the authors of the unsigned editorial write, “an academic laboratory confronted with this situation while making a serious effort to determine whether a result is reproducible would perform the experiments in different ways, with different conditions, to generate clear results and to test whether there is some condition under which the original observation holds.”

Not lost on Errington is the irony that he and his colleagues are publishing just five replication efforts, a small sample size from which extrapolating in bulk is risky. Still, he said, “the fact that we already have things that are quite different from each other in the first five is a peek at what we’re going to see” in future studies.

Although many researchers the group contacted were gracious and eager to help — providing additional information about their methods — others were less forthcoming. One group took 111 days to respond; another ignored the request.

In the present culture of how science is done, intransigence isn’t surprising. “One of the broader goals I have,” Nosek told STAT, “is to shift the view of replication from a threat to a compliment.”

Succeeding in that would do at least as much to boost science as replicating a few studies.