Skip to Main Content

In the mid-1990s, a practice-changing clinical trial refined how people with mild asthma use inhaled albuterol. Instead of taking it on a regular schedule, the trial suggested it was better to use albuterol only when asthma symptoms appeared. That trial also had a hidden bonus: It yielded detailed data on inhaler use and asthma symptoms for hundreds of patients with asthma. Those data were used to help design additional trials in similar patients, studies done more quickly and at lower cost.

About five years later, as electronic medical records (EMRs) began to enter the scene, clinicians were asked to document their care of patients in more detail than ever before. The goals for this effort were to improve communication among caregivers and to inform payment reimbursement. But as with the asthma trial above, these data are having another life.


Today, EMR data are being used to predict domestic abuse, characterize genetic risks for rheumatoid arthritis and non-responsiveness to antidepressants, as well as to identify distinct subgroups of children with autism. Unknown signals can often lurk in a mountain of electronic data.

Clinical trials offer a source of rich medical data. These high-quality data have been meticulously gathered and extensively curated. Couple that with the capacity to be stored in formats that can be manipulated and you have a truly amazing resource.

With the growth of data analytics as a science, clinical trial data are more than a just simple resource. The availability of increased computing power and many more people trained to interpret it makes possible even more creative and informative uses of clinical trial data. Given these advances, we think it’s a safe bet that independent analysts will find within data gathered for one reason even more compelling, unanticipated applications.


Yet while data sharing seems, on its surface, to be a simple ask, the technicalities that underlie the endeavor aren’t so simple. For instance, when a researcher collects data for a study, she does so with no expectation that they will need to be used by others.

What we have seen, and what researchers we’ve spoken to have said, is that understanding clinical trial data sets is not easy. A great deal of time can be spent getting the data straight. No matter how carefully data are curated, there are always questions. If someone wants to use a data set, they need to understand it and use it wisely and responsibly.

From our perspectives as a trialist and a data analyst, we support clinical trial data sharing. We also have heard, and published in the pages of the New England Journal of Medicine, that not all members of the scientific community agree on how to share clinical trial data. We know that reusing data will not be easy, but it has the potential to teach us things we did not know.

This aspirational approach to data use maximizes the contribution of the patients who put themselves at risk to give us their data. But the community needs to come together to find the right path forward. We can only gain from having many more eyes on these precious studies.

Over the past year, the call for widespread access to clinical trial data has grown louder. We will not reach the desired endpoint unless all constituencies in this discussion — trialists, data analysts, and patients — come together to develop a framework for data sharing. If data are collected using standards that are compatible with each other, they will be more easily shared and provide the greatest return.

We are cochairs of the SPRINT Data Analysis Challenge. On Nov. 1, the challenge opens up the data set from the SPRINT trial on intensive blood pressure monitoring. It asks the community to look at these data and tell us something we didn’t know before. The top entries will be presented at a data-sharing summit in early April 2017.

Someday, researchers doing a cardiovascular trial could easily share their data with others conducting trials on cancer, kidney disease, or even schizophrenia and learn something that hadn’t been known before. These insights may help us design clinically informative trials that are not being misled by false relationships among the data. The problem is that we are in the early stage of the data-sharing process and there are not many examples of insights that have evolved from it.

For the promise of data sharing to become a reality, we need to work together to create examples of success that show how we can use what we know to learn even more. We hope that the SPRINT Data Analysis Challenge is a step toward that goal.

Jeffrey Drazen, MD, is editor-in-chief of the New England Journal of Medicine. Isaac Kohane, MD, is chair of Harvard Medical School’s Department of Biomedical Informatics.

Comments are closed.