Skip to Main Content

Patients who consent to participate in clinical trials are informed that their data will be used for both primary and secondary research purposes. When reassured that data will be anonymized, the overwhelming majority want the value of their data to be maximized through reuse.

Secondary analysis of patient-level data is valuable. Meta-analysis and other methods based on the combined datasets increase statistical power to enable inferences about rare adverse events, treatment effects in patient subsets, and reproducibility of results. These inferences are not possible from single datasets. Leveraging these data supports accelerated R&D and expanded perspectives on broader populations of patients.

Project Data Sphere™ operates an open access data platform where de-identified patient-level oncology data is shared. Reuse of the data has yielded more than 100 peer-reviewed publications that have impacted the direction of research and patient care in oncology. ClinicalTrials.gov publishes an inventory of clinical trials with information on the protocol, sample sizes, and summary results. Others — CSDR, VIVLI, YODA, SOAR — collectively provide inventories, metadata, and controlled access to the majority of clinical trials sponsored by the pharmaceutical industry. Patient-level data from NIH-funded trials are shared openly on the NCTN data archive.

Myth #1: Clinical trial data is shared widely and reused

With so many groups participating in some aspect of data sharing, it is surprising that less than 1% of datasets discoverable on controlled access platforms are being reused. Has the research community delivered on promises of transparency and sharing?

Controlled access models are configured to limit and discourage data sharing. Have we created incentives for data reuse?  Are we systematically assessing the value of reuse?

Myth #2: Sharing data harms patients

This myth reasons that researchers who don’t understand the full context of the datasets they are reanalyzing will generate spurious inferences that harm patients by undermining safe and effective medicines.

There is no evidence this is true. NIH requires the sharing of patient-level data for government-funded studies. Project Data Sphere provides open sharing of patient-level data for 200 oncology trials and more than 100,000 patients. No examples of spurious inferences are known.

Myth #3: The sponsor owns the data

Sponsors invest money in trials. Scientists contribute considerable time. Patients invest their lives. Research works best when incentives (IP, publications) for innovation are in place. That does not explain why there is reticence to share data from studies that were completed years ago. Too many organizations have taken the default position not to share data or else we would see more reuse of data. The focus should be squarely on patients and their desire for reuse of data to accelerate and expand discoveries.

The way forward

The research community needs to recalibrate incentives, and adopt a lifecycle approach to sharing data with the a priori intent to share all data at the appropriate milestone. Further, the community must passionately promote the greater good, open science, and respect for the patients who volunteer their participation.

To address these issues and discuss solutions, Project Data Sphere in partnership with the FDA is hosting the 10th FDA-PDS Symposium virtually on September 23, 2021. The symposium will bring together patient advocates and clinical research leaders from industry, government, and academia to define strategies to accelerate the delivery of value from data sharing.

Learn more about how data is being reused to fuel research.