Our secure, open data hub brings together health systems and research communities from across the world, enabling researchers to push the boundaries of medical science.
Description¶
Nightingale Open Science is a platform that connects researchers with world-class medical data. We work closely with health systems around the world to create and curate datasets of medical images linked to ground-truth labels. We carefully deidentify the data and make it available for non-profit research on our cloud infrastructure.
We focus on datasets that will help researchers make breakthroughs for unsolved medical problems. Consider sudden cardiac death, which kills 300,000 Americans every year. Many papers have been written on factors that put people at higher risk—but even after looking back at the vast majority of deaths, we still cannot find an identifiable cause. Or cancer: improved screening since the 1990’s has helped us identify more small tumors—but we still haven’t been able to translate this into lower rates of late-stage diagnoses or death.
What is current clinical research missing?¶
We believe the key to solving these mysteries lies in the massive volumes of complex imaging data health systems produce every day: electrocardiogram waveforms, x-rays and CT scans, tissue biopsy images, and more. Today, these data are interpreted by humans, but our research is providing clues that machine learning can open up new ways of ‘seeing’ signals and patterns in the data that humans cannot.
Unfortunately, existing medical data with the potential to shed light on these patterns have historically been siloed. By making this data accessible to broad groups of interdisciplinary researchers, we can begin to unlock discoveries that save lives, surfacing previously unknown patterns of disease.
This is the vision underlying Nightingale Open Science: an open platform housing cutting-edge, deidentified medical datasets that are available to a diverse, global community of researchers.¶
Our goal is to foster researcher collaborations across disciplines, bringing together computer science researchers, clinicians, and economists around critical questions that will push the boundaries of medical research and spur the field of computational medicine.
Dataset¶
ECG Waveforms¶
[ed-bwh-ecg](https://docs.ngsci.org/datasets/ed-bwh-ecg/): Assessing Heart Attack Risk (104,000 ECG waveforms)
[silent-cchs-ecg](https://docs.ngsci.org/datasets/silent-cchs-ecg/): Diagnosing ‘Silent’ Heart Attack (48,000 ECG waveforms)
[arrest-ntuh-ecg](https://docs.ngsci.org/datasets/arrest-ntuh-ecg/): Subtyping Cardiac Arrest (24,106 ECG waveforms)Microscopy Images¶
[brca-psj-path](https://docs.ngsci.org/datasets/brca-psj-path/v2.1/): Identifying High-Risk Breast Cancer (175,000 biopsy slides)
[tb-wellgen-smear](https://docs.ngsci.org/datasets/tb-wellgen-smear/): Detecting Active Tuberculosis (75,000 TB smear images)X-ray Images¶
[fracture-aimi-xray](https://docs.ngsci.org/datasets/fracture-aimi-xray/): Predicting Fractures (64,000 chest x-rays)
[covid-psj-xray](https://docs.ngsci.org/datasets/covid-psj-xray/): Emergency Triage of Covid-19 Patients (7,500 chest x-rays)Multiple Diagnostics¶
[tamil-jpal-multi](https://docs.ngsci.org/datasets/tamil-jpal-multi/data-dictionary.html): Tamil Nadu J-PAL Data Dictionary (4,500 participants)Data Types¶
Demographics
Clinical
ECG
Microscopy
X-ray
Register¶
Request access to the data here.