Data sharing across jurisdictions is a challenge and, typically, the permissions and agreements obtained are site- and project-specific. Years of project time can be consumed with the development and approval of data sharing agreements, and inevitably some jurisdictions never participate because of regulations and restrictions that cannot be negotiated. In this issue of Pediatrics, Glinianaia et al present survival estimates up to age 10 years for children born with a major birth defect,1 leveraging a small proportion of the data from the EUROlinkCAT project,2 which supported 22 EUROCAT (https://eu-rd-platform.jrc.ec.europa.eu/eurocat) registries in 14 European countries to link their data on live born infants with birth defects to mortality, hospital discharge, prescription, and educational databases. After each registry completed its within-country linkages, a common data model (CDM) consisting of standardized variables required for analyses was developed, and each registry transformed its data into the CDM format using registry-specific analytic syntax; validation scripts were run to confirm that data were transformed properly. A protocol and syntax scripts were then developed centrally to perform analyses on registry-specific data for the ultimate submission of aggregated data and analytic results, rather than sharing individual-level data. Glinianaia et al used only the data linkages with mortality and vital statistics; yet additional analyses of data from the EUROlinkCAT project are under development.
Certainly, the establishment of EUROlinkCAT was not quick and easy. Each registry in EUROCAT had to have the appropriate local ethics permissions and procedures in place to participate, and then was responsible for obtaining any additional permissions needed to link the birth defects registry data to other data systems. EUROlinkCAT took several years to develop, and in some countries new legal foundations had to be established to implement the data linkages. However, now that it is established, the CDM provides an infrastructure that allows for future multinational studies to be conducted much more efficiently.3–5
This CDM approach is used in several spaces in the United States. It is the model on which highly sophisticated clinical research data networks are based, such as the National Patient Centered Outcomes Research Network (PCORnet; https://pcornet.org/data/) and the Food and Drug Administration’s Sentinel Initiative (https://www.sentinelinitiative.org/about).
Including data from more than 66 million individuals, PCORnet is a partnership of several smaller clinical research networks that came together either regionally or around the populations they serve (eg, pediatric hospitals or safety-net settings) and a coordinating center. In the PCORnet model, data are never actually shared or pooled across network partners; each network conforms to the CDM, and queries of the data are distributed across PCORnet such that researchers get answers rather than data. Representing over 68 million individuals, the Sentinel Initiative involves health care organizations working with the Sentinel Operations Center (SOC) to use billing and electronic health record data to assess medical product safety. The SOC creates programs which are then run by the health care organizations on their data. Deidentified results are sent back to the SOC and combined into a single result for the Food and Drug Administration.
Currently, the birth defects surveillance community in the United States has not adopted a CDM approach. In the United States, most 5- to 10-year survival analyses of children born with birth defects have relied on data from a single, high-quality, active surveillance program such as those in North Carolina, Texas, and metropolitan Atlanta.6–8 A handful of successful multistate survival projects9–11 have been coordinated through the National Birth Defects Prevention Network (www.nbdpn.org), by using a repository of individual-level data shared only with the Centers for Disease Control and Prevention for the purpose of specific analytic projects. The pooling of individual-level data at the Centers for Disease Control and Prevention allowed for centralized data cleaning and analysis; one challenge faced was that only a limited number of individuals could access the data, restricting the number of projects that could be conducted and the timeliness of their completion.
As an alternative to a data repository model in which jurisdictions need to share individual-level data with a centralized source, a CDM collaboration similar to the EUROlinkCAT project could be considered in the context of birth defects surveillance in the United States. The development of a birth defects CDM would be challenging and not feasible without substantial investments of time and resources. It is an aspirational vision at this time but something worthy of discussion within the birth defects community. If appropriate within- jurisdiction approvals were acquired and maintained, linkages between birth defects surveillance data and other jurisdiction-specific data sources could be conducted. To be part of the CDM collaborative, jurisdictions would need to agree to store standardized data elements locally and conform to centrally developed analytic methods and techniques. Analyses of data using a CDM also presumes a willingness of investigators to trust the aggregated results of distributed data queries. There would be no centralized data repository; all data would reside securely within the jurisdictions but be accessible to queries by members of the collaborative. Of course, long-term survival is not the only outcome of importance in understanding the experience of living with birth defects across the lifespan. Other measures of quality of life, including educational outcomes, comorbidities, and disability status, are critical to ascertain; some of these key metrics could be captured via jurisdiction-specific data linkages and, similar to EUROlinkCAT, become elements of a US birth defects CDM.
Although birth defects affect approximately 3% of all births in the United States, individual birth defects are rare; multistate collaborations are essential to have large enough sample sizes to assess the long-term outcomes of specific birth defects while also accounting for racial, ethnic, and socioeconomic diversity. The EUROlinkCAT project using CDM illustrates the opportunities that this level of collaboration can provide.
FUNDING: No external funding.
COMPANION PAPER: A companion to this article can be found online at www.pediatrics.org/cgi/doi/10.1542/peds.2021-053793.
The findings and conclusions in this report are those of the authors and do not represent the official position of the Centers for Disease Control and Prevention.
Dr Gilboa conceptualized this commentary and drafted the initial manuscript; Drs Tepper and Reefhuis reviewed and revised the final manuscript; and all authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.
References
Competing Interests
CONFLICT OF INTEREST DISCLOSURES: The authors have indicated they have no conflicts of interest relevant to this article to disclose.
Comments