In a recently released article in Pediatrics, Dr. Guowei Li and colleagues from across the globe tackle the issue of missing data in neonatal and perinatal clinical research (10.1542/peds.2023-063101), and an accompanying commentary by Drs. Ryan Kilpatrick and Rachel Greenberg provides helpful perspective on their findings and recommendations (10.1542/peds.2023-064938). This is a “methods paper” that is engaging to read and very accessible to non-statisticians. It addresses the following questions: What is the effect of “data missingness” on study results and does it matter? How can study teams most efficiently and rigorously mitigate the effect of missing data on study results, and what can they do to prevent missing data in the first place?
The authors searched Medline for randomized controlled trials (RCTs) that included newborns or their birth parents, reported childhood outcomes, and were published over a recent 3-year period (1/1/20–12/31/22) in select high-impact general medical and pediatric specialty journals, including Pediatrics. The point of the search was not to conduct a systematic review, but rather to obtain a high-profile snapshot of data missingness and associated data analysis practices.
Of 87 eligible RCTs, almost all (77, 89%) were missing main outcome data, and the mean percentage of randomized participants who had missing primary outcome data was 11%. Study teams took differing approaches to the problem:
- Nine (12%) did not discuss the issue at all,
- Most (61, 79%) limited the analysis of data to participants with complete information, and
- Less than half (38, 49%) used a statistical technique called sensitivity analysis to understand the effect, followed by use of one or several techniques called imputation to mitigate. (In statistics, “imputation” simply refers to the procedure of using alternative values in place of missing data.)
The authors guide us through an understanding of these differing techniques and their strengths and limitations. Although just one studied trial (of the 38) yielded different results following sensitivity analysis and imputation work, since we count on RCTs to guide practice, this may be the tip of the iceberg of results that are not as incontrovertible as we believe.
What is so fascinating about missing data is that missingness differs, and the authors share definitions and examples:
- Data can be missing completely at random (eg, one blood tube lost in an otherwise well-run lab; not related to site or participant)
- Data can be missing at random (eg, ventilation data missing on an infant; related to site but not related to participant)
- Data can be missing not at random (eg, 24 hours of information is missing due to a shift change; related to site and possibly to site practices affecting the study)
The type of missingness can help guide how best to handle the issue statistically, nicely summarized in Table 2 in the article.
Strategies to prevent missing data include proactively addressing barriers such as distance and transportation, establishing ongoing relationships with participants via frequent interactions, and using flexible virtual visits as possible. Perhaps the strongest preventive strategy utilizes participant engagement, following the “Nothing about us without us” mantra first raised by the South African disability rights movement. Participant engagement can range from Community Based Participatory Research, in which community members are coinvestigators who collaborate equally on study design and research methods, to inclusion of a Community Advisory Board, composed of community members, who provide reactive or proactive guidance about the trial.1 At each point in the research, there is a choice about the degree of participant engagement, and use of tools like the Collaborative Research Design, in which partners state their level of interest in each step, can help clarify and codify the partnership.1,2
Both the article by Dr. Li and colleagues and the accompanying commentary help practicing pediatricians and researchers to think clearly about this large but oft ignored issue of data missingness in research.
References
1. Vaughn, L. M., & Jacquez, F. (2020). Participatory Research Methods – Choice Points in the Research Process. Journal of Participatory Research Methods, 1(1). https://doi.org/10.35844/001c.13244
2. Makleff, Shelly. Reflections on establishing and sustaining partnership. Rethinking Research Collaborative. https://rethinkingresearchcollaborative.com/2015/06/04/reflections-on-establishing-and-sustaining-partnership/ Published Jun 4, 2015. Accessed 1/22/2024.
Acknowledgment
Thank you to Sarah Ronis, MD, PhD, for bringing these articles to my attention.