Strategies to improve neonatal outcomes rely on accurate collection and analyses of quality indicators. Most low- and middle-income countries (LMICs) fail to monitor facility-level indicators, partly because recommended and consistently defined indicators for essential newborn care (ENC) do not exist. This gap prompted our development of an annotated directory of quality indicators.
We used a mixed method study design. In phase 1, we selected potential indicators by reviewing existing literature. An overall rating was assigned based on subscores for scientific evidence, importance, and usability. We used a modified Delphi technique for consensus-based approval from American Academy of Pediatrics Helping Babies Survive Planning Group members (phase 2) and secondarily surveyed international partners with expertise in ENC, LMIC clinical environments, and indicator development (phase 3). We generated the final directory with guidelines for site-specific indicator selection (phase 4).
We identified 51 indicators during phase 1. Following Delphi sessions and secondary review, we added 5 indicators and rejected 7. We categorized the 49 indicators meeting inclusion criteria into 3 domains: 17 outcome, 21 process, and 11 educational. Among those, we recommend 30 for use, meaning indicators should be selected preferentially when appropriate; we recommend 9 for selective use primarily because of data collection challenges and 10 for use with reservation because of scientific evidence or usability limitations.
We developed this open-access indicator directory with input from ENC experts to enable appraisal of care provision, track progress toward improvement goals, and provide a standard for benchmarking care delivery among LMICs.
Quality indicators are standardized, evidence-based measures of health care quality that enable tracking of progress toward improvement goals. To date, facility-level indicators for essential newborn care have been variably defined and recommended for use in low- and middle-income countries.
This open-access directory defines, rates, and recommends 49 quality indicators selected by international experts in essential newborn care. Supplemental materials provide comprehensive site-specific guidance in indicator selection for identifying gaps in care quality and for educational training program evaluation.
Neonatal mortality represents a disproportionately large percentage of deaths in children younger than age 5 years in low- and middle-income countries (LMICs).1,2 As countries aim to accelerate progress toward achieving the 2030 Sustainable Development Goal (SDG) of <12 neonatal deaths per 1000 live births, evaluating quality of care gaps remains at the forefront of development efforts.3–6 Quality improvement (QI) methodology has been identified by leading international health organizations as a means to improve essential newborn care (ENC) and measure progress in reducing preventable neonatal deaths.7,8
ENC describes routine practices in the care of the newborn around the time of birth, including basic resuscitation, thermal care, hygiene practices, and nutrition support. Adherence to ENC substantially reduces mortality risk and improves postnatal growth and development.9,10 A number of educational programs, such as the World Health Organization (WHO) Essential Newborn Care Course and American Academy of Pediatrics (AAP) Helping Babies Survive curriculum, have been developed to teach birth attendants ENC in LMICs.11,12 Assessment of how these programs translate into high-quality care delivery requires facility-level quality indicators.13–15 To date, these indicators have been variably defined and recommended. Consideration of local context also has an impact on the feasibility, importance, and utility of specific indicators for each health facility.
In 2014, the WHO proposed a list of 15 indicators of quality of care for maternal and newborn health in LMICs for global data comparison.16 A subsequent study of the feasibility of applying the proposed WHO quality indicators in 10 countries in Africa and Asia reported that several of the proposed newborn-specific indicators required revision to operationalize in local settings or additional data sources and data collection methods (ie, 33% of indicators lacked definition clarity and 67% lacked existing facility register data).8 The study suggested that further guidance is needed to support health facilities in data measurement to characterize best practice gaps accurately.
As the global health community strives to improve care delivery at birth and accelerate progress toward the 2030 SDG,17 the AAP Helping Babies Survive Planning Group (HBSPG) developed an annotated directory of quality indicators to aid newborn providers in low-resource settings in facility-level data tracking to identify gaps in ENC.
Methods
The HBSPG assembled a QI Task Force to lead this initiative using a 4-phase process to identify and evaluate quality indicators related to ENC. The QI Task Force included 3 current HBSPG members and a senior neonatal-perinatal medicine fellow (4 study authors) with QI expertise. Indicators that became part of a final directory were categorized as recommended, recommended for selective use, or recommended with reservation. The QI Task Force developed an accompanying implementation guide for facility use. We outline the 4-phase process next.
Phase 1: Development of Draft Indicator Directory and Framework for Expert Review
Search Strategy
The QI Task Force used a systematic appraisal of evidence-based newborn care standards to compile the initial indicator directory (Table 1). Clinical studies, consensus statements, and expert committee recommendations were identified by keyword search in the PubMed, DynaMed, and Google Scholar databases. Boolean operators were used to connect the search terms: “quality indicator” and “essential newborn care” and “low- and middle-income countries” and “(neonate or infant or newborn)” and “(helping babies breathe OR survive)” and “essential care for every baby” and “essential care for small babies.” All references were published in English. The year of publication was not used to limit the search criteria. The title and abstract were used to screen for articles that define, describe, analyze, or use quality indicators in the newborn period (<28 days after birth). Articles that addressed maternal care or did not describe quality indicators were excluded. The study design included primary (eg, cohort studies, survey research, case studies) and secondary sources (eg, reviews, systematic reviews, meta-analysis, practice guidelines, and international healthcare governing standards). The study methods included qualitative, quantitative, observational, descriptive, or mixed methods. Selected references for each quality indicator can be found in Supplemental Fig 2.
The left column displays the initial indicator directory draft generated by the American Academy of Pediatric Helping Babies Survive Quality Improvement Indicators Task Force. The right column displays the finalized directory list following modified Delphi sessions and international expert panel review.
Indicators that received a final rating discrepant with the majority consensus from the REDCap study survey.
Grouping and Defining Quality Indicators
Indicators were then grouped by the QI Task Force into quality domains (eg, outcome, process) and defined by a numerator and a denominator or alternative format when appropriate. Proportion- and rate-based indicators are those quantified by comparison of desired outcomes or processes to expected or desired rates using proportions or rates.18 Mean-based indicators are those used for normally distributed continuous variables as mean ± SD (eg, time to bag-mask ventilation). Median-based indicators are those used for nonnormally distributed continuous variables and defined as median ± SD (eg, early initiation of breastfeeding). A sentinel indicator identifies individual events that are intrinsically undesirable and trigger further analysis for risk management (eg, preventable perinatal death).18
Rating Quality Indicators
We systematically evaluated each indicator based on the quality of scientific evidence (appraisal of published evidence), importance (to key stakeholders including providers, healthcare systems, patients, and parents), and usability (based on feasibility of measurement and utility for providers/health care systems in monitoring quality of care). Evaluation in each of these domains was assigned by each member of the QI Task Force; a final evaluation was made by consensus among the four members. Figure 1 outlines the scientific evidence, importance, and usability rating scales. The QI Task Force then assigned an overall assessment rating score (Fig 1) for each indicator based on the summative subscores in the 3 categories. A rating of recommended was designated for indicators that should be selected preferentially from the group when appropriate. A rating of recommended with reservation was assigned for indicators with limitations in scientific evidence or usability. A rating of rejected was assigned for indicators with low-quality scientific evidence, importance, and usability (Fig 1).
Phase 2: Modified Delphi Technique
The HBSPG reviewed and refined the draft indicator directory through a modified Delphi technique. This technique combines a structured questionnaire and meeting of experts to synthesize knowledge and generate a group consensus.19,20 Consensus was achieved by first tabulating the assigned rating of respondents to the survey. When a clear consensus was not achieved, the assignment was determined during discussions with the HBSPG as outlined next.
We developed a survey tool in REDCap (Research Electronic Data Capture) listing the quality domains and indicators with the assigned overall assessment rating score from phase 1.21,22 The survey requested the assignment of an overall rating score for each indicator. Open textboxes allowed for comments on the proposed indicators and suggestions for additional indicators. Because of the coronavirus pandemic, a secure virtual platform replaced an in-person meeting. The HBSPG was selected as the primary expert panel to evaluate study indicators as this group is composed of 14 leading experts in the field of resuscitation science and ENC.
The survey was distributed to HBSPG members on July 7, 2021. The QI Task Force compiled the survey rating results and free text responses for each indicator ahead of two 90-minute modified Delphi Sessions on July 19, 2021, and August 2, 2021. A review of the survey results and expert commentary occurred during the sessions to achieve a consensus for revisions to the draft indicator directory ahead of the international partner review. The QI Task Force facilitated the discussions.
Phase 3: International Partner Expert Review
The survey was revised based on commentary during the HBSPG Delphi sessions. The revised survey was distributed to experts outside the HBSPG on October 21, 2021, to integrate input from key stakeholders, international organizations, and international experts in ENC, LMIC clinical environments, and indicator development. Participants included representatives from the WHO, Every Newborn-Birth Indicators Research Tracking in Hospitals Study Group, African Neonatal Association, United Nations International Children’s Emergency Fund, and Save the Children US. Respondents from these organizations were identified through contacts on the QI Task Force or by members of the HBSPG. The QI Task Force reviewed survey responses; however, there was no virtual meeting with these individuals.
Phase 4: Development of the Final Indicator Directory
The QI Task Force completed the modified Delphi process by incorporating the expert international review into the previous survey results and commentary from the HBSPG. The QI Task Force adjudicated overall assessment rating scales for the draft indicator directory when disparities between review groups or among reviewers existed and developed a final indicator directory (Table 2). Indicators meeting the criteria for inclusion in the final directory were those receiving a rating of: recommended, recommended for selective use, or recommended with reservation. Indicators in the draft directory were excluded from the final if they received a rating of rejected. The majority consensus was tabulated based on rating scoring completed after adding the subcategory recommended for selective use during the first Delphi session. The QI Task Force generated implementation guidelines for site-specific indicator selection and use to accompany the final indicator directory.
Results
In phase 1, the QI Task Force identified 51 initial indicators in four quality domains (Table 1): 16 outcome, 18 process, 4 facility preparedness, and 13 educational indicators. Outcome and process domains encompassed indicators that measured processes of care and health outcomes around the time of birth. These indicators are also known as impact and clinical performance indicators, respectively. The facility preparedness domain included indicators evaluating preparedness for care delivery at birth. The educational domain included indicators that assessed the teaching and learning environment essential to an effective education program.
In phase 2, 10 of 14 of HBSPG members completed the survey tool ahead of the Delphi session 1. All members of the HBSPG group were present for at least 1 of the 2 sessions, with 10 of 14 HBSPG member attendance at sessions 1 and 11 of 14 HBSPG member attendance at session 2.
During Delphi session 1, HBSPG members suggested eliminating facility preparedness indicators as a discrete domain. Additionally, HBSPG members proposed the inclusion of a subcategory within the overall assessment rating scale (Figure 1) to account for indicators with high-quality scientific evidence and importance but significant data collection limitations impacting usability. The QI Task Force elected to label this subcategory as recommended for selective use to denote the need to use these indicators in settings where data collection specific to the indicator is feasible. The secondary survey of the broader panel of international partners included this new subcategory rating.
In phase 3, a convenience sample of 16 international partners from 4 leading intergovernmental public health organizations and 1 nongovernmental organization were secondarily surveyed. Survey responses accounted for 13 of 16 (81%) of submissions, and the remaining 3 of 16 (19%) were free-text responses.
In phase 4, the QI Task Force incorporated survey results, modified Delphi session commentary, and international partner review to generate the final study directory with accompanying definitions and ratings (Table 2). Forty-four of the initial indicators (17 outcome, 21 process, and 11 educational) met inclusion criteria, and 7 indicators (3 process, 2 facility preparedness, 2 educational) were rejected. Five indicators (1 outcome and 4 process) were added to the initial directory in phase 2, resulting in 49 quality indicators in 3 domains (Table 2). Among the 49 included indicators, 30 were recommended for use, 9 were recommended for selective use, and 10 were recommended with reservation (Table 2). A majority consensus did not result from survey responses for 2 of 51 indicators (antepartum stillbirth rate and time to bag-mask ventilation), and the QI Task Force assigned the overall assessment rating based on modified Delphi session commentary. We assigned a final overall assessment rating for 14 of 51 of indicators based on a plurality but not a majority response on the survey. The QI Task Force made these assignments based on discussions during the virtual Delphi sessions and international partner review. These indicators appear with an asterisk in Table 1. Supplemental Fig 2 outlines the Delphi discussions that prompted alteration to the majority survey rating for these indicators. Supplemental Fig 2 also includes commentary for each indicator to aid facilities in indicator selection.
Discussion
Disparities in neonatal morbidity and mortality persist despite global health efforts to improve the quality of ENC.1,23 Concerted efforts have focused on evaluating care coverage and the quality of care provided.6,23–25 Hospital systems face the challenge of determining whether care provided at birth aligns with quality standards set by global governing agencies.
The responsibility of the QI Task Force was to develop a comprehensive directory of quality indicators to aid health care facilities in critically evaluating the quality of care delivered in the perinatal period. This directory is intended for use during QI to improve newborn care and evaluate educational curriculum quality (eg, Essential Newborn Care Course). To that end, the QI Task Force selected indicators based on evidence from rigorously conducted empirical studies and the opinions and experience of experts in the field.8,26 Systematic methods for indicator selection enhance decision making in areas where the data availability is limited.27–30 In addition to published scientific evidence, an assessment of the importance and usability aided in developing the final indicator recommendation.
The QI Task Force prioritized quality domains that denote outcomes of individual infants rather than existing facility resources. Facility preparedness indicators appraise the physical environment and governance structure related to system resources and often reflect single time points (eg, WHO/United Nations International Children’s Emergency Fund Physical Space Infrastructure & Input: Dedicated maternal and newborn wards and Kangaroo Mother Care and/or ICUs).31 Although these inputs are critical to high-quality care, they are less reflective of the ongoing monitoring of care delivery. Alternatively, facility preparedness can be captured by tracking deficiencies in care processes and outcomes. Therefore, with 1 exception, the domains were limited to processes and outcomes, and the indicators were limited to those measured for individual newborns. The educational indicators characterize best practices for training programs.
The final list of indicators includes 17 key outcomes (eg, stillbirth, mortality, morbidities). The directory also includes 21 evidence-based processes of care that depend on correct and consistent performance, including indicators that track preparation for birth (eg, birth attendance, equipment), routine newborn care actions (eg, immediate skin-to-skin, delayed cord clamping, vitamin K, eye cord and eye care), resuscitation actions, and infection prevention (eg, equipment disinfection, hand-washing). These processes of care depend on the knowledge and skills of providers. Therefore, there are 11 educational indicators to evaluate the quality of the training environment, maintenance of learning, and knowledge and skills retention.
Similar to the 2014 WHO quality indicator list,16 our directory included neonatal mortality outcomes (including death audits), availability of equipment (functional bag-masks, infection prevention materials), and processes of ENC (immediate skin-to-skin contact, delayed cord clamping, and initiation of breastfeeding in the first hour). The WHO additionally incorporated facility-level indicators such as Kangaroo Mother Care, baby-friendly certification, essential lifesaving medicines, and oxygen supply that are absent from our list. To address the limitations of the WHO indicators highlighted in the follow-up feasibility assessment,8 the QI Task Force accompanied each indicator with a clear definition to delineate the population of interest and guidance to aid in selecting and applying appropriate indicators (Supplemental Fig 2). Indicators should be selected based on facility-specific goals and resources because gaps in care are highly individualized to patient populations, hospital facilities, and geographic locations. We developed this list with the understanding that all indicators will not be appropriate for use within every facility or health system.
Seven indicators were excluded, time to cord clamping, measure temperature, clear airway, relative vol of deliveries per birth attendant, functional heat source, documentation of maintenance of learning, and facility-based coach/champion. Indicators were rated rejected because of low-quality scientific evidence, an absence of clinical importance correlation, or poor usability. For example, although clearing the airway is supported by scientific evidence when secretions block the nares or oropharynx, excessive suctioning can harm the infant. However, this indicator poses significant challenges with the feasibility of measurement because data collection relies on subjective assessment of necessary or appropriate suctioning. We rejected some indicators because data supporting the indicator could be expressed in more meaningful terms. For example, the continuous variable time to cord clamping was excluded, but this critical aspect of care was included in a proportionate variable (delayed cord clamping) that reflected compliance with recommended timing. We excluded measure temperature because the measurement was included by necessity in the outcome variable hypothermia.
Facilities define their quality of care in their centers most accurately by measuring processes and outcomes among inborn infants. Adaptations or additions are required for use in other cohorts. For example, investigating processes and outcomes among outborn infants would require amendments of definitions using all outborn infants as the denominators. Other cohorts of particular interest might include those defined by birth weight (eg, low birth weight, <2500 g) or gestational age (eg, premature, <37 weeks’ gestation). Indicators can also describe the quality of care in population-based cohorts if processes and outcomes are known for newborns within a geographic area and stratified levels of care within a defined geography (eg, district, general referral, tertiary hospitals).
Our mixed methodology that included a consensus-based selection of indicators has some limitations. Although the modified Delphi technique has been used in health care to develop quality indicators, there is no validated requirement for technique components.19,26,27,32 Considerable variation exists in the definition of group consensus, expert selection, number of rounds, and reporting of the method and results. The rate of response to the study survey was relatively low among international partners. The QI Task Force disseminated the survey to 5 international organizations with only 16 responses despite e-mail reminders, which could lead to participation bias. Coordinating a virtual meeting was not feasible given the diversity of locations and availability. To mitigate this potential shortcoming, respondents could add open text to clarify their responses. Finally, adding the subcategorization recommended for selective use after Delphi session 1 potentially skewed the secondary review majority result.
This directory will require adaptation with new data and recommendations from leading organizations. The AAP HBSPG plans to review the directory regularly to evaluate new evidence and ensure the incorporation of data that would necessitate a change. In particular, we recognize the ongoing need to evaluate research and implementation experience with the educational indicators. As indicators are used and their utility reported, adaptations may be advisable based on these experiences. The directory will be made available in an open-access electronic and printable PDF version to ensure equitable access to all facilities interested in using this QI tool. We anticipate a future online format of the directory with interactive data collection and analytic tools to aid facilities with data management and indicator tracking. Facilities can additionally consider field testing selected indicators to evaluate usability when deemed appropriate by hospital sites and allow for quality data comparison among facilities, regions, and countries.
Conclusions
As health care governing bodies aim to reduce global disparities in neonatal mortality, guidance for data tracking is needed to evaluate facility-level clinical practice and ascertain where gaps in quality occur. The QI Task Force developed this directory of quality indicators of ENC to enable appraisal of care provision, track progress toward improvement goals, and provide a standard for benchmarking care delivery and implementation of education among LMIC sites.
The QI Task Force encourages facilities to review the commentary in each category for guidance on indicator selection based on the importance to families or health authorities, expected improvement, and the overall impact the estimated improvement will have in their patient populations. As the 2030 deadline for the SDGs nears, it is important to empower facility-based newborn providers and leaders to use and learn from local data in the context of standardized indicators and benchmarks. This directory is a tool to transform data into improvement stories, fueling enthusiasm and advocacy toward our global goals measured locally.
Drs Ehret, Bose, and Ashish conceptualized the approach to indicator development, supervised survey creation, reviewed and revised the indicator panel, and contributed significantly to manuscript revision; Dr Diego performed the initial draft of quality indicators, conducted an evidence-based literature review, generated the REDCap survey database, compiled data for modified Delphi technique review, and drafted the initial manuscript; all authors attended and guided the modified Delphi session discussions and contributed to the final indicator ratings; and all authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.
FUNDING: No external funding.
CONFLICT OF INTEREST DISCLOSURES: Drs Ehret, Ashish, and Bose are members of the American Academy of Pediatrics Helping Babies Survive Planning Group. Dr Bose served as a member of the editorial committee for Improving Care for Mothers and Babies and for Essential Care for Every Baby. The other authors have indicated they have no potential conflicts of interest to disclose.
Comments