We assessed the effect of feeding preterm or low birth weight infants with infant formula compared with mother’s own milk on mortality, morbidity, growth, neurodevelopment, and disability.
We searched Medline (Ovid), Embase (Ovid), and Cochrane Central Register of Controlled Studies to October 1, 2021.
Forty-two studies enrolling 89 638 infants fulfilled the inclusion criteria. We did not find evidence of an effect on mortality (odds ratio [OR] 1.26, 95% confidence interval [CI] 0.91–1.76), infection (OR 1.52, 95% CI 0.98–2.37), cognitive neurodevelopment (standardized mean difference −1.30, 95% CI −3.53 to 0.93), or on growth parameters. Formula milk feeding increased the risk of necrotizing enterocolitis (OR 2.99, 95% CI 1.75–5.11). The Grading of Recommendations Assessment, Development, and Evaluation certainty of evidence was low for mortality and necrotizing enterocolitis, and very low for neurodevelopment and growth outcomes.
In preterm and low birth weight infants, low to very low-certainty evidence indicates that feeding with infant formula compared with mother’s own milk has little effect on all-cause mortality, infection, growth, or neurodevelopment, and a higher risk of developing necrotizing enterocolitis.
Preterm (<37 weeks’ gestation) and low birth weight (LBW) (<2.5 kg) infants have limited nutrient reserves at birth and are subject to many physiologic and metabolic stresses that increase their nutrient needs.1 Formula milks (eg, artificial infant formulas) can be manipulated to contain higher amounts of nutrients (such as protein) than mother’s own milk.2,3 However, formula milks do not contain the immunomodulators and nutrients present in human milk that stimulate the immune system, protect the immature gut, and promote neurodevelopment.2,4 There are many new infant formulas; however, the last systematic reviews were conducted in 2011 and 2019 and there have been no recent reviews of the effectiveness of infant formula and other formula milks compared with mother’s own milk on outcomes in preterm and LBW infants.5,6
Our primary objective was to evaluate the effect of formula milks compared with mother’s own milk on primary outcomes (mortality, morbidity, growth, neurodevelopment, and disability) in preterm and LBW infants. Our secondary objectives were to determine the effect of gestational age (<32 weeks), birth weight (<1.5 kg), and exclusivity of mother’s own milk (ie, if the mother’s own milk was the sole diet [ie, 100% mother’s own milk]), on health outcomes of preterm or LBW infants.
Methods
This review was registered in PROSPERO (#CRD42021283008). Preferred Reporting Items for Systematic Reviews and Meta-Analyses-Protocol guidance was followed.7
Type and Setting of Included Studies
We included randomized controlled studies (RCTs), cohort, cross-sectional, and case-control studies. Case reports and studies published in abstract form only were excluded. All settings were included, such as home and health facility, within any country.
Participants
Only preterm or LBW infants were included. Normal weight or term infants were excluded.
Intervention and Control Groups
The intervention was any formula milk which included artificial infant formula (including cow’s milk protein, soy protein, other protein, or hydrolyzed formula) or other animal milk. Infants could receive mother’s own milk or donor human milk, as long as most (>50%) of the milk was formula. Infants had to receive the intervention in the neonatal period (0–27 days).
The control was mother’s own milk. Infants could also receive other milks (ie, donor human milk, formula milk, formula) as long as most (>50%) of the milk was mother’s own.
In both the intervention and control groups, the infants could receive any water-based fluids.
Outcomes
The primary outcome was infant all-cause mortality.
The secondary outcomes were:
necrotizing enterocolitis as defined by the study authors;
severe infections (eg, sepsis, pneumonia, meningitis) as defined by the study authors;
neurodevelopment defined as: neurodevelopmental scores measured using validated assessment tools in the main domains (cognitive, motor, language) of standardized assessment tools such as the Bayley Scales of Infant and Toddler Development, Third Edition, or the Weschler Wechsler Intelligence Scale for Children;
disability defined as: nonambulant cerebral palsy; developmental quotient >2 SDs below the population mean; blindness (visual acuity <6 of 60) or deafness (any hearing impairment requiring or unimproved by amplification); and
growth (weight, length, head circumference, mid-upper arm circumference, skinfold thickness) absolute change measured as grams or centimeters, standardized change measured as z-score, or percentile compared with a population reference.
The timing of the outcome assessment was at hospital discharge and at latest follow-up time recorded.
Search Methods
This is an update of a 2011 systematic review.5 Electronic databases were searched from January 1, 2011, to October 1, 2021. Databases included Medline (Ovid), Embase (Ovid), and Cochrane Central Register of Controlled Studies. In addition, we completed manual reference checks of existing reviews and of papers that were included in the review. All studies from the 2011 systematic review were included.5 Appendix 1 provides the search strategy used and Appendix 2 shows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses flowchart.
Selection of Studies and Data Extraction
Selection of studies and data extraction was conducted by 2 authors and followed standard methods.8 Data extracted included: country, study design, study setting (facility [infant born and followed-up in the facility until discharge] or “whole population” [ie, infant born and followed-up at home or the facility]), gestational age, birth weight, and milks given in intervention and control groups.
Assessment of Risk of Bias
Two review authors judged the risk of bias using standard methods, including the ROBINS-I tool (risk of bias in nonrandomized studies of interventions).9 Where possible, funnel plots and Egger’s test were used to assess publication bias.
Measurement of Treatment Effect
For dichotomous data, we summarized results using risk ratios and, where this was not possible, odds ratios (ORs) with 95% confidence intervals (CIs). For continuous data, we summarized results using the mean difference (MD) with 95% CIs or standardized mean difference (SMD) when different methods or scales were used between studies.10
We considered all studies to be highly heterogenous, so we used random-effects models to calculate pooled estimates for all outcomes. Where available, we used study level adjusted effect sizes to calculate pooled estimates and, when not available, we used raw data. We imputed missing data on the basis of Cochrane methods.8 Restricted maximum likelihood estimates and Knapp–Hartung SEs were used.11 We also assessed forest plots visually for heterogeneity, and considered I-squared values >50% to represent substantial heterogeneity. All analyses were done using Stata 16.1.
Subgroup and Sensitivity Analysis
Our a priori subgroup analyses were:
gestational age and weight at birth (studies enrolling only infants <32 weeks’ gestation or <1.5 kg at birth compared with studies that did not restrict enrollment on the basis of gestational age or birth weight); and
exclusivity of mother’s own milk in the comparison arms (ie, studies providing mother’s own milk as the sole diet [ie, 100% mother’s own milk]) compared with studies providing other milks (eg, artificial infant formula, other animal milk, donor human milk) or foods (eg, porridge) mixed with mother’s own milk.
We had also planned to stratify analyses by high-, middle-, and low-income settings; however, there were no low-income studies and only 3 middle-income studies. We completed a sensitivity analysis to determine the robustness of the growth income by excluding 1 study that contributed a large sample size.12 We also planned a sensitivity analysis excluding studies at serious or critical risk of bias; however, only 1 study did not have serious or critical risk of bias.
Summary of Findings and GRADE Table
We prepared a summary of findings table for each outcome using Grading of Recommendations Assessment, Development and Evaluation (GRADE) and GRADEPro guideline development tool software to assess the quality of the body of evidence, consistency of effect, imprecision, indirectness, and publication bias for each outcome.13–15
Results
Study Characteristics
The search resulted in 5170 records. After screening titles and abstracts, 89 records were retrieved. Sixty-three reports were excluded with reasons (Appendix 2). We identified 26 new articles reporting on new studies. Combined with the 19 studies from the previous review, 42 studies were included in the narrative review and 36 studies provided data for meta-analysis.12,14–48 Of the 42 studies there were no RCTs, but there were 34 cohort,12,16,18,20,27–29,31–60 5 cross-sectional,19,22,23,26,30 and 3 case-control studies17,21,25 (Appendix 3). Studies were from 21 countries: Australia, Belgium, Chile, China, Germany, Ghana, Greece, Hong Kong, India, Israel, Italy, Japan, Nepal, Netherlands, New Zealand, Poland, Romania, Spain, Sweden, the United Kingdom, and the United States. Thirty-six studies were implemented in NICUs and special care nurser ies,12,16,18,19,21,23,25–31,33–35,37–39,41–60 and 6 were whole population studies.17,20,22,32,36,40
In total, there were 89 638 preterm and/or LBW infants included in the review, of whom 74 656 were in the formula group and 14 982 were in the mother’s own milk group. Of the infants included in the review, 77 892 were infants born at <32 weeks’ gestational age and 76 796 were infants weighing <1.5 kg at birth. Twenty studies enrolled infants <32 weeks’ gestation and/or <1.5 kg at birth,12,16,18,22,23,26,30–33,41–46,49–53,57,59 and 22 studies did not restrict enrollment on the basis of gestational age or birth weight.17,19–21,25,27–29,34–40,47,48,54–56,58,60 In the intervention arm, 24 studies provided formula milk only (ie, formula was the sole diet)12,16–18,21–23,26–29,32–35,37,38,41–44,49,51,52,55,59,60 (Appendix 3 and 4). In the control arm, 9 studies provided mother’s own milk only (ie, mother’s own milk was the sole diet).20,26,31–33,36,49,51,52 The proportion of formula and mother’s own milk provided in each study is given in Appendix 4. All formula milks, where information was provided, were artificial cow’s milk protein-based. There were no other types of other protein base (eg, soy or goat milk protein) and no hydrolyzed formula in the studies. Seventeen studies provided donor milk in either the intervention or control groups.12,17–19,21,28,34,35,37,39,44–48,54,55 A total of 36 studies of 88 741 infants provided data for meta-analysis.
One study contributed 81% (n = 72 997) of the overall sample in the review.12 This study only reported on growth (ie, did not report on mortality, morbidity, or neurodevelopment). After excluding the infants in this study, there were 10 636 infants in the formula group and 6005 infants in the mother’s own milk group. A total of 5258 were infants born at <32 weeks’ gestational age and 4128 were infants weighing <1.5 kg at birth.
Risk of Bias
A risk of bias assessment was completed for the 36 studies included in the meta-analysis (Appendix 5). No studies had low risk of bias. Nineteen had critical,16,21,26,27,29,30,33,34,36,39,41–43,47–54,56 16 had serious,12,17–20,22,23,25,28,35,37,38,44–46,55 and 1 had moderate31 risk of bias. Most biases were because of confounding (Appendix 5). Three studies in the sepsis outcome had small study effects (ie, events ranging from 0 to 2) in the intervention and/or control groups (Egger test P = .0459; funnel plot 5.4) (Appendix 5).31,47,51 No other outcomes had obvious publication bias or small study effects in any analyses (funnel plots shown in Appendix 5).
Primary Outcomes
At discharge, there was little effect of the intervention (formula milk) on the primary outcome (all-cause mortality) (OR 1.26; 95% CI 0.91–1.76; I2 = 0%; low certainty evidence; 5 trials, 9625 participants) (Appendix 6) or the severe infection outcome (OR 1.52; 95% CI 0.98–2.37; very low certainty evidence; 15 studies; 2572 participants) (Appendix 6). However, at discharge, there was a threefold effect of the intervention on necrotizing enterocolitis (OR 2.99; 95% CI 1.75–5.11; low certainty evidence; 15 studies; 3013 participants) (Appendix 6).
At latest follow-up (between 91 and 416 weeks), there was little or no effect on cognitive neurodevelopment (SMD −1.30, 95% CI −3.53 to 0.93; very low certainty evidence; 8 studies, 1560 participants) (Appendix 6). Similarly, there was little to no effect on language neurodevelopment among groups (SMD 0.02, 95% CI −0.39 to 0.43; very low certainty evidence; 3 studies, 587 participants) (Appendix 6).
There was little or no effect of the intervention on change from birth to discharge weight z-score (MD 0.03; 95% CI −0.15 to 0.21; very low certainty evidence; 6 studies, 74 130 participants) (Appendix 6). At latest follow-up (range 39–416 weeks), there was little to no effect on weight z-score (MD 0.14; 95% CI −0.76 to 1.05; very low certainty evidence; 3 studies; 271 participants) (Appendix 6). There was also little to no evidence of increase in length (MD 0.33; 95% CI −0.40 to 1.05; very low certainty evidence; 9 studies; 1048 participants) (Appendix 6) or head circumference (MD 0.26; 95% CI −0.35 to 0.87; very low certainty evidence; 9 studies; 1550 participants) (Appendix 6). At latest follow-up (range 39–416 weeks), there was little to no evidence of a difference in length z-scores among groups (MD 0.06; 95% CI −0.81 to 0.92; very low certainty evidence; 3 studies; 271 participants) (Appendix 6). No studies reported other growth outcomes. Results of the summary of findings are presented in Table 1.
Subgroup and Sensitivity Analysis
There were no differences in the effect of the intervention in studies that enrolled infants <32 weeks’ gestation and/or <1.5 kg at birth compared with studies which did not restrict enrollment on the basis of gestational age or birth weight (Appendix 7).
There were no differences in the effect of the intervention on primary outcomes in infants who received mother’s own milk as a sole diet in the control group compared with infants who received a mixture of mother’s own milk and other milks (Appendix 8).
A sensitivity analysis was completed for weight z-score at latest follow-up by excluding the study with the large sample size (n = 72 997).12 After removing this study, there was little change to results (MD 0.01; 95% CI −0.28 to 0.30; 5 studies; 1133 participants) (Appendix 9).
Discussion
In our systematic review of 42 observational studies enrolling 89 638 preterm and LBW infants, we found that formula milks had little or no effect compared with mother’s own milk on mortality, severe infection, neurodevelopment, weight, length, or head circumference at discharge or latest follow-up, but found a threefold increase in necrotizing enterocolitis. We found no differential effect in the 17 studies enrolling only infants <32 weeks’ gestation or <1.5 kg at birth compared with the 19 studies that did not restrict enrollment on the basis of gestational age or birth weight. We also found no differential effect in the 8 studies providing mother’s own milk as the sole diet compared with the 28 studies providing other milks (eg, artificial infant formula, other animal milk, donor human milk) or foods (eg, porridge) mixed with mother’s own milk.
The previous systematic review of 19 observational studies enrolling 13 027 infants reported low certainty evidence that formula milk was associated with an increase in mortality and the combined outcome of severe infection and necrotizing enterocolitis compared with mother’s own milk in preterm and LBW infants.5 Formula milk was also associated with increased length and decreased neurodevelopmental outcomes, but no change in weight outcomes. Our search found 19 new observational studies and increased the number of participants contributing data to 88 741. The addition of new studies to the mortality (Svenningsen 1982,47 Ruys 201741 ), length (Costa-Orvay 2011,49 Madore 2017,31 Mol 2019,52 Pieltan 2001,54 Ruys 201741 ) and neurodevelopment (O’Connor 200339 ) outcomes did not substantially change the effects reported in 2011, though strength of effect for all outcomes was reduced and there was no improvement in the certainty or quality of the evidence.
Three Cochrane reviews assessed the effects of human milk in preterm and LBW infants.6,61,62 In 2019, a Cochrane review of RCTs of the effects of formula compared with mother’s own milk located no trials.6 Also in 2019, a Cochrane review of RCTs reported moderate certainty evidence that formula milk increased weight, length, and head growth, and had a higher risk of necrotizing enterocolitis compared with donor human milk.62 The trial data did not show an effect on all-cause mortality, or on long-term growth or neurodevelopment. A Cochrane review of RCTs in 2020 also reported moderate certainty evidence that provision of multicomponent “fortifier” (powdered or liquid supplement with protein, carbohydrate, vitamins, and minerals added to human milk) increased short term in hospital weight, length, and head circumference compared with “unfortified” human milk, but evidence was insufficient to assess long-term effect on growth, neurodevelopment, mortality, or morbidity outcomes.61
In addition to the above review, 3 non-Cochrane reviews were identified.63–65 Each of these reviews only investigated 1 outcome per review, which included a range of in-hospital growth outcomes, bronchopulmonary dysplasia, and necrotizing enterocolitis. One review found that, for preterm and LBW infants, there was inconclusive evidence on the effect of formula compared with exclusive human milk on growth parameters including change in weight z-scores and head circumference.65 For preterm infants receiving exclusive human milk, there was an improvement in bronchopulmonary dysplasia compared with exclusive formula-fed infants; however, the review combined both RCTs and observational studies in their analysis.64 Similar to our observational study review, preterm infants who received human milk had a reduced risk of developing necrotizing enterocolitis than those who received formula.63 However, unlike our review, these reviews did not differentiate between whether infants received mainly mother’s own milk; only that they received exclusive (ie, mother’s own milk and/or donor milk) or any human milk.
Our review had some limitations. All evidence was low to very low certainty because of problems with confounding bias, unexplained heterogeneity, small sample sizes, and imprecision in many studies. Thirty-six studies recruited infants from health facilities only; there were only 6 studies that recruited infants from the whole population. Twenty-two studies recruited infants >32 weeks’ gestation. We had also planned to stratify analyses by high-, middle-, and low- income settings; however, there were no low-income studies and only 3 middle-income studies. We also planned a sensitivity analysis excluding studies at serious or critical risk of bias; however, only 1 study did not have serious or critical risk of bias. One study contributed a large sample size (n = 72 997) to the growth outcome and had the potential to bias outcomes. However, when we removed this study from the analysis, there was little change to the results. There is also much potential for misclassification between intervention and control groups (ie, switching between formula and mother’s own milk groups). This is because of the common practice of providing formula if mother’s own milk is not available and active promotion of breastfeeding in formula-fed infants. However, in our study, we defined the intervention and control groups as those receiving >50% (ie, “most or majority”) formula (intervention) or mother’s own milk (control) over the entire study period. We also assessed sole or exclusive diet as a subgroup. Some studies did not describe the amount of formula or mother’s own milk provided but we were able to estimate most or majority >50% for all studies. Other strengths of our study were the comprehensive search strategy and the inclusion of all study designs.
Overall, in preterm and LBW infants, our review shows low- to very low-certainty evidence that feeding with formula milks compared with mother’s own milk, either as a sole diet or mixed with other milks, has little effect on all-cause mortality, severe infection, growth, or neurodevelopment, and a higher risk of developing necrotizing enterocolitis. However, the quality of the observational evidence base must be improved. It is especially important to control studies for confounding and to conduct more studies in low-income countries and in infants born and cared for outside of health facilities.
Dr Strobel conceptualized and designed the study, designed the protocol and data collection instruments, collected data, conducted the initial analysis, and drafted the initial manuscript; Dr Adams reviewed the protocol and data collection instruments, and collected data; Dr McAullay reviewed the protocol; Dr Edmond conceptualized and designed the study, designed the protocol and the data collection instruments, collected data, and drafted the initial manuscript; and all authors reviewed and revised the manuscript, approved the final manuscript as submitted, and agree to be accountable for all aspects of the work.
This study is registered at PROSPERO, # CRD42021283008, https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42021283008. Data are available on request.
FUNDING: Edith Cowan University received funding from the World Health Organization (WHO) to complete this work. WHO commissioned the review for the guideline development group meeting for development of WHO recommendations on care of the preterm or low birth weight infant. WHO assisted in formulating the research questions and provided input on the synthesis of the results and manuscript.
CONFLICT OF INTEREST DISCLAIMER: Dr Edmond is an employee of the sponsor, the World Health Organization. The other authors have indicated they have no conflicts of interest relevant to this article to disclose.