Video Abstract

Video Abstract

Close modal
BACKGROUND

Several prediction models have been reported to identify patients with radiographic pneumonia, but none have been validated or broadly implemented into practice. We evaluated 5 prediction models for radiographic pneumonia in children.

METHODS

We evaluated 5 previously published prediction models for radiographic pneumonia (Neuman, Oostenbrink, Lynch, Mahabee-Gittens, and Lipsett) using data from a single-center prospective study of patients 3 months to 18 years with signs of lower respiratory tract infection. Our outcome was radiographic pneumonia. We compared each model’s area under the receiver operating characteristic curve (AUROC) and evaluated their diagnostic accuracy at statistically-derived cutpoints.

RESULTS

Radiographic pneumonia was identified in 253 (22.2%) of 1142 patients. When using model coefficients derived from the study dataset, AUROC ranged from 0.58 (95% confidence interval, 0.52–0.64) to 0.79 (95% confidence interval, 0.75–0.82). When using coefficients derived from original study models, 2 studies demonstrated an AUROC >0.70 (Neuman and Lipsett); this increased to 3 after deriving regression coefficients from the study cohort (Neuman, Lipsett, and Oostenbrink). Two models required historical and clinical data (Neuman and Lipsett), and the third additionally required C-reactive protein (Oostenbrink). At a statistically derived cutpoint of predicted risk from each model, sensitivity ranged from 51.2% to 70.4%, specificity 49.9% to 87.5%, positive predictive value 16.1% to 54.4%, and negative predictive value 83.9% to 90.7%.

CONCLUSIONS

Prediction models for radiographic pneumonia had varying performance. The 3 models with higher performance may facilitate clinical management by predicting the risk of radiographic pneumonia among children with lower respiratory tract infection.

What’s Known on This Subject:

Several prediction models (using clinical and laboratory data) have been described to identify patients with radiographic pneumonia among children with suspected lower respiratory tract infections. None have been externally validated, and thus are infrequently applied in practice.

What This Study Adds:

We externally validated 5 prediction models for radiographic pneumonia in children using data from a prospective cohort study. Of these, 3 models (2 using clinical and physical examination characteristics, and another which includes C-reactive protein) demonstrated satisfactory performance.

Pneumonia is the among the most common conditions encountered among children presenting to United States emergency departments (EDs).1  Although routine use of chest radiographs (CXR) for outpatient pneumonia is not recommended by the Pediatric Infectious Diseases Society and Infectious Diseases Society of America pediatric pneumonia guideline,2  they are performed for approximately 80% of patients with pneumonia in pediatric hospitals.3  Extensive use of CXR leads to increased radiation exposure, unclear or conflicting findings,4,5  patient or caregiver inconvenience,6  and increased health care costs.7  Furthermore, use of antibiotics remains high in children for suspected pneumonia, despite recommendations from professional societies.89  However, as the negative predictive value of CXR is high10 ; antibiotics can be avoided in the case of normal imaging, which occurs substantially more often than an abnormal radiograph.

Accurate prediction models for children presenting to the ED may reduce unnecessary CXR use and promote antimicrobial stewardship. Previously published models incorporated historical, physical examination, and laboratory components to generate a predicted probability of disease. For example, if a model predicts that a patient has a high probability of pneumonia, CXR may be avoided unless a provider suspects an alternative pathology or disease complications, and empirical antibiotics may be used based on suspicion of bacterial disease. Conversely, if radiographic pneumonia is unlikely, then the provider may also choose to avoid obtaining a radiograph and consider alternative diagnoses.

Several clinical prediction models have been developed in children to assist in the prediction of radiographic pneumonia, either in isolation1116  or with other bacterial infections,1720  using a combination of historical, physical examination, and laboratory characteristics. Models typically perform worse in new populations than in the development cohort.21  Therefore, prediction models require external validation using distinct, high-quality data sources external to the original derivation cohort before implementation to establish their accuracy. We sought to validate previously published models for pediatric radiographic pneumonia using a prospective cohort of children presenting to the ED with suspected pneumonia.

We performed a secondary analysis of a prospective cohort study, Catalyzing Ambulatory Research in Pneumonia Etiology and Diagnostic Innovations in Emergency Medicine (CARPE DIEM), which was conducted at Cincinnati Children’s Hospital Medical Center (CCHMC) ED between July 2013 and December 2017. The CCHMC ED is part of a tertiary care specialty pediatric hospital that evaluates an average of 61 990 pediatric encounters per year between 2013 and 2017, of which 823 per year (1.3%) had a diagnosis of pneumonia. The CCHMC and Ann and Robert H Lurie Children’s Hospital Institutional Review Boards approved this study. A radiographic pneumonia prediction model was previously derived and published using data from CARPE DIEM.15 

Patients 3 months to 18 years of age with signs and symptoms of lower respiratory tract infection lower respiratory tract infection (defined based on previous work as new or different cough or sputum production, chest pain, dyspnea, tachypnea, or abnormal auscultatory findings)22  and who had CXR performed for clinical suspicion of community-acquired pneumonia (CAP). We excluded patients with a recent (≤14 days) hospitalization, history of aspiration, medically complex conditions (eg, immunodeficiency, chronic corticosteroid use, chronic lung disease, malignancy, sickle cell disease, congenital heart disease, tracheostomy use, and neuromuscular disorders impacting respiration). Potential patients were prospectively enrolled by research coordinators who used a computerized ED tracking board and who then collaborated with the treating physician to confirm eligibility criteria before enrollment. Research coordinators obtained informed consent from caregivers, and assent from children ≥11 years old. Medical history was collected from patients and guardians by the research coordinators, and physical examination data were collected by the clinical care team.

CXRs were independently interpreted by 2 radiologists masked to clinical information. Radiologists classified CXRs into 1 of 4 categories: (1) normal lungs, (2) definite atelectasis, (3) atelectasis versus pneumonia, and (4) definite pneumonia. We defined our primary outcome, radiographic pneumonia, as an interpretation of atelectasis versus pneumonia or definite pneumonia. We included equivocal radiographs in our outcome measure based on previous literature, which suggests that most clinicians prescribe antibiotics in these cases.23 

We evaluated 5 previously published models for pneumonia: Mahabee-Gittens et al,13  Lynch et al,11  Neuman et al,14  Oostenbrink et al,12  and Lipsett et al.16  These models were chosen as they used an outcome of radiographic pneumonia, in contrast to other models that included pneumonia as a part of a composite variable defining serious bacterial infections.1720  Models were derived from prospective studies, defined radiographic pneumonia as the outcome, and had similar inclusion criteria, although age ranges varied (Table 1). When validating a model with a narrower age range, the validation data set was limited to the appropriate age range for each associated model.

TABLE 1

Decision Rules Applied With Included Variables

Decision Rule, YearAge Included, yPneumonia Prevalence in Derivation Study, % (n/N)Historical VariablesPhysical Examination or Laboratory Variables
Lynch et al,11  2004 1–16 35.7 (204/571) None Fever (≥38°C), decreased breath sounds, auscultatory crackles, tachypneaa 
Mahabee-Gittens et al,13  2005 2 mo–5 8.6 (44/510) Age >12 mo Respiratory rate ≥50, oxygen saturation ≤96, nasal flaring 
Neuman et al,14  2011 0–18 16.4 (422/2574) Difficulty breathing, chest pain, duration of fever (classified as none, ≤72 h, and >72 h), and duration of cough (classified as none, ≤72 h, and >72 h) Wheezing, respiratory distress, tachypnea at triagea, retractions, grunting, focal or decreased breath sounds, rales (diffuse or focal), focal rales, focal wheeze, fever at triage (≥38°C), oxygen saturation at triage (classified as 97% to 100%; 93% to 96%, ≤92%) 
Oostenbrink et al,12  2013 1 mo–16 Three study populations: 15.4 (78/504), 13.8 (58/420), 7.3 (27/366) None Ill-appearance, tachypneaa, oxygen saturation <94%, CRP 
Lipsett et al,16  2021 3 mo–18 17.4 (206/1181) Age, fever at home Triage oxygen saturation, fever in the ED (≥38°C), rales, wheeze 
Decision Rule, YearAge Included, yPneumonia Prevalence in Derivation Study, % (n/N)Historical VariablesPhysical Examination or Laboratory Variables
Lynch et al,11  2004 1–16 35.7 (204/571) None Fever (≥38°C), decreased breath sounds, auscultatory crackles, tachypneaa 
Mahabee-Gittens et al,13  2005 2 mo–5 8.6 (44/510) Age >12 mo Respiratory rate ≥50, oxygen saturation ≤96, nasal flaring 
Neuman et al,14  2011 0–18 16.4 (422/2574) Difficulty breathing, chest pain, duration of fever (classified as none, ≤72 h, and >72 h), and duration of cough (classified as none, ≤72 h, and >72 h) Wheezing, respiratory distress, tachypnea at triagea, retractions, grunting, focal or decreased breath sounds, rales (diffuse or focal), focal rales, focal wheeze, fever at triage (≥38°C), oxygen saturation at triage (classified as 97% to 100%; 93% to 96%, ≤92%) 
Oostenbrink et al,12  2013 1 mo–16 Three study populations: 15.4 (78/504), 13.8 (58/420), 7.3 (27/366) None Ill-appearance, tachypneaa, oxygen saturation <94%, CRP 
Lipsett et al,16  2021 3 mo–18 17.4 (206/1181) Age, fever at home Triage oxygen saturation, fever in the ED (≥38°C), rales, wheeze 

CRP, C-reactive protein; ED, emergency department; WHO, World Health Organization.

a

For Neuman, tachypnea was defined as respiratory rate of >60 breaths per min for age of <2 y, >50 breaths per min for age of 2 to 4.9 y, >30 breaths per min for age of 5 to 9.9 y, and >24 breaths per min for age of 10 to 21.9 y. For Oostenbrink, WHO24  cutoffs were used, and for Lynch, Pediatric Risk of Admission25  criteria were used.

We matched variables within the validation dataset to those in the derivation studies. We categorized continuous variables using the same criteria from each derivation study. For the classification of tachypnea, we used guidelines provided by the model for each definition: Neuman et al used age-specific thresholds, Oostenbrink used World Health Organization criteria,24  Lynch et al used Pediatric Risk of Admission criteria,25  and Mahabee-Gittens used a criteria of >50 respirations per minute. For the variable of “ill appearance” described by Neuman and Oostenbrink, we used a variable for general appearance based on exam by an ED clinician, which was classified into 5 categories: well, mildly ill or distressed, moderately ill or distressed, and severely ill or distressed, and designated the presence of mild, moderate or severely ill or distressed as “ill appearing.”12,14 

We imputed missing data via multiple imputation by chained equations, creating 5 imputed data sets over which all subsequent results were averaged.26  We generated predicted probabilities for each of the previously published models under 2 methods: (1) using the values of the regression coefficients as originally published (“coefficients as published”), and (2) using the CARPE DIEM dataset to estimate new regression coefficients for the variables included in each model (“coefficient derived from data”). For 3 models,11,13,14  model intercepts were not published. We estimated intercepts for these models in the “coefficients as published” analysis by fixing the regression coefficients of the included variables at their published values and estimating the intercepts on the CARPE DIEM data set.

We conducted receiver operating characteristic analyses of the predicted probabilities generated by each model and calculated area under the receiver operating characteristic curve (AUROC) with 95% confidence intervals (95% CI). We constructed calibration graphs of the predicted probabilities against the observed prevalence of pneumonia in both continuous and decile-categorized formats.27,28  We identified optimal cutoffs for the predicted probabilities from each model using the Euclidean distance method29  and compared diagnostic accuracy statistics (sensitivity, specificity, positive and negative predictive values, positive and negative likelihood ratios).

Given the large proportion of CARPE DIEM patients with missing c-reactive protein (CRP) data, we repeated our validation analyses in only those patients with an observed CRP (ie, not imputed) to validate with the Oostenbrink model. Additionally, given potential differences in CAP etiology by age, we evaluated the performance of all models for the subset of children <5 years of age. Analysis was performed with R, version 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria).

Of 1142 patients enrolled, the median patient age was 3.3 years (interquartile range, 1.4–7.1 years) and 54% were male. Radiographic pneumonia was found in 253 (22%) patients (203 with definite pneumonia and 50 with pneumonia versus atelectasis). Characteristics of the cohort, with rates of missing data before imputation, are provided in Table 2.

TABLE 2

Patient Characteristics of CARPE DIEM Dataset

CharacteristicSummary (N = 1142), Number (%) or Median [IQR]
Demographic  
 Age 3.3 [1.4–7.1] 
 Male sex 622 (54) 
Historical  
 Fever 996 (87) 
 Days of fever 2 [1–4] 
 Cough 1099 (96) 
 Difficulty breathing 930 (81) 
 Fully immunized 1062 (93) 
 Days of illness 4 [2–7] 
 Vomiting 585 (51) 
 Wheezing 737 (65) 
 Rapid breathing 848 (74) 
 Rhinorrhea 949 (83) 
 Chest pain 350 (31) 
 Abdominal pain 362 (32) 
 Decreased oral intake 714 (63) 
 Decreased urine output 117 (10) 
 Smoke exposure 482 (42) 
 Pneumonia history 251 (22) 
 Past pneumonia hospitalization 101 (40) 
 Asthma 365 (32) 
Physical examination  
 Temperature (degrees Celsius) 37.6 [37–38.3] 
 RR 36 [28–48] 
 HR 142 [123–160] 
 SBP 114 [105–123] 
 Oxygen saturation 96 [94–98] 
 Retractions 488 (4) 
 Grunting 78 (7) 
 Nasal flaring 127 (12) 
 Head nodding 34 (3) 
 Abdominal pain 104 (10) 
Crackles or rales  
 None 761 (69) 
 Focal 240 (22) 
 Diffuse 107 (10) 
Rhonchi  
 None 715 (64) 
 Focal 83 (7) 
 Diffuse 311 (28) 
Wheezing  
 None 776 (70) 
 Focal 38 (3) 
 Diffuse 296 (27) 
Decreased breath sounds  
 None 729 (66) 
 Focal 257 (23) 
 Diffuse 123 (11) 
 CRP (mg/L) 5.2 [1.1–6.7] 
CharacteristicSummary (N = 1142), Number (%) or Median [IQR]
Demographic  
 Age 3.3 [1.4–7.1] 
 Male sex 622 (54) 
Historical  
 Fever 996 (87) 
 Days of fever 2 [1–4] 
 Cough 1099 (96) 
 Difficulty breathing 930 (81) 
 Fully immunized 1062 (93) 
 Days of illness 4 [2–7] 
 Vomiting 585 (51) 
 Wheezing 737 (65) 
 Rapid breathing 848 (74) 
 Rhinorrhea 949 (83) 
 Chest pain 350 (31) 
 Abdominal pain 362 (32) 
 Decreased oral intake 714 (63) 
 Decreased urine output 117 (10) 
 Smoke exposure 482 (42) 
 Pneumonia history 251 (22) 
 Past pneumonia hospitalization 101 (40) 
 Asthma 365 (32) 
Physical examination  
 Temperature (degrees Celsius) 37.6 [37–38.3] 
 RR 36 [28–48] 
 HR 142 [123–160] 
 SBP 114 [105–123] 
 Oxygen saturation 96 [94–98] 
 Retractions 488 (4) 
 Grunting 78 (7) 
 Nasal flaring 127 (12) 
 Head nodding 34 (3) 
 Abdominal pain 104 (10) 
Crackles or rales  
 None 761 (69) 
 Focal 240 (22) 
 Diffuse 107 (10) 
Rhonchi  
 None 715 (64) 
 Focal 83 (7) 
 Diffuse 311 (28) 
Wheezing  
 None 776 (70) 
 Focal 38 (3) 
 Diffuse 296 (27) 
Decreased breath sounds  
 None 729 (66) 
 Focal 257 (23) 
 Diffuse 123 (11) 
 CRP (mg/L) 5.2 [1.1–6.7] 

Missing data were present for the following variables (n): immunization status (5), heart rate (1), systolic blood pressure (85), oxygen saturation (38), retractions (31), grunting (34), nasal flaring (38), head nodding (33), abdominal pain (82), crackles (34), rhonchi (33), wheezing (32), decreased breath sounds (33), and CRP (685).

The models by Neuman and Lipsett included the whole CARPE DIEM cohort. The remaining 4 were only applied to subsets of the cohort because of different age ranges used in the original models. The percentage of patients with radiographic pneumonia were similar in the validation cohorts for the models of Neuman, Lipsett, Oostenbrink, and Lynch (22% to 24%), but was lower in the validation cohort of the model of Mahabee-Gittens (12%). Supplemental Table 4 describes differences in the variables included in model between those with and without radiographic pneumonia for their respective age-based validation cohorts.

Using the coefficients as published, the highest AUROC was demonstrated by the models of Neuman and Lipsett (0.72, 95% CI 0.68–0.75 for both models, Fig 1). The AUROCs for the remaining 3 models were substantially lower than the Neuman and Lipsett models. When utilizing coefficients estimated from the CARPE DIEM dataset, the model of Neuman exhibited the highest AUROC (0.79, 95% CI 0.75–0.82) followed by Lipsett (0.76, 95% CI 0.73–0.80). The AUROC for the model of Oostenbrink increased substantially when coefficients were estimated from CARPE DIEM (from 0.53 [95% CI 0.49–0.57] to 0.72 [95% CI 0.68–0.77]), whereas the models of Lynch and Mahabee-Gittens exhibited modest increases.

FIGURE 1

Receiver operator curves for the studied models when using: (A) coefficients as published and (B) coefficients derived from data. Legend includes the area under the receiver operator curve with 95% confidence intervals in parenthesis.

FIGURE 1

Receiver operator curves for the studied models when using: (A) coefficients as published and (B) coefficients derived from data. Legend includes the area under the receiver operator curve with 95% confidence intervals in parenthesis.

Close modal

In the coefficients-as-published analysis, the model of Neuman demonstrated a sensitivity of 70.0% and specificity of 65.4% and the models of Lynch and Lipsett demonstrated high sensitivity (83.0% and 81.7%, respectively) with lower specificity (30.0% and 52.6% respectively). The Oostenbrink model exhibited lower sensitivity (63.4%) and specificity (49.8%). The model of Mahabee-Gittens classified nearly the entire cohort (670 of 725 cases, or 92%) as positive for pneumonia, resulting in a high sensitivity (95.3%) but low specificity (8.0%). In the coefficients derived from the CARPE DIEM cohort, 1 or more performance characteristics improved for each of the models (Table 3).

TABLE 3

Diagnostic Performance of Models Using a Statistically Derived Optimal Cutpoint

Coefficients as PublishedCoefficients Derived From CARPE DIEM Dataset
SensitivitySpecificityPPVNPVLR+LR−SensitivitySpecificityPPVNPVLR+LR−
Neuman, et al14  70.0 65.4 36.5 88.4 2.02 0.46 69.6 77.1 46.3 89.9 3.03 0.39 
Oostenbrink, et al12  63.4 49.8 26.3 82.9 1.26 0.73 52.8 87.5 54.4 86.8 4.23 0.54 
Lynch, et al11  83.0 30.0 27.7 84.5 1.19 0.57 70.4 49.9 31.3 83.9 1.41 0.59 
Mahabee-Gittens, et al13  95.3 8.0 12.2 92.7 1.04 0.58 51.2 64.0 16.1 90.7 1.42 0.76 
Lipsett, et al16  81.7 52.6 33.0 90.9 1.72 0.35 60.1 79.9 45.9 87.5 2.98 0.50 
Coefficients as PublishedCoefficients Derived From CARPE DIEM Dataset
SensitivitySpecificityPPVNPVLR+LR−SensitivitySpecificityPPVNPVLR+LR−
Neuman, et al14  70.0 65.4 36.5 88.4 2.02 0.46 69.6 77.1 46.3 89.9 3.03 0.39 
Oostenbrink, et al12  63.4 49.8 26.3 82.9 1.26 0.73 52.8 87.5 54.4 86.8 4.23 0.54 
Lynch, et al11  83.0 30.0 27.7 84.5 1.19 0.57 70.4 49.9 31.3 83.9 1.41 0.59 
Mahabee-Gittens, et al13  95.3 8.0 12.2 92.7 1.04 0.58 51.2 64.0 16.1 90.7 1.42 0.76 
Lipsett, et al16  81.7 52.6 33.0 90.9 1.72 0.35 60.1 79.9 45.9 87.5 2.98 0.50 

LR+, positive likelihood ratio; LR−, negative likelihood ratio; NPV, negative predictive value; PPV, positive predictive value.

None of the models calibrated well using the coefficients as published, each predicting higher risk of pneumonia than what was observed, and an observed risk of pneumonia that did not increase with predicted risk (Supplemental Fig 3). Calibration for all models substantially improved when the coefficients were derived from CARPE DIEM data. Model performance at intervals of predicted risk is provided in Fig 2. The model of Neuman calibrated best among the 5 models, followed by the model of Lipsett. A notable feature of the Lynch, Mahabee-Gittens, and, to a lesser extent, Oostenbrink models was the limited range of predicted probabilities exhibited by each when applied to the validation dataset.

FIGURE 2

Model performance by risk thresholds for the 5 studied models when using: (A) coefficients as published; and (B) coefficients derived from the CARPE DIEM dataset. Intervals were calculated by dividing the cohort into groups of approximately equal size. Within each bar, the darker and lighter shades indicate the proportions of patients in each band with and without radiographic disease, respectively.

FIGURE 2

Model performance by risk thresholds for the 5 studied models when using: (A) coefficients as published; and (B) coefficients derived from the CARPE DIEM dataset. Intervals were calculated by dividing the cohort into groups of approximately equal size. Within each bar, the darker and lighter shades indicate the proportions of patients in each band with and without radiographic disease, respectively.

Close modal

In an additional analysis, externally validating the Oostenbrink model among the 432 CARPE DIEM patients with CRP data available, the AUROC of originally published coefficients was 0.55 (95% CI 0.49–0.60), which improved to 0.75 (95% CI 0.70–0.80) when using coefficients derived from the CARPE DIEM dataset. Model discriminatory performance declined slightly when analyses were limited to children <5 years of age (Supplemental Fig 4) and in the performance at an optimally-selected cutpoint (Supplemental Table 5).

We externally evaluated 5 previously published prediction models for radiographic pneumonia using a prospective cohort of children with suspected CAP. Of these, the models reported by Neuman, Lipsett, and Oostenbrink demonstrated the highest performance, with AUROCs >0.7 after fitting model variables to the validation dataset. No model provided clear discrimination between patients with and without radiographic pneumonia at a single cutoff.

The Neuman, Lipsett, and Oostenbrink models demonstrated the highest performance in this external validation, though the variables used in these models are different. The Neuman model requires extensive data, including historical and physical examination variables.14  Some of these variables, such as chest pain, may be difficult to ascertain in younger patients. Others, such as those relating to accessory muscle use and auscultatory findings, may be challenging to ascertain (eg, patients with high body mass index) and may be subject to limitations of interrater reliability.30  The model reported by Lipsett retained excellent performance with fewer clinical variables (age, presence of fever, wheeze, rales, and oxygen desaturation). Fewer clinical data are required by the Oostenbrink model, though this model requires the performance of a laboratory test (CRP) and 1 of the measures (ill-appearance) is inherently subjective.12  Reliance on a blood biomarker raises challenges with respect to institutional availability of point of care testing, time and cost, and discomfort required for venipuncture. Furthermore, the ability to apply the Oostenbrink model in a less ill-appearing population may be challenging, as blood testing is infrequently performed in these patients.

Predictive performance was weaker for the other 2 models we studied. The study by Mahabee-Gittens had limited performance after fitting the model with covariates derived from our dataset. This is likely because of the Mahabee-Gittens cohort being limited to a younger patient sample (2 months–5 years) where the radiographic pneumonia is less common.13  Additionally, the prediction of radiographic pneumonia among younger children is challenging because of the high incidence of bronchiolitis and viral pneumonia, which have similar clinical manifestations but variable findings on CXR.31  This combination of factors underscores the challenges in developing a prediction model for pediatric pneumonia limited to young children.

By externally validating decision models for radiographic pneumonia using a population of patients distinct in time and place but with a similar inclusion criteria from the derivation studies, this study demonstrates the limited generalizability of some of the studied models.32  A decline in performance is generally expected when externally validating a model in a population distinct from the derivation cohort.21  The Mahabee-Gittens model, for example, had a decline in the AUROC from the derivation study (0.81; 95% CI 0.75–0.87) compared with our validation study (0.58; 95% CI 0.52–0.64). In contrast, the model by Lipsett demonstrated an increase in AUROC from the original derivation study (0.71, 95% CI 0.67–0.75) to the present external validation (0.76; 95% CI 0.73–0.80), suggesting that this model may have better transportability. As such, this model, which carries the benefit of having only 6 variables and does not require laboratory biomarkers, may be most beneficial in clinical practice. If the predicted probability of disease is low, then other disease states may be considered, and radiography may be avoided. Alternatively, if the predicted probability is high, the patient may be treated for pneumonia without a confirmatory chest radiography.

Many clinical prediction models in emergency medicine, such as those used for young febrile infants for serious bacterial infection33  or for children with head trauma for clinically important traumatic brain injury,34  are used in a “1-way” fashion to determine patients at a low risk of an outcome but generally cannot be used to determine high risk. None of the radiographic pneumonia models we examined provided satisfactory discrimination between patients with and without radiographic pneumonia at a single threshold using a binary cutoff. This challenge, additionally noted in the CARPE DIEM rule,15  means that no clear, singular cutoff can be used to characterize patients as low- or high-risk in a dichotomous fashion to make decisions about CXR or empirical antimicrobial use. This likely relates to the heterogeneity of pneumonia presentation among children, differences in disease etiology, and variability in radiograph interpretation. At the statistically derived cutoffs, the Oostenbrink model demonstrated the greatest ability to “rule in” pneumonia with the highest specificity and positive likelihood ratio. The Neuman model demonstrated moderate ability to rule in but the highest performance to rule out, as reflected by the sensitivity and negative likelihood ratio. The internally derived CARPE DIEM radiographic pneumonia prediction model contains 3 variables: duration of fever, focal decreased breath sounds, and age.15  Evaluated on this subset of patients, this model demonstrated an AUROC of 0.81 (95% CI 0.66–0.84). At an optimally selected cutpoint, this model demonstrated a sensitivity of 73.1% and a specificity of 75.8%, though this model has not yet been externally validated. None of the rules demonstrated a likelihood ratio of greater than 5 or a negative likelihood ratio less than 0.2, which is generally suggestive of a substantial change in pretest to posttest probability. Although performance of the models declined slightly when models were examined in children younger than 5 years of age, most of these were not derived specifically in this age group, and thus a decline in performance is expected.

Prediction rules for pneumonia are not widely used in clinical practice. The reasons for their lack of use may include the lack of external validation, inability to sufficiently differentiate patients with and without radiographic pneumonia with a single threshold, and challenges related to implementation of best practice regarding defining diagnosis and etiology. Successful implementation would potentially decrease overuse of CXR3  and antibiotics35  and facilitate a decrease in unnecessary practice variation.36  We have taken the initial step to externally validate several of these previously derived models. A prediction rule may be implemented in multiple ways. Online calculators, such as in 1 recently described for urinary tract infection risk stratification in young febrile children,37  may allow for the dynamic risk stratification of children considered for a specific disease wherein the logistic regression formula is applied, and a predicted probability is returned to the user. Alternatively, a model may be directly embedded into the electronic medical record, similar to efforts that have been described with pediatric sepsis.38  These predicted probabilities may in turn guide management decisions (eg, a decision to perform CXR in children at intermediate risk, or a decision regarding use of antibiotics without CXR in children at very low or high risk). Finally, a future step may be to evaluate the validity of these models in other settings, including for children seen in the primary care offices or for admitted patients.

Our study has limitations. Certain variables, such as ill appearance used in Oostenbrink were not objectively defined. In addition, most of the studied models did not include complete data required for external validation (ie, model coefficients and intercepts),11,12,14  requiring additional steps to estimate the model performance on the study data. As such, we used 2 analytical approaches toward external validation. We performed multiple imputation for missing data in the CARPE DIEM dataset. This was particularly important in validating the Oostenbrink where a larger proportion of patients had missing CRP. However, the performance of the Oostenbrink model even when limited to those with CRP obtained was similar to the primary analysis performed with imputation. Additionally, all the models studied were developed for ED use and were validated on the data from another pediatric ED. Therefore, the applicability of these model to other settings remains unknown.

Implementation of well validated models can facilitate CXR or antibiotic use, particularly when CXR is not easily available. By providing a probability of radiographic pneumonia based on clinical factors, clinicians can integrate an evidence-based risk estimate into their clinical decision-making. In this study, 3 models revealed superior performance during validation.

Dr Ramgopal designed the study, interpreted the data, and drafted the initial manuscript, Drs Navanandan, Cotter, Ambroggio, Shah, and Ruddy conceptualized the study, designed the data collection instruments and participated in data collection, interpreted the results, and reviewed and revised the manuscript; Dr Lorenz conducted the statistical analyses and reviewed and revised the manuscript; Dr Florin conceptualized the study, designed the data collection instruments and participated in data collection, interpreted the results, reviewed and revised the manuscript, and supervised the study; and all authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.

This study externally validates 5 prediction models for pediatric radiographic pneumonia.

FUNDING: This study was supported by the National Institutes of Health and National Institute of Allergy and Infectious Diseases (K23AI121325 and R03AI147112 to T.A.F. and K01AI125413 to L.A.), the Gerber Foundation (to T.A.F.), National Institute of Health and National Center for Research Resources and Cincinnati Center for Clinical and Translational Science and Training (5KL2TR000078 to T.A.F.). The funders did not have any role in study design, data collection, statistical analysis, or manuscript preparation. Funded by the National Institutes of Health (NIH).

CONFLICT OF INTERST DISCLOSURES: The authors have indicated they have no financial relationships relevant to this article.

CARPE DIEM

Catalyzing Ambulatory Research in Pneumonia Etiology and Diagnostic Innovations in Emergency Medicine

AUROC

area under the ROC curve

CAP

community-acquired pneumonia

CCHMC

Cincinnati Children’s Hospital Medical Center

CXR

chest radiograph

ED

emergency department

1
McDermott
K
,
Stocks
C
,
Freeman
W
.
Overview of pediatric emergency department visits
.
2
Bradley
JS
,
Byington
CL
,
Shah
SS
, et al;
Pediatric Infectious Diseases Society and the Infectious Diseases Society of America
.
The management of community-acquired pneumonia in infants and children older than 3 months of age: clinical practice guidelines by the Pediatric Infectious Diseases Society and the Infectious Diseases Society of America
.
Clin Infect Dis
.
2011
;
53
(
7
):
e25
e76
3
Geanacopoulos
AT
,
Porter
JJ
,
Monuteaux
MC
,
Lipsett
SC
,
Neuman
MI
.
Trends in chest radiographs for pneumonia in emergency departments
.
Pediatrics
.
2020
;
145
(
3
):
e20192816
4
Elemraid
MA
,
Muller
M
,
Spencer
DA
, et al;
North East of England Paediatric Respiratory Infection Study Group
.
Accuracy of the interpretation of chest radiographs for the diagnosis of paediatric pneumonia
.
PLoS One
.
2014
;
9
(
8
):
e106051
5
Neuman
MI
,
Lee
EY
,
Bixby
S
, et al
.
Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children
.
J Hosp Med
.
2012
;
7
(
4
):
294
298
6
Kramer
MS
,
Etezadi-Amoli
J
,
Ciampi
A
, et al
.
Parents’ versus physicians’ values for clinical outcomes in young febrile children
.
Pediatrics
.
1994
;
93
(
5
):
697
702
7
Cohen
E
,
Rodean
J
,
Diong
C
, et al
.
Low-value diagnostic imaging use in the pediatric emergency department in the United States and Canada
.
JAMA Pediatr
.
2019
;
173
(
8
):
e191439
e191439
8
Gotta
V
,
Baumann
P
,
Ritz
N
, et al;
ProPAED study group
.
Drivers of antibiotic prescribing in children and adolescents with febrile lower respiratory tract infections
.
PLoS One
.
2017
;
12
(
9
):
e0185197
9
Kronman
MP
,
Hersh
AL
,
Feng
R
,
Huang
Y-S
,
Lee
GE
,
Shah
SS
.
Ambulatory visit rates and antibiotic prescribing for children with pneumonia, 1994-2007
.
Pediatrics
.
2011
;
127
(
3
):
411
418
10
Lipsett
SC
,
Monuteaux
MC
,
Bachur
RG
,
Finn
N
,
Neuman
MI
.
Negative chest radiography and risk of pneumonia
.
Pediatrics
.
2018
;
142
(
3
):
e20180236
11
Lynch
T
,
Platt
R
,
Gouin
S
,
Larson
C
,
Patenaude
Y
.
Can we predict which children with clinically suspected pneumonia will have the presence of focal infiltrates on chest radiographs?
Pediatrics
.
2004
;
113
(
3 Pt 1
):
e186
e189
12
Oostenbrink
R
,
Thompson
M
,
Lakhanpaul
M
,
Steyerberg
EW
,
Coad
N
,
Moll
HA
.
Children with fever and cough at emergency care: diagnostic accuracy of a clinical model to identify children at low risk of pneumonia
.
Eur J Emerg Med
.
2013
;
20
(
4
):
273
280
13
Mahabee-Gittens
EM
,
Grupp-Phelan
J
,
Brody
AS
, et al
.
Identifying children with pneumonia in the emergency department
.
Clin Pediatr (Phila)
.
2005
;
44
(
5
):
427
435
14
Neuman
MI
,
Monuteaux
MC
,
Scully
KJ
,
Bachur
RG
.
Prediction of pneumonia in a pediatric emergency department
.
Pediatrics
.
2011
;
128
(
2
):
246
253
15
Ramgopal
S
,
Ambroggio
L
,
Lorenz
D
,
Shah
S
,
Ruddy
R
,
Florin
T
.
A prediction model for pediatric radiographic pneumonia
.
Pediatrics
.
2022
;
149
(
1
):
e2021051405
16
Lipsett
SC
,
Hirsch
AW
,
Monuteaux
MC
,
Bachur
RG
,
Neuman
MI
.
Development of the novel pneumonia risk score to predict radiographic pneumonia in children
.
Pediatr Infect Dis J
.
2022
;
41
(
1
):
24
30
17
Van den Bruel
A
,
Aertgeerts
B
,
Bruyninckx
R
,
Aerts
M
,
Buntinx
F
.
Signs and symptoms for diagnosis of serious infections in children: a prospective study in primary care
.
Br J Gen Pract
.
2007
;
57
(
540
):
538
546
18
Craig
JC
,
Williams
GJ
,
Jones
M
, et al
.
The accuracy of clinical symptoms and signs for the diagnosis of serious bacterial infection in young febrile children: prospective cohort study of 15 781 febrile illnesses
.
BMJ
.
2010
;
340
:
c1594
19
Nijman
RG
,
Vergouwe
Y
,
Thompson
M
, et al
.
Clinical prediction model to aid emergency doctors managing febrile children at risk of serious bacterial infections: diagnostic study
.
BMJ
.
2013
;
346
:
f1706
20
Irwin
AD
,
Grant
A
,
Williams
R
, et al
.
Predicting risk of serious bacterial infections in febrile children in the emergency department
.
Pediatrics
.
2017
;
140
(
2
):
e20162853
21
Moons
KGM
,
Altman
DG
,
Reitsma
JB
, et al
.
Transparent Reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration
.
Ann Intern Med
.
2015
;
162
(
1
):
W1-73
22
Jain
S
,
Williams
DJ
,
Arnold
SR
, et al;
CDC EPIC Study Team
.
Community-acquired pneumonia requiring hospitalization among US children
.
N Engl J Med
.
2015
;
372
(
9
):
835
845
23
Nelson
KA
,
Morrow
C
,
Wingerter
SL
,
Bachur
RG
,
Neuman
MI
.
Impact of chest radiography on antibiotic treatment for children with suspected pneumonia
.
Pediatr Emerg Care
.
2016
;
32
(
8
):
514
519
24
World Health Organization
.
The Management of Acute Respiratory Infections in Children: Practical Guidelines for Outpatient Care
.
Geneva
:
World Health Organization
 ;
1997
25
Chamberlain
JM
,
Patel
KM
,
Ruttimann
UE
,
Pollack
MM
.
Pediatric risk of admission (PRISA): a measure of severity of illness for assessing the risk of hospitalization from the emergency department
.
Ann Emerg Med
.
1998
;
32
(
2
):
161
169
26
van Buuren
S
,
Groothuis-Oudshoorn
K
.
Mice: multivariate imputation by chained equations in R
.
J Stat Softw
.
2011
;
45
(
3
):
1
67
27
Van Calster
B
,
Nieboer
D
,
Vergouwe
Y
,
De Cock
B
,
Pencina
MJ
,
Steyerberg
EW
.
A calibration hierarchy for risk models was defined: from utopia to empirical data
.
J Clin Epidemiol
.
2016
;
74
:
167
176
28
Pepe
MS
,
Feng
Z
,
Huang
Y
, et al
.
Integrating the predictiveness of a marker with its performance as a classifier
.
Am J Epidemiol
.
2008
;
167
(
3
):
362
368
29
Perkins
NJ
,
Schisterman
EF
.
The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve
.
Am J Epidemiol
.
2006
;
163
(
7
):
670
675
30
Florin
TA
,
Ambroggio
L
,
Brokamp
C
, et al
.
Reliability of examination findings in suspected community-acquired pneumonia
.
Pediatrics
.
2017
;
140
(
3
):
e20170310
31
Shay
DK
,
Holman
RC
,
Newman
RD
,
Liu
LL
,
Stout
JW
,
Anderson
LJ
.
Bronchiolitis-associated hospitalizations among US children, 1980-1996
.
JAMA
.
1999
;
282
(
15
):
1440
1446
32
Steyerberg
EW
,
Harrell
FE
Jr
.
Prediction models need appropriate internal, internal-external, and external validation
.
J Clin Epidemiol
.
2016
;
69
:
245
247
33
Kuppermann
N
,
Dayan
PS
,
Levine
DA
, et al
.
A clinical prediction rule for stratifying febrile infants 60 days and younger at risk for serious bacterial infections
.
JAMA Pediatr
.
2019
;
173
(
4
):
342
351
34
Kuppermann
N
,
Holmes
JF
,
Dayan
PS
, et al;
Pediatric Emergency Care Applied Research Network (PECARN)
.
Identification of children at very low risk of clinically-important brain injuries after head trauma: a prospective cohort study
.
Lancet
.
2009
;
374
(
9696
):
1160
1170
35
Handy
LK
,
Bryan
M
,
Gerber
JS
,
Zaoutis
T
,
Feemster
KA
.
Variability in antibiotic prescribing for community-acquired pneumonia
.
Pediatrics
.
2017
;
139
(
4
):
e20162331
36
McLaren
SH
,
Mistry
RD
,
Neuman
MI
,
Florin
TA
,
Dayan
PS
.
Guideline adherence in diagnostic testing and treatment of community-acquired pneumonia in children
.
Pediatr Emerg Care
.
2021
;
37
(
10
):
485
493
37
Shaikh
N
,
Hoberman
A
,
Hum
SW
, et al
.
Development and validation of a calculator for estimating the probability of urinary tract infection in young febrile children
.
JAMA Pediatr
.
2018
;
172
(
6
):
550
556
38
Eisenberg
M
,
Freiman
E
,
Capraro
A
, et al
.
Comparison of manual and automated sepsis screening tools in a pediatric emergency department
.
Pediatrics
.
2021
;
147
(
2
):
e2020022590

Supplementary data