Video Abstract

Video Abstract

Close modal
BACKGROUND:

Polysomnography is central to the diagnosis and management of childhood obstructive sleep apnea (OSA). However, it is not known whether the treatment-related outcomes of OSA are causally associated with its resolution or changes in severity as determined by polysomnography.

METHODS:

Polysomnographic, cognitive, behavioral, quality-of life, and health outcomes at baseline and at 7 months were obtained from the Childhood Adenotonsillectomy Trial, a randomized trial comparing the outcomes of early adenotonsillectomy to watchful waiting in children with OSA. We used causal mediation analysis to measure the changes in 18 outcomes independently attributable to polysomnographic resolution or changes in severity after adjusting for confounding variables.

RESULTS:

A total of 398 children aged 5 to 9 years were included. A total of 244 (61%) experienced resolution of OSA at follow-up. Polysomnographic resolution of the condition accounted for small but significant proportions of changes in symptoms (proportion mediated [95% confidence interval] 0.13 [0.07 to 0.21]; P < .001) and disease-specific quality of life (0.11 [0.04 to 0.20]; P = .004). Changes in polysomnographic severity similarly mediated symptom score (proportion mediated 0.18 [0.11 to 0.26]; P < .001) and disease-specific quality-of-life outcomes (0.20 [0.10 to 0.31]; P = .004). Importantly, significant mediation effects were not identified for any of the other 16 outcomes. No significant interactions were observed between the trial arms.

CONCLUSIONS:

The majority of the treatment-related changes in outcomes of OSA in school-aged children are not causally attributable to polysomnographic resolution or changes in its severity. These results underscore the limited utility of polysomnographic thresholds in the management of childhood OSA.

What’s Known on This Subject:

Polysomnography is central to the diagnosis and stratification of obstructive sleep apnea in children. Severity thresholds derived from polysomnography are used widely to initiate and monitor treatment of chronic upper airway obstruction.

What This Study Adds:

The outcomes of treatment in school-aged children with obstructive sleep apnea are not independently associated with the changes in severity or resolution of the condition as determined by polysomnography.

Polysomnography is used to quantify the severity of obstructive sleep apnea (OSA) in children and is central to the treatment guidelines from the American Academy of Pediatrics,1  the American Academy of Sleep Medicine,2  and the American Academy of Otolaryngology—Head and Neck Surgery.3  These guidelines generally delineate empirical classes of polysomnographic severity of OSA on the basis of the apnea hypopnea index (AHI) to initiate and/or monitor treatment. In children undergoing adenotonsillectomy for the treatment of OSA, postoperative polysomnography is recommended when the preoperative AHI exceeds 5.2  Recent guidelines from the European Respiratory Society (ERS) support treatment of an AHI exceeding 5, with the principal aim of complete polysomnographic resolution of OSA.4  Children with an AHI >30 in the Childhood Adenotonsillectomy Trial (CHAT) underwent urgent adenotonsillectomy.5  Given that the majority of the 500 000 adenotonsillectomies are performed annually in the United States for childhood OSA,6  there is an urgent need to rationalize the indications for pediatric polysomnography given the associated untenable costs.7  Specifically, the causal association between the treatment outcomes of childhood OSA and polysomnographic changes remains unproven.

The CHAT remains the only randomized trial to date used to compare the benefit of early adenotonsillectomy over watchful waiting in children aged 5 to 9 years and diagnosed with OSA.8  The trial design facilitates the assessment of the isolated impact of polysomnographic improvement or resolution of OSA on the causal pathways leading to changes in outcomes after treatment. Although the CHAT study did not conclusively establish the benefits of early surgery over watchful waiting in the neurocognitive domain, significantly greater improvements were demonstrated for caregiver-reported outcomes of behavior, quality of life, and symptoms in children in the surgical arm compared to watchful waiting. Importantly, the investigators hypothesized that the changes in OSA severity mediate the changes in outcomes after treatment.5  The biological premise of this hypothesis is that the resolution or reduction of the severity of upper airway obstruction as measured by polysomnography may causally account for improved outcomes after treatment. Furthermore, on the basis of the observation that OSA resolved spontaneously in half of the children who underwent watchful waiting, both the trial investigators8  as well as others9  emphasized the potential opportunity to avoid surgery in this subset. Together, these approaches suggest that the change in polysomnographic severity of OSA represents a possible primary surrogate for the effectiveness of its treatment, with the goal being complete resolution. We therefore analyzed the trial data to examine the causal impact of polysomnographic resolution of OSA or changes in its severity on treatment-related outcomes of the condition.

The rationale, design, trial procedures, and results related to the CHAT study have been described by Marcus et al8  and Redline et al.5  The salient aspects of the trial methodology are provided in the following sections.

The CHAT study was designed to test the hypothesis that adenotonsillectomy is superior to watchful waiting in improving outcomes related to childhood OSA. The study investigators enrolled children on the basis of age (5–9 years) and polysomnographic definitions (AHI >2 or obstructive apnea index [OAI] >1 per hour) based on suitability for adenotonsillectomy. The rationale for the use of these definitions, based on normative data representing the burden of upper airway obstruction, are provided in the trial design.5  Children were excluded if they had polysomnographic evidence of very severe OSA (AHI >3010  or OAI >20) or prolonged hypoxemia. The trial data were obtained from the National Sleep Research Resource (https://sleepdata.org) through a data use agreement.11  The primary and secondary outcomes were measured at baseline and follow-up at 7 months. Table 1 summarizes these outcomes, along with their respective domains and the interpretation of the results.

TABLE 1

Differences Between Children Grouped by Whether They Had Resolution of OSA in the CHAT As Determined by Follow-up Polysomnography

CharacteristicWith Resolution (n = 244)Without Resolution (n = 154)Pa
Age, y, mean (95% CI) 6.5 (6.3 to 6.6) 6.7 (6.5 to 6.9) .08 
Male sex, n (%) 114 (47) 80 (52) .36 
Race, n (%)   .05 
 White 101 (41) 43 (28)  
 African American 115 (47) 97 (63)  
 Other 28 (12) 14 (9)  
Hispanic ethnicity, n (%) 19 (8) 12 (8) .99 
BMI percentile score, mean (95% CI) 66.3 (62.5 to 70.3) 76.2 (71.5 to 80.9) <.001 
Wt class, n (%)    
 Obeseb 62 (25) 72 (47) <.001 
 Failure to thrive 10 (4) 4 (3) — 
Maternal education less than high school, n (%) 70 (29) 55 (36) .35 
Annual household income <$30 000, n (%) 91 (37) 64 (42) .17 
AHI score at baseline, events per h, mean (95% CI) 5.8 (5.1 to 6.4) 8.3 (7.4 to 9.2) <.001 
AHI score at follow-up, events per h, mean (95% CI) 0.8 (0.7 to 0.8) 8.7 (6.9 to 10.5) <.001 
Adenotonsillectomy, n (%) 154 (63) 40 (26) <.001 
Follow-up AHI ≥2, n (%) 0 (0) 148 (96) — 
Follow-up AHI ≥5, n (%) 0 (0) 77 (50) — 
Follow-up AHI ≥10, n (%) 0 (0) 34 (22) — 
CharacteristicWith Resolution (n = 244)Without Resolution (n = 154)Pa
Age, y, mean (95% CI) 6.5 (6.3 to 6.6) 6.7 (6.5 to 6.9) .08 
Male sex, n (%) 114 (47) 80 (52) .36 
Race, n (%)   .05 
 White 101 (41) 43 (28)  
 African American 115 (47) 97 (63)  
 Other 28 (12) 14 (9)  
Hispanic ethnicity, n (%) 19 (8) 12 (8) .99 
BMI percentile score, mean (95% CI) 66.3 (62.5 to 70.3) 76.2 (71.5 to 80.9) <.001 
Wt class, n (%)    
 Obeseb 62 (25) 72 (47) <.001 
 Failure to thrive 10 (4) 4 (3) — 
Maternal education less than high school, n (%) 70 (29) 55 (36) .35 
Annual household income <$30 000, n (%) 91 (37) 64 (42) .17 
AHI score at baseline, events per h, mean (95% CI) 5.8 (5.1 to 6.4) 8.3 (7.4 to 9.2) <.001 
AHI score at follow-up, events per h, mean (95% CI) 0.8 (0.7 to 0.8) 8.7 (6.9 to 10.5) <.001 
Adenotonsillectomy, n (%) 154 (63) 40 (26) <.001 
Follow-up AHI ≥2, n (%) 0 (0) 148 (96) — 
Follow-up AHI ≥5, n (%) 0 (0) 77 (50) — 
Follow-up AHI ≥10, n (%) 0 (0) 34 (22) — 

All categorical variables are shown by n (%) and continuous variables by mean (95% CIs). Resolution of OSA was defined by follow-up polysomnography revealing an AHI <2 and an OAI <1. —, not applicable.

a

P < .05 indicates statistical significance.

b

Children with a BMI ≥95th percentile were categorized as obese. Children with a BMI in the fifth percentile or lower were classified as failure to thrive. The percentile designation was obtained from the Centers for Disease Control growth charts for children aged 2 to 20 y.

The primary outcome was the attention and executive function score on the Developmental Neuropsychological Assessment (NEPSY).12  Behavioral outcomes were measured by the Conners’ Rating Scales–Revised: Long Version Global Index,13  the Behavior Rating Inventory of Executive Function (BRIEF),14  the Conners’ Comprehensive Behavior Rating Scale (attention-deficit/hyperactivity disorder),13  and the Child Behavior Checklist (CBCL).15  Both global and disease-specific quality-of-life measures were obtained by using the Pediatric Quality of Life Inventory (PedsQL)16  and the Obstructive Sleep Apnea-18 (OSA-18) scale.17  The symptoms of OSA were assessed by the Pediatric Sleep Questionnaire Sleep-Related Breathing Disorder (PSQ-SRBD)18  scale and the modified Epworth Sleepiness Scale (ESS).19  Generalized intellectual functioning was determined by using the Differential Ability Scales-II.20 

Additional assessments included changes in BMI percentile,21  systolic and diastolic blood pressure measurements (measured in millimeters of mercury), and serum levels of C-reactive protein (measured in micrograms per milliliter) with a detection threshold of 0.15 μg/mL, a nonspecific marker for inflammation.22  Changes in the Homeostasis Model Assessment for Insulin Resistance (HOMA-IR), a surrogate marker of insulin resistance, was also calculated by using the equation HOMA-IR = fasting insulin (measured in microunits per milliliter) × fasting glucose (measured in milligrams per deciliter)/405.23  The rationale for the choice of these variables is based on the disease domains potentially affected by untreated OSA.5  In the current study, polysomnographic resolution of OSA was defined by an AHI <2 and OAI <1 at follow-up, identical to the CHAT study.2426  The change in polysomnographic severity of OSA was the difference between the follow-up and baseline AHI scores. These mediators were chosen on the basis of (1) the finding that half of the children in the nonsurgical arm of the trial experienced resolution of the condition supporting the conclusion that surgery could be potentially avoided in these children,8  and (2) change in AHI indicates change in OSA severity, which was prespecified in the trial design as a mediator for other outcomes.5  Additionally, a more robust definition of OSA resolution represented by the resolution of obstruction (follow-up AHI <1.527  and OAI <1) along with the absence of hypoxemia (no portion of sleep with oximetry reading <92%) was also evaluated as a mediator for other outcomes.

The characteristics of the study population including age, sex, race, ethnicity, BMI percentile score, socioeconomic characteristics, and OSA severity defined by the AHI score were compared between 2 groups of children defined by whether they experienced resolution of OSA at follow-up. We then compared the 18 outcomes between the trial arms by performing an analysis of covariance adjusted for the stratification factors of age, race, and weight status.8 

We performed a mediation analysis by modeling the effect of treatment on each of the 18 outcomes in Table 1 by including each of the mediating variables separately. The mediator represents the pathway through which the intervention influences the change in outcome. Causal mediation analysis was described by Imai et al28  on the basis of the well-known conceptual framework of Baron and Kenny.29  Methodologic aspects of the analysis are described in the Supplemental Information. The analysis was undertaken by using the mediation package for R (https://cran.r-project.org, version 3.5.1).30  The first mediation model described in the current study decomposes the total effect of treatment into mediated effect via polysomnographic resolution of OSA and direct effect unrelated to the mediator, as shown in Fig 1. The second model similarly uses the change in AHI as a mediator.

FIGURE 1

General framework of causal mediation analysis. A, Conventional estimation of treatment effect c identifies the change in outcome measure (ΔPSQ-SRBD) as a function of the intervention such as eAT or WWSC. B, Causal mediation analysis is used to identify 3 separate pathways that include (1) the effect c′ of the treatment (eAT or WWSC) on the outcome (PSQ-SRBD), (2) the effect a′ of the treatment on mediator polysomnographic resolution of OSA, and (3) the effect b′ of the mediator on the outcome. The treatment effect that passes through the mediator is called the causal mediation effect, the unmediated effect is called the direct effect, and the sum of all effects is termed the total effect. C, The estimated ACME, the ADE, and the total effect for the outcome PSQ-SRBD. All effects are averaged over the entire study population with the assumption that there are no interactions between the 2 arms of the trial for ACME. D, The changes in ACME resulting from changes in the sensitivity parameter ρ. The actual ACME is shown by the point on the vertical axis intersecting the dotted horizontal line. The mediator is the resolution of OSA defined by an AHI <2 and OAI <1. ρ would need to assume a value of −0.4 before resulting in a change in direction of ACME. Error bars and gray shading represent 95% CIs. ACME, average causal mediation effect; ADE, average direct effect; eAT, early adenotonsillectomy; WWSC, watchful waiting with supportive care. a Indicates the lower limit of the correlation between residuals of the mediator and the outcome regression.

FIGURE 1

General framework of causal mediation analysis. A, Conventional estimation of treatment effect c identifies the change in outcome measure (ΔPSQ-SRBD) as a function of the intervention such as eAT or WWSC. B, Causal mediation analysis is used to identify 3 separate pathways that include (1) the effect c′ of the treatment (eAT or WWSC) on the outcome (PSQ-SRBD), (2) the effect a′ of the treatment on mediator polysomnographic resolution of OSA, and (3) the effect b′ of the mediator on the outcome. The treatment effect that passes through the mediator is called the causal mediation effect, the unmediated effect is called the direct effect, and the sum of all effects is termed the total effect. C, The estimated ACME, the ADE, and the total effect for the outcome PSQ-SRBD. All effects are averaged over the entire study population with the assumption that there are no interactions between the 2 arms of the trial for ACME. D, The changes in ACME resulting from changes in the sensitivity parameter ρ. The actual ACME is shown by the point on the vertical axis intersecting the dotted horizontal line. The mediator is the resolution of OSA defined by an AHI <2 and OAI <1. ρ would need to assume a value of −0.4 before resulting in a change in direction of ACME. Error bars and gray shading represent 95% CIs. ACME, average causal mediation effect; ADE, average direct effect; eAT, early adenotonsillectomy; WWSC, watchful waiting with supportive care. a Indicates the lower limit of the correlation between residuals of the mediator and the outcome regression.

Close modal

A key assumption supporting the framework of causal mediation analysis is sequential ignorability. In the first part of the assumption, the mediator is randomly assigned given the treatment as part of a randomized trial. The second part infers that there are no unmeasured baseline variables confounding the relationships between the treatment, the mediator, and the outcome. This is potentially satisfied by controlling for all baseline covariates potentially confounding these relationships. The magnitude of departure from the second assumption is estimated by a sensitivity analysis assessing the contribution of unmeasured confounding variables (Fig 1).28 

The causal mediation analysis yields average causally mediated, direct, and total effects as well as the proportion of the mediated effects (of the total effect) as point estimates with 95% confidence intervals (CIs) and the associated P values. For the estimation of the mediation effects, a lack of significant interaction between the trial arms is assumed (ie, the mediation effects are independent of the trial arm and averaged over the entire study population). We tested the effect of relaxing this assumption by identifying the outcomes for which the average causal effects were significantly different between the trial arms.

With 400 children enrolled, the CHAT study was powered to detect differences in the primary outcome between children grouped by the type of treatment.8  For tests of relationships between each arm of the mediation analysis, an individual effect size of 0.30 (medium) as measured by partial correlation is assumed (treatment to mediator, mediator to outcome, and treatment to outcome; Fig 1).31  A sample size of 400 and type 1 error rate of 5% yield a power of nearly 1 for the mediation analysis. Additionally, 2 types of sensitivity analysis were performed; the first was performed to account for the effects of missing values and the second to determine the possible existence of unobserved pretreatment covariates when significant causal mediation effects were identified. The methodologic details of the 2 sensitivity analyses are described in the Supplemental Information. P < .05 was considered significant throughout.

The primary results of the CHAT study have been published.8  Of the 464 children enrolled in the study and who underwent random assignment, 398 were included in the current study on the basis of the availability of complete polysomnographic data. Supplemental Fig 2 reveals patient flow into the current study from the CHAT. The baseline characteristics of children in the study are shown in Table 2. By using the criteria defined in the CHAT, a significantly greater number of children (244 [61%]) had polysomnographic resolution of OSA compared with those who did not experience resolution (154 [39%]; P < .001). Of those without polysomnographic resolution, 96% had an AHI ≥2 and 50% had an AHI ≥5 on follow-up polysomnography. In this group, there were a significantly greater proportion of African American children (63% vs 47%; P = .001) and obese children (47% vs 25%; P < .001) compared with the group without resolution. The mean baseline AHI in the group with complete resolution (5.8 [95% CI, 5.1 to 6.4]) was significantly lower than the group without resolution (8.3 [95% CI, 7.4 to 9.2]; P < .001).

TABLE 2

Outcomes Measured in the CHAT

TestDomainOutcome VariableTest RecipientNormative Mean or RangeInterpretation of Higher Value
NEPSY Neurocognitive Attention and executive function score Child 100 ± 15 Better functioning 
Conners’ Rating Scales–Revised: Long Version Global Index Behavior Restless-impulsive and emotional lability factor sets Caregiver 50 ± 10 Worse behavior 
Conners’ Rating Scale–Revised: Long Version Global Index Behavior Restless-impulsive and emotional lability factor sets Teacher 50 ± 10 Worse behavior 
BRIEF Behavior Global Executive Composite T score Caregiver 50 ± 10 Worse behavior 
BRIEF Behavior Global Executive Composite T score Teacher 50 ± 10 Worse behavior 
Pediatric Sleep Questionnaire Sleep-related symptoms Sleep-Related Breathing Disorder scale total score Caregiver 0.2 ± 0.1 Worse symptoms 
PedsQL Overall quality of life Total score Caregiver 78 ± 16 Better quality of life 
Differential Ability Scales-II Neurocognitive General conceptual ability composite score Child 100 ± 15 Better functioning 
Conners’ Behavior Rating Scales (attention-deficit/hyperactivity disorder) Behavior T score Caregiver 50 ± 10 Worse behavior 
Conners’ Behavior Rating Scale (attention-deficit/hyperactivity disorder) Behavior T score Teacher 50 ± 10 Worse behavior 
CBCL Behavior Total score Caregiver 50 ± 10 Worse behavior 
OSA-18 Disease-specific quality of life Total score Caregiver NA Worse quality of life 
Modified ESS Sleepiness Summary score Caregiver <10 Worse sleepiness 
BMI Growth Percentile score Child 5–95 ≥95: obese 
<5: failure to thrive 
C-reactive protein Inflammation Serum level (μg/mL) Child <1 Worse systemic inflammation 
HOMA-IR Physiology Fasting insulin (μU/mL) × fasting glucose (mg/dL)/405 Child 2.22–2.67 Worse insulin resistance 
Blood pressure Physiology Systolic blood pressure Child NAa NAa 
Blood pressure Physiology Diastolic blood pressure Child NAa NAa 
TestDomainOutcome VariableTest RecipientNormative Mean or RangeInterpretation of Higher Value
NEPSY Neurocognitive Attention and executive function score Child 100 ± 15 Better functioning 
Conners’ Rating Scales–Revised: Long Version Global Index Behavior Restless-impulsive and emotional lability factor sets Caregiver 50 ± 10 Worse behavior 
Conners’ Rating Scale–Revised: Long Version Global Index Behavior Restless-impulsive and emotional lability factor sets Teacher 50 ± 10 Worse behavior 
BRIEF Behavior Global Executive Composite T score Caregiver 50 ± 10 Worse behavior 
BRIEF Behavior Global Executive Composite T score Teacher 50 ± 10 Worse behavior 
Pediatric Sleep Questionnaire Sleep-related symptoms Sleep-Related Breathing Disorder scale total score Caregiver 0.2 ± 0.1 Worse symptoms 
PedsQL Overall quality of life Total score Caregiver 78 ± 16 Better quality of life 
Differential Ability Scales-II Neurocognitive General conceptual ability composite score Child 100 ± 15 Better functioning 
Conners’ Behavior Rating Scales (attention-deficit/hyperactivity disorder) Behavior T score Caregiver 50 ± 10 Worse behavior 
Conners’ Behavior Rating Scale (attention-deficit/hyperactivity disorder) Behavior T score Teacher 50 ± 10 Worse behavior 
CBCL Behavior Total score Caregiver 50 ± 10 Worse behavior 
OSA-18 Disease-specific quality of life Total score Caregiver NA Worse quality of life 
Modified ESS Sleepiness Summary score Caregiver <10 Worse sleepiness 
BMI Growth Percentile score Child 5–95 ≥95: obese 
<5: failure to thrive 
C-reactive protein Inflammation Serum level (μg/mL) Child <1 Worse systemic inflammation 
HOMA-IR Physiology Fasting insulin (μU/mL) × fasting glucose (mg/dL)/405 Child 2.22–2.67 Worse insulin resistance 
Blood pressure Physiology Systolic blood pressure Child NAa NAa 
Blood pressure Physiology Diastolic blood pressure Child NAa NAa 

The outcome domains as well as normal values are shown in subsequent columns. The fourth column indicates the recipient of the test. ± represents values of SD. The interpretation of the higher value of each outcome is shown in the last column. T scores provide information about a child’s score relative to the reference sample. NA, not applicable.

a

Varies by age and sex.

Early adenotonsillectomy resulted in significantly greater changes in 12 out of 18 outcomes compared to watchful waiting (Table 3). Of these, the largest effect sizes were observed for the changes in symptoms measured by the PSQ-SRBD (Cohen’s d, 1.35; P < .001) and disease-specific quality of life assessed by OSA-18 (Cohen’s d, 0.94; P < .001).

TABLE 3

Outcome Measures from the CHAT After Random Assignment

OutcomeaNormative MeanMean (95% CI)PEffect Size
Watchful WaitingEarly Adenotonsillectomy
BaselineChange From Baseline to 7 moBaselineChange From Baseline to 7 mo
NEPSY attention and executive function score (n = 390) 100 ± 15 101.0 (99.0 to 103.0) 5.1 (3.3 to 7.0) 102.0 (99.7 to 104.2) 6.9 (−4.1 to −1.2) .07 0.13 
Conners’ Rating Scale score        
 Caregiver rating (n = 385) 50 ± 10 52.5 (50.9 to 54.1) −0.4 (−1.7 to 0.93) 52.2 (50.6 to 53.9) −2.7 (−4.1 to −1.2) .001 0.23 
 Teacher rating (n = 209) NA 54.9 (52.8 to 57.0) −1.8 (−3.5 to −0.1) 55.3 (53.1 to 57.5) −5.2 (−7.1 to −3.3) .04 0.27 
BRIEF score        
 Caregiver rating (n = 385) 50 ± 10 50.0 (48.4 to 51.6) 0.3 (−0.9 to 1.5) 50.1 (48.5 to 51.6) −3.3 (−4.5 to −2.1) <.001 0.41 
 Teacher rating (n = 204) NA 56.3 (54.3 to 58.3) −0.9 (−2.6 to 0.8) 56.9 (54.8 to 59.0) −2.4 (−4.1 to −0.6) .18 0.12 
PSQ-SRBD score (n = 389) 0.2 ± 0.1 0.5 (0.5 to 0.5) 0.0 (−0.1 to −0.0) 0.5 (0.5 to 0.5) −0.3 (−0.3 to −0.3) <.001 1.35 
PedsQL score (n = 392) 78 ± 16 78.1 (76.0 to 80.2) 0.8 (−1.0 to 2.6) 78.3 (76.1 to 80.5) 6.1 (4.0 to 8.1) <.001 0.38 
DAS-2 GCA score (n = 390) 100 ± 15 95.7 (94.1 to 97.3) 1.7 (−0.6 to 4.0) 98.6 (97.0 to 100.3) −2.0 (−4.4 to −0.3) .37 0.06 
Conners’ ADHD Index T score        
 Caregiver rating (n = 385) 50 ± 10 53.2 (51.7 to 54.8) −0.6 (−2.6 to 1.6) 52.9 (51.3 to 54.5) −2.7 (−5.0 to −0.6) .003 0.22 
 Teacher rating (n = 209) NA 54.2 (52.2 to 56.2) 0.5 (−2.3 to 3.2) 55.6 (53.5 to 57.6) −3.4 (−6.2 to −0.5) .01 0.35 
CBCL total problems T score (n = 374) 50 ± 10 53.1 (51.6 to 54.6) −0.9 (−3.1 to −1.2) 52.4 (50.8 to 54.0) −3.7 (−6.1 to −1.3) <.001 0.26 
OSA-18 score (n = 393) NA 54.1 (51.5 to 56.8) −4.6 (−8.5 to −0.8) 53.3 (50.7 to 55.8) −21.5 (−24.8 to −18.1) <.001 0.94 
ESS (n = 393) <10 7.8 (7.0 to 8.6) −0.3 (−1.3 to 0.8) 7.5 (6.7 to 8.2) −2.1 (−3.1 to −1.1) <.001 0.38 
BMI percentile (n = 391) 5–85 70.0 (65.9 to 74.2) 4.0 (−1.8 to 9.8) 70.3 (65.8 to 74.8) 6.5 (0.6 to 12.5) .02 0.21 
C-reactive protein (n = 262), μg/mL <1 2.5 (1.1 to 3.8) −0.9 (−2.3 to 0.5) 1.9 (1.4 to 2.4) 0.2 (−0.8 to 1.3) .50 0.01 
HOMA-IR (n = 261) 2.22–2.67 2.0 (1.6 to 2.4) 0.0 (−0.5 to 0.5) 1.8 (1.5 to 2.1) 0.5 (−0.1 to 1.1) .04 0.20 
Systolic blood pressure (n = 391), mm Hg NA 97.8 (96.7 to 99.0) 0.8 (−0.8 to 2.5) 96.6 (95.3 to 98.0) 1.2 (−0.6 to 3.1) .56 0.06 
Diastolic blood pressure (n = 391), mm Hg NA 62.4 (61.3 to 63.4) 0.7 (−0.9 to 2.2) 62.1 (60.9 to 63.2) 1.1 (−0.5 to 2.7) .81 0.05 
OutcomeaNormative MeanMean (95% CI)PEffect Size
Watchful WaitingEarly Adenotonsillectomy
BaselineChange From Baseline to 7 moBaselineChange From Baseline to 7 mo
NEPSY attention and executive function score (n = 390) 100 ± 15 101.0 (99.0 to 103.0) 5.1 (3.3 to 7.0) 102.0 (99.7 to 104.2) 6.9 (−4.1 to −1.2) .07 0.13 
Conners’ Rating Scale score        
 Caregiver rating (n = 385) 50 ± 10 52.5 (50.9 to 54.1) −0.4 (−1.7 to 0.93) 52.2 (50.6 to 53.9) −2.7 (−4.1 to −1.2) .001 0.23 
 Teacher rating (n = 209) NA 54.9 (52.8 to 57.0) −1.8 (−3.5 to −0.1) 55.3 (53.1 to 57.5) −5.2 (−7.1 to −3.3) .04 0.27 
BRIEF score        
 Caregiver rating (n = 385) 50 ± 10 50.0 (48.4 to 51.6) 0.3 (−0.9 to 1.5) 50.1 (48.5 to 51.6) −3.3 (−4.5 to −2.1) <.001 0.41 
 Teacher rating (n = 204) NA 56.3 (54.3 to 58.3) −0.9 (−2.6 to 0.8) 56.9 (54.8 to 59.0) −2.4 (−4.1 to −0.6) .18 0.12 
PSQ-SRBD score (n = 389) 0.2 ± 0.1 0.5 (0.5 to 0.5) 0.0 (−0.1 to −0.0) 0.5 (0.5 to 0.5) −0.3 (−0.3 to −0.3) <.001 1.35 
PedsQL score (n = 392) 78 ± 16 78.1 (76.0 to 80.2) 0.8 (−1.0 to 2.6) 78.3 (76.1 to 80.5) 6.1 (4.0 to 8.1) <.001 0.38 
DAS-2 GCA score (n = 390) 100 ± 15 95.7 (94.1 to 97.3) 1.7 (−0.6 to 4.0) 98.6 (97.0 to 100.3) −2.0 (−4.4 to −0.3) .37 0.06 
Conners’ ADHD Index T score        
 Caregiver rating (n = 385) 50 ± 10 53.2 (51.7 to 54.8) −0.6 (−2.6 to 1.6) 52.9 (51.3 to 54.5) −2.7 (−5.0 to −0.6) .003 0.22 
 Teacher rating (n = 209) NA 54.2 (52.2 to 56.2) 0.5 (−2.3 to 3.2) 55.6 (53.5 to 57.6) −3.4 (−6.2 to −0.5) .01 0.35 
CBCL total problems T score (n = 374) 50 ± 10 53.1 (51.6 to 54.6) −0.9 (−3.1 to −1.2) 52.4 (50.8 to 54.0) −3.7 (−6.1 to −1.3) <.001 0.26 
OSA-18 score (n = 393) NA 54.1 (51.5 to 56.8) −4.6 (−8.5 to −0.8) 53.3 (50.7 to 55.8) −21.5 (−24.8 to −18.1) <.001 0.94 
ESS (n = 393) <10 7.8 (7.0 to 8.6) −0.3 (−1.3 to 0.8) 7.5 (6.7 to 8.2) −2.1 (−3.1 to −1.1) <.001 0.38 
BMI percentile (n = 391) 5–85 70.0 (65.9 to 74.2) 4.0 (−1.8 to 9.8) 70.3 (65.8 to 74.8) 6.5 (0.6 to 12.5) .02 0.21 
C-reactive protein (n = 262), μg/mL <1 2.5 (1.1 to 3.8) −0.9 (−2.3 to 0.5) 1.9 (1.4 to 2.4) 0.2 (−0.8 to 1.3) .50 0.01 
HOMA-IR (n = 261) 2.22–2.67 2.0 (1.6 to 2.4) 0.0 (−0.5 to 0.5) 1.8 (1.5 to 2.1) 0.5 (−0.1 to 1.1) .04 0.20 
Systolic blood pressure (n = 391), mm Hg NA 97.8 (96.7 to 99.0) 0.8 (−0.8 to 2.5) 96.6 (95.3 to 98.0) 1.2 (−0.6 to 3.1) .56 0.06 
Diastolic blood pressure (n = 391), mm Hg NA 62.4 (61.3 to 63.4) 0.7 (−0.9 to 2.2) 62.1 (60.9 to 63.2) 1.1 (−0.5 to 2.7) .81 0.05 

ADHD, attention-deficit/hyperactivity disorder; DAS-2 GCA, Differential Ability Scales–Second Edition (General Conceptual Ability); NA, not available.

a

Outcome variables included scores on the NEPSY, the Conners’ Behavior Rating Scales, and the BRIEF, with summary measures obtained from the primary caregiver and the teacher, the PSQ-SRBD, and the PedsQL. These were reported in the original study. Additional outcome variables included scores on the DAS-2 GCA, the Conners’ ADHD Index T scores obtained from the primary caregiver and the teacher, the CBCL, the OSA-18 inventory, and the modified ESS. T scores provide information about a child’s score relative to the reference sample. Physiologic outcomes were represented by changes in BMI percentile score, C-reactive protein values, HOMA-IR, and the systolic and diastolic blood pressures. Population means along with SDs are provided where normative data are available. All other values are reported as means and 95% CIs. P values are adjusted for effects of age, race, and wt. Effect size is calculated by the Cohen’s d values and follows the following conventions: 0.20–0.49 (small), 0.50–0.79 (medium), and ≥0.80 (large). Numbers in parentheses listed in the outcome column represent sample sizes of complete data sets. P < .05 was considered significant. A complete description of each outcome is provided in Table 1.

Table 4 shows the results of mediation analysis averaged over both arms of the trial when polysomnographic resolution of OSA was considered the mediator. The total effects of treatment, defined as the sum of mediation and direct effects, were significant for 12 of the 18 outcomes. Small but statistically significant average mediation effects were identified for the PSQ-SRBD (proportion mediated: 0.13; 95% CI, 0.07 to 0.21; P < .001) and OSA-18 (proportion mediated: 0.11 [95% CI, 0.04 to 0.20]; P = .004). Similarly, the change in AHI between baseline and follow-up revealed significant mediation effects (Table 5) for the PSQ-SRBD (proportion mediated: 0.18 [95% CI, 0.11 to 0.26]; P < .001) and OSA-18 (proportion mediated: 0.20 [95% CI, 0.10 to 0.31]; P < .001). However, 16 out of 18 outcomes were not associated with significant contribution from either mediator. When OSA resolution was defined in a more robust fashion as an AHI <1.5, OAI <1, and absence of hypoxemia as measured by oximetry (Supplemental Table 6), the mediation effects were limited to the PSQ-SRBD (proportion mediated: 0.06 [95% CI, 0.02 to 0.12]; P = .005). Investigation of interaction between the trial arms revealed that polysomnographic resolution in children undergoing surgery mediated weight gain as opposed to weight loss in children undergoing watchful waiting (difference in mediation effect: 2.02 [95% CI, 0.40 to 3.85]; P = .01; Supplemental Table 7 for OSA resolution as mediator and Supplemental Tables 8 and 9 for other mediators). None of the other 17 outcomes revealed statistically significant interaction effects.

TABLE 4

Results of Mediation Analysis for 18 Outcomes Assessed in the CHAT

OutcomeaACMEADETotal EffectbProportion Mediatedc
Mean Estimate (95% CI)PMean Estimate (95% CI)PMean Estimate (95% CI)PMean Estimate (95% CI)P
NEPSY attention and executive function score (n = 390) −0.20 (−1.42 to 1.02) .74 2.53 (−0.41 to 5.67) .10 2.32 (−0.41 to 5.30) .10 −0.09 (−3.69 to 0.83) .76 
Conners’ Rating Scale score         
 Caregiver rating (n = 385) −0.07 (−0.78 to 0.60) .88 −2.94 (−4.73 to −1.11) .002 −3.02 (−4.62 to −1.38) <.001 0.02 (−0.19 to 0.39) .88 
 Teacher rating (n = 209) −0.68 (−2.41 to 0.97) .42 −2.40 (−6.02 to 0.75) .15 −3.08 (−5.84 to −0.48) .03 0.22 (−2.10 to 3.07) .45 
BRIEF score         
 Caregiver rating (n = 385) 0.06 (−0.66 to 0.72) .86 −3.68 (−5.43 to −1.95) <.001 −3.62 (−5.26 to −2.08) <.001 −0.02 (−0.22 to 0.19) .86 
 Teacher rating (n = 204) −0.13 (−1.65 to 1.42) .89 −2.05 (−5.49 to 1.25) .21 −2.19 (−5.22 to 0.64) .14 0.06 (−618.93 to −0.26) .90 
PSQ-SRBD (n = 389) −0.03 (−0.05 to −0.02) <.001 −0.22 (−0.25 to −0.18) <.001 −0.25 (−0.28 to −0.21) <.001 0.13 (0.07 to 0.21) <.001 
PedsQL score (n = 392) 0.78 (−0.41 to 2.05) .23 4.55 (1.21 to 7.61) .007 5.33 (2.30 to 8.10) <.001 0.15 (−0.05 to 0.63) .23 
DAS-2 GCA score (n = 390) −0.56 (−1.14 to 0.00) .05 1.27 (−0.16 to 2.63) .08 0.71 (−0.64 to 1.98) .31 −0.79 (−8.96 to 7.66) .34 
Conners’ ADHD Index T score         
 Caregiver rating (n = 385) −0.45 (−1.16 to 0.25) .21 −2.15 (−3.98 to −0.33) .02 −2.60 (−4.23 to −1.01) .002 0.17 (−0.10 to 0.70) .21 
 Teacher rating (n = 209) −0.33 (−1.83 to 1.12) .67 −3.11 (−5.98 to −0.30) .03 −3.44 (−5.90 to −1.03) .004 0.10 (−0.39 to 0.81) .67 
CBCL total problems T score (n = 374) −0.02 (−0.66 to 0.61) .94 −2.68 (−4.33 to −1.13) <.001 −2.70 (−4.31 to −1.13) <.001 0.01 (−0.32 to 0.27) .94 
OSA-18 summary score (388) −1.98 (−3.40 to −0.64) .004 −15.31 (−18.70 to −12.07) <.001 −17.29 (−20.40 to −14.24) <.001 0.11 (0.04 to 0.20) .004 
ESS (n = 393) −0.13 (−0.47 to 0.20) .44 −1.64 (−2.55 to −0.77) <.001 −1.77 (−2.60 to −0.98) <.001 0.08 (−0.11 to 0.31) .44 
BMI percentile (n = 391) −0.35 (−1.26 to 0.50) .40 2.76 (0.56 to 5.00) .01 2.42 (0.41 to 4.49) .02 −0.14 (−0.99 to 0.33) .41 
C-reactive protein, (n = 262), μg/mL −0.15 (−0.52 to 0.17) .40 0.38 (−0.53 to 1.30) .47 0.23 (−0.48 to 0.92) .60 −0.64 (−6.39 to 4.92) .54 
HOMA-IR (n = 261) 0.02 (−0.19 to 0.22) .85 0.55 (0.03 to 1.11) .03 0.57 (0.09 to 1.07) .02 0.04 (−0.51 to 0.69) .85 
Systolic blood pressure, (n = 391) mm Hg −0.12 (−0.78 to 0.54) .73 −0.13 (−1.82 to 1.60) .88 −0.25 (−1.75 to 1.32) .78 0.47 (−6.16 to 5.50) .95 
Diastolic blood pressure (n = 391), mm Hg 0.29 (−0.34 to 0.91) .35 −0.17 (−1.72 to 1.37) .85 0.12 (−1.30 to 1.53) .86 2.44 (−7.01 to 6.52) .93 
OutcomeaACMEADETotal EffectbProportion Mediatedc
Mean Estimate (95% CI)PMean Estimate (95% CI)PMean Estimate (95% CI)PMean Estimate (95% CI)P
NEPSY attention and executive function score (n = 390) −0.20 (−1.42 to 1.02) .74 2.53 (−0.41 to 5.67) .10 2.32 (−0.41 to 5.30) .10 −0.09 (−3.69 to 0.83) .76 
Conners’ Rating Scale score         
 Caregiver rating (n = 385) −0.07 (−0.78 to 0.60) .88 −2.94 (−4.73 to −1.11) .002 −3.02 (−4.62 to −1.38) <.001 0.02 (−0.19 to 0.39) .88 
 Teacher rating (n = 209) −0.68 (−2.41 to 0.97) .42 −2.40 (−6.02 to 0.75) .15 −3.08 (−5.84 to −0.48) .03 0.22 (−2.10 to 3.07) .45 
BRIEF score         
 Caregiver rating (n = 385) 0.06 (−0.66 to 0.72) .86 −3.68 (−5.43 to −1.95) <.001 −3.62 (−5.26 to −2.08) <.001 −0.02 (−0.22 to 0.19) .86 
 Teacher rating (n = 204) −0.13 (−1.65 to 1.42) .89 −2.05 (−5.49 to 1.25) .21 −2.19 (−5.22 to 0.64) .14 0.06 (−618.93 to −0.26) .90 
PSQ-SRBD (n = 389) −0.03 (−0.05 to −0.02) <.001 −0.22 (−0.25 to −0.18) <.001 −0.25 (−0.28 to −0.21) <.001 0.13 (0.07 to 0.21) <.001 
PedsQL score (n = 392) 0.78 (−0.41 to 2.05) .23 4.55 (1.21 to 7.61) .007 5.33 (2.30 to 8.10) <.001 0.15 (−0.05 to 0.63) .23 
DAS-2 GCA score (n = 390) −0.56 (−1.14 to 0.00) .05 1.27 (−0.16 to 2.63) .08 0.71 (−0.64 to 1.98) .31 −0.79 (−8.96 to 7.66) .34 
Conners’ ADHD Index T score         
 Caregiver rating (n = 385) −0.45 (−1.16 to 0.25) .21 −2.15 (−3.98 to −0.33) .02 −2.60 (−4.23 to −1.01) .002 0.17 (−0.10 to 0.70) .21 
 Teacher rating (n = 209) −0.33 (−1.83 to 1.12) .67 −3.11 (−5.98 to −0.30) .03 −3.44 (−5.90 to −1.03) .004 0.10 (−0.39 to 0.81) .67 
CBCL total problems T score (n = 374) −0.02 (−0.66 to 0.61) .94 −2.68 (−4.33 to −1.13) <.001 −2.70 (−4.31 to −1.13) <.001 0.01 (−0.32 to 0.27) .94 
OSA-18 summary score (388) −1.98 (−3.40 to −0.64) .004 −15.31 (−18.70 to −12.07) <.001 −17.29 (−20.40 to −14.24) <.001 0.11 (0.04 to 0.20) .004 
ESS (n = 393) −0.13 (−0.47 to 0.20) .44 −1.64 (−2.55 to −0.77) <.001 −1.77 (−2.60 to −0.98) <.001 0.08 (−0.11 to 0.31) .44 
BMI percentile (n = 391) −0.35 (−1.26 to 0.50) .40 2.76 (0.56 to 5.00) .01 2.42 (0.41 to 4.49) .02 −0.14 (−0.99 to 0.33) .41 
C-reactive protein, (n = 262), μg/mL −0.15 (−0.52 to 0.17) .40 0.38 (−0.53 to 1.30) .47 0.23 (−0.48 to 0.92) .60 −0.64 (−6.39 to 4.92) .54 
HOMA-IR (n = 261) 0.02 (−0.19 to 0.22) .85 0.55 (0.03 to 1.11) .03 0.57 (0.09 to 1.07) .02 0.04 (−0.51 to 0.69) .85 
Systolic blood pressure, (n = 391) mm Hg −0.12 (−0.78 to 0.54) .73 −0.13 (−1.82 to 1.60) .88 −0.25 (−1.75 to 1.32) .78 0.47 (−6.16 to 5.50) .95 
Diastolic blood pressure (n = 391), mm Hg 0.29 (−0.34 to 0.91) .35 −0.17 (−1.72 to 1.37) .85 0.12 (−1.30 to 1.53) .86 2.44 (−7.01 to 6.52) .93 

The causal mediator was polysomnographic resolution, defined by an AHI <2 and OAI <1 at follow-up. ACME, average causal mediation effect; ADE, average direct effect; ADHD, attention-deficit/hyperactivity disorder; DAS-2 GCA, Differential Ability Scales–Second Edition (General Conceptual Ability). T scores provide information about a child’s score relative to the reference sample.

a

Outcome variables included scores on the NEPSY, the Conners’ Behavior Rating Scales and the BRIEF, with summary measures obtained from the primary caregiver and the teacher, the PSQ-SRBD, and the PedsQL. These were reported in the CHAT study. Additional outcome variables included scores on the DAS-2 GCA, the Conners’ ADHD Index T scores obtained from the primary caregiver and the teacher, the CBCL, the OSA-18 inventory, and the modified ESS. Physiologic outcomes were represented by changes in BMI percentile score, C-reactive protein values, HOMA-IR, and the systolic and diastolic blood pressures. Changes in outcomes in response to early adenotonsillectomy compared to watchful waiting are shown in Table 3. Normative values are described in Table 1. The relationship between the treatment and the outcome is assessed before and after, including polysomnographic resolution of OSA as the mediating variable.

b

The total effect is decomposed into the ACME and the ADE.

c

The contribution of the mediated effect as a proportion of the total effect is also shown. Point estimates are represented by means and 95% CIs, along with P values for comparisons. The number of children is indicated by “n =” in parentheses. P < .05 was considered significant.

TABLE 5

Results of Mediation Analysis for Outcomes of the CHAT

OutcomeaACMEADETotal EffectbProportion Mediatedc
Mean Estimate (95% CI)PMean Estimate (95% CI)PMean Estimate (95% CI)PMean Estimate (95% CI)P
NEPSY attention and executive function score (n = 390) 0.81 (−0.71 to 2.28) .27 1.51 (−1.64 to 4.78) .27 2.32 (−0.48 to 5.19) .10 0.35 (−1.78 to 3.27) .34 
Conners’ Rating Scale score         
 Caregiver rating (n = 385) −0.43 (−1.27 to 0.35) .30 −2.59 (−4.43 to −0.74) .004 −3.02 (−4.64 to −1.42) <.001 0.14 (−0.12 to 0.56) .30 
 Teacher rating (n = 209) −0.59 (−2.39 to 1.29) .51 −2.49 (−5.95 to 0.98) .16 −3.08 (−5.79 to −0.360 .03 0.19 (−0.62 to 1.88) .53 
BRIEF score         
 Caregiver rating (n = 385) −0.23 (−1.12 to 0.60) .56 −3.39 (−5.13 to −1.64) <.001 −3.63 (−5.26 to −2.02) <.001 0.06 (−0.19 to 0.340 .56 
 Teacher rating (n = 204) −0.51 (−2.53 to 1.78) .67 −1.68 (−5.59 to 2.07) .36 −2.19 (−5.19 to 0.74) .14 0.23 (−3.22 to 4.10) .72 
PSQ-SRBD (n = 389) −0.05 (−0.06 to −0.03) <.001 −0.20 (−0.24 to −0.17) <.001 −0.25 (−0.28 to −0.22) <.001 0.18 (0.11 to 0.26) <.001 
PedsQL score (n = 392) 0.051 (−0.95 to 1.96) .49 4.82 (1.41 to 8.19) .006 5.33 (2.42 to 8.21) <.001 0.10 (−0.18 to 0.51) .50 
DAS-2 GCA score (n = 390) −0.90 (−1.65 to −0.17) .02 1.61 (0.13 to 3.07) .03 0.71 (−0.61 to 2.04) .30 −1.28 (−13.38 to 10.28) .31 
Conners’ ADHD Index T score         
 Caregiver rating (n = 385) −0.60 (−1.41 to 0.23) .15 −2.00 (−3.95 to −0.20) .03 −2.60 (−4.24 to −1.060 .001 0.23 (−0.09 to 0.84) .15 
 Teacher rating (n = 209) −0.51 (−2.09 to 1.28) .55 −2.93 (−6.15 to 0.00) .05 −3.44 (−5.85 to −1.13) .004 0.15 (−0.39 to 0.99) .55 
CBCL total problems T score (n = 374) −0.36 (−1.18 to 0.47) .39 −2.24 (−3.99 to −0.68) .007 −2.70 (−4.29 to −1.17) .001 0.14 (−0.20 to 0.54) .39 
OSA-18 summary score (n = 388) −3.39 (−5.16 to −1.82) <.001 −13.90 (−17.24 to −10.45) <.001 −17.28 (−20.35 to −14.19) <.001 0.20 (0.10 to 0.31) <.001 
ESS (n = 393) −0.49 (−0.93 to −0.07) .02 −1.29 (−2.20 to −0.41) .004 −1.77 (−2.59 to −0.98) <.001 0.27 (0.04 to 0.64) .02 
BMI percentile (n = 391) 0.78 (−0.35 to 1.940 .18 1.64 (−0.76 to 4.04) .18 2.42 (0.43 to 4.44) .02 0.32 (−0.20 to 1.82) .20 
C-reactive protein (n = 262), μg/mL −0.42 (−1.08 to 0.12) .19 0.65 (−0.38 to 1.75) .29 0.23 (−0.44 to 0.88) .58 −1.81 (−15.12 to 12.51) .58 
HOMA-IR (n = 261) 0.01 (−0.22 to 0.22) .91 0.56 (0.03 to 1.11) .04 0.57 (0.11 to 1.06) .01 0.02 (−0.48 to 0.71) .91 
Systolic blood pressure (n = 391), mm Hg −0.12 (−0.98 to 0.78) .80 −0.13 (−1.87 to 1.62) .88 −0.25 (−1.78 to 1.28) .72 0.49 (−6.56 to 8.25) .93 
Diastolic blood pressure (n = 391), mm Hg 0.47 (−0.33 to 1.25) .24 −0.35 (−2.00 to 1.31) .69 0.12 (−1.32 to 1.540 .85 4.00 (−11.39 to 10.54) .88 
OutcomeaACMEADETotal EffectbProportion Mediatedc
Mean Estimate (95% CI)PMean Estimate (95% CI)PMean Estimate (95% CI)PMean Estimate (95% CI)P
NEPSY attention and executive function score (n = 390) 0.81 (−0.71 to 2.28) .27 1.51 (−1.64 to 4.78) .27 2.32 (−0.48 to 5.19) .10 0.35 (−1.78 to 3.27) .34 
Conners’ Rating Scale score         
 Caregiver rating (n = 385) −0.43 (−1.27 to 0.35) .30 −2.59 (−4.43 to −0.74) .004 −3.02 (−4.64 to −1.42) <.001 0.14 (−0.12 to 0.56) .30 
 Teacher rating (n = 209) −0.59 (−2.39 to 1.29) .51 −2.49 (−5.95 to 0.98) .16 −3.08 (−5.79 to −0.360 .03 0.19 (−0.62 to 1.88) .53 
BRIEF score         
 Caregiver rating (n = 385) −0.23 (−1.12 to 0.60) .56 −3.39 (−5.13 to −1.64) <.001 −3.63 (−5.26 to −2.02) <.001 0.06 (−0.19 to 0.340 .56 
 Teacher rating (n = 204) −0.51 (−2.53 to 1.78) .67 −1.68 (−5.59 to 2.07) .36 −2.19 (−5.19 to 0.74) .14 0.23 (−3.22 to 4.10) .72 
PSQ-SRBD (n = 389) −0.05 (−0.06 to −0.03) <.001 −0.20 (−0.24 to −0.17) <.001 −0.25 (−0.28 to −0.22) <.001 0.18 (0.11 to 0.26) <.001 
PedsQL score (n = 392) 0.051 (−0.95 to 1.96) .49 4.82 (1.41 to 8.19) .006 5.33 (2.42 to 8.21) <.001 0.10 (−0.18 to 0.51) .50 
DAS-2 GCA score (n = 390) −0.90 (−1.65 to −0.17) .02 1.61 (0.13 to 3.07) .03 0.71 (−0.61 to 2.04) .30 −1.28 (−13.38 to 10.28) .31 
Conners’ ADHD Index T score         
 Caregiver rating (n = 385) −0.60 (−1.41 to 0.23) .15 −2.00 (−3.95 to −0.20) .03 −2.60 (−4.24 to −1.060 .001 0.23 (−0.09 to 0.84) .15 
 Teacher rating (n = 209) −0.51 (−2.09 to 1.28) .55 −2.93 (−6.15 to 0.00) .05 −3.44 (−5.85 to −1.13) .004 0.15 (−0.39 to 0.99) .55 
CBCL total problems T score (n = 374) −0.36 (−1.18 to 0.47) .39 −2.24 (−3.99 to −0.68) .007 −2.70 (−4.29 to −1.17) .001 0.14 (−0.20 to 0.54) .39 
OSA-18 summary score (n = 388) −3.39 (−5.16 to −1.82) <.001 −13.90 (−17.24 to −10.45) <.001 −17.28 (−20.35 to −14.19) <.001 0.20 (0.10 to 0.31) <.001 
ESS (n = 393) −0.49 (−0.93 to −0.07) .02 −1.29 (−2.20 to −0.41) .004 −1.77 (−2.59 to −0.98) <.001 0.27 (0.04 to 0.64) .02 
BMI percentile (n = 391) 0.78 (−0.35 to 1.940 .18 1.64 (−0.76 to 4.04) .18 2.42 (0.43 to 4.44) .02 0.32 (−0.20 to 1.82) .20 
C-reactive protein (n = 262), μg/mL −0.42 (−1.08 to 0.12) .19 0.65 (−0.38 to 1.75) .29 0.23 (−0.44 to 0.88) .58 −1.81 (−15.12 to 12.51) .58 
HOMA-IR (n = 261) 0.01 (−0.22 to 0.22) .91 0.56 (0.03 to 1.11) .04 0.57 (0.11 to 1.06) .01 0.02 (−0.48 to 0.71) .91 
Systolic blood pressure (n = 391), mm Hg −0.12 (−0.98 to 0.78) .80 −0.13 (−1.87 to 1.62) .88 −0.25 (−1.78 to 1.28) .72 0.49 (−6.56 to 8.25) .93 
Diastolic blood pressure (n = 391), mm Hg 0.47 (−0.33 to 1.25) .24 −0.35 (−2.00 to 1.31) .69 0.12 (−1.32 to 1.540 .85 4.00 (−11.39 to 10.54) .88 

The causal mediator was polysomnographic change between baseline and follow-up, measured by the log-transformed AHI. ACME, average causal mediation effect; ADE, average direct effect; ADHD, attention-deficit/hyperactivity disorder; DAS-2 GCA, Differential Ability Scales–Second Edition (General Conceptual Ability). T scores provide information about a child’s score relative to the reference sample.

a

Outcome variables reported in the CHAT study included scores on the NEPSY, the Conners’ Behavior Rating Scales, and the BRIEF, with summary measures obtained from the primary caregiver and the teacher, the PSQ-SRBD, and the PedsQL. Additional outcome variables included scores on DAS-2 GCA, the Conners’ ADHD Index T scores obtained from the primary caregiver and the teacher, the CBCL, the OSA-18 inventory, and the modified ESS. Physiologic outcomes were represented by changes in BMI percentile score, C-reactive protein values, HOMA-IR, and the systolic and diastolic blood pressures. The relationship between the treatment and the outcome is assessed before and after, including log-transformed AHI as the mediating variable. Log transformation was performed because of the nonnormal distribution of data.

b

The total effect is decomposed into the ACME and the ADE.

c

The contribution of the mediated effect as a proportion of the total effect is also shown. Point estimates are represented by means and 95% CIs, along with P values for comparisons. The number of children is indicated by “n =” in parentheses. P < .05 was considered significant.

The first sensitivity analysis revealed that the missing observations from the data did not significantly alter causal inferences (data not shown). Supplemental Table 10 and Supplemental Figs 35 represent sensitivity analyses for the significant mediation effects identified for 3 separate causal pathways. In summary, sequential ignorability was less likely to be violated for the mediation effects in the PSQ-SRBD (sensitivity parameter of −0.40 for the watchful waiting group and −0.30 for the adenotonsillectomy group) compared to the OSA-18 (sensitivity parameter of −0.30 and −0.10, respectively, for the 2 arms of the trial).

In this study, polysomnographic resolution and changes in severity of OSA accounted for small but significant proportions of changes in both symptoms and disease-specific quality of life in children treated for OSA. Importantly, polysomnographic resolution of OSA or changes in its severity did not causally impact 16 out of the 18 treatment outcomes. These results highlight the limited utility associated with the use of polysomnographic thresholds in the management of OSA.

The principal outcome reported in the treatment of pediatric OSA is the resolution of the condition by polysomnography. This approach is supported by several nonrandomized studies with both prospective32  and retrospective33  designs as well as multiple meta-analyses.3436  As shown by the CHAT study, children who were African American, obese, or with greater baseline severity of OSA were more likely to have persistent OSA at follow-up regardless of the treatment arm.8 

Additionally, the comparison of treatment effects between the 2 arms of the trial also revealed statistically significant increases in weight gain and HOMA-IR in the surgical arm. Although weight gain after adenotonsillectomy has been described previously,37  the effect sizes of these differences were small in the current study.

Because of the risk of persistent OSA in 10% to 50% of children undergoing adenotonsillectomy, postoperative polysomnography is recommended2  for early detection and further treatment.1  On the basis of this recommendation, in the current study, 50% of the children (77 out of 154) without resolution of OSA may have been potential candidates for further treatment ranging from additional surgery to continuous positive air pressure therapy solely on the basis of an AHI exceeding 5 at follow-up.

The CHAT study remains the only randomized trial to this date used to investigate the benefits of surgery over watchful waiting for outcomes related to childhood OSA. The trial additionally provided the opportunity to examine the isolated effects of changes in polysomnographic severity of OSA or resolution on its treatment. Importantly, we show that the results of the causal mediation analysis suggest that a majority of the treatment-related changes in outcomes of OSA in children are not attributable to polysomnographic changes in its severity or resolution. Given that 12 out of the 18 outcomes in the trial revealed significant average changes with treatment, alterations in other polysomnographic parameters such as sleep architecture may play a mediating role and merit further investigation.

The use of polysomnography is central to the guidelines from the American Academy of Pediatrics,1  American Academy of Sleep Medicine,2  American Academy of Otolaryngology–Head and Neck Surgery,3  and the ERS4  for the diagnosis and management of OSA, including the 500 000 children undergoing adenotonsillectomy each year. The ERS suggests treating a child with an AHI >5 even in the absence of comorbidities.4  Such an approach increases the risk of avoidable surgical morbidity in children considered candidates for further treatment solely on the basis of empirical polysomnographic thresholds. Moreover, a stepwise approach that seeks complete resolution of OSA in children has been proposed with a recommendation to measure outcomes on the basis of polysomnography at each stage.4  Liberal polysomnography as proposed in these guidelines may not be justifiable given the costs and resources needed.7,38 

The principal strengths of this study are related to the robust criteria used to define inclusion and exclusion of subjects in the only randomized trial to assess the benefit of adenotonsillectomy over watchful waiting for the treatment of childhood OSA. The causal mediation analysis described in the current study has been promulgated to provide mechanistic explanations for outcomes related to interventions.39  Furthermore, the causal models described here are generally agnostic to the nature of the underlying data distributions.28  Children were enrolled from multiple sites across the United States, underscoring the generalizability of the results and bias mitigation efforts. The 18 outcomes examined as part of the trial were derived from all possible domains potentially impacted by OSA in children. The weaknesses of the study are similar to those listed by the original trial, which include the narrow range of age of children, the exclusion of children with an AHI exceeding 30 and a relatively short period of follow-up. Additionally, identification of other causal mediators related to the treatment of OSA is a subject of future investigation.

The mediation analysis of the results obtained from the CHAT study suggests a limited role for polysomnographic resolution or changes in severity of OSA in causally influencing the outcomes related to its treatment. These results caution against the use of empirical AHI-based thresholds to assess outcomes related to treatment. Further studies are necessary to identify possible mediators of outcomes of pediatric OSA to better define the severity of the condition and reduce the cost of diagnosis.

We thank the University of Maryland Biostatistics Core of the Institute for Clinical and Translational Research for statistical consultations. We also acknowledge Drs Abraham Kanate, Anita Shet, and Arun Shet for their useful feedback on earlier versions of the article.

Dr Isaiah conceptualized and designed the study, conducted the initial analysis, and drafted the initial manuscript; Drs Das and Pereira conceptualized and designed the study; and all authors reviewed and revised the manuscript, approved the final manuscript as submitted, and agree to be accountable for all aspects of the work.

This trial has been registered at www.clinicaltrials.gov (identifier NCT00560859).

FUNDING: No external funding.

AHI

apnea hypopnea index

BRIEF

Behavior Rating Inventory of Executive Function

CBCL

Child Behavior Checklist

CHAT

Childhood Adenotonsillectomy Trial

CI

confidence interval

CRP

C-reactive protein

ERS

European Respiratory Society

ESS

Epworth Sleepiness Scale

HOMA-IR

Homeostasis Model Assessment for Insulin Resistance

NEPSY

Developmental Neuropsychological Assessment

OAI

obstructive apnea index

OSA

obstructive sleep apnea

OSA-18

Obstructive Sleep Apnea-18

PedsQL

Pediatric Quality of Life Inventory

PSQ-SRBD

Pediatric Sleep Questionnaire Sleep-Related Breathing Disorder

1
Marcus
CL
,
Brooks
LJ
,
Draper
KA
, et al
;
American Academy of Pediatrics
.
Diagnosis and management of childhood obstructive sleep apnea syndrome
.
Pediatrics
.
2012
;
130
(
3
).
2
Aurora
RN
,
Zak
RS
,
Karippot
A
, et al
;
American Academy of Sleep Medicine
.
Practice parameters for the respiratory indications for polysomnography in children
.
Sleep
.
2011
;
34
(
3
):
379
388
3
Roland
PS
,
Rosenfeld
RM
,
Brooks
LJ
, et al
;
American Academy of Otolaryngology
;
Head and Neck Surgery Foundation
.
Clinical practice guideline: polysomnography for sleep-disordered breathing prior to tonsillectomy in children
.
Otolaryngol Head Neck Surg
.
2011
;
145
(
suppl 1
):
S1
S15
4
Kaditis
AG
,
Alonso Alvarez
ML
,
Boudewyns
A
, et al
.
Obstructive sleep disordered breathing in 2- to 18-year-old children: diagnosis and management
.
Eur Respir J
.
2016
;
47
(
1
):
69
94
5
Redline
S
,
Amin
R
,
Beebe
D
, et al
.
The Childhood Adenotonsillectomy Trial (CHAT): rationale, design, and challenges of a randomized controlled trial evaluating a standard surgical procedure in a pediatric population
.
Sleep
.
2011
;
34
(
11
):
1509
1517
6
Erickson
BK
,
Larson
DR
,
St Sauver
JL
,
Meverden
RA
,
Orvidas
LJ
.
Changes in incidence and indications of tonsillectomy and adenotonsillectomy, 1970-2005
.
Otolaryngol Head Neck Surg
.
2009
;
140
(
6
):
894
901
7
Wise
MS
,
Nichols
CD
,
Grigg-Damberger
MM
, et al
.
Executive summary of respiratory indications for polysomnography in children: an evidence-based review
.
Sleep
.
2011
;
34
(
3
):
389
398AW
8
Marcus
CL
,
Moore
RH
,
Rosen
CL
, et al
;
Childhood Adenotonsillectomy Trial (CHAT)
.
A randomized trial of adenotonsillectomy for childhood sleep apnea
.
N Engl J Med
.
2013
;
368
(
25
):
2366
2376
9
Chervin
RD
,
Ellenberg
SS
,
Hou
X
, et al
;
Childhood Adenotonsillectomy Trial
.
Prognosis for spontaneous resolution of OSA in children
.
Chest
.
2015
;
148
(
5
):
1204
1213
10
Isaiah
A
,
Hamdan
H
,
Johnson
RF
,
Naqvi
K
,
Mitchell
RB
.
Very severe obstructive sleep apnea in children: outcomes of adenotonsillectomy and risk factors for persistence
.
Otolaryngol Head Neck Surg
.
2017
;
157
(
1
):
128
134
11
Dean
DA
 II
,
Goldberger
AL
,
Mueller
R
, et al
.
Scaling up scientific discovery in sleep medicine: the national sleep research resource
.
Sleep
.
2016
;
39
(
5
):
1151
1164
12
Korkman
M
,
Kirk
U
,
Kemp
S
.
NEPSY: A Developmental Neuropsychological Assessment Manual
.
New York, NY
:
Psychological Corporation
;
1998
13
Conners
CK
.
Conners’ Rating Scales — Revised Technical Manual
. 5th ed.
Tonawanda, NY
:
Multi-Health Systems
;
2001
14
Gioia
GA
,
Isquith
PK
,
Guy
PK
,
Kenworthy
L
.
Behavior Rating Inventory of Executive Function (BRIEF)
.
Odessa, FL
:
Psychological Assessment Resources
;
2000
15
Achenbach
TM
,
Ruffle
TM
.
The Child Behavior Checklist and related forms for assessing behavioral/emotional problems and competencies
.
Pediatr Rev
.
2000
;
21
(
8
):
265
271
16
Varni
JW
,
Seid
M
,
Kurtin
PS
.
PedsQL 4.0: reliability and validity of the Pediatric Quality of Life Inventory version 4.0 generic core scales in healthy and patient populations
.
Med Care
.
2001
;
39
(
8
):
800
812
17
Franco
RA
 Jr
,
Rosenfeld
RM
,
Rao
M
.
First place–resident clinical science award 1999. Quality of life for children with obstructive sleep apnea
.
Otolaryngol Head Neck Surg
.
2000
;
123
(
1, pt 1
):
9
16
18
Chervin
RD
,
Hedger
K
,
Dillon
JE
,
Pituch
KJ
.
Pediatric sleep questionnaire (PSQ): validity and reliability of scales for sleep-disordered breathing, snoring, sleepiness, and behavioral problems
.
Sleep Med
.
2000
;
1
(
1
):
21
32
19
Melendres
MC
,
Lutz
JM
,
Rubin
ED
,
Marcus
CL
.
Daytime sleepiness and hyperactivity in children with suspected sleep-disordered breathing
.
Pediatrics
.
2004
;
114
(
3
):
768
775
20
Elliott
CD
.
Differential Abilities Scale: Introductory and Technical Handbook
.
San Antonio, TX
:
Harcourt Brace Jovanovich
;
1990
21
Kuczmarski
RJ
,
Ogden
CL
,
Grummer-Strawn
LM
, et al
.
CDC growth charts: United States
.
Adv Data
.
2000
;(
314
):
1
27
22
Jaye
DL
,
Waites
KB
.
Clinical applications of C-reactive protein in pediatrics
.
Pediatr Infect Dis J
.
1997
;
16
(
8
):
735
746; quiz 746–747
23
Matthews
DR
,
Hosker
JP
,
Rudenski
AS
,
Naylor
BA
,
Treacher
DF
,
Turner
RC
.
Homeostasis model assessment: insulin resistance and beta-cell function from fasting plasma glucose and insulin concentrations in man
.
Diabetologia
.
1985
;
28
(
7
):
412
419
24
Uliel
S
,
Tauman
R
,
Greenfeld
M
,
Sivan
Y
.
Normal polysomnographic respiratory values in children and adolescents
.
Chest
.
2004
;
125
(
3
):
872
878
25
Marcus
CL
,
Omlin
KJ
,
Basinki
DJ
, et al
.
Normal polysomnographic values for children and adolescents
.
Am Rev Respir Dis
.
1992
;
146
(
5 pt 1
):
1235
1239
26
Montgomery-Downs
HE
,
O’Brien
LM
,
Gulliver
TE
,
Gozal
D
.
Polysomnographic characteristics in normal preschool and early school-aged children
.
Pediatrics
.
2006
;
117
(
3
):
741
753
27
Marcus
CL
.
Childhood obstructive sleep apnoea: to treat or not to treat, that is the question
.
Thorax
.
2010
;
65
(
1
):
4
5
28
Imai
K
,
Keele
L
,
Tingley
D
.
A general approach to causal mediation analysis
.
Psychol Methods
.
2010
;
15
(
4
):
309
334
29
Baron
RM
,
Kenny
DA
.
The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations
.
J Pers Soc Psychol
.
1986
;
51
(
6
):
1173
1182
30
Tingley
D
,
Yamamoto
T
,
Hirose
K
,
Keele
L
,
Imai
K
.
mediation: R package for causal mediation analysis
.
J Stat Softw
.
2014
;
59
(
1
):
1
38
31
Schoemann
AM
,
Boulton
AJ
,
Short
SD
.
Determining power and sample size for simple and complex mediation models
.
Soc Psychol Personal Sci
.
2017
;
8
(
4
):
379
386
32
Mitchell
RB
.
Adenotonsillectomy for obstructive sleep apnea in children: outcome evaluated by pre- and postoperative polysomnography
.
Laryngoscope
.
2007
;
117
(
10
):
1844
1854
33
Bhattacharjee
R
,
Kheirandish-Gozal
L
,
Spruyt
K
, et al
.
Adenotonsillectomy outcomes in treatment of obstructive sleep apnea in children: a multicenter retrospective study
.
Am J Respir Crit Care Med
.
2010
;
182
(
5
):
676
683
34
Costa
DJ
,
Mitchell
R
.
Adenotonsillectomy for obstructive sleep apnea in obese children: a meta-analysis
.
Otolaryngol Head Neck Surg
.
2009
;
140
(
4
):
455
460
35
Friedman
M
,
Wilson
M
,
Lin
HC
,
Chang
HW
.
Updated systematic review of tonsillectomy and adenoidectomy for treatment of pediatric obstructive sleep apnea/hypopnea syndrome
.
Otolaryngol Head Neck Surg
.
2009
;
140
(
6
):
800
808
36
Brietzke
SE
,
Gallagher
D
.
The effectiveness of tonsillectomy and adenoidectomy in the treatment of pediatric obstructive sleep apnea/hypopnea syndrome: a meta-analysis
.
Otolaryngol Head Neck Surg
.
2006
;
134
(
6
):
979
984
37
Van
M
,
Khan
I
,
Hussain
SS
.
Short-term weight gain after adenotonsillectomy in children with obstructive sleep apnoea: systematic review
.
J Laryngol Otol
.
2016
;
130
(
3
):
214
218
38
Glaze
DG
.
Evidence based sleep medicine--is pediatric sleep medicine ready?
J Clin Sleep Med
.
2005
;
1
(
3
):
255
256
39
Kraemer
HC
,
Wilson
GT
,
Fairburn
CG
,
Agras
WS
.
Mediators and moderators of treatment effects in randomized clinical trials
.
Arch Gen Psychiatry
.
2002
;
59
(
10
):
877
883

Competing Interests

POTENTIAL CONFLICT OF INTEREST: Dr Isaiah is an inventor of 3 technologies related to the diagnosis and treatment of sleep apnea in adults; the other authors have indicated they have no potential conflicts of interest to disclose.

FINANCIAL DISCLOSURE: Dr Isaiah receives patent-related royalties from the University of Maryland, Baltimore, for inventions related to sleep apnea; the other authors have indicated they have no financial relationships relevant to this article to disclose.

Supplementary data