Video Abstract
Polysomnography is central to the diagnosis and management of childhood obstructive sleep apnea (OSA). However, it is not known whether the treatment-related outcomes of OSA are causally associated with its resolution or changes in severity as determined by polysomnography.
Polysomnographic, cognitive, behavioral, quality-of life, and health outcomes at baseline and at 7 months were obtained from the Childhood Adenotonsillectomy Trial, a randomized trial comparing the outcomes of early adenotonsillectomy to watchful waiting in children with OSA. We used causal mediation analysis to measure the changes in 18 outcomes independently attributable to polysomnographic resolution or changes in severity after adjusting for confounding variables.
A total of 398 children aged 5 to 9 years were included. A total of 244 (61%) experienced resolution of OSA at follow-up. Polysomnographic resolution of the condition accounted for small but significant proportions of changes in symptoms (proportion mediated [95% confidence interval] 0.13 [0.07 to 0.21]; P < .001) and disease-specific quality of life (0.11 [0.04 to 0.20]; P = .004). Changes in polysomnographic severity similarly mediated symptom score (proportion mediated 0.18 [0.11 to 0.26]; P < .001) and disease-specific quality-of-life outcomes (0.20 [0.10 to 0.31]; P = .004). Importantly, significant mediation effects were not identified for any of the other 16 outcomes. No significant interactions were observed between the trial arms.
The majority of the treatment-related changes in outcomes of OSA in school-aged children are not causally attributable to polysomnographic resolution or changes in its severity. These results underscore the limited utility of polysomnographic thresholds in the management of childhood OSA.
Polysomnography is central to the diagnosis and stratification of obstructive sleep apnea in children. Severity thresholds derived from polysomnography are used widely to initiate and monitor treatment of chronic upper airway obstruction.
The outcomes of treatment in school-aged children with obstructive sleep apnea are not independently associated with the changes in severity or resolution of the condition as determined by polysomnography.
Polysomnography is used to quantify the severity of obstructive sleep apnea (OSA) in children and is central to the treatment guidelines from the American Academy of Pediatrics,1 the American Academy of Sleep Medicine,2 and the American Academy of Otolaryngology—Head and Neck Surgery.3 These guidelines generally delineate empirical classes of polysomnographic severity of OSA on the basis of the apnea hypopnea index (AHI) to initiate and/or monitor treatment. In children undergoing adenotonsillectomy for the treatment of OSA, postoperative polysomnography is recommended when the preoperative AHI exceeds 5.2 Recent guidelines from the European Respiratory Society (ERS) support treatment of an AHI exceeding 5, with the principal aim of complete polysomnographic resolution of OSA.4 Children with an AHI >30 in the Childhood Adenotonsillectomy Trial (CHAT) underwent urgent adenotonsillectomy.5 Given that the majority of the 500 000 adenotonsillectomies are performed annually in the United States for childhood OSA,6 there is an urgent need to rationalize the indications for pediatric polysomnography given the associated untenable costs.7 Specifically, the causal association between the treatment outcomes of childhood OSA and polysomnographic changes remains unproven.
The CHAT remains the only randomized trial to date used to compare the benefit of early adenotonsillectomy over watchful waiting in children aged 5 to 9 years and diagnosed with OSA.8 The trial design facilitates the assessment of the isolated impact of polysomnographic improvement or resolution of OSA on the causal pathways leading to changes in outcomes after treatment. Although the CHAT study did not conclusively establish the benefits of early surgery over watchful waiting in the neurocognitive domain, significantly greater improvements were demonstrated for caregiver-reported outcomes of behavior, quality of life, and symptoms in children in the surgical arm compared to watchful waiting. Importantly, the investigators hypothesized that the changes in OSA severity mediate the changes in outcomes after treatment.5 The biological premise of this hypothesis is that the resolution or reduction of the severity of upper airway obstruction as measured by polysomnography may causally account for improved outcomes after treatment. Furthermore, on the basis of the observation that OSA resolved spontaneously in half of the children who underwent watchful waiting, both the trial investigators8 as well as others9 emphasized the potential opportunity to avoid surgery in this subset. Together, these approaches suggest that the change in polysomnographic severity of OSA represents a possible primary surrogate for the effectiveness of its treatment, with the goal being complete resolution. We therefore analyzed the trial data to examine the causal impact of polysomnographic resolution of OSA or changes in its severity on treatment-related outcomes of the condition.
Methods
Study Design
The CHAT study was designed to test the hypothesis that adenotonsillectomy is superior to watchful waiting in improving outcomes related to childhood OSA. The study investigators enrolled children on the basis of age (5–9 years) and polysomnographic definitions (AHI >2 or obstructive apnea index [OAI] >1 per hour) based on suitability for adenotonsillectomy. The rationale for the use of these definitions, based on normative data representing the burden of upper airway obstruction, are provided in the trial design.5 Children were excluded if they had polysomnographic evidence of very severe OSA (AHI >3010 or OAI >20) or prolonged hypoxemia. The trial data were obtained from the National Sleep Research Resource (https://sleepdata.org) through a data use agreement.11 The primary and secondary outcomes were measured at baseline and follow-up at 7 months. Table 1 summarizes these outcomes, along with their respective domains and the interpretation of the results.
Differences Between Children Grouped by Whether They Had Resolution of OSA in the CHAT As Determined by Follow-up Polysomnography
Characteristic . | With Resolution (n = 244) . | Without Resolution (n = 154) . | Pa . |
---|---|---|---|
Age, y, mean (95% CI) | 6.5 (6.3 to 6.6) | 6.7 (6.5 to 6.9) | .08 |
Male sex, n (%) | 114 (47) | 80 (52) | .36 |
Race, n (%) | .05 | ||
White | 101 (41) | 43 (28) | |
African American | 115 (47) | 97 (63) | |
Other | 28 (12) | 14 (9) | |
Hispanic ethnicity, n (%) | 19 (8) | 12 (8) | .99 |
BMI percentile score, mean (95% CI) | 66.3 (62.5 to 70.3) | 76.2 (71.5 to 80.9) | <.001 |
Wt class, n (%) | |||
Obeseb | 62 (25) | 72 (47) | <.001 |
Failure to thrive | 10 (4) | 4 (3) | — |
Maternal education less than high school, n (%) | 70 (29) | 55 (36) | .35 |
Annual household income <$30 000, n (%) | 91 (37) | 64 (42) | .17 |
AHI score at baseline, events per h, mean (95% CI) | 5.8 (5.1 to 6.4) | 8.3 (7.4 to 9.2) | <.001 |
AHI score at follow-up, events per h, mean (95% CI) | 0.8 (0.7 to 0.8) | 8.7 (6.9 to 10.5) | <.001 |
Adenotonsillectomy, n (%) | 154 (63) | 40 (26) | <.001 |
Follow-up AHI ≥2, n (%) | 0 (0) | 148 (96) | — |
Follow-up AHI ≥5, n (%) | 0 (0) | 77 (50) | — |
Follow-up AHI ≥10, n (%) | 0 (0) | 34 (22) | — |
Characteristic . | With Resolution (n = 244) . | Without Resolution (n = 154) . | Pa . |
---|---|---|---|
Age, y, mean (95% CI) | 6.5 (6.3 to 6.6) | 6.7 (6.5 to 6.9) | .08 |
Male sex, n (%) | 114 (47) | 80 (52) | .36 |
Race, n (%) | .05 | ||
White | 101 (41) | 43 (28) | |
African American | 115 (47) | 97 (63) | |
Other | 28 (12) | 14 (9) | |
Hispanic ethnicity, n (%) | 19 (8) | 12 (8) | .99 |
BMI percentile score, mean (95% CI) | 66.3 (62.5 to 70.3) | 76.2 (71.5 to 80.9) | <.001 |
Wt class, n (%) | |||
Obeseb | 62 (25) | 72 (47) | <.001 |
Failure to thrive | 10 (4) | 4 (3) | — |
Maternal education less than high school, n (%) | 70 (29) | 55 (36) | .35 |
Annual household income <$30 000, n (%) | 91 (37) | 64 (42) | .17 |
AHI score at baseline, events per h, mean (95% CI) | 5.8 (5.1 to 6.4) | 8.3 (7.4 to 9.2) | <.001 |
AHI score at follow-up, events per h, mean (95% CI) | 0.8 (0.7 to 0.8) | 8.7 (6.9 to 10.5) | <.001 |
Adenotonsillectomy, n (%) | 154 (63) | 40 (26) | <.001 |
Follow-up AHI ≥2, n (%) | 0 (0) | 148 (96) | — |
Follow-up AHI ≥5, n (%) | 0 (0) | 77 (50) | — |
Follow-up AHI ≥10, n (%) | 0 (0) | 34 (22) | — |
All categorical variables are shown by n (%) and continuous variables by mean (95% CIs). Resolution of OSA was defined by follow-up polysomnography revealing an AHI <2 and an OAI <1. —, not applicable.
P < .05 indicates statistical significance.
Children with a BMI ≥95th percentile were categorized as obese. Children with a BMI in the fifth percentile or lower were classified as failure to thrive. The percentile designation was obtained from the Centers for Disease Control growth charts for children aged 2 to 20 y.
The primary outcome was the attention and executive function score on the Developmental Neuropsychological Assessment (NEPSY).12 Behavioral outcomes were measured by the Conners’ Rating Scales–Revised: Long Version Global Index,13 the Behavior Rating Inventory of Executive Function (BRIEF),14 the Conners’ Comprehensive Behavior Rating Scale (attention-deficit/hyperactivity disorder),13 and the Child Behavior Checklist (CBCL).15 Both global and disease-specific quality-of-life measures were obtained by using the Pediatric Quality of Life Inventory (PedsQL)16 and the Obstructive Sleep Apnea-18 (OSA-18) scale.17 The symptoms of OSA were assessed by the Pediatric Sleep Questionnaire Sleep-Related Breathing Disorder (PSQ-SRBD)18 scale and the modified Epworth Sleepiness Scale (ESS).19 Generalized intellectual functioning was determined by using the Differential Ability Scales-II.20
Additional assessments included changes in BMI percentile,21 systolic and diastolic blood pressure measurements (measured in millimeters of mercury), and serum levels of C-reactive protein (measured in micrograms per milliliter) with a detection threshold of 0.15 μg/mL, a nonspecific marker for inflammation.22 Changes in the Homeostasis Model Assessment for Insulin Resistance (HOMA-IR), a surrogate marker of insulin resistance, was also calculated by using the equation HOMA-IR = fasting insulin (measured in microunits per milliliter) × fasting glucose (measured in milligrams per deciliter)/405.23 The rationale for the choice of these variables is based on the disease domains potentially affected by untreated OSA.5 In the current study, polysomnographic resolution of OSA was defined by an AHI <2 and OAI <1 at follow-up, identical to the CHAT study.24–26 The change in polysomnographic severity of OSA was the difference between the follow-up and baseline AHI scores. These mediators were chosen on the basis of (1) the finding that half of the children in the nonsurgical arm of the trial experienced resolution of the condition supporting the conclusion that surgery could be potentially avoided in these children,8 and (2) change in AHI indicates change in OSA severity, which was prespecified in the trial design as a mediator for other outcomes.5 Additionally, a more robust definition of OSA resolution represented by the resolution of obstruction (follow-up AHI <1.527 and OAI <1) along with the absence of hypoxemia (no portion of sleep with oximetry reading <92%) was also evaluated as a mediator for other outcomes.
Statistical Analysis
The characteristics of the study population including age, sex, race, ethnicity, BMI percentile score, socioeconomic characteristics, and OSA severity defined by the AHI score were compared between 2 groups of children defined by whether they experienced resolution of OSA at follow-up. We then compared the 18 outcomes between the trial arms by performing an analysis of covariance adjusted for the stratification factors of age, race, and weight status.8
We performed a mediation analysis by modeling the effect of treatment on each of the 18 outcomes in Table 1 by including each of the mediating variables separately. The mediator represents the pathway through which the intervention influences the change in outcome. Causal mediation analysis was described by Imai et al28 on the basis of the well-known conceptual framework of Baron and Kenny.29 Methodologic aspects of the analysis are described in the Supplemental Information. The analysis was undertaken by using the mediation package for R (https://cran.r-project.org, version 3.5.1).30 The first mediation model described in the current study decomposes the total effect of treatment into mediated effect via polysomnographic resolution of OSA and direct effect unrelated to the mediator, as shown in Fig 1. The second model similarly uses the change in AHI as a mediator.
General framework of causal mediation analysis. A, Conventional estimation of treatment effect c identifies the change in outcome measure (ΔPSQ-SRBD) as a function of the intervention such as eAT or WWSC. B, Causal mediation analysis is used to identify 3 separate pathways that include (1) the effect c′ of the treatment (eAT or WWSC) on the outcome (PSQ-SRBD), (2) the effect a′ of the treatment on mediator polysomnographic resolution of OSA, and (3) the effect b′ of the mediator on the outcome. The treatment effect that passes through the mediator is called the causal mediation effect, the unmediated effect is called the direct effect, and the sum of all effects is termed the total effect. C, The estimated ACME, the ADE, and the total effect for the outcome PSQ-SRBD. All effects are averaged over the entire study population with the assumption that there are no interactions between the 2 arms of the trial for ACME. D, The changes in ACME resulting from changes in the sensitivity parameter ρ. The actual ACME is shown by the point on the vertical axis intersecting the dotted horizontal line. The mediator is the resolution of OSA defined by an AHI <2 and OAI <1. ρ would need to assume a value of −0.4 before resulting in a change in direction of ACME. Error bars and gray shading represent 95% CIs. ACME, average causal mediation effect; ADE, average direct effect; eAT, early adenotonsillectomy; WWSC, watchful waiting with supportive care. a Indicates the lower limit of the correlation between residuals of the mediator and the outcome regression.
General framework of causal mediation analysis. A, Conventional estimation of treatment effect c identifies the change in outcome measure (ΔPSQ-SRBD) as a function of the intervention such as eAT or WWSC. B, Causal mediation analysis is used to identify 3 separate pathways that include (1) the effect c′ of the treatment (eAT or WWSC) on the outcome (PSQ-SRBD), (2) the effect a′ of the treatment on mediator polysomnographic resolution of OSA, and (3) the effect b′ of the mediator on the outcome. The treatment effect that passes through the mediator is called the causal mediation effect, the unmediated effect is called the direct effect, and the sum of all effects is termed the total effect. C, The estimated ACME, the ADE, and the total effect for the outcome PSQ-SRBD. All effects are averaged over the entire study population with the assumption that there are no interactions between the 2 arms of the trial for ACME. D, The changes in ACME resulting from changes in the sensitivity parameter ρ. The actual ACME is shown by the point on the vertical axis intersecting the dotted horizontal line. The mediator is the resolution of OSA defined by an AHI <2 and OAI <1. ρ would need to assume a value of −0.4 before resulting in a change in direction of ACME. Error bars and gray shading represent 95% CIs. ACME, average causal mediation effect; ADE, average direct effect; eAT, early adenotonsillectomy; WWSC, watchful waiting with supportive care. a Indicates the lower limit of the correlation between residuals of the mediator and the outcome regression.
A key assumption supporting the framework of causal mediation analysis is sequential ignorability. In the first part of the assumption, the mediator is randomly assigned given the treatment as part of a randomized trial. The second part infers that there are no unmeasured baseline variables confounding the relationships between the treatment, the mediator, and the outcome. This is potentially satisfied by controlling for all baseline covariates potentially confounding these relationships. The magnitude of departure from the second assumption is estimated by a sensitivity analysis assessing the contribution of unmeasured confounding variables (Fig 1).28
The causal mediation analysis yields average causally mediated, direct, and total effects as well as the proportion of the mediated effects (of the total effect) as point estimates with 95% confidence intervals (CIs) and the associated P values. For the estimation of the mediation effects, a lack of significant interaction between the trial arms is assumed (ie, the mediation effects are independent of the trial arm and averaged over the entire study population). We tested the effect of relaxing this assumption by identifying the outcomes for which the average causal effects were significantly different between the trial arms.
With 400 children enrolled, the CHAT study was powered to detect differences in the primary outcome between children grouped by the type of treatment.8 For tests of relationships between each arm of the mediation analysis, an individual effect size of 0.30 (medium) as measured by partial correlation is assumed (treatment to mediator, mediator to outcome, and treatment to outcome; Fig 1).31 A sample size of 400 and type 1 error rate of 5% yield a power of nearly 1 for the mediation analysis. Additionally, 2 types of sensitivity analysis were performed; the first was performed to account for the effects of missing values and the second to determine the possible existence of unobserved pretreatment covariates when significant causal mediation effects were identified. The methodologic details of the 2 sensitivity analyses are described in the Supplemental Information. P < .05 was considered significant throughout.
Results
The primary results of the CHAT study have been published.8 Of the 464 children enrolled in the study and who underwent random assignment, 398 were included in the current study on the basis of the availability of complete polysomnographic data. Supplemental Fig 2 reveals patient flow into the current study from the CHAT. The baseline characteristics of children in the study are shown in Table 2. By using the criteria defined in the CHAT, a significantly greater number of children (244 [61%]) had polysomnographic resolution of OSA compared with those who did not experience resolution (154 [39%]; P < .001). Of those without polysomnographic resolution, 96% had an AHI ≥2 and 50% had an AHI ≥5 on follow-up polysomnography. In this group, there were a significantly greater proportion of African American children (63% vs 47%; P = .001) and obese children (47% vs 25%; P < .001) compared with the group without resolution. The mean baseline AHI in the group with complete resolution (5.8 [95% CI, 5.1 to 6.4]) was significantly lower than the group without resolution (8.3 [95% CI, 7.4 to 9.2]; P < .001).
Outcomes Measured in the CHAT
Test . | Domain . | Outcome Variable . | Test Recipient . | Normative Mean or Range . | Interpretation of Higher Value . |
---|---|---|---|---|---|
NEPSY | Neurocognitive | Attention and executive function score | Child | 100 ± 15 | Better functioning |
Conners’ Rating Scales–Revised: Long Version Global Index | Behavior | Restless-impulsive and emotional lability factor sets | Caregiver | 50 ± 10 | Worse behavior |
Conners’ Rating Scale–Revised: Long Version Global Index | Behavior | Restless-impulsive and emotional lability factor sets | Teacher | 50 ± 10 | Worse behavior |
BRIEF | Behavior | Global Executive Composite T score | Caregiver | 50 ± 10 | Worse behavior |
BRIEF | Behavior | Global Executive Composite T score | Teacher | 50 ± 10 | Worse behavior |
Pediatric Sleep Questionnaire | Sleep-related symptoms | Sleep-Related Breathing Disorder scale total score | Caregiver | 0.2 ± 0.1 | Worse symptoms |
PedsQL | Overall quality of life | Total score | Caregiver | 78 ± 16 | Better quality of life |
Differential Ability Scales-II | Neurocognitive | General conceptual ability composite score | Child | 100 ± 15 | Better functioning |
Conners’ Behavior Rating Scales (attention-deficit/hyperactivity disorder) | Behavior | T score | Caregiver | 50 ± 10 | Worse behavior |
Conners’ Behavior Rating Scale (attention-deficit/hyperactivity disorder) | Behavior | T score | Teacher | 50 ± 10 | Worse behavior |
CBCL | Behavior | Total score | Caregiver | 50 ± 10 | Worse behavior |
OSA-18 | Disease-specific quality of life | Total score | Caregiver | NA | Worse quality of life |
Modified ESS | Sleepiness | Summary score | Caregiver | <10 | Worse sleepiness |
BMI | Growth | Percentile score | Child | 5–95 | ≥95: obese |
<5: failure to thrive | |||||
C-reactive protein | Inflammation | Serum level (μg/mL) | Child | <1 | Worse systemic inflammation |
HOMA-IR | Physiology | Fasting insulin (μU/mL) × fasting glucose (mg/dL)/405 | Child | 2.22–2.67 | Worse insulin resistance |
Blood pressure | Physiology | Systolic blood pressure | Child | NAa | NAa |
Blood pressure | Physiology | Diastolic blood pressure | Child | NAa | NAa |
Test . | Domain . | Outcome Variable . | Test Recipient . | Normative Mean or Range . | Interpretation of Higher Value . |
---|---|---|---|---|---|
NEPSY | Neurocognitive | Attention and executive function score | Child | 100 ± 15 | Better functioning |
Conners’ Rating Scales–Revised: Long Version Global Index | Behavior | Restless-impulsive and emotional lability factor sets | Caregiver | 50 ± 10 | Worse behavior |
Conners’ Rating Scale–Revised: Long Version Global Index | Behavior | Restless-impulsive and emotional lability factor sets | Teacher | 50 ± 10 | Worse behavior |
BRIEF | Behavior | Global Executive Composite T score | Caregiver | 50 ± 10 | Worse behavior |
BRIEF | Behavior | Global Executive Composite T score | Teacher | 50 ± 10 | Worse behavior |
Pediatric Sleep Questionnaire | Sleep-related symptoms | Sleep-Related Breathing Disorder scale total score | Caregiver | 0.2 ± 0.1 | Worse symptoms |
PedsQL | Overall quality of life | Total score | Caregiver | 78 ± 16 | Better quality of life |
Differential Ability Scales-II | Neurocognitive | General conceptual ability composite score | Child | 100 ± 15 | Better functioning |
Conners’ Behavior Rating Scales (attention-deficit/hyperactivity disorder) | Behavior | T score | Caregiver | 50 ± 10 | Worse behavior |
Conners’ Behavior Rating Scale (attention-deficit/hyperactivity disorder) | Behavior | T score | Teacher | 50 ± 10 | Worse behavior |
CBCL | Behavior | Total score | Caregiver | 50 ± 10 | Worse behavior |
OSA-18 | Disease-specific quality of life | Total score | Caregiver | NA | Worse quality of life |
Modified ESS | Sleepiness | Summary score | Caregiver | <10 | Worse sleepiness |
BMI | Growth | Percentile score | Child | 5–95 | ≥95: obese |
<5: failure to thrive | |||||
C-reactive protein | Inflammation | Serum level (μg/mL) | Child | <1 | Worse systemic inflammation |
HOMA-IR | Physiology | Fasting insulin (μU/mL) × fasting glucose (mg/dL)/405 | Child | 2.22–2.67 | Worse insulin resistance |
Blood pressure | Physiology | Systolic blood pressure | Child | NAa | NAa |
Blood pressure | Physiology | Diastolic blood pressure | Child | NAa | NAa |
The outcome domains as well as normal values are shown in subsequent columns. The fourth column indicates the recipient of the test. ± represents values of SD. The interpretation of the higher value of each outcome is shown in the last column. T scores provide information about a child’s score relative to the reference sample. NA, not applicable.
Varies by age and sex.
Early adenotonsillectomy resulted in significantly greater changes in 12 out of 18 outcomes compared to watchful waiting (Table 3). Of these, the largest effect sizes were observed for the changes in symptoms measured by the PSQ-SRBD (Cohen’s d, 1.35; P < .001) and disease-specific quality of life assessed by OSA-18 (Cohen’s d, 0.94; P < .001).
Outcome Measures from the CHAT After Random Assignment
Outcomea . | Normative Mean . | Mean (95% CI) . | P . | Effect Size . | |||
---|---|---|---|---|---|---|---|
Watchful Waiting . | Early Adenotonsillectomy . | ||||||
Baseline . | Change From Baseline to 7 mo . | Baseline . | Change From Baseline to 7 mo . | ||||
NEPSY attention and executive function score (n = 390) | 100 ± 15 | 101.0 (99.0 to 103.0) | 5.1 (3.3 to 7.0) | 102.0 (99.7 to 104.2) | 6.9 (−4.1 to −1.2) | .07 | 0.13 |
Conners’ Rating Scale score | |||||||
Caregiver rating (n = 385) | 50 ± 10 | 52.5 (50.9 to 54.1) | −0.4 (−1.7 to 0.93) | 52.2 (50.6 to 53.9) | −2.7 (−4.1 to −1.2) | .001 | 0.23 |
Teacher rating (n = 209) | NA | 54.9 (52.8 to 57.0) | −1.8 (−3.5 to −0.1) | 55.3 (53.1 to 57.5) | −5.2 (−7.1 to −3.3) | .04 | 0.27 |
BRIEF score | |||||||
Caregiver rating (n = 385) | 50 ± 10 | 50.0 (48.4 to 51.6) | 0.3 (−0.9 to 1.5) | 50.1 (48.5 to 51.6) | −3.3 (−4.5 to −2.1) | <.001 | 0.41 |
Teacher rating (n = 204) | NA | 56.3 (54.3 to 58.3) | −0.9 (−2.6 to 0.8) | 56.9 (54.8 to 59.0) | −2.4 (−4.1 to −0.6) | .18 | 0.12 |
PSQ-SRBD score (n = 389) | 0.2 ± 0.1 | 0.5 (0.5 to 0.5) | 0.0 (−0.1 to −0.0) | 0.5 (0.5 to 0.5) | −0.3 (−0.3 to −0.3) | <.001 | 1.35 |
PedsQL score (n = 392) | 78 ± 16 | 78.1 (76.0 to 80.2) | 0.8 (−1.0 to 2.6) | 78.3 (76.1 to 80.5) | 6.1 (4.0 to 8.1) | <.001 | 0.38 |
DAS-2 GCA score (n = 390) | 100 ± 15 | 95.7 (94.1 to 97.3) | 1.7 (−0.6 to 4.0) | 98.6 (97.0 to 100.3) | −2.0 (−4.4 to −0.3) | .37 | 0.06 |
Conners’ ADHD Index T score | |||||||
Caregiver rating (n = 385) | 50 ± 10 | 53.2 (51.7 to 54.8) | −0.6 (−2.6 to 1.6) | 52.9 (51.3 to 54.5) | −2.7 (−5.0 to −0.6) | .003 | 0.22 |
Teacher rating (n = 209) | NA | 54.2 (52.2 to 56.2) | 0.5 (−2.3 to 3.2) | 55.6 (53.5 to 57.6) | −3.4 (−6.2 to −0.5) | .01 | 0.35 |
CBCL total problems T score (n = 374) | 50 ± 10 | 53.1 (51.6 to 54.6) | −0.9 (−3.1 to −1.2) | 52.4 (50.8 to 54.0) | −3.7 (−6.1 to −1.3) | <.001 | 0.26 |
OSA-18 score (n = 393) | NA | 54.1 (51.5 to 56.8) | −4.6 (−8.5 to −0.8) | 53.3 (50.7 to 55.8) | −21.5 (−24.8 to −18.1) | <.001 | 0.94 |
ESS (n = 393) | <10 | 7.8 (7.0 to 8.6) | −0.3 (−1.3 to 0.8) | 7.5 (6.7 to 8.2) | −2.1 (−3.1 to −1.1) | <.001 | 0.38 |
BMI percentile (n = 391) | 5–85 | 70.0 (65.9 to 74.2) | 4.0 (−1.8 to 9.8) | 70.3 (65.8 to 74.8) | 6.5 (0.6 to 12.5) | .02 | 0.21 |
C-reactive protein (n = 262), μg/mL | <1 | 2.5 (1.1 to 3.8) | −0.9 (−2.3 to 0.5) | 1.9 (1.4 to 2.4) | 0.2 (−0.8 to 1.3) | .50 | 0.01 |
HOMA-IR (n = 261) | 2.22–2.67 | 2.0 (1.6 to 2.4) | 0.0 (−0.5 to 0.5) | 1.8 (1.5 to 2.1) | 0.5 (−0.1 to 1.1) | .04 | 0.20 |
Systolic blood pressure (n = 391), mm Hg | NA | 97.8 (96.7 to 99.0) | 0.8 (−0.8 to 2.5) | 96.6 (95.3 to 98.0) | 1.2 (−0.6 to 3.1) | .56 | 0.06 |
Diastolic blood pressure (n = 391), mm Hg | NA | 62.4 (61.3 to 63.4) | 0.7 (−0.9 to 2.2) | 62.1 (60.9 to 63.2) | 1.1 (−0.5 to 2.7) | .81 | 0.05 |
Outcomea . | Normative Mean . | Mean (95% CI) . | P . | Effect Size . | |||
---|---|---|---|---|---|---|---|
Watchful Waiting . | Early Adenotonsillectomy . | ||||||
Baseline . | Change From Baseline to 7 mo . | Baseline . | Change From Baseline to 7 mo . | ||||
NEPSY attention and executive function score (n = 390) | 100 ± 15 | 101.0 (99.0 to 103.0) | 5.1 (3.3 to 7.0) | 102.0 (99.7 to 104.2) | 6.9 (−4.1 to −1.2) | .07 | 0.13 |
Conners’ Rating Scale score | |||||||
Caregiver rating (n = 385) | 50 ± 10 | 52.5 (50.9 to 54.1) | −0.4 (−1.7 to 0.93) | 52.2 (50.6 to 53.9) | −2.7 (−4.1 to −1.2) | .001 | 0.23 |
Teacher rating (n = 209) | NA | 54.9 (52.8 to 57.0) | −1.8 (−3.5 to −0.1) | 55.3 (53.1 to 57.5) | −5.2 (−7.1 to −3.3) | .04 | 0.27 |
BRIEF score | |||||||
Caregiver rating (n = 385) | 50 ± 10 | 50.0 (48.4 to 51.6) | 0.3 (−0.9 to 1.5) | 50.1 (48.5 to 51.6) | −3.3 (−4.5 to −2.1) | <.001 | 0.41 |
Teacher rating (n = 204) | NA | 56.3 (54.3 to 58.3) | −0.9 (−2.6 to 0.8) | 56.9 (54.8 to 59.0) | −2.4 (−4.1 to −0.6) | .18 | 0.12 |
PSQ-SRBD score (n = 389) | 0.2 ± 0.1 | 0.5 (0.5 to 0.5) | 0.0 (−0.1 to −0.0) | 0.5 (0.5 to 0.5) | −0.3 (−0.3 to −0.3) | <.001 | 1.35 |
PedsQL score (n = 392) | 78 ± 16 | 78.1 (76.0 to 80.2) | 0.8 (−1.0 to 2.6) | 78.3 (76.1 to 80.5) | 6.1 (4.0 to 8.1) | <.001 | 0.38 |
DAS-2 GCA score (n = 390) | 100 ± 15 | 95.7 (94.1 to 97.3) | 1.7 (−0.6 to 4.0) | 98.6 (97.0 to 100.3) | −2.0 (−4.4 to −0.3) | .37 | 0.06 |
Conners’ ADHD Index T score | |||||||
Caregiver rating (n = 385) | 50 ± 10 | 53.2 (51.7 to 54.8) | −0.6 (−2.6 to 1.6) | 52.9 (51.3 to 54.5) | −2.7 (−5.0 to −0.6) | .003 | 0.22 |
Teacher rating (n = 209) | NA | 54.2 (52.2 to 56.2) | 0.5 (−2.3 to 3.2) | 55.6 (53.5 to 57.6) | −3.4 (−6.2 to −0.5) | .01 | 0.35 |
CBCL total problems T score (n = 374) | 50 ± 10 | 53.1 (51.6 to 54.6) | −0.9 (−3.1 to −1.2) | 52.4 (50.8 to 54.0) | −3.7 (−6.1 to −1.3) | <.001 | 0.26 |
OSA-18 score (n = 393) | NA | 54.1 (51.5 to 56.8) | −4.6 (−8.5 to −0.8) | 53.3 (50.7 to 55.8) | −21.5 (−24.8 to −18.1) | <.001 | 0.94 |
ESS (n = 393) | <10 | 7.8 (7.0 to 8.6) | −0.3 (−1.3 to 0.8) | 7.5 (6.7 to 8.2) | −2.1 (−3.1 to −1.1) | <.001 | 0.38 |
BMI percentile (n = 391) | 5–85 | 70.0 (65.9 to 74.2) | 4.0 (−1.8 to 9.8) | 70.3 (65.8 to 74.8) | 6.5 (0.6 to 12.5) | .02 | 0.21 |
C-reactive protein (n = 262), μg/mL | <1 | 2.5 (1.1 to 3.8) | −0.9 (−2.3 to 0.5) | 1.9 (1.4 to 2.4) | 0.2 (−0.8 to 1.3) | .50 | 0.01 |
HOMA-IR (n = 261) | 2.22–2.67 | 2.0 (1.6 to 2.4) | 0.0 (−0.5 to 0.5) | 1.8 (1.5 to 2.1) | 0.5 (−0.1 to 1.1) | .04 | 0.20 |
Systolic blood pressure (n = 391), mm Hg | NA | 97.8 (96.7 to 99.0) | 0.8 (−0.8 to 2.5) | 96.6 (95.3 to 98.0) | 1.2 (−0.6 to 3.1) | .56 | 0.06 |
Diastolic blood pressure (n = 391), mm Hg | NA | 62.4 (61.3 to 63.4) | 0.7 (−0.9 to 2.2) | 62.1 (60.9 to 63.2) | 1.1 (−0.5 to 2.7) | .81 | 0.05 |
ADHD, attention-deficit/hyperactivity disorder; DAS-2 GCA, Differential Ability Scales–Second Edition (General Conceptual Ability); NA, not available.
Outcome variables included scores on the NEPSY, the Conners’ Behavior Rating Scales, and the BRIEF, with summary measures obtained from the primary caregiver and the teacher, the PSQ-SRBD, and the PedsQL. These were reported in the original study. Additional outcome variables included scores on the DAS-2 GCA, the Conners’ ADHD Index T scores obtained from the primary caregiver and the teacher, the CBCL, the OSA-18 inventory, and the modified ESS. T scores provide information about a child’s score relative to the reference sample. Physiologic outcomes were represented by changes in BMI percentile score, C-reactive protein values, HOMA-IR, and the systolic and diastolic blood pressures. Population means along with SDs are provided where normative data are available. All other values are reported as means and 95% CIs. P values are adjusted for effects of age, race, and wt. Effect size is calculated by the Cohen’s d values and follows the following conventions: 0.20–0.49 (small), 0.50–0.79 (medium), and ≥0.80 (large). Numbers in parentheses listed in the outcome column represent sample sizes of complete data sets. P < .05 was considered significant. A complete description of each outcome is provided in Table 1.
Table 4 shows the results of mediation analysis averaged over both arms of the trial when polysomnographic resolution of OSA was considered the mediator. The total effects of treatment, defined as the sum of mediation and direct effects, were significant for 12 of the 18 outcomes. Small but statistically significant average mediation effects were identified for the PSQ-SRBD (proportion mediated: 0.13; 95% CI, 0.07 to 0.21; P < .001) and OSA-18 (proportion mediated: 0.11 [95% CI, 0.04 to 0.20]; P = .004). Similarly, the change in AHI between baseline and follow-up revealed significant mediation effects (Table 5) for the PSQ-SRBD (proportion mediated: 0.18 [95% CI, 0.11 to 0.26]; P < .001) and OSA-18 (proportion mediated: 0.20 [95% CI, 0.10 to 0.31]; P < .001). However, 16 out of 18 outcomes were not associated with significant contribution from either mediator. When OSA resolution was defined in a more robust fashion as an AHI <1.5, OAI <1, and absence of hypoxemia as measured by oximetry (Supplemental Table 6), the mediation effects were limited to the PSQ-SRBD (proportion mediated: 0.06 [95% CI, 0.02 to 0.12]; P = .005). Investigation of interaction between the trial arms revealed that polysomnographic resolution in children undergoing surgery mediated weight gain as opposed to weight loss in children undergoing watchful waiting (difference in mediation effect: 2.02 [95% CI, 0.40 to 3.85]; P = .01; Supplemental Table 7 for OSA resolution as mediator and Supplemental Tables 8 and 9 for other mediators). None of the other 17 outcomes revealed statistically significant interaction effects.
Results of Mediation Analysis for 18 Outcomes Assessed in the CHAT
Outcomea . | ACME . | ADE . | Total Effectb . | Proportion Mediatedc . | ||||
---|---|---|---|---|---|---|---|---|
Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | |
NEPSY attention and executive function score (n = 390) | −0.20 (−1.42 to 1.02) | .74 | 2.53 (−0.41 to 5.67) | .10 | 2.32 (−0.41 to 5.30) | .10 | −0.09 (−3.69 to 0.83) | .76 |
Conners’ Rating Scale score | ||||||||
Caregiver rating (n = 385) | −0.07 (−0.78 to 0.60) | .88 | −2.94 (−4.73 to −1.11) | .002 | −3.02 (−4.62 to −1.38) | <.001 | 0.02 (−0.19 to 0.39) | .88 |
Teacher rating (n = 209) | −0.68 (−2.41 to 0.97) | .42 | −2.40 (−6.02 to 0.75) | .15 | −3.08 (−5.84 to −0.48) | .03 | 0.22 (−2.10 to 3.07) | .45 |
BRIEF score | ||||||||
Caregiver rating (n = 385) | 0.06 (−0.66 to 0.72) | .86 | −3.68 (−5.43 to −1.95) | <.001 | −3.62 (−5.26 to −2.08) | <.001 | −0.02 (−0.22 to 0.19) | .86 |
Teacher rating (n = 204) | −0.13 (−1.65 to 1.42) | .89 | −2.05 (−5.49 to 1.25) | .21 | −2.19 (−5.22 to 0.64) | .14 | 0.06 (−618.93 to −0.26) | .90 |
PSQ-SRBD (n = 389) | −0.03 (−0.05 to −0.02) | <.001 | −0.22 (−0.25 to −0.18) | <.001 | −0.25 (−0.28 to −0.21) | <.001 | 0.13 (0.07 to 0.21) | <.001 |
PedsQL score (n = 392) | 0.78 (−0.41 to 2.05) | .23 | 4.55 (1.21 to 7.61) | .007 | 5.33 (2.30 to 8.10) | <.001 | 0.15 (−0.05 to 0.63) | .23 |
DAS-2 GCA score (n = 390) | −0.56 (−1.14 to 0.00) | .05 | 1.27 (−0.16 to 2.63) | .08 | 0.71 (−0.64 to 1.98) | .31 | −0.79 (−8.96 to 7.66) | .34 |
Conners’ ADHD Index T score | ||||||||
Caregiver rating (n = 385) | −0.45 (−1.16 to 0.25) | .21 | −2.15 (−3.98 to −0.33) | .02 | −2.60 (−4.23 to −1.01) | .002 | 0.17 (−0.10 to 0.70) | .21 |
Teacher rating (n = 209) | −0.33 (−1.83 to 1.12) | .67 | −3.11 (−5.98 to −0.30) | .03 | −3.44 (−5.90 to −1.03) | .004 | 0.10 (−0.39 to 0.81) | .67 |
CBCL total problems T score (n = 374) | −0.02 (−0.66 to 0.61) | .94 | −2.68 (−4.33 to −1.13) | <.001 | −2.70 (−4.31 to −1.13) | <.001 | 0.01 (−0.32 to 0.27) | .94 |
OSA-18 summary score (388) | −1.98 (−3.40 to −0.64) | .004 | −15.31 (−18.70 to −12.07) | <.001 | −17.29 (−20.40 to −14.24) | <.001 | 0.11 (0.04 to 0.20) | .004 |
ESS (n = 393) | −0.13 (−0.47 to 0.20) | .44 | −1.64 (−2.55 to −0.77) | <.001 | −1.77 (−2.60 to −0.98) | <.001 | 0.08 (−0.11 to 0.31) | .44 |
BMI percentile (n = 391) | −0.35 (−1.26 to 0.50) | .40 | 2.76 (0.56 to 5.00) | .01 | 2.42 (0.41 to 4.49) | .02 | −0.14 (−0.99 to 0.33) | .41 |
C-reactive protein, (n = 262), μg/mL | −0.15 (−0.52 to 0.17) | .40 | 0.38 (−0.53 to 1.30) | .47 | 0.23 (−0.48 to 0.92) | .60 | −0.64 (−6.39 to 4.92) | .54 |
HOMA-IR (n = 261) | 0.02 (−0.19 to 0.22) | .85 | 0.55 (0.03 to 1.11) | .03 | 0.57 (0.09 to 1.07) | .02 | 0.04 (−0.51 to 0.69) | .85 |
Systolic blood pressure, (n = 391) mm Hg | −0.12 (−0.78 to 0.54) | .73 | −0.13 (−1.82 to 1.60) | .88 | −0.25 (−1.75 to 1.32) | .78 | 0.47 (−6.16 to 5.50) | .95 |
Diastolic blood pressure (n = 391), mm Hg | 0.29 (−0.34 to 0.91) | .35 | −0.17 (−1.72 to 1.37) | .85 | 0.12 (−1.30 to 1.53) | .86 | 2.44 (−7.01 to 6.52) | .93 |
Outcomea . | ACME . | ADE . | Total Effectb . | Proportion Mediatedc . | ||||
---|---|---|---|---|---|---|---|---|
Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | |
NEPSY attention and executive function score (n = 390) | −0.20 (−1.42 to 1.02) | .74 | 2.53 (−0.41 to 5.67) | .10 | 2.32 (−0.41 to 5.30) | .10 | −0.09 (−3.69 to 0.83) | .76 |
Conners’ Rating Scale score | ||||||||
Caregiver rating (n = 385) | −0.07 (−0.78 to 0.60) | .88 | −2.94 (−4.73 to −1.11) | .002 | −3.02 (−4.62 to −1.38) | <.001 | 0.02 (−0.19 to 0.39) | .88 |
Teacher rating (n = 209) | −0.68 (−2.41 to 0.97) | .42 | −2.40 (−6.02 to 0.75) | .15 | −3.08 (−5.84 to −0.48) | .03 | 0.22 (−2.10 to 3.07) | .45 |
BRIEF score | ||||||||
Caregiver rating (n = 385) | 0.06 (−0.66 to 0.72) | .86 | −3.68 (−5.43 to −1.95) | <.001 | −3.62 (−5.26 to −2.08) | <.001 | −0.02 (−0.22 to 0.19) | .86 |
Teacher rating (n = 204) | −0.13 (−1.65 to 1.42) | .89 | −2.05 (−5.49 to 1.25) | .21 | −2.19 (−5.22 to 0.64) | .14 | 0.06 (−618.93 to −0.26) | .90 |
PSQ-SRBD (n = 389) | −0.03 (−0.05 to −0.02) | <.001 | −0.22 (−0.25 to −0.18) | <.001 | −0.25 (−0.28 to −0.21) | <.001 | 0.13 (0.07 to 0.21) | <.001 |
PedsQL score (n = 392) | 0.78 (−0.41 to 2.05) | .23 | 4.55 (1.21 to 7.61) | .007 | 5.33 (2.30 to 8.10) | <.001 | 0.15 (−0.05 to 0.63) | .23 |
DAS-2 GCA score (n = 390) | −0.56 (−1.14 to 0.00) | .05 | 1.27 (−0.16 to 2.63) | .08 | 0.71 (−0.64 to 1.98) | .31 | −0.79 (−8.96 to 7.66) | .34 |
Conners’ ADHD Index T score | ||||||||
Caregiver rating (n = 385) | −0.45 (−1.16 to 0.25) | .21 | −2.15 (−3.98 to −0.33) | .02 | −2.60 (−4.23 to −1.01) | .002 | 0.17 (−0.10 to 0.70) | .21 |
Teacher rating (n = 209) | −0.33 (−1.83 to 1.12) | .67 | −3.11 (−5.98 to −0.30) | .03 | −3.44 (−5.90 to −1.03) | .004 | 0.10 (−0.39 to 0.81) | .67 |
CBCL total problems T score (n = 374) | −0.02 (−0.66 to 0.61) | .94 | −2.68 (−4.33 to −1.13) | <.001 | −2.70 (−4.31 to −1.13) | <.001 | 0.01 (−0.32 to 0.27) | .94 |
OSA-18 summary score (388) | −1.98 (−3.40 to −0.64) | .004 | −15.31 (−18.70 to −12.07) | <.001 | −17.29 (−20.40 to −14.24) | <.001 | 0.11 (0.04 to 0.20) | .004 |
ESS (n = 393) | −0.13 (−0.47 to 0.20) | .44 | −1.64 (−2.55 to −0.77) | <.001 | −1.77 (−2.60 to −0.98) | <.001 | 0.08 (−0.11 to 0.31) | .44 |
BMI percentile (n = 391) | −0.35 (−1.26 to 0.50) | .40 | 2.76 (0.56 to 5.00) | .01 | 2.42 (0.41 to 4.49) | .02 | −0.14 (−0.99 to 0.33) | .41 |
C-reactive protein, (n = 262), μg/mL | −0.15 (−0.52 to 0.17) | .40 | 0.38 (−0.53 to 1.30) | .47 | 0.23 (−0.48 to 0.92) | .60 | −0.64 (−6.39 to 4.92) | .54 |
HOMA-IR (n = 261) | 0.02 (−0.19 to 0.22) | .85 | 0.55 (0.03 to 1.11) | .03 | 0.57 (0.09 to 1.07) | .02 | 0.04 (−0.51 to 0.69) | .85 |
Systolic blood pressure, (n = 391) mm Hg | −0.12 (−0.78 to 0.54) | .73 | −0.13 (−1.82 to 1.60) | .88 | −0.25 (−1.75 to 1.32) | .78 | 0.47 (−6.16 to 5.50) | .95 |
Diastolic blood pressure (n = 391), mm Hg | 0.29 (−0.34 to 0.91) | .35 | −0.17 (−1.72 to 1.37) | .85 | 0.12 (−1.30 to 1.53) | .86 | 2.44 (−7.01 to 6.52) | .93 |
The causal mediator was polysomnographic resolution, defined by an AHI <2 and OAI <1 at follow-up. ACME, average causal mediation effect; ADE, average direct effect; ADHD, attention-deficit/hyperactivity disorder; DAS-2 GCA, Differential Ability Scales–Second Edition (General Conceptual Ability). T scores provide information about a child’s score relative to the reference sample.
Outcome variables included scores on the NEPSY, the Conners’ Behavior Rating Scales and the BRIEF, with summary measures obtained from the primary caregiver and the teacher, the PSQ-SRBD, and the PedsQL. These were reported in the CHAT study. Additional outcome variables included scores on the DAS-2 GCA, the Conners’ ADHD Index T scores obtained from the primary caregiver and the teacher, the CBCL, the OSA-18 inventory, and the modified ESS. Physiologic outcomes were represented by changes in BMI percentile score, C-reactive protein values, HOMA-IR, and the systolic and diastolic blood pressures. Changes in outcomes in response to early adenotonsillectomy compared to watchful waiting are shown in Table 3. Normative values are described in Table 1. The relationship between the treatment and the outcome is assessed before and after, including polysomnographic resolution of OSA as the mediating variable.
The total effect is decomposed into the ACME and the ADE.
The contribution of the mediated effect as a proportion of the total effect is also shown. Point estimates are represented by means and 95% CIs, along with P values for comparisons. The number of children is indicated by “n =” in parentheses. P < .05 was considered significant.
Results of Mediation Analysis for Outcomes of the CHAT
Outcomea . | ACME . | ADE . | Total Effectb . | Proportion Mediatedc . | ||||
---|---|---|---|---|---|---|---|---|
Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | |
NEPSY attention and executive function score (n = 390) | 0.81 (−0.71 to 2.28) | .27 | 1.51 (−1.64 to 4.78) | .27 | 2.32 (−0.48 to 5.19) | .10 | 0.35 (−1.78 to 3.27) | .34 |
Conners’ Rating Scale score | ||||||||
Caregiver rating (n = 385) | −0.43 (−1.27 to 0.35) | .30 | −2.59 (−4.43 to −0.74) | .004 | −3.02 (−4.64 to −1.42) | <.001 | 0.14 (−0.12 to 0.56) | .30 |
Teacher rating (n = 209) | −0.59 (−2.39 to 1.29) | .51 | −2.49 (−5.95 to 0.98) | .16 | −3.08 (−5.79 to −0.360 | .03 | 0.19 (−0.62 to 1.88) | .53 |
BRIEF score | ||||||||
Caregiver rating (n = 385) | −0.23 (−1.12 to 0.60) | .56 | −3.39 (−5.13 to −1.64) | <.001 | −3.63 (−5.26 to −2.02) | <.001 | 0.06 (−0.19 to 0.340 | .56 |
Teacher rating (n = 204) | −0.51 (−2.53 to 1.78) | .67 | −1.68 (−5.59 to 2.07) | .36 | −2.19 (−5.19 to 0.74) | .14 | 0.23 (−3.22 to 4.10) | .72 |
PSQ-SRBD (n = 389) | −0.05 (−0.06 to −0.03) | <.001 | −0.20 (−0.24 to −0.17) | <.001 | −0.25 (−0.28 to −0.22) | <.001 | 0.18 (0.11 to 0.26) | <.001 |
PedsQL score (n = 392) | 0.051 (−0.95 to 1.96) | .49 | 4.82 (1.41 to 8.19) | .006 | 5.33 (2.42 to 8.21) | <.001 | 0.10 (−0.18 to 0.51) | .50 |
DAS-2 GCA score (n = 390) | −0.90 (−1.65 to −0.17) | .02 | 1.61 (0.13 to 3.07) | .03 | 0.71 (−0.61 to 2.04) | .30 | −1.28 (−13.38 to 10.28) | .31 |
Conners’ ADHD Index T score | ||||||||
Caregiver rating (n = 385) | −0.60 (−1.41 to 0.23) | .15 | −2.00 (−3.95 to −0.20) | .03 | −2.60 (−4.24 to −1.060 | .001 | 0.23 (−0.09 to 0.84) | .15 |
Teacher rating (n = 209) | −0.51 (−2.09 to 1.28) | .55 | −2.93 (−6.15 to 0.00) | .05 | −3.44 (−5.85 to −1.13) | .004 | 0.15 (−0.39 to 0.99) | .55 |
CBCL total problems T score (n = 374) | −0.36 (−1.18 to 0.47) | .39 | −2.24 (−3.99 to −0.68) | .007 | −2.70 (−4.29 to −1.17) | .001 | 0.14 (−0.20 to 0.54) | .39 |
OSA-18 summary score (n = 388) | −3.39 (−5.16 to −1.82) | <.001 | −13.90 (−17.24 to −10.45) | <.001 | −17.28 (−20.35 to −14.19) | <.001 | 0.20 (0.10 to 0.31) | <.001 |
ESS (n = 393) | −0.49 (−0.93 to −0.07) | .02 | −1.29 (−2.20 to −0.41) | .004 | −1.77 (−2.59 to −0.98) | <.001 | 0.27 (0.04 to 0.64) | .02 |
BMI percentile (n = 391) | 0.78 (−0.35 to 1.940 | .18 | 1.64 (−0.76 to 4.04) | .18 | 2.42 (0.43 to 4.44) | .02 | 0.32 (−0.20 to 1.82) | .20 |
C-reactive protein (n = 262), μg/mL | −0.42 (−1.08 to 0.12) | .19 | 0.65 (−0.38 to 1.75) | .29 | 0.23 (−0.44 to 0.88) | .58 | −1.81 (−15.12 to 12.51) | .58 |
HOMA-IR (n = 261) | 0.01 (−0.22 to 0.22) | .91 | 0.56 (0.03 to 1.11) | .04 | 0.57 (0.11 to 1.06) | .01 | 0.02 (−0.48 to 0.71) | .91 |
Systolic blood pressure (n = 391), mm Hg | −0.12 (−0.98 to 0.78) | .80 | −0.13 (−1.87 to 1.62) | .88 | −0.25 (−1.78 to 1.28) | .72 | 0.49 (−6.56 to 8.25) | .93 |
Diastolic blood pressure (n = 391), mm Hg | 0.47 (−0.33 to 1.25) | .24 | −0.35 (−2.00 to 1.31) | .69 | 0.12 (−1.32 to 1.540 | .85 | 4.00 (−11.39 to 10.54) | .88 |
Outcomea . | ACME . | ADE . | Total Effectb . | Proportion Mediatedc . | ||||
---|---|---|---|---|---|---|---|---|
Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | Mean Estimate (95% CI) . | P . | |
NEPSY attention and executive function score (n = 390) | 0.81 (−0.71 to 2.28) | .27 | 1.51 (−1.64 to 4.78) | .27 | 2.32 (−0.48 to 5.19) | .10 | 0.35 (−1.78 to 3.27) | .34 |
Conners’ Rating Scale score | ||||||||
Caregiver rating (n = 385) | −0.43 (−1.27 to 0.35) | .30 | −2.59 (−4.43 to −0.74) | .004 | −3.02 (−4.64 to −1.42) | <.001 | 0.14 (−0.12 to 0.56) | .30 |
Teacher rating (n = 209) | −0.59 (−2.39 to 1.29) | .51 | −2.49 (−5.95 to 0.98) | .16 | −3.08 (−5.79 to −0.360 | .03 | 0.19 (−0.62 to 1.88) | .53 |
BRIEF score | ||||||||
Caregiver rating (n = 385) | −0.23 (−1.12 to 0.60) | .56 | −3.39 (−5.13 to −1.64) | <.001 | −3.63 (−5.26 to −2.02) | <.001 | 0.06 (−0.19 to 0.340 | .56 |
Teacher rating (n = 204) | −0.51 (−2.53 to 1.78) | .67 | −1.68 (−5.59 to 2.07) | .36 | −2.19 (−5.19 to 0.74) | .14 | 0.23 (−3.22 to 4.10) | .72 |
PSQ-SRBD (n = 389) | −0.05 (−0.06 to −0.03) | <.001 | −0.20 (−0.24 to −0.17) | <.001 | −0.25 (−0.28 to −0.22) | <.001 | 0.18 (0.11 to 0.26) | <.001 |
PedsQL score (n = 392) | 0.051 (−0.95 to 1.96) | .49 | 4.82 (1.41 to 8.19) | .006 | 5.33 (2.42 to 8.21) | <.001 | 0.10 (−0.18 to 0.51) | .50 |
DAS-2 GCA score (n = 390) | −0.90 (−1.65 to −0.17) | .02 | 1.61 (0.13 to 3.07) | .03 | 0.71 (−0.61 to 2.04) | .30 | −1.28 (−13.38 to 10.28) | .31 |
Conners’ ADHD Index T score | ||||||||
Caregiver rating (n = 385) | −0.60 (−1.41 to 0.23) | .15 | −2.00 (−3.95 to −0.20) | .03 | −2.60 (−4.24 to −1.060 | .001 | 0.23 (−0.09 to 0.84) | .15 |
Teacher rating (n = 209) | −0.51 (−2.09 to 1.28) | .55 | −2.93 (−6.15 to 0.00) | .05 | −3.44 (−5.85 to −1.13) | .004 | 0.15 (−0.39 to 0.99) | .55 |
CBCL total problems T score (n = 374) | −0.36 (−1.18 to 0.47) | .39 | −2.24 (−3.99 to −0.68) | .007 | −2.70 (−4.29 to −1.17) | .001 | 0.14 (−0.20 to 0.54) | .39 |
OSA-18 summary score (n = 388) | −3.39 (−5.16 to −1.82) | <.001 | −13.90 (−17.24 to −10.45) | <.001 | −17.28 (−20.35 to −14.19) | <.001 | 0.20 (0.10 to 0.31) | <.001 |
ESS (n = 393) | −0.49 (−0.93 to −0.07) | .02 | −1.29 (−2.20 to −0.41) | .004 | −1.77 (−2.59 to −0.98) | <.001 | 0.27 (0.04 to 0.64) | .02 |
BMI percentile (n = 391) | 0.78 (−0.35 to 1.940 | .18 | 1.64 (−0.76 to 4.04) | .18 | 2.42 (0.43 to 4.44) | .02 | 0.32 (−0.20 to 1.82) | .20 |
C-reactive protein (n = 262), μg/mL | −0.42 (−1.08 to 0.12) | .19 | 0.65 (−0.38 to 1.75) | .29 | 0.23 (−0.44 to 0.88) | .58 | −1.81 (−15.12 to 12.51) | .58 |
HOMA-IR (n = 261) | 0.01 (−0.22 to 0.22) | .91 | 0.56 (0.03 to 1.11) | .04 | 0.57 (0.11 to 1.06) | .01 | 0.02 (−0.48 to 0.71) | .91 |
Systolic blood pressure (n = 391), mm Hg | −0.12 (−0.98 to 0.78) | .80 | −0.13 (−1.87 to 1.62) | .88 | −0.25 (−1.78 to 1.28) | .72 | 0.49 (−6.56 to 8.25) | .93 |
Diastolic blood pressure (n = 391), mm Hg | 0.47 (−0.33 to 1.25) | .24 | −0.35 (−2.00 to 1.31) | .69 | 0.12 (−1.32 to 1.540 | .85 | 4.00 (−11.39 to 10.54) | .88 |
The causal mediator was polysomnographic change between baseline and follow-up, measured by the log-transformed AHI. ACME, average causal mediation effect; ADE, average direct effect; ADHD, attention-deficit/hyperactivity disorder; DAS-2 GCA, Differential Ability Scales–Second Edition (General Conceptual Ability). T scores provide information about a child’s score relative to the reference sample.
Outcome variables reported in the CHAT study included scores on the NEPSY, the Conners’ Behavior Rating Scales, and the BRIEF, with summary measures obtained from the primary caregiver and the teacher, the PSQ-SRBD, and the PedsQL. Additional outcome variables included scores on DAS-2 GCA, the Conners’ ADHD Index T scores obtained from the primary caregiver and the teacher, the CBCL, the OSA-18 inventory, and the modified ESS. Physiologic outcomes were represented by changes in BMI percentile score, C-reactive protein values, HOMA-IR, and the systolic and diastolic blood pressures. The relationship between the treatment and the outcome is assessed before and after, including log-transformed AHI as the mediating variable. Log transformation was performed because of the nonnormal distribution of data.
The total effect is decomposed into the ACME and the ADE.
The contribution of the mediated effect as a proportion of the total effect is also shown. Point estimates are represented by means and 95% CIs, along with P values for comparisons. The number of children is indicated by “n =” in parentheses. P < .05 was considered significant.
The first sensitivity analysis revealed that the missing observations from the data did not significantly alter causal inferences (data not shown). Supplemental Table 10 and Supplemental Figs 3–5 represent sensitivity analyses for the significant mediation effects identified for 3 separate causal pathways. In summary, sequential ignorability was less likely to be violated for the mediation effects in the PSQ-SRBD (sensitivity parameter of −0.40 for the watchful waiting group and −0.30 for the adenotonsillectomy group) compared to the OSA-18 (sensitivity parameter of −0.30 and −0.10, respectively, for the 2 arms of the trial).
Discussion
In this study, polysomnographic resolution and changes in severity of OSA accounted for small but significant proportions of changes in both symptoms and disease-specific quality of life in children treated for OSA. Importantly, polysomnographic resolution of OSA or changes in its severity did not causally impact 16 out of the 18 treatment outcomes. These results highlight the limited utility associated with the use of polysomnographic thresholds in the management of OSA.
The principal outcome reported in the treatment of pediatric OSA is the resolution of the condition by polysomnography. This approach is supported by several nonrandomized studies with both prospective32 and retrospective33 designs as well as multiple meta-analyses.34–36 As shown by the CHAT study, children who were African American, obese, or with greater baseline severity of OSA were more likely to have persistent OSA at follow-up regardless of the treatment arm.8
Additionally, the comparison of treatment effects between the 2 arms of the trial also revealed statistically significant increases in weight gain and HOMA-IR in the surgical arm. Although weight gain after adenotonsillectomy has been described previously,37 the effect sizes of these differences were small in the current study.
Because of the risk of persistent OSA in 10% to 50% of children undergoing adenotonsillectomy, postoperative polysomnography is recommended2 for early detection and further treatment.1 On the basis of this recommendation, in the current study, 50% of the children (77 out of 154) without resolution of OSA may have been potential candidates for further treatment ranging from additional surgery to continuous positive air pressure therapy solely on the basis of an AHI exceeding 5 at follow-up.
The CHAT study remains the only randomized trial to this date used to investigate the benefits of surgery over watchful waiting for outcomes related to childhood OSA. The trial additionally provided the opportunity to examine the isolated effects of changes in polysomnographic severity of OSA or resolution on its treatment. Importantly, we show that the results of the causal mediation analysis suggest that a majority of the treatment-related changes in outcomes of OSA in children are not attributable to polysomnographic changes in its severity or resolution. Given that 12 out of the 18 outcomes in the trial revealed significant average changes with treatment, alterations in other polysomnographic parameters such as sleep architecture may play a mediating role and merit further investigation.
The use of polysomnography is central to the guidelines from the American Academy of Pediatrics,1 American Academy of Sleep Medicine,2 American Academy of Otolaryngology–Head and Neck Surgery,3 and the ERS4 for the diagnosis and management of OSA, including the 500 000 children undergoing adenotonsillectomy each year. The ERS suggests treating a child with an AHI >5 even in the absence of comorbidities.4 Such an approach increases the risk of avoidable surgical morbidity in children considered candidates for further treatment solely on the basis of empirical polysomnographic thresholds. Moreover, a stepwise approach that seeks complete resolution of OSA in children has been proposed with a recommendation to measure outcomes on the basis of polysomnography at each stage.4 Liberal polysomnography as proposed in these guidelines may not be justifiable given the costs and resources needed.7,38
The principal strengths of this study are related to the robust criteria used to define inclusion and exclusion of subjects in the only randomized trial to assess the benefit of adenotonsillectomy over watchful waiting for the treatment of childhood OSA. The causal mediation analysis described in the current study has been promulgated to provide mechanistic explanations for outcomes related to interventions.39 Furthermore, the causal models described here are generally agnostic to the nature of the underlying data distributions.28 Children were enrolled from multiple sites across the United States, underscoring the generalizability of the results and bias mitigation efforts. The 18 outcomes examined as part of the trial were derived from all possible domains potentially impacted by OSA in children. The weaknesses of the study are similar to those listed by the original trial, which include the narrow range of age of children, the exclusion of children with an AHI exceeding 30 and a relatively short period of follow-up. Additionally, identification of other causal mediators related to the treatment of OSA is a subject of future investigation.
The mediation analysis of the results obtained from the CHAT study suggests a limited role for polysomnographic resolution or changes in severity of OSA in causally influencing the outcomes related to its treatment. These results caution against the use of empirical AHI-based thresholds to assess outcomes related to treatment. Further studies are necessary to identify possible mediators of outcomes of pediatric OSA to better define the severity of the condition and reduce the cost of diagnosis.
Acknowledgments
We thank the University of Maryland Biostatistics Core of the Institute for Clinical and Translational Research for statistical consultations. We also acknowledge Drs Abraham Kanate, Anita Shet, and Arun Shet for their useful feedback on earlier versions of the article.
Dr Isaiah conceptualized and designed the study, conducted the initial analysis, and drafted the initial manuscript; Drs Das and Pereira conceptualized and designed the study; and all authors reviewed and revised the manuscript, approved the final manuscript as submitted, and agree to be accountable for all aspects of the work.
This trial has been registered at www.clinicaltrials.gov (identifier NCT00560859).
FUNDING: No external funding.
- AHI
apnea hypopnea index
- BRIEF
Behavior Rating Inventory of Executive Function
- CBCL
Child Behavior Checklist
- CHAT
Childhood Adenotonsillectomy Trial
- CI
confidence interval
- CRP
C-reactive protein
- ERS
European Respiratory Society
- ESS
Epworth Sleepiness Scale
- HOMA-IR
Homeostasis Model Assessment for Insulin Resistance
- NEPSY
Developmental Neuropsychological Assessment
- OAI
obstructive apnea index
- OSA
obstructive sleep apnea
- OSA-18
Obstructive Sleep Apnea-18
- PedsQL
Pediatric Quality of Life Inventory
- PSQ-SRBD
Pediatric Sleep Questionnaire Sleep-Related Breathing Disorder
References
Competing Interests
POTENTIAL CONFLICT OF INTEREST: Dr Isaiah is an inventor of 3 technologies related to the diagnosis and treatment of sleep apnea in adults; the other authors have indicated they have no potential conflicts of interest to disclose.
FINANCIAL DISCLOSURE: Dr Isaiah receives patent-related royalties from the University of Maryland, Baltimore, for inventions related to sleep apnea; the other authors have indicated they have no financial relationships relevant to this article to disclose.
Comments
Childhood Sleep Apnea: Don't Quit on Polysomnography Just Yet…
To the Editor
We have read with great interest the recent study by Isaiah et al.(1), in which they analyze data from the CHAT(2) multicenter RCT comparing surgical versus watchful waiting approaches in children with OSA. Isaiah and colleagues aimed to infer whether polysomnographic resolution (measured by change in apnea-hypopnea index, AHI) underlies the changes in measurable outcomes. Their conclusion was that since the change in AHI does not statistically mediate the change in most of the measured outcomes "liberal polysomnography […] may not be justifiable". We not only strongly disagree with the authors’ conclusions but further consider them misguided. As such, the following arguments will attempt to explain the fallacies in Isaiah and colleagues reasoning:
The main conclusion of the original CHAT study(2) was that the primary outcome of interest, the neurocognitive domain, was comparable between the two treatment arms, and that parentally reported quality of life was favorably affected by adenotonsillectomy (T&A). Since its original publication in 2013, the CHAT study has been criticized primarily for methodological limitations – the study included a narrow age range of children with mild-moderate OSA, and excluded those with severe OSA. Since OSA increases the risk of cognitive deficits in a severity –dependent fashion(3), not all children included in the study presented evidence of such deficits. Additionally, the follow-up period of 7 months may not have allowed for improvement in other outcome measures among those who were cognitively affected. Notwithstanding, when a post-hoc analysis of a small group of children presenting deficits was conducted, significant improvements emerged, but the number of subjects precluded establishing any relationship between post-surgical sleep measures and degree of cognitive improvements(4). Bearing in mind these limitations, Isaiah et al found that AHI improvements were associated with improvements in disease-specific quality of life (QoL) and with OSA symptom resolution. This should not come as a surprise when considering the burden of OSA on QoL, which can be metaphorically paralleled to someone carrying a very heavy backpack. Removing a meaningful fraction of that backpack weight will obviously result in improvements in QoL, even if residual disease remains. Such relief may be misleading, and drive the conclusion that no PSG is needed after the T&A, at a time when many pathophysiological findings that drive end-organ morbidity continue to be operational. Furthermore, the authors fail to communicate that AHI is only one of multiple measures in the PSG that associate with morbidity. Moreover, the risk factors for residual OSA in CHAT included African-American race, obesity, and more severe OSA. Not surprisingly, these are the children that pose the greatest challenges, since it is hard to predict who will benefit from T&A and to what extent, even if symptoms have virtually disappeared. Residual OSA, which would go undetected based on the proposition of Isaiah and colleagues, would continue to impose its risk for end-organ morbidity. In summary, we urge to interpret the results of the current mediation analysis with caution, bearing in mind that despite being laborious and costly, PSG is still the only comprehensive and objective measure of sleep pathology in children.
1. Isaiah A, Pereira KD, Das G. Polysomnography and Treatment-Related Outcomes of Childhood Sleep Apnea. Pediatrics. 2019.
2. Marcus CL, Moore RH, Rosen CL, Giordani B, Garetz SL, Taylor HG, et al. A randomized trial of adenotonsillectomy for childhood sleep apnea. The New England journal of medicine. 2013;368(25):2366-76.
3. Hunter SJ, Gozal D, Smith DL, Philby MF, Kaylegian J, Kheirandish-Gozal L. Effect of Sleep-disordered Breathing Severity on Cognitive Performance Measures in a Large Community Cohort of Young School-aged Children. American journal of respiratory and critical care medicine. 2016;194(6):739-47.
4. Taylor HG, Bowen SR, Beebe DW, Hodges E, Amin R, Arens R, et al. Cognitive Effects of Adenotonsillectomy for Obstructive Sleep Apnea. Pediatrics. 2016;138(2).