To evaluate the diagnostic accuracy of the Early Autism Evaluation (EAE) Hub system, a statewide network that provides specialized training and collaborative support to community primary care providers in the diagnosis of young children at risk for autism spectrum disorder (ASD).
EAE Hub clinicians referred children, aged 14 to 48 months, to this prospective diagnostic study for blinded follow-up expert evaluation including assessment of developmental level, adaptive behavior, and ASD symptom severity. The primary outcome was agreement on categorical ASD diagnosis between EAE Hub clinician (index diagnosis) and ASD expert (reference standard).
Among 126 children (mean age: 2.6 years; 77% male; 14% Latinx; 66% non-Latinx white), 82% (n = 103) had consistent ASD outcomes between the index and reference evaluation. Sensitivity was 81.5%, specificity was 82.4%, positive predictive value was 92.6%, and negative predictive value was 62.2%. There was no difference in accuracy by EAE Hub clinician or site. Across measures of development, there were significant differences between true positive and false negative (FN) cases (all Ps < .001; Cohen’s d = 1.1–1.4), with true positive cases evidencing greater impairment.
Community-based primary care clinicians who receive specialty training can make accurate ASD diagnoses in most cases. Diagnostic disagreements were predominately FN cases in which EAE Hub clinicians had difficulty differentiating ASD and global developmental delay. FN cases were associated with a differential diagnostic and phenotypic profile. This research has significant implications for the development of future population health solutions that address ASD diagnostic delays.
Finding effective and scalable solutions to address autism diagnostic delays and disparities is a public health imperative. Tiered community-based approaches that enhance the capacity of primary care clinicians to provide diagnostic evaluations hold promise for addressing this problem.
Primary care clinicians trained as part of a statewide system can make accurate ASD diagnoses in a majority of cases, extending evidence that tiered, community-based models may be a valid approach for reducing ASD diagnostic delays.
Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder defined by impairments in social communication and the presence of restricted and repetitive behaviors,1 with estimated prevalence of 1 in 36 8-year-old children.2 Although reliable ASD diagnosis is often possible in the second year of life,3 the median age of diagnosis in the United States is 49 months.2 For many children, this delay4–6 is because of a bottleneck in access to diagnostic evaluations.7 Shortages of specialists8 trained to provide diagnostic evaluations and clustering of available specialists in metropolitan areas9–11 result in excessive family travel requirements, lost wages, and need to find alternative caregivers for other children or dependents.12–15 Further, labor- and cost-intensive evaluation models and assessment tools limit efficiency and contribute to organizational16–18 and family19 burden. These factors, together with systemic influences on socioeconomic status, cultural stigma, and reduced access to information, education, and community resources, contribute to substantial diagnostic disparities for historically minoritized children and families.20–22 ASD diagnostic delays impede enrollment in targeted interventions, with cascading individual23–25 and societal26–28 consequences. As such, finding feasible, equitable, and scalable solutions that address ASD diagnostic delays and disparities is a public health imperative.
In recent years, tension between the notion that ASD diagnostic evaluation must be expert-driven to maintain quality standards29 and the very significant demands for increasing capacity of diagnostic service systems18 has grown. Yet, there seems to be increasing recognition that tiered, community-based approaches that enhance the capacity of primary care providers (PCP) to conduct diagnostic evaluations of young children at risk for ASD hold promise for reducing delays and disparities.30,31 Both the American Academy of Pediatrics32 and the Canadian Pediatric Society33 now recognize that general pediatricians with training in application of Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, ASD criteria can make a clinical diagnosis of ASD. Although nonspecialist providers such as PCPs will not have the expertise to make a definitive diagnosis for all at-risk children (ie, given the substantial heterogeneity of the disorder), there is mounting evidence to suggest that PCPs can make an initial clinical diagnosis to facilitate initiation of services for many young children with clear ASD symptoms.18,30,31,34
Data on the implementation of novel diagnostic technologies35,36 and streamlined training30 and evaluation models37–40 are emerging rapidly. Findings suggest that these innovative diagnostic approaches may shift clinician knowledge and perceived competency, improve access, and result in moderate to high degree of diagnostic accuracy. However, studies on PCP training in ASD diagnosis have been limited by small sample sizes, variable methodological quality, and heterogenous design and selection of outcome variables.30 Guan et al30 recently called for more rigorous studies of PCP evaluation models that include demographic characteristics of clinicians and patients, comprehensive assessments of outcome, and data on diagnostic accuracy.
Community-based, PCP-delivered ASD diagnostic evaluation models have high potential to reduce diagnostic delays and disparities. As such, our goal in the present prospective diagnostic study was to evaluate the diagnostic accuracy of a statewide model of early ASD evaluation in the primary care setting. Specifically, we present indices of diagnostic accuracy between PCPs trained to deliver ASD evaluations and comprehensive expert ASD evaluation, as well as differences in diagnostic, demographic, and phenotypic characteristics across diagnostic accuracy groups.
Methods
Study Setting
This study took place within the Early Autism Evaluation (EAE) Hub system, a statewide network of community PCPs trained to provide streamlined diagnostic evaluations for young children, aged 14 to 48 months, at risk for ASD.39 As outlined in McNally Keehn et al,39 EAE Hub clinician training involved didactics in ASD evaluation of young children, as well as a clinical practicum with practice-based coaching until mastery of all components of the standard clinical diagnostic evaluation was achieved. After initial training, all PCPs participated in a virtual longitudinal learning collaborative (which meets every month) with content on challenging case presentations, updated ASD diagnosis and care management practice standards, billing and coding guidance, and information on statewide ASD resources. EAE Hubs receive referrals from regional PCPs for evaluation of children determined to be at risk for ASD on the basis of surveillance and/or developmental screening, and then follow a standard clinical evaluation protocol including administration/review of standard developmental and autism screening tools (ie, Ages and Stages Questionnaire-3 and Modified Checklist for Autism in Toddlers–Revised With Follow-Up), a developmental history and Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, focused ASD clinical interview, physical examination, and administration of an observational assessment of ASD (Screening Tool for Autism in Toddlers41 ). The EAE Hub clinician then issues a best-estimate ASD diagnosis and report with clinical recommendations, including information on community and statewide interventions and resources for children with ASD and developmental disabilities.
Study Design and Participants
Standards for Reporting Diagnostic Accuracy guidelines were followed in the design and conduct of this prospective diagnostic study. Seven EAE Hubs set within primary care practices (ie, including 6 large health system group practices and 1 private practice) referred children, aged 14 to 48 months evaluated for ASD in the community primary care setting, to the study between June 2019 and August 2022. To be included, children were aged 14 to 48 months at time of EAE Hub evaluation and had an English-speaking primary caregiver/guardian. This study was approved by the Indiana University School of Medicine institutional review board, and written informed consent was obtained from legal guardians of all participants.
Study Procedure
EAE Hubs were recruited into the study in a nonrandom, staggered manner during the study period; site startup was impacted by coronavirus disease 2019 institutional research and patient care regulations. Each site referred a prospective, consecutive sample of children who received an EAE Hub evaluation after site startup until ∼20 children from each site were enrolled (note: Site 1 recruited a greater number of participants because they served as a pilot and study site; Site 4 recruited fewer participants because of relocation of the EAE Hub clinician during the study period). This recruitment procedure allowed the study team to maintain diagnostic blindness and assess children with both ASD and non-ASD outcomes without referral bias. During EAE Hub evaluations, clinicians (or a member of the EAE Hub clinical team; eg, nurse or medical assistant) provided caregivers of children evaluated with a study brochure, brief verbal description of the study, and obtained signed consent to share contact information with the study team for recruitment and enrollment. Once enrolled, an electronic caregiver-report survey (ie, caregiver-reported demographic data on child race/ethnicity and caregiver/family income and education level) and EAE Hub evaluation data (ie, index ASD diagnosis and clinician diagnostic certainty) were collected by a member of the study team (Fig 1). The study team, consisting of a licensed clinical psychologist (R.M.K., B.E., or T.R.) with expertise in evaluation of ASD in toddlers and young children and clinical research technician (advanced graduate student or postdoctoral fellow: G.K., L.H., or A.M.M.), traveled to the EAE Hub to conduct a follow-up gold-standard ASD diagnostic assessment within 16 weeks of the initial EAE Hub evaluation. Figure 1 and Supplemental Information detail the outcome measures, including child, caregiver, and clinician measures, administered to obtain a best-estimate ASD diagnosis (reference standard diagnosis). Participants were compensated with a gift card in the amount of $25 per hour of completed research evaluation (up to a maximum of $75).
Analysis
Data analyses were performed using SPSS (IBM SPSS Statistics, Version 28, Armonk, NY: IBM Corp) and JMP (JMP, Version 13.0, Cary, NC: SAS Institute Inc.). Continuous variables are reported as means and SDs, and categorical variables are reported as absolute frequencies and percentages. Diagnostic accuracy, the primary outcome of interest, was calculated by comparing percentage agreement between the EAE Hub diagnosis and ASD specialist on categorical ASD diagnosis (ASD; non-ASD). Chance-corrected agreement (κ) and accuracy indices of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the 95% confidence interval (CI) are reported. There were no missing data or indeterminate results regarding index or reference diagnoses. Variability in diagnostic accuracy by EAE Hub site and clinician was examined via a series of χ2 and Fisher’s exact tests. Exploratory post-hoc analyses were conducted to examine diagnostic, demographic, and phenotypic differences between true positive (TP) and false negative (FN) outcome groups to understand differences between children who were correctly diagnosed with ASD and those who were missed (or unable to be definitively diagnosed) by nonspecialist clinicians. χ2 and 2-sided t tests were used to compare categorical and continuous variables, respectively. Finally, an additional subset analysis of accuracy indices for cases with EAE Hub clinician diagnostic certainty ratings of Highly or Completely Certain was conducted to examine whether agreement improved on the basis of clinician perceptions of diagnostic confidence. A sample size of N = 126 provided an upper/lower limit 95% CI width of 0.067 for overall diagnostic agreement, assuming 82% agreement (ie, as observed in this study).
Results
Participant flow through the study is detailed in Fig 2. Of 182 referred children, 131 enrolled, and index and reference standard diagnosis evaluations were included in the final analysis for 126 children across 6 EAE Hubs. Ten clinicians conducted a mean of 12.6 (SD = 7.4) evaluations (Supplemental Table 4 for EAE Hub clinician demographics and learning collaborative participation). Mean age of children was 2.6 years; 77% (n = 97) were male, 14% (n = 18) were Hispanic/Latinx, and 66% (n = 84) were non-Latinx white (Table 1). Across all children evaluated, scores on measures of developmental and adaptive skills fell well below the average range and ASD symptom severity was in the moderate range.42 Seventy-five percent (n = 94) of children had a reference diagnosis of ASD; 10% had global developmental delay (GDD) (n = 13), 10% had language delay (n = 13), and 5% (n = 6) had another emotional, behavioral, or medical concern (n = 6).
. | All Included . |
---|---|
N (%) . | |
Number of participants | 126 |
Age, m (SD), in mo | 2.6 (0.6) |
Sex | |
Male | 97 (77) |
Female | 29 (23) |
Race/ethnicity | |
>1 race | 7 (6) |
Asian American | 1 (1) |
Black | 10 (8) |
Hispanic/Latinx, any race | 18 (14) |
White | 84 (66) |
Unknown/not reported | 6 (5) |
Yearly household income | |
<$25 000 | 21 (17) |
$25 000–$49 999 | 40 (32) |
$50 000–$74 999 | 29 (23) |
$75 000–$99 000 | 12 (10) |
$100 000–$149 999 | 13 (10) |
>$150 000 | 2 (2) |
Primary caregiver education level | |
Less than high school diploma | 8 (6) |
High school diploma/GED | 36 (29) |
Some college, no degree | 32 (25) |
Associate degree/postsecondary certificate | 14 (11) |
Bachelor’s degree | 24 (19) |
Graduate degree | 12 (10) |
MSEL ELC, m (SD) | 61.8 (14.7) |
MSEL nonverbal DQ, m (SD) | 70.4 (17.3) |
MSEL verbal DQ, m (SD) | 52.1 (24.2) |
Vineland-3 ABC, m (SD) | 67.9 (12.0) |
ADOS-2 CSS, m (SD) | 7.0 (3.0) |
Reference diagnosis of ASD | 94 (75) |
Reference diagnosis of GDD | 13 (10) |
Reference diagnosis of LD | 13 (10) |
Reference diagnosis of othera | 5 (4) |
Reference diagnosis of known genetic syndrome | 1 (1) |
. | All Included . |
---|---|
N (%) . | |
Number of participants | 126 |
Age, m (SD), in mo | 2.6 (0.6) |
Sex | |
Male | 97 (77) |
Female | 29 (23) |
Race/ethnicity | |
>1 race | 7 (6) |
Asian American | 1 (1) |
Black | 10 (8) |
Hispanic/Latinx, any race | 18 (14) |
White | 84 (66) |
Unknown/not reported | 6 (5) |
Yearly household income | |
<$25 000 | 21 (17) |
$25 000–$49 999 | 40 (32) |
$50 000–$74 999 | 29 (23) |
$75 000–$99 000 | 12 (10) |
$100 000–$149 999 | 13 (10) |
>$150 000 | 2 (2) |
Primary caregiver education level | |
Less than high school diploma | 8 (6) |
High school diploma/GED | 36 (29) |
Some college, no degree | 32 (25) |
Associate degree/postsecondary certificate | 14 (11) |
Bachelor’s degree | 24 (19) |
Graduate degree | 12 (10) |
MSEL ELC, m (SD) | 61.8 (14.7) |
MSEL nonverbal DQ, m (SD) | 70.4 (17.3) |
MSEL verbal DQ, m (SD) | 52.1 (24.2) |
Vineland-3 ABC, m (SD) | 67.9 (12.0) |
ADOS-2 CSS, m (SD) | 7.0 (3.0) |
Reference diagnosis of ASD | 94 (75) |
Reference diagnosis of GDD | 13 (10) |
Reference diagnosis of LD | 13 (10) |
Reference diagnosis of othera | 5 (4) |
Reference diagnosis of known genetic syndrome | 1 (1) |
ADOS-2 CSS, Autism Diagnostic Observation Schedule, Second Edition, Calibrated Severity Score; DQ, developmental quotient; LD, language delay; m, mean; MSEL ELC, Mullen Scales of Early Learning Early Learning Composite; Vineland-3 ABC, Vineland Adaptive Behavior Scales, Third Edition, Adaptive Behavior Composite.
Other diagnosis includes emotional behavioral concerns such as separation anxiety, sensory processing impairment, and parent–child relational problem.
Agreement Between Index and Reference Diagnosis
Of 126 children evaluated, ASD diagnosis was consistent between the EAE Hub evaluation (index diagnosis) and expert research evaluation (reference diagnosis) for 82% (n = 103) of cases (Table 2). Chance-corrected diagnostic agreement was moderate, κ = 0.580 (95% CI, 0.429–0.731). Sensitivity, or correct classification of ASD diagnosis, was 81.5% (95% CI, 72.4–88.1), whereas specificity, or correct classification of non-ASD diagnosis, was 82.4% (95% CI, 66.5–91.7). PPV was 92.6% (95% CI, 84.8–96.6) and NPV was 62.2% (95% CI, 47.6–74.9). Overall, 60% (n =75) of cases were TP (index = ASD; reference = ASD), 5% (n = 6) were false positive (FP: index = ASD; reference = non-ASD), 22% (n = 28) were true negative (TN: index = non-ASD; reference = non-ASD), and 14% (n = 17) were FN (index = non-ASD; reference = ASD) Table 2).
. | . | Reference Standard Diagnosis . | |
---|---|---|---|
. | . | ASD . | Non-ASD . |
Index diagnosis | ASD | 75 (60) | 6 (5) |
Non-ASD | 17 (14) | 28 (22) |
. | . | Reference Standard Diagnosis . | |
---|---|---|---|
. | . | ASD . | Non-ASD . |
Index diagnosis | ASD | 75 (60) | 6 (5) |
Non-ASD | 17 (14) | 28 (22) |
Data represented as number (%). Index diagnosis is based on EAE Hub evaluation. Reference standard diagnosis based on blinded expert research diagnosis.
Diagnostic Agreement by EAE Hub Site and Clinician
There was no difference between EAE Hub sites in overall accuracy (ie, accurate versus not; TP + TN versus FP + FN; P = .89) or proportion of FN (compared with TP) cases (P = .67) (Supplemental Table 5). Similarly, there was no difference in overall accuracy (P = .24) or proportion of FN cases (P = .09) by EAE Hub clinician for those submitting data for ≥5 children (n = 8).
Diagnostic, Demographic, and Phenotypic Differences by Diagnostic Agreement Group
Descriptive, clinical, and phenotypic data by diagnostic group can be found in Supplemental Table 6. To address the question of what diagnostic, demographic, and/or phenotypic factors may be associated with FN diagnoses made by trained PCPs, we conducted an exploratory analysis of differences between TP and FN cases (Table 3). There was a significant difference in dichotomized (ie, Highly or Completely Certain versus all other ratings) index clinician diagnostic certainty ratings between TP and FN groups, P = .002, with a higher proportion of Highly–Completely Certain ratings for the TP (95%; 71 of 75), as compared with the FN (65%; 11 of 17) group. Similarly, index clinicians flagged a significantly higher proportion of FN cases (69% of FN; 17% TP) for specialty follow-up evaluation, P ≤ .001. There were no demographic differences by age or sex (Ps > .20). Across measures of developmental and adaptive skills (ie, Mullen Scales of Early Learning Verbal Developmental Quotient, Nonverbal Developmental Quotient, and Early Learning Composite; Vineland-3 Adaptive Behavior Composite), there were significant differences between TP and FN cases (all Ps < .001), with the TP group evidencing significantly greater impairment as compared with the FN group. There was no significant difference between Autism Diagnostic Observation Schedule, Second Edition, Calibrated Severity Scores between the TP and FN groups (P = .28), suggesting no meaningful differences in ASD symptom severity between groups. To address factors that may be associated with FP diagnoses, we examined differences between TN and FP cases (Supplemental Table 7). Although these results must be interpreted with substantial caution given the small sample size, it appears that there may be a trend toward older age in the FP group.
. | TP . | FN . | . | . |
---|---|---|---|---|
n = 75 . | n = 17 . | P . | Effect Size . | |
Clinical characteristics | ||||
Index diagnostic certainty, n (%) | .002 | 9.7 | ||
Completely/highly certain | 71 (95) | 11 (65) | [2.4–39.9]a | |
All other ratings | 4 (5) | 6 (35) | ||
Specialty evaluation, n (%) | <.001 | 10.6 | ||
Referral not recommended | 58 (83) | 5 (31) | [3.1–36.2]a | |
Referral recommended | 12 (17) | 11 (69) | ||
Demographic characteristics | ||||
Age, m (SD) y | 2.7 (0.6) | 2.7 (0.6) | .89 | 0.04b |
Sex, n (%) | .20 | 2.1 | ||
Male | 63 (84) | 12 (71) | [0.7–7.4]a | |
Female | 12 (16) | 5 (29) | ||
Phenotypic characteristics | ||||
MSEL, m (SD) | ||||
Verbal DQ | 39.8 (17.9) | 60.1 (18.7) | <.001 | 1.4b |
Nonverbal DQ | 63.0 (15.4) | 78.7 (10.8) | <.001 | 1.1b |
MSEL ELC | 54.7 (7.6) | 64.5 (12.0) | <.001 | 1.1b |
Vineland-3 ABC, m (SD) | 62.3 (9.6) | 72.1 (4.8) | <.001 | 1.1b |
ADOS-2 CSS, m (SD) | 8.6 (1.6) | 8.1 (2.0) | .28 | 0.3b |
. | TP . | FN . | . | . |
---|---|---|---|---|
n = 75 . | n = 17 . | P . | Effect Size . | |
Clinical characteristics | ||||
Index diagnostic certainty, n (%) | .002 | 9.7 | ||
Completely/highly certain | 71 (95) | 11 (65) | [2.4–39.9]a | |
All other ratings | 4 (5) | 6 (35) | ||
Specialty evaluation, n (%) | <.001 | 10.6 | ||
Referral not recommended | 58 (83) | 5 (31) | [3.1–36.2]a | |
Referral recommended | 12 (17) | 11 (69) | ||
Demographic characteristics | ||||
Age, m (SD) y | 2.7 (0.6) | 2.7 (0.6) | .89 | 0.04b |
Sex, n (%) | .20 | 2.1 | ||
Male | 63 (84) | 12 (71) | [0.7–7.4]a | |
Female | 12 (16) | 5 (29) | ||
Phenotypic characteristics | ||||
MSEL, m (SD) | ||||
Verbal DQ | 39.8 (17.9) | 60.1 (18.7) | <.001 | 1.4b |
Nonverbal DQ | 63.0 (15.4) | 78.7 (10.8) | <.001 | 1.1b |
MSEL ELC | 54.7 (7.6) | 64.5 (12.0) | <.001 | 1.1b |
Vineland-3 ABC, m (SD) | 62.3 (9.6) | 72.1 (4.8) | <.001 | 1.1b |
ADOS-2 CSS, m (SD) | 8.6 (1.6) | 8.1 (2.0) | .28 | 0.3b |
Index diagnosis is based on EAE Hub evaluation. Categorical variables presented as number (%); continuous variables presented as mean (SD). P values represent 2-sided significance of t test for continuous variables and χ2 or Fisher’s exact test for categorical variables. ADOS-2 CSS, Autism Diagnostic Observation Schedule, Second Edition, Calibrated Severity Score; DQ, developmental quotient; m, mean; MSEL ELC, Mullen Scales of Early Learning Early Learning Composite; Vineland-3 ABC, Vineland Adaptive Behavior Scales, Third Edition, Adaptive Behavior Composite.
Effect size reported as odds ratio [95% CI].
Effect size reported as Cohen’s d.
Subset Analysis by Index Diagnostic Certainty Ratings
Among the subset of cases in which an EAE Hub clinician rated diagnostic certainty to be Highly or Completely Certain (N = 105), sensitivity was 84.5% (95% CI, 75.3–90.7) and specificity was 90.5% (95% CI, 71.1–97.3). PPV was 97.3% (95% CI, 90.5–99.2) and NPV was 59.4% (95% CI, 42.3–74.5).
Discussion
In this prospective diagnostic study, we found 82% agreement between trained primary care clinicians and blinded expert research evaluation on categorical ASD diagnosis of children aged 14 to 48 months. Accuracy indices of sensitivity, specificity, and PPV were high (ie, 81.5%, 82.4%, and 92.6%, respectively), whereas NPV was substantially lower (ie, 62.2%). There were no statistically significant differences in accuracy by EAE Hub site or clinician. Diagnostic disagreements were predominately FN cases in which EAE Hub clinicians had difficulty differentiating ASD and GDD. Clinicians flagged most of these cases for follow-up specialty evaluation. FN (as compared with TP) cases were associated with a differential diagnostic and phenotypic, but not demographic, profile. When analysis was restricted to cases in which EAE Hub clinicians rated their diagnostic certainty high, measures of sensitivity, specificity, and PPV improved. To our knowledge, this is the largest study to date that evaluates diagnostic accuracy of a coordinated system of diagnosis in the primary care setting. Notable strengths of this study include the diversity of included primary care index clinicians (ie, from large health system group practices, federally qualified health centers, and private practices) and children evaluated (ie, from diverse socioeconomic and family education backgrounds), large sample size, and rigorous methodology (ie, including blinded reference standard evaluations).
Although existing reports of primary care-based models of ASD diagnosis show promising evidence for improved service access and acceptable accuracy, studies have been limited by small sample size and reduced methodological rigor, or have not used a standard approach for training and diagnostic evaluation.30 Nonetheless, our 82% diagnostic agreement between index and reference diagnosis is comparable to that of others who have reported rates of agreement between PCP and expert evaluation between 71% and 85%.34,43,44 Importantly, across EAE Hub sites and clinicians, including PCPs and nurse practitioners, there was no difference in overall accuracy or rate of FN cases. Given the small number of NPs in our study, future examination of accuracy with a larger sample of diverse clinicians is needed.
Our study is the first to report on ASD accuracy metrics between nonspecialist clinicians and expert diagnosis when a standard training and clinical pathway is followed. Findings suggest that PCPs who receive specialty training are highly reliable when they confirm an ASD diagnosis, as evidenced by our very low rate of FP cases (6%) and high PPV (92.6%). Clinicians were unable to make a definitive diagnosis or missed ASD in 14% of cases, resulting in low NPV (62.2%). Similar to the findings of Penner et al,34 FN cases evidenced higher verbal and nonverbal developmental level and adaptive skills, though most in our study still met criteria for GDD. Notably, there was no difference in ASD symptom severity between TP and FN cases, suggesting that index clinicians may place more emphasis on developmental impairment than ASD-specific symptoms when making diagnostic decisions. FN cases were associated with lower index clinician diagnostic certainty and higher rates of referral (69% of cases) for specialty evaluation, suggesting that clinicians recognized that these children demonstrated a more complex profile, making differential diagnosis between GDD and ASD challenging. When analysis of diagnostic agreement was restricted to only cases for which index diagnostic certainty was high, sensitivity, specificity, and PPV increased, suggesting that primary care clinicians perceive their ability to render a correct ASD diagnosis with high accuracy. Future research should evaluate whether triaging cases for specialty evaluation on the basis of the child’s overall developmental level (ie, those with higher developmental skills) and/or low clinician diagnostic certainty may mitigate the rate of FN diagnosis in the primary care setting.
Limitations
A primary limitation of the current study is the high proportion (75%) of reference ASD diagnosis in the sample, resulting in low sample size for comparisons by accuracy subgroup. Because of small sample size, we did not use modeling approaches to adjust for potential correlated outcomes of patients clustered within EAE Hub sites or clinicians; however, we did examine site and clinician differences on overall accuracy and proportion of FN cases, and results were not significant. We also did not collect data on those children evaluated in the EAE Hub system who did not consent to participate, and thus we cannot rule out unmeasured bias in our findings or confirm the generalizability of our findings to all young children who require ASD evaluation. Inclusion of only children with English-speaking caregivers limits the generalizability of our findings, and further solutions to ensure equitable access to diagnostic evaluations are necessary. Although we asked clinicians to flag children that required specialty follow-up evaluation, we designed the study to force index clinicians to make a binary (ASD present/absent) choice about ASD outcome, perhaps artificially deflating accuracy indices because of caution against overdiagnosis. Finally, although an initial ASD diagnosis is needed to access specific services, longitudinal developmental evaluation is important for individualized intervention and prognosis planning.45 As such, tiered diagnostic approaches represent an important and promising solution to one component of the larger ASD service bottleneck problem.
Conclusions
Tiered diagnostic approaches, including primary care-based models such as the EAE Hub system, are now being tested as a solution to address the need for increased ASD diagnostic evaluation capacity.18,31 The EAE Hub system was developed with the primary goal of lowering the age of ASD diagnosis through providing streamlined access to localized diagnostic evaluation within the primary care setting. We have previously shown that the EAE Hub model, which involves intensive training for PCPs in ASD diagnosis and ongoing participation in a longitudinal learning collaborative, is feasible,39 with >4000 children evaluated since 2012. In the present prospective diagnostic study, we extend the empirical support for this model by demonstrating a high level of diagnostic accuracy (ie, 82%) in a sample of diverse community PCPs and at-risk children. Additional research is needed to understand implementation promotors and barriers to broad scale-up of community-based ASD models, as well as replication and comparative effectiveness studies that allow for determination of the key components of training and model implementation necessary for success. Testing strategies aimed to mitigate FN cases will be an essential next step in ensuring the accuracy and quality of streamlined, community-based ASD evaluations. Further, it will be critical to develop and test adaptations of tiered diagnostic approaches for non-English speaking children, as well as to determine whether these types of models lead to earlier intervention enrollment, both efforts currently underway in the EAE Hub system. Collectively, the study of innovative diagnostic models has important implications for how future population health solutions that address the ASD diagnosis crisis are designed and implemented. ASD experts, self-advocates and families, and other health service stakeholders (ie, insurers, service providers) must work together to construct and put into action flexible, evolving, evidence-driven health policies that account for scientific innovation and advancements in ASD diagnosis.17,46
Acknowledgments
We thank Mary Delaney for her work as a practice liaison for the Indiana University School of Medicine EAE Hub leadership team, as well as Yvonne Purcell for her support to the team during the data collection phase of the study. We also thank the EAE Hub clinicians for their commitment to the children and families of Indiana, our learning collaborative network, and their participation in this research. Finally, and most importantly, we thank the children and caregivers who participated in this study.
Drs McNally Keehn and Keehn conceptualized and designed the study, designed the data collection instruments, led data collection, analysis, and interpretation, and drafted the initial manuscript; Drs Swigonski, Monahan, and Ciccarelli contributed to data analysis and interpretation efforts; Drs Enneking, Ryan, Martin, Hamrick, and Kadlaskar, and Ms Paxton contributed to acquisition of data; and all authors reviewed and revised the manuscript for important intellectual content, approved the final manuscript as submitted, and agree to be accountable for all aspects of the work.
Deidentified individual participant data has been shared as part of the National Institute of Mental Health Data Archive.
COMPANION PAPER: A companion to this article can be found online at www.pediatrics.org/cgi/doi/10.1542/peds.2023-062279.
FUNDING: Supported by the National Institute of Mental Health grant #R21MH121747 (Drs McNally Keehn, Keehn, Swigonski), pilot funding from the Indiana Clinical and Translational Sciences Institute grant #UL1TR002529 (Dr McNally Keehn), and Purdue Big Idea Challenge 2.0.
CONFLICT OF INTEREST DISCLOSURES: The authors have indicated they have no conflicts of interest relevant to this article to disclose.
Comments