Quantity of talk and interaction in the home during early childhood is correlated with socioeconomic status (SES) and can be used to predict early language and cognitive outcomes. We tested the effectiveness of automated early language environment estimates for children 2 to 36 months old to predict cognitive and language skills 10 years later and examined effects for specific developmental age periods.
Daylong audio recordings for 146 infants and toddlers were completed monthly for 6 months, and the total number of daily adult words and adult-child conversational turnswere automatically estimated with Language Environment Analysis software. Follow-up evaluations at 9 to 14 years of age included language and cognitive testing. Language exposure for 3 age groups was assessed: 2 to 17 months, 18 to 24 months, and ≥25 months. Pearson correlations and multiple linear regression analyses were conducted.
Conversational turn counts at 18 to 24 months of age accounted for 14% to 27% of the variance in IQ, verbal comprehension, and receptive and/or expressive vocabulary scores 10 years later after controlling for SES. Adult word counts between 18 and 24 months were correlated with language outcomes but were considerably weakened after controlling for SES.
These data support the hypothesis that early talk and interaction, particularly during the relatively narrow developmental window of 18 to 24 months of age, can be used to predict school-age language and cognitive outcomes. With these findings, we underscore the need for effective early intervention programs that support parents in creating an optimal early language learning environment in the home.
Previous studies in which researchers use transcriptions of short recordings have revealed that the quantity of talk and interaction experienced by infants and toddlers is correlated with early language and cognitive abilities.
Automated estimates of turn-taking interactions with children 18 to 24 months old predicted IQ and language skills 10 years later, suggesting that a child’s language experience during this relatively narrow early age window may predict later language and cognitive development.
In their landmark study, Hart and Risley1,2 quantified the language environments of typically developing infants and toddlers, finding that adult word exposure between 10 and 36 months of age predicted child IQ at age 3 years. Their work and subsequent research provides strong evidence that early language exposure predicts developmental outcomes.3,–6 In response, pediatric interventions have been developed to help parents and caregivers boost talk and interaction with young children; several of these incorporate Language Environment Analysis (LENA) software to characterize early language experience via automated analysis of daylong audio recordings.7,–10 Although these interventions reportedly have increased early talk and improved child language skills, research is needed on the long-term relationship between automated measures of early language experience and later developmental outcomes. In the current study, we tested whether cognitive and language skills in children 9 to 14 years of age were correlated with automated estimates of their early language experience and examined whether long-term outcomes were predicted differentially during 3 periods of early development. Note that we use predictiveness throughout in a statistical rather than explanatory or causative sense.
Decades of research have provided empirical evidence linking early language exposure and developmental outcomes.1,–3,5,6 In 1 study, 14- to 26-month-old children exposed to more adult words had higher rates of vocabulary development than those exposed to fewer words.4 In another study, high–socioeconomic status (SES) mothers of 18- to 29-month-old children spoke more often and with more varied vocabulary than mid-SES mothers, and their children demonstrated more advanced lexical development.11 Research connecting adult word exposure with higher rates of vocabulary development generally has been focused on children with spoken lexicons in the second and third years of life, when many children undergo linguistic changes that may influence interlocutors and patterns of interaction in their environments.12,13 In fact, it has long been argued that parents adjust their speech to infants and children on the basis of recognition of their level of development and awareness that child speech complexity changes with age.12,13 For example, children <18 months of age rarely engage with others using word combinations.14 Then around 18 months of age, children commonly produce their first combinatorial speech (2-word utterances), and their spoken vocabulary increases rapidly, a phenomenon sometimes called the “word spurt.”15,–17 The existence and nature of a literal word spurt is debatable; some researchers explain accelerated vocabulary learning and the appearance of word combinations by invoking a “naming insight,” and others suggest accelerated acquisition may be a byproduct of parallel word learning and variation in the time to learn new words.18,–20 Ganger and Brent21 argue against a word spurt but concede that “it is uncontroversial that a child’s rate of word learning increases during the second year of life.” Regardless of origin, a landscape change in language use is observed around this age. Subsequently, children >24 months old start to produce longer utterances, including grammatical morphemes and multiclausal sentences.14
Although follow-up with Hart and Risley's1,2 original sample revealed that child word complexity and length of utterances between 10 and 36 months of age predicted academic outcomes in third grade (eg, expressive and/or receptive language, spelling, and reading), analyses were presented in aggregate, and the authors did not examine effects for specific age groups. Little is known regarding whether language experience during different developmental periods may uniquely impact long-term outcomes. We address the question longitudinally using LENA language experience metrics extracted from automatically analyzed, full-day audio recordings collected from infants and toddlers to predict their language and cognitive skills in middle school. These relationships are examined across the full span of recording ages as well as within subgroups of children ages 2 to 17 months, 18 to 24 months, and ≥25 months. Analyses within these age groups are used to address the possibility that the long-term impact of a child’s parent-generated language environment may depend in part on developments in child utterance complexity.
The initial 2006 phase of this research has been reported in detail.22 Briefly, 329 children predominantly between 2 and 36 months of age (9 were between 38 and 47 months old) were recruited from the Denver metropolitan area, matching the US census on an SES proxy (mother’s attained education). Families completed daylong (12-hour) audio recordings monthly for 6 months.
LENA software automatically processed audio recordings to quantify adult word exposure, child vocalization (CV), and turn-taking interactions throughout the day on the basis of algorithmic analysis.7 The adult word count (AWC) algorithm does not recognize words directly but analyzes acoustic information (eg, related to syllable counts and consonant distributions) to estimate counts. The recording device registers all speech near the child; thus, AWC includes both overheard and child-directed speech. CVs are used to quantify speech-related vocalizing by the child. The conversational turn count (CTC) is used to quantify adult-child alternations (vocal initiations with responses that occur within 5 seconds). Both intentional vocal responses and accidental vocal contiguity can be included in the CTC measure.23
The reliability of LENA’s automated speaker segmentation has been extensively reported, with accuracy of identification of adult and child segments being between 68% and 82%.24,–27 Reliability for LENA measures was based on 5000 minutes of transcribed recording data from 94 children ages 2 to 48 months (30 to 70 minutes each [mean = 53.2; SD = 12.7 minutes]). AWC, CV, and CTC were highly correlated with human transcription counts (r = 0.95, 0.82, and 0.83, respectively; all P < .001). Differences between transcribed counts and LENA estimates were uncorrelated with age for AWC (r = −0.12; P = .27), CV (r = 0.16; P = .11), or CTC (r = 0.06; P = .57). The concurrent validity of LENA measures in the phase I sample was revealed by significant correlations with language assessments administered by a certified speech-language pathologist.22
Participant children were evaluated by a certified speech-language pathologist on a battery of assessments. A composite child language skills score (mean = 100; SD = 15) was generated by averaging total (expressive and receptive) language standard scores from the Preschool Language Scale, Fourth Edition and the Receptive-Expressive Emergent Language Test, Third Edition.28,29 Parents completed an age-appropriate version of the MacArthur-Bates Communicative Development Inventory, from which the child’s vocabulary size was included.30 In Fig 1, we illustrate growth in vocabulary across age for 90 participants for whom it was available concurrently with recordings and show that sample children demonstrated accelerating vocabularies around 18 months of age.
In phase II, phase I families with now early school-aged children (9–13 years old) were invited via letters to complete follow-up language and cognitive assessments with a clinical psychologist, during which time the parents completed a demographic questionnaire. On completion, children were given a $50 gift card. Participants were not provided with assessment results, and the evaluator was blind to phase I data and results.
Participant addresses were updated via phone calls and e-mail correspondences between phases. Figure 2 includes the derivation of the study sample in a simplified flow diagram. Ultimately, 146 Denver-area families provided informed consent approved by Heartland Institutional Review Board and participated in phase II (Table 1). More than 95% of phase II children were 36 months of age or younger at phase I onset. No differences were found between phase II participant and nonparticipant families (n = 183) on child sex or age at recording, but more phase II mothers had attended college (64.4% vs 44.8% [χ12 = 12.51; P < .001]).
Participant children were administered the Wechsler Intelligence Scale for Children, Fifth Edition (WISC-V); Peabody Picture Vocabulary Test (PPVT); and Expressive Vocabulary Test (EVT).31,–33 The WISC-V, for children 6 to 16 years of age, comprises 5 Primary Index Scales (Verbal Comprehension, Visual-Spatial, Fluid Reasoning, Working Memory, and Processing Speed) that are used to produce the Full-Scale IQ; of these, Full-Scale IQ and the Verbal Comprehension Index (VCI) were included in this study. The PPVT is a widely used measure of receptive vocabulary for ages 2 to ≥90 years in which respondents indicate which of 4 pictures matches a given word. The EVT, an expressive counterpart to the PPVT for the same ages, includes pictures that participants are asked to name. WISC-V administration generally takes 60 minutes, and the PPVT and EVT each take ∼15 minutes. All assessment scores were standardized (mean = 100; SD = 15) against the general population.
All valid recordings contributed by families over the 6-month phase I study were included for the full sample. LENA metrics were age standardized (mean = 100; SD = 15) against a LENA Research Foundation corpus of 3384 recordings from 378 families of typically developing children (including current participants) collected during phase I and subsequent studies. These values were then averaged within each family across recordings to produce 1 representative value and to minimize random variation in monthly scores. For this study, the early childhood language experience is characterized by AWC and CTC; CV is included only as a measure of child volubility. Pearson correlations were calculated for AWC and CTC with outcome measures and then recalculated, adjusting for SES, and repeated by age subgroups. For analyses within age groups, only age-appropriate recordings were used, and each family was represented in only 1 age group, the 1 for which they contributed the maximum number of recordings (with preference given to the 18–24-months age group to improve sample balance). That is, 1 set of LENA values that covered the full 6-month period and 1 set restricted to an age subgroup (eg, 18–24 months old) were analyzed. Mean full-sample and age-restricted values were highly correlated for AWC, CV, and CTC (r = 0.98, 0.96, and 0.97, respectively; all P < .001). In additional analyses, we examined the impact of a single recording per family and corrected for inequality of variance in age subgroups. Finally, to examine the possibility that CTC might not only reflect meaningful caregiver-child interaction but could act as a proxy for other child language characteristics (eg, volubility), we conducted a multiple linear regression analysis and controlled for the children’s phase I CV, language skills, and vocabulary size. See Table 2 for more details.
In Table 3, we summarize phase II assessment scores for the full sample and age subgroups. Pearson correlations with LENA measures are shown in Table 4. CTC was associated with VCI, PPVT, and EVT scores but did not significantly predict IQ. AWC predicted only VCI score. CTC and AWC had a strong concurrent relationship with each other (r = 0.74 [95% confidence interval (CI): 0.66 to 0.81]; R2 = 0.55; P < .001).
Age Groups and SES
In subsequent analyses, we examined possible systematic variation related to child age. To generate a smoothed representation of the relationship between early language experience and later outcomes, a moving average age window was defined such that recording values were averaged within each family for each target age month ±3 months. For example, language values for age 18 months were computed as the average of available values from 15 to 21 months. In Fig 3, we display the relationships between CTC and/or AWC and the primary outcome measures for the resulting 7-month age window. The strongest relationships (solid lines indicate statistical significance) occurred in a middle period starting at ∼18 months old.
The sample was split into 3 exploratory age groupings (2–17 months, 18–24 months, and ≥25 months). Pearson correlations between language experience predictors and outcomes within each age group are provided in Table 4. Essentially, no significant relationships were observed for the 2-to-17–months and ≥25-months age groups. However, both CTC and AWC strongly predicted outcomes in the 18-to-24–months age group. Repeating these analyses, controlling for maternal attained education as a marker for SES (Table 5), revealed that correlations in this group remained significant for CTC with IQ, VCI scores, PPVT scores, and EVT scores, but the predictive power of AWC was weakened considerably.
Assessment of Sampling Issues
LENA metrics were generally derived from multiple recordings per family (Table 2). To test whether similar results held when using a single recording, we randomly selected 1 recording per family within each age group. Overall, correlation patterns were similar; magnitudes were reduced somewhat but remained significant. For example, the multiple recordings 18-to-24–months CTC-VCI correlation was r = 0.57 (P < .001); correlations computed from 2 random draws of 1 recording for this group were r = 0.43 and r = 0.47 (both P < .01).
CTC distributions were further examined to investigate whether increased predictiveness for the 18-to-24–months age group could have resulted from greater variance in the language environment compared with younger or older children. The Levene test statistic W was computed to compare the homogeneity of variance among age groups. CTC variance in the 18-to-24–months age group (SD = 15.5) was significantly larger than that in the younger group (SD = 10.1; W [1, 93] = 11.09; P = .001) and marginally so compared with the older group (SD = 12.6; W [1, 93] = 3.68; P = .06). Nine cases with the highest squared errors (compared with the group mean) were excluded from this group to achieve homogeneity of variance with the other age groups (W  = 1.16; P = .32). Recomputed correlations (controlling for SES) between CTC and language outcomes (Table 5) were higher than with the cases included, supporting the interpretation that the predictive strength for this group did not derive solely from increased CTC variance.
The CTC-VCI relationship in the 18-to-24–months age group was the strongest observed. To evaluate whether this correlation could be accounted for by other child characteristics, we added 3 contemporaneously collected, potentially related measures of child language development.22 CTC was correlated significantly with CV, language skills, and vocabulary size (r = 0.84, 0.57, and 0.63, respectively; all P < .001). However, results from multiple linear regression analyses predicting VCI scores from CTC when controlling for these factors revealed that their addition to the model did not significantly alter the predictive power of CTC. In Table 6, we display regression metrics for 5 models: model 1, CTC only; model 2, CV only; model 3, CTC plus CV; model 4, CTC plus CV and language skills; and model 5, CTC plus CV, language skills, and vocabulary size (for a reduced sample of 27 participants for whom vocabulary size was available).
These results are used to support the hypothesized relationship between early language experience and school-aged developmental and language outcomes by using an objective, automated method to estimate conversational turns and adult words in the language environment. But, the strength of the association based on automated counts was age dependent; in subgroup analyses, the mean CTC for young children, specifically between 18 and 24 months of age, predicted IQ, verbal comprehension, and expressive and receptive language skills at 9 to 13 years old. Importantly, these correlations remained significant after adjustments for SES or child language development, suggesting that the impact of increased early interaction on long-term developmental outcomes extends beyond the influence of socioeconomic factors and child skills.
It is possible that the automated procedure is not yet sensitive enough to capture significant relationships outside the 18-to-24–months age range and that improvement in automatic detection will make it possible to observe significant correlations across all early ages. It is also possible that, although not indicated in our transcriptional analysis, the automated procedure is particularly well suited to deliver accurate counts of conversational turns during the 18-to-24–months age range. In short, although these findings reveal a strong predictiveness of outcome measures by conversational turns in the 18-to-24–months age range, they cannot be used to rule out possible relationships at earlier or later ages.
However, assuming that a most sensitive period for prediction of language outcomes by early language experience really does begin at ∼18 months (with individual variation expected), these findings might be explained by a newly emergent cognitive process, such as a naming insight, or with a more mechanistic proposal not necessitating cognitive insights. But whatever the cause for the increased vocabulary growth and onset of combinatorial speech observed during this period, the data presented here are used to support the possibility that developmental changes associated with CV complexity are concurrent with a particularly sensitive period for adult-child interaction.14 If specialized cognitive processes contribute to the onset of more frequent word usage based on symbolic reference to external entities, then perhaps during this period, children increasingly engage in especially impactful, referentially meaningful exchanges. These turn-taking exchanges may prepare the child’s cognitive and linguistic capacities for enhanced growth, as indicated by the correlational patterns depicted in Fig 3.19 An empirical evaluation of this possibility requires further investigation. Although little is known about the neuronal mechanisms underlying accelerated vocabulary acquisition after 18 months of age, the existence of sensitive periods for language acquisition suggests that brain architecture may be differentially receptive to environmental input at different periods in early childhood.34,–36 Researchers in numerous studies have explored such sensitive periods and the long-term effects that can result when normal patterns of experience are disturbed during development.37 With the present work, we support research revealing that exposure to tailored experience during specific periods of early development may have important effects on later development.34,–36
In the current study, we expand on previous research by addressing certain limitations in Hart and Risley’s1,2 work. First, they reported a correlation between early word exposure and IQ at age 3 years but did not report results when controlled for SES. Second, although their AWCs correlated with 36-month IQ scores, turn-taking quantity did not, which is surprising given what is now known about the relationship between parent-child interactions and cognitive development.1,38 However, Hart and Risley’s1,2 “parent behavioral turns” differed from LENA’s conversational turns metric and included both verbal and nonverbal responses. Third, although additional research on a subset of the original Hart and Risley1,2 sample revealed measures of child language production (eg, mean length of utterance and child vocabulary use) to be correlated with developmental outcomes in elementary school, analyses on early AWCs and turn counts were not reported.39 With this study, we fill these gaps, demonstrating significant correlations between turn-taking interactions early in life and cognitive and speech-language skills at 9 to 13 years of age.
There are several possible reasons why CTC is better correlated with long-term developmental outcomes than AWC. AWC cannot be used to measure alternation and so is more likely to contain overheard speech, whereas CTC requires alternation between vocalizations of the child and adult speakers and so is more likely to include child-directed speech. Furthermore, unlike AWC, CTC incorporates child speech, which may be predictive of later developmental outcomes.40 This finding is supported by other reports revealing that adult-child turn-taking is more important to early development than is simple word exposure.25,41 Romeo et al23 specifically identified a possible neural mechanism, reporting that LENA conversational turns predicted functional MRI activation in language areas of the brain for 4- to 6-year-old children, whereas AWC did not. Their study represents the first empirical research linking a direct measure of neural functioning to early language environment and is used to support the current finding that turns are more strongly related to long-term outcomes than is simple exposure to adult words. Consequently, we suggest that the long-term predictiveness of turn-taking reported here, coupled with empirical evidence for its relationship to neural functioning, provides strong support for the pivotal roles of the early language environment in healthy cognitive development.
Hart and Risley1,2 showed that the early language environment is important in predicting developmental outcomes. But, their laborious transcription methods severely limited most clinical applications. Automated analyses from daylong recordings are unquestionably easier to obtain than labor-intensive transcriptions, and the ability to predict long-term IQ and language skills even from a single recording has implications for developmental intervention and prevention programs. Potential issues in a child’s language experience may be identified early; if 1 or 2 recordings are completed before the 18-month well-infant visit and impoverished language environments are identified, families could be supported through appropriate intervention.
One limitation to these results is their correlational nature; although we refer to statistical predictiveness, we cannot infer causality. For example, other developmental changes occurring during the 18-to-24–months period may be used to primarily account for cognitive and language skills later. Another limitation is that although the sample spanned a relatively large range of mother’s education levels, only 10 children were from the lowest-SES group. In addition, the sample is not ethnically diverse and includes only monolingual English speakers, so the generalizability of results to those of other languages and cultures is unknown.
Our findings are used to support the concept that a child’s early language experiences may predict developmental outcomes years later. With this study, we expand on previous research by using an automated system to estimate language experience. Conversational turn-taking between the ages of 18 and 24 months was highly correlated with later language and cognitive skills. The use of automated recordings in the home language environment provides an objective and relatively noninvasive method for assessing the strengths and weaknesses of a child’s language environment and an opportunity to design individualized family feedback and offer education and support to enhance child development, potentially altering developmental trajectories, especially of children living in impoverished language environments.
adult word count
conversational turn count
Expressive Vocabulary Test
Language Environment Analysis
Peabody Picture Vocabulary Test
Verbal Comprehension Index
Wechsler Intelligence Scale for Children, Fifth Edition
Dr Gilkerson conceptualized and designed the study, coordinated and supervised data collection, participated in data interpretation, drafted the initial manuscript, and reviewed and revised the manuscript; Mr Richards conceptualized and designed the study, conducted statistical analyses, participated in data interpretation, and reviewed and revised the manuscript; Drs Warren, Oller, and Vohr conceptualized the study, participated in data interpretation, and reviewed and revised the manuscript; Ms Russo conducted statistical analyses and reviewed and revised the manuscript; and all authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.
FUNDING: Funded by the LENA Research Foundation. Dr Oller’s contribution was funded by grant R01 DC011027 and the Plough Foundation.
COMPANION PAPER: A companion to this article can be found online at www.pediatrics.org/cgi/doi/10.1542/peds.2018-2234.
We gratefully acknowledge the Paul family for their wisdom and philanthropy, without which none of this work would have been possible. We are grateful to the members of the LENA Scientific Advisory board, the LENA parents and children, and the LENA employees past and present who contributed to this study – with specific gratitude to Rebecca Mills for diligently managing both study phases and to Joanna Lester for her tireless data collection and communication efforts during the second study phase.
POTENTIAL CONFLICT OF INTEREST: Dr Gilkerson, Mr Richards, and Ms Russo are full-time employees of the LENA Foundation, a 501(c)(3) public charity through which researchers developed and distribute the automated approach used to analyze the data described here. The salaries of LENA Foundation scientists are in no way associated with data analyses or research results; Drs Warren, Oller, and Vohr have indicated they have no potential conflicts of interest to disclose.
FINANCIAL DISCLOSURE: Dr Gilkerson, Mr Richards, and Ms Russo are full-time employees of the LENA Foundation. The LENA Foundation is a nonprofit 501(c)(3) public charity that has been committed to conducting sound scientific research for the past decade. Drs Warren, Oller, and Vohr are unpaid members of the LENA Foundation Scientific Advisory Board. Advisory board members were paid consultants for Infoture, Inc from 2006 to 2009. In February 2009, Infoture, Inc was dissolved and gifted to the LENA Foundation. Since then, all advisory board members have been unpaid volunteers.