Definitive guidelines regarding correction for prematurity13  are lacking. The American Academy of Pediatrics recommends correction to age 3,4,5  but this is not implemented consistently. Correction becomes more critical with the decreasing age of viability.

Differences between corrected and uncorrected scores are greatest in infants born <28 weeks.6,7  Rates of disabilities also increase at these gestational ages, questioning whether correction simply masks developmental deficits,8  causing them to be misinterpreted as temporary delays. Correction is applicable to all preterm infants; children with lower gestational ages at birth require extended correction.

The rapid developmental growth curve in infancy requires narrow normative age bands; childhood tests have several-month age bands. The assumption is that development is more rapid early on and the slope slows at later ages. On the Bayley-4,9  10 day age bands are found up to 5 months to 15 days, 30 day bands up to 36 months to 15 days, and 3 month intervals to 42 months to 30 days. For example, a child born at 26 weeks and tested at 12 months receives a corrected standard score of 100 but the uncorrected score is 85 (1 SD difference).

With early childhood testing, changes in the size of the age band norms are often overlooked. The Stanford-Binet-V for Early Childhood10  has 1 month age bands from 2 to 5 years and 3 month bands from 5 years onward. Therefore, if outcome is at 5 years, some children would fall into the 1-month band, whereas others would be compared with the 3 month age bands. The Differential Abilities Scales-II11  has 3 month age bands up to 8 to 11. Despite consistent band widths, children born <28 weeks could span >1 age band. The Wechsler Preschool and Primary Scales of Intelligence IV12  has age bands ranging from 3 to 6 months, again adding variability.

With uncorrected scores, very or extremely preterm infants are at greater disadvantage if their age falls at the beginning of a 3 month age band because the child could be 2 to 3 months younger than one whose age falls in the upper end of the age band.13  Scores may also change when age correction is used at younger ages but discontinued at the later chronological ages of 3 to 4 years.

Norming groups are typically established on the basis of samples within discrete age bands that can vary widely. Discontinuities, evident in gaps or abrupt jumps in norms tables, may create a situation where the raw score yields the same scaled score for children within the age band but then jumps markedly for children in the next age band. Moreover, the test score distribution is only constant within the age band when it is very narrow and the growth curve is not too steep. Selecting the size of the age band for norms creation is equal parts psychometric science and clinical art.

The jumps between norms at successive age levels are exacerbated when correcting for prematurity. Raw test scores can have different interpretations with only a day or 2 difference in age. For example, on the Bayley-4, if the child’s age was 12 months, 15 days, a raw Cognitive score of 57 yields a standard score of 85 (16th percentile); at 12 months, 16 days, the same raw score yields a score of 80 (ninth percentile),14  (0.33 SD difference). At some ages, the effect of a 1 day difference can be more substantial and could mean the difference between a moderate versus severe neurodevelopmental impairment classification.

One possible solution is to apply linear regression to predict average growth curves within and across age bands. Referred to as “continuous norming,” this approach may reduce the effect of wide window norms on correction.15,16  This technique is particularly useful if the growth curve is steep, such as in the earliest months and years of development. The regression curve is plotted across the means by age; similarly, a curve is also fitted to the SDs across age. The fitted mean and fitted SD at any given day, week, or month of age can be used to produce norms for that age. The regression models also take into account the shape of the data distribution. The fitted mean and SD could be averaged to produce norms for different intervals (eg, weekly). Continuous norming does not assume development is static within an age band but conceptualizes age as continuous and models raw scores as a smooth, nonlinear function of age.

Advantages to regression-based continuous norming are that norms are more realistic and smaller sample sizes are needed.16  However, the number of tables necessary for the age interval selected require computerized scoring, which is a drawback. As a result, weekly or monthly norms should be considered for young and older children, respectively.

A major consideration is whether assessment is for clinical or research purposes. Corrected scores should be used into adolescence in research studies, but uncorrected scores are appropriate for clinical purposes in children aged >5. The child’s level of function is important: the likelihood of a major difference in corrected and uncorrected scores is minimal at ≥2 SDs below average. The effect of adjustment will be greater in children who are borderline to low average or above. In many venues, it is helpful to provide both corrected and uncorrected scores.1,6 

The appropriateness of correction may depend on whether cognitive, motor, or language domains are involved.6  Age adjustment for prematurity was reported to overrate motor performance on the Bayley-III17  12 month assessment, but not cognitive performance. Other studies reported conflicting results, with correction being needed to bring motor milestones more in line with normative data.18  Language performance data are inconclusive.8  Using chronologic versus corrected age will affect categorical diagnoses and eligibility for intervention services.

In a theoretical model, Wilson-Ching et al2  used the Bayley-III normative data to identify cognitive raw scores that produced baseline standard scores of 70, 85, and 100, and compared these scores for 1, 2, 3, and 4 months of prematurity (assessing what the same raw score values are if the child was 1 to 4 months older). These comparisons were made at 6, 12, 24, and 36 months. Differences between scores at 1 and 2 months of prematurity washed out after 2 years; however, at 3 and 4 months of hypothetical prematurity, a 5-point difference in the cognitive composite score was still found at 36 months.

Aylward1  employed a similar approach with Bayley-4 data, evaluating cognitive, motor, and language function at 6, 12, 24, and 36 months. Using a cutoff of 3 points difference in scores (0.20 SD) correction was needed at all ages over the first 3 years for language and motor function. With respect to the cognitive domain, correction was needed up through 24 months, but cognitive age adjustment was necessary at 36 months for those born 4 months early.

These data were obtained from two nationally representative, contemporary data sets of typically developing children without medical risk; however, the samples did not include preterm infants. Nonetheless, these data suggest a greater likelihood of correction addressing delay versus deficit.

  • Larger differences between corrected and noncorrected scores particularly occur in infants born at younger gestational ages (≥3 months).

  • The choice of later tests and their psychometric properties affect outcome classifications. The width of age bands at the time of evaluation and small differences in age can cause major discrepancies in scores, placing younger children at a distinct disadvantage if they fall at the beginning of a several-month age band. This may or may not be evened out in larger samples. A mean 5 point (0.33 SD) difference can be found between 2 normative cutoffs that differ by as little as 1 day.

  • In preterm infants, it is likely that both delay and deficit influence the accuracy of correction. It is impossible to determine the contribution of each to a particular score, although correction seems to primarily address delay. This reaffirms the need for serial evaluations.

  • Application of the continuous norming approach should be considered. The main restriction is the need for computer-generated tables the number depending on the age interval selected.

  • It is recommended that both corrected and uncorrected scores be recorded at this time. This is an area for further research.

  • The context of the assessment determines the appropriateness of correction. Clinically, it appears necessary over the first 3 years for language and motor function. For the cognitive domain, correction to 2 years seems appropriate for children born up to 3 months early; for those born 4 months premature, correction is appropriate to age 3. In research studies, correction through adolescence is endorsed. This recommendation depends on the difference value deemed significant by the examiner between corrected and uncorrected scores. Attempts to develop complex algorithms or partial correction are unsupported19 

Investigators should develop a consensus regarding correction to allow comparability across different cohorts. Gestational age at birth, age at time of assessment, medical/biologic issues, and characteristics of a given test’s normative data must be considered· Comparing preterms’ developmental slopes to those of full-term peers to determine the ages at which correction is needed is not as straightforward as it might seem.

The author thanks Drs Peter J. Anderson and Larry Weiss for their helpful review of the manuscript.

Correction for prematurity is controversial. There are multiple issues that can affect assessment and outcome data. Possible solutions are outlined.

FUNDING: No external funding.

CONFLICT OF INTEREST DISCLAIMER: Dr Aylward is the author of the Bayley-4 and receives royalties from the test publisher, Pearson.

1
Aylward
GP
.
Is it correct to correct for prematurity? Theoretic analysis of the Bayley-4 normative data
.
J Dev Behav Pediatr
.
2020
;
41
(
2
):
128
133
2
Wilson-Ching
M
,
Pascoe
L
,
Doyle
LW
,
Anderson
PJ
.
Effects of correcting for prematurity on cognitive test scores in childhood
.
J Paediatr Child Health
.
2014
;
50
(
3
):
182
188
3
Wilson
SL
,
Cradock
MM
.
Review: accounting for prematurity in developmental assessment and the use of age-adjusted scores
.
J Pediatr Psychol
.
2004
;
29
(
8
):
641
649
4
Medical Home Initiatives for Children With Special Needs Project Advisory Committee. American Academy of Pediatrics
.
The medical home
.
Pediatrics
.
2002
;
110
(
1 Pt 1
):
184
186
5
Bernbaum
JC
,
Campbell
DE
,
Imaizumi
SO
.
Follow-up care of the graduate from the neonatal intensive care unit
. In:
McInerny
T
, ed.
American Academy of Pediatrics Textbook of Pediatric Care
.
Elk Grove Village, IL
:
American Academy of Pediatrics
;
2009
:
867
882
6
Morsan
V
,
Fantoni
C
,
Talladini
MA
.
Age correction in cognitive, linguistic, and motor domains for infants born preterm: an analysis of the Bayley Scales of Infant and Toddler Development Third Edition developmental patterns
.
Dev Med Child Neurol
.
2018
;
60
(
8
):
820
825
7
Harel-Gadassi
A
,
Friedlander
E
,
Yaari
M
, et al
.
Developmental assessment of preterm infants: Chronological or corrected age?
Res Dev Disabil
.
2018
;
80
:
35
43
8
Parekh
SA
,
Boyle
EM
,
Guy
A
, et al
.
Correcting for prematurity affects developmental test scores in infants born late and moderately preterm
.
Early Hum Dev
.
2016
;
94
:
1
6
9
Bayley
N
,
Aylward
GP
.
Bayley Scales of Infant and Toddler Development–Fourth Edition Administration Manual
.
Bloomington, MN
:
NCS Pearson
;
2019
10
Roid
GH
.
The Stanford Binet 5 for Early Childhood
.
Itasca, IL
:
Riverside
;
2005
11
Elliott
CD
.
The Differential Abilities Scales-II
.
Bloomington, MN
:
NCS Pearson
;
2007
12
Wechsler
D
.
Wechsler Preschool and Primary Scales of intelligence
, 4th ed.
Bloomington, MN
:
NCS Pearson
;
2012
13
van Veen
S
,
Aarnoudse-Moens
CS
,
van Kaam
AH
,
Oosterlaan
J
,
van Wassenaer-Leemhuis
AG
.
Consequences of correcting intelligence quotient for prematurity at age 5 years
.
J Pediatr
.
2016
;
173
:
90
95
14
Aylward
GP
,
Zhu
J
.
The Bayley Scales: clarification for clinicians and researchers
.
[White paper]
Bloomington MN
;
NCS Pearson
:
2019
.
15
Lenhard
A
,
Lenhard
W
,
Suggate
S
,
Segerer
R
.
A continuous solution to the norming problem
.
Assessment
.
2018
;
25
(
1
):
112
125
16
Lenhard
A
,
Lenhard
W
,
Gary
S
.
Continuous norming of psychometric tests: a simulation study of parametric and semi-parametric approaches
.
PLoS One
.
2019
;
14
(
9
):
e0222279
17
Barrera
ME
,
Rosenbaum
PL
,
Cunningham
CE
.
Corrected and uncorrected Bayley scores: longitudinal developmental patterns in low and high birth weight preterm infants
.
Infant Behav Dev
.
1987
;
10
:
337
346
18
Romeo
DM
.
Correcting for prematurity with the Bayley Scales of Infant Development
.
Dev Med Child Neurol
.
2018
;
60
(
8
):
736
737
19
Lems
W
,
Hopkins
B
,
Samson
JF
.
Mental and motor development in preterm infants: the issue of corrected age
.
Early Hum Dev
.
1993
;
34
(
1-2
):
113
123