The objective of this study was to evaluate the usefulness of covariates in identifying birth records with implausible values of gestational age. Birthweight distributions for births with early reported gestational ages are markedly bimodal, suggesting a mixture of two distributions. Most births form a normal-shaped left-hand (primary) distribution and a smaller number form the right-hand (secondary) distribution. The births in the secondary distribution are thought to have gestational age mistakenly reported. Prior work has found that births in the secondary distribution are at higher risk of poor outcomes than those in the primary distribution. Using 2002 US Natality data for gestational ages 26–35 weeks, we fit normal mixture models to birthweight with and without covariates (maternal race, education, parity, age, region of the country, prenatal care initiation) by reported gestational age. Additional models were stratified by infant sex. This approach allowed for the relationship between the covariates and birthweight to differ between the components.<br><br>Mixture models fit reasonably well for reported gestational ages <33 weeks, but not for later weeks. Counter to the hypothesis, results were similar for models with and without covariates or stratification or both, although stratified models without covariates predicted slightly more girls and slightly fewer boys in the secondary distribution than did the corresponding unstratified models. For reported gestational ages <33 weeks, predictions from the four sets of models were highly correlated and predictions were similar for subgroups defined by the clinical estimates of gestational age and other covariates. For births with reported gestational ages of 29 or more weeks, the proportion in the secondary distribution exceeded 30%, although this varied by maternal characteristics. The use of covariates and stratification complicated model fitting without materially improving identification of implausible gestational age values, supporting inferences from prior studies using data ‘cleaned’ without consideration of maternal or infant characteristics.<br><br>
The use of covariates to identify records with implausible gestational ages using the birthweight distribution
Parker, JD., Liao, D., Schenker, N., & Branum, A. (2010). The use of covariates to identify records with implausible gestational ages using the birthweight distribution. Paediatric and Perinatal Epidemiology, 24(5), 424-432. https://doi.org/10.1111/j.1365-3016.2010.01138.x
To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.
Multifaceted risk for non-suicidal self-injury only versus suicide attempt in a population-based cohort of adults
Long-term effects of a diet supplement containing Cannabis sativa oil and Boswellia serrata in dogs with osteoarthritis following physiotherapy treatments
The importance of quality data to track global progress in addressing stillbirths and neonatal mortality