We use cookies to improve your experience. By continuing to browse this site, you accept our cookie policy.×
Skip main navigation
Aging Health
Bioelectronics in Medicine
Biomarkers in Medicine
Breast Cancer Management
CNS Oncology
Colorectal Cancer
Concussion
Epigenomics
Future Cardiology
Future Medicine AI
Future Microbiology
Future Neurology
Future Oncology
Future Rare Diseases
Future Virology
Hepatic Oncology
HIV Therapy
Immunotherapy
International Journal of Endocrine Oncology
International Journal of Hematologic Oncology
Journal of 3D Printing in Medicine
Lung Cancer Management
Melanoma Management
Nanomedicine
Neurodegenerative Disease Management
Pain Management
Pediatric Health
Personalized Medicine
Pharmacogenomics
Regenerative Medicine
Research ArticleOpen Accesscc iconby iconnc iconnd icon

Toward a more holistic approach to the study of exposures and child outcomes

    Barry M Lester

    *Author for correspondence: Tel.: +1 401 453 7640;

    E-mail Address: barry_lester@brown.edu

    Department of Pediatrics, Brown Alpert Medical School & Women & Infants Hospital, Providence, RI 02905, USA

    Brown Center for the Study of Children at Risk, Brown Alpert Medical School & Women & Infants Hospital, Providence, RI 02905, USA

    Department of Psychiatry & Human Behavior, Brown Alpert Medical School, Providence, RI 02905, USA

    ,
    Marie Camerota

    Brown Center for the Study of Children at Risk, Brown Alpert Medical School & Women & Infants Hospital, Providence, RI 02905, USA

    Department of Psychiatry & Human Behavior, Brown Alpert Medical School, Providence, RI 02905, USA

    ,
    Todd M Everson

    Department of Environmental Health, Emory University Rollins School of Public Health, Atlanta, GA 30322, USA

    ,
    Coral L Shuster

    Department of Pediatrics, Brown Alpert Medical School & Women & Infants Hospital, Providence, RI 02905, USA

    Brown Center for the Study of Children at Risk, Brown Alpert Medical School & Women & Infants Hospital, Providence, RI 02905, USA

    &
    Carmen J Marsit

    Department of Environmental Health, Emory University Rollins School of Public Health, Atlanta, GA 30322, USA

    Published Online:https://doi.org/10.2217/epi-2023-0424

    Abstract

    Aim: The current work was designed to demonstrate the application of the exposome framework in examining associations between exposures and children's long-term neurodevelopmental and behavioral outcomes. Methods: Longitudinal data were collected from birth through age 6 from 402 preterm infants. Three statistical methods were utilized to demonstrate the exposome framework: exposome-wide association study, cumulative exposure and machine learning models, with and without epigenetic data. Results: Each statistical approach answered a distinct research question regarding the impact of exposures on longitudinal child outcomes. Findings highlight associations between exposures, epigenetics and executive function. Conclusion: Findings demonstrate how an exposome-based approach can be utilized to understand relationships between internal (e.g., DNA methylation) and external (e.g., prenatal risk) exposures and long-term developmental outcomes in preterm children.

    It is well known that events that occur in early development can result in long-term adverse health outcomes [1]. Epidemiologic research and genome-wide association studies indicate that environmental factors may account for, in some cases, 85% or more of chronic disease risk [2–5]. While much of this is attributed to cigarette smoking; alcohol; excess caloric, sugar or fat consumption; and psychosocial exposures, a considerable amount of disease burden is currently of unknown origin. Despite significant advances in pediatric research, our understanding of how prenatal and early life environments influence children's health and well-being is inadequate, and generally focuses on limited specific exposures or exposure classes, inhibiting our ability to develop interventions to mitigate adverse outcomes. The irony is that many of the conditions that affect children's health and well-being are related to modifiable factors in the environment including air pollution, chemicals, stress, poor nutrition, poverty and other social factors. The American Academy of Pediatrics and the American College of Obstetricians and Gynecologists have recognized the influence of exposure to environmental factors early in life, especially during periods of developmental plasticity, on the health and well-being of children [6,7].

    The concept of the exposome originated in the fields of cancer epidemiology and toxicology, initially focusing on chemical exposures to provide a complement to the human genome [8], as an agnostic approach to investigating the environmental causes of chronic diseases. It has evolved to represent the systematic and comprehensive analysis of the nongenetic factors influencing health over the life course [9]. Exposome-related concepts consider that this burden originates from an accumulation of exposures from the environment, diet, behavior and related endogenous processes over various life stages [10].

    Over time, the operationalization of the exposome has expanded beyond toxicants to include a broad definition of environmental exposures such as psychosocial and physical factors, societal exposures such as racism and socioeconomic status, and biological responses to exposures such as epigenetics and metabolomics. Taken together, such exposures define an individual's exposome, the totality of environmental exposures experienced across a lifespan, the response of biological systems to those exposures, broader societal factors that impact the environment experienced by an individual and how an individual may respond or interact with that environment. Important progress in uncovering the various aspects of the exposome are currently being made by research projects worldwide, including the LifeCycle, HELIX, EXPOsOMICS, HEALS and CHEAR/HHEAR and the Emory HERCULES Exposome Research Center [11,12], which have expanded the concept of the exposome to include societal drivers and the impacts of structural factors and racism [13].

    Studying the exposome poses challenges given the myriad ways that it can be defined, for individuals, communities, populations and their specific contexts, as well as the complexity of assessing cumulative histories of exposures. Environmental exposures to substances such as pollutants, metals and chemicals interact with other exposome components such as diet, gut and other microbiomes, nonchemical stressors (e.g., socioeconomic conditions, social support, stress, sleep quantity and quality), medications and exercise, all in the context of a unique genome, so assessment of the true contribution of environmental exposures to disease is complex. The incorporation of biological responses as part of the exposome definition is important, as it provides an opportunity to consider how an individual can have a unique response to the experienced environment based on those complex interactions [12,14,15]. The inclusion of epigenetics also allows for consideration of the potential independent or mutual effects of environment and biology on long-term outcomes. Additionally, structural and societal factors define the environment that is experienced by an individual and contribute to the nonchemical stressors that interact with and build upon those exposures and may indicate an additional opportunity for intervention. The exposome therefore provides a unifying but expansive model in environmental health, linking exposures across space, time and behavior to health outcomes. While the full exposome is not directly measurable, the concept provides a framework for linking multiple types of data to describe the past, current and projected exposures of an individual and how these exposures may link to health and disease.

    Applying the exposome concept to better understand children's health and development has tremendous potential to move the field forward, particularly in the area of developmental origins of health and disease, which hypothesizes that early-life exposures can program metabolic, immunologic, epigenetic or physiologic responses that have persisting effects on health [16,17]. The ability to assess the totality of exposures starting in the prenatal period and children's biological responses to these exposures during a period of heightened sensitivity and plasticity could lead to novel insights and open the door for new opportunities to promote optimal health and well-being beginning early in life. However, the current landscape of exposome research has been conducted mostly with adult populations and focused on physical health and disease outcomes.

    The purpose of this work is twofold: first, to show how the exposome framework can be applied to studies of children's health where ‘health' is broadly defined, moving from the typical disease model to also include behavioral and neurodevelopmental outcomes while focusing on both positive and negative outcomes; second, the authors propose a roadmap for conducting these types of studies by demonstrating various statistical approaches that can be applied to assess the relationship between the exposome and child neurobehavioral outcomes. This framework (Figure 1) elaborates the exposome through the addition of factors that provide a more complete understanding of children's health. This includes the addition of protective factors and resilience, to bring to light opportunities to improve health. It also highlights the importance of diversity, including neurodiversity, as these individual attributes can affect how the environment is experienced or how the environment impacts health, both positively and negatively [12].

    Figure 1. Exposome model.

    The exposome framework is also specifically applied to an exemplary study of very preterm infants, a feature that allows the study of modifiable risk factors for outcomes that are more frequent in this population, including neurodevelopmental impairment [18], and provides the opportunity to study the relative contribution of pre- and post-neonatal intensive care unit (NICU) factors in the development of adverse child health outcomes. The study of very preterm infants via the exposome model also fits well in the field of developmental origins of health and disease, as we can investigate both internal and external exposome features and relate them to prospective outcomes. Finally, little is understood about why many preterm infants, despite having severe medical problems, have positive health and neurodevelopmental outcomes. Applying the exposome model to preterm infants may help us better understand the combination of environmental and biological factors that lead to both positive and negative health outcomes in this, and perhaps other, at-risk populations. From a public health perspective, the study of protective factors in the environment and resiliency in the child may provide insights into interventions to mitigate adverse outcomes.

    Methods

    Study population

    Participants came from the Neonatal Neurobehavior and Outcomes in Very Preterm Infants (NOVI) study. Infants and caregivers were enrolled at nine university-affiliated NICUs in Providence, RI (n = 116), Grand Rapids, MI (n = 129), Kansas City, MO (n = 86), Honolulu, HI (n = 112), Winston-Salem, NC (n = 147) and Torrance and Long Beach, CA (n = 114) from April 2014 through June 2016. These NICUs also participated in the Vermont Oxford Network. Eligibility was determined based on the following inclusion criteria: birth at <30 weeks gestational age (GA); parental ability to read and speak English or Spanish and residence within 3 h of the NICU and follow-up clinic. Exclusion criteria included maternal age <18 years, maternal cognitive impairment, maternal or infant death and major congenital anomalies, including CNS, cardiovascular, gastrointestinal, genitourinary, chromosomal and other serious anomalies. Parents of eligible infants were invited to participate in the study when survival to discharge was determined to be likely by the attending neonatologist. Mothers provided written informed consent. All study methods were approved by the Institutional Review Board at each site and all research was performed following guidelines regarding ethics in human subjects research. Characteristics of all mothers and infants enrolled in the NOVI study are shown in Table 1. The current analysis includes 402 infants with data on exposures and at least one neurobehavioral outcome (Table 1).

    Table 1. Maternal and infant characteristics of participants in the Neonatal Neurobehavior and Outcomes in Very Preterm Infants study.
    Maternal characteristicsFull sample (N = 617)Included (N = 346)Excluded (N = 271)p-value
    M (SD) or % (n)M (SD) or % (n)M (SD) or % (n)
    Maternal race
      American–Indian/Alaska Native0.16% (1/617)0% (0/346)0.37% (1/271)0.90
      Asian3.7% (23/617)4.6% (16/346)2.6% (7/271)0.27
      Native Hawaiian/other Pacific Islander1.3% (8/617)0.58% (2/346)2.2% (6/271)0.15
      Black or African–American20% (126/617)18% (61/346)24% (65/271)0.07
      White42% (261/617)47% (161/346)37% (100/271)0.02
      More than one race22% (136/617)21% (72/346)24% (64/271)0.46
      Unknown/not reported10% (62/617)9.8% (34/346)10% (28/271)0.94
    Hispanic/Latino ethnicity23% (142/617)25% (85/346)21% (57/271)0.35
    Low SES: Hollingshead level 59.6% (58/605)9.1% (31/342)10% (27/263)0.72
    Maternal education <HS/GED13% (78/604)12% (41/341)14% (37/263)0.54
    No partner25% (152/605)23% (78/342)28% (74/263)0.16
    Neonatal characteristicsFull sample (N = 704)Included (N = 402)Excluded (N = 302)p-value
    M (SD) or % (n)M (SD) or % (n)M (SD) or % (n)
    Multiple gestation26% (184/697)29% (116/401)23% (68/296)0.09
    Vaginal delivery29% (201/696)26% (106/400)32% (95/296)0.13
    Severe retinopathy of prematurity5.9% (41/697)5.7% (23/401)6.1% (18/296)0.98
    Necrotizing enterocolitis/sepsis18% (128/697)17% (70/401)20% (58/296)0.53
    Bronchopulmonary dysplasia51% (357/697)50% (199/401)53% (158/296)0.37
    Serious brain injury13% (92/694)11% (45/400)16% (47/294)0.09
    Sex = male55% (390/704)54% (217/402)57% (173/302)0.43
    PMA at birth (weeks)27.01 (1.91)26.99 (1.88)27.04 (1.96)0.73
    Head circumference (cm)24.46 (2.43)24.39 (2.33)24.56 (2.56)0.38
    PMA at NICU discharge (weeks)40.53 (5.43)40.33 (4.6)40.81 (6.39)0.28
    Length of NICU stay (days)94.16 (44.1)93.24 (38)95.42 (51.3)0.54
    Birth weight (grams)948.3 (281)939.2 (273)960.5 (290)0.33
    Weight at discharge (grams)3014 (905)3020 (876)3005 (944)0.83

    Serious brain injury included parenchymal echodensity, periventricular leukomalacia or ventricular dilation diagnosed via cranial ultrasound.

    GED: General equivalency diploma; HS: High school; M: Mean; NICU: Neonatal intensive care unit; PMA: Postmenstrual age; SD: Standard deviation; SES: Socioeconomic status.

    †Bold text indicates a statistical difference between participants included and excluded from analyses.

    Measures

    Exposures

    Exposures included a set of 24 pre- and peri-natal risk variables self-reported by mothers or obtained via medical record abstraction (Supplementary Table 1). These included maternal demographic (e.g., education, age), medical (e.g., pre-eclampsia, hypertension) and substance use (e.g., tobacco, alcohol) variables. Additional information about these variables and their prior use in NOVI are described elsewhere [19].

    Outcomes

    Several neurobehavioral outcomes assessed when children were between 4 and 6 years of age were examined. These included both behavioral and neurodevelopmental positive and negative outcomes to demonstrate the wide range of domains that can be studied with an exposome framework, especially as applied to children.

    Social Responsiveness Scale, 2nd Edition

    The Social Responsiveness Scale, 2nd Edition (SRS-2; n = 369), is a 65-item caregiver-reported measure of social impairment in children ages 2 to 6.5 years old [20]. Impairments identified by this measure are often associated with autism spectrum disorder (e.g., not being able to communicate feelings, avoiding eye contact, avoiding social interactions with peers or adults), with results from this measure aiming to differentiate autism from other developmental disorders. Higher scores on the SRS-2 are indicative of greater social impairment. The outcome measure in the current work was the total impairment t-score. Prior research shows that this measure demonstrates strong validity, reliability and sensitivity [21,22].

    Conners Kiddie Continuous Performance Test

    The Conners Kiddie Continuous Performance Test (K-CPT; n = 183) is a performance-based assessment of attention deficits in children ages 4 to 7 years old [23]. Children view a continuous series of images and are asked to press a computer key every time they see a target object, except when the object is a soccer ball (nontarget). T-scores are provided for three error types: omissions (failing to respond to a target), commissions (responding to a nontarget) and perseverations (indicative of anticipatory responding). All three error types were examined as measures of inattention. Psychometric studies report acceptable reliability and validity of the K-CPT [24].

    Childhood Behavior Questionnaire

    The Childhood Behavior Questionnaire, Very Short Form (CBQ; n = 387), is a parent-report measure of child temperament for children ages 3 to 7 years old [25]. Parents rate their children's reactions to 36 scenarios to measure surgency, negative affect and effortful control. Summary scores for all three domains were assessed. Psychometric studies report acceptable reliability and validity of the CBQ [26].

    NIH Toolbox Early Cognition Battery

    The NIH Toolbox Early Cognition Battery (NIH-TB) assesses children's executive function using two tasks: Dimensional Change Card Sort Test (DCCS; n = 200) and Flanker Inhibitory Control and Attention Test (Flanker; n = 206) [27]. The DCCS measures cognitive flexibility and the Flanker task measures inhibitory control. Language is assessed using the Picture Vocabulary Test (n = 252), which primarily measures receptive vocabulary [28]. The age-corrected standard scores from each task were used as measures of executive function and language. All three measures demonstrate excellent developmental sensitivity, test–retest reliability and construct validity [27,28].

    Epigenetics

    Genomic DNA was obtained from neonatal cheek cells at NICU discharge using the Isohelix Buccal Swab system (Boca Scientific, MA, USA). DNA was quantified with the Quibit Fluorometer (Thermo Fisher, MA, USA) and DNA was plated randomly across 96-well plates. The Emory University Integrated Genomics Core performed bisulfite modification using the EZ DNA Methylation Kit (Zymo Research, CA, USA), and genome-wide DNA methylation (DNAm) profiling using the Illumina MethylationEPIC Beadarray (Illumina, CA, USA). Detailed quality control and preprocessing are described elsewhere [29]. Briefly, samples with poor detection p-values or that were sex-mismatched were excluded and Noob normalization was used [30,31]. Probes with high median detection p-values, those on the X or Y chromosome, those with SNPs within the binding region and those that could cross-hybridize were excluded [32]. Beta-mixture quantile normalization was used to standardize across Type-I and Type-II probe designs [33,34]. Data-reduction steps were also incorporated to decrease the multiple testing burden and increase the power to detect meaningful associations. First, the CoMeBack pipeline was implemented to identify comethylated regions (CMRs) [35], which are clusters of proximal CpGs that have highly correlated methylation levels. Principal components analysis is performed on each CMR, and the first component is assigned to each cluster as a summary of overall CMR methylation levels. The CoMeBack pipeline identified 73,746 CMRs representing the DNAm of 206,195 CpG sites; 500,128 CpG sites were not included in the CMRs and were retained as individual CpG sites. CpGs or CMRs that had mean standard deviations <0.02 were then excluded since these sites with low variability are more prone to measurement error and less likely to yield reproducible findings [36]. After exclusions and data-reduction steps, 452,453 loci (60,917 CMRs and 391,536 CpGs) were available from 542 samples for this study (83% of 651 with buccal swab data; 77% of the NOVI cohort). For simplicity, each locus is referred to as a CpG, noting whether significant results are located in a CMR. These data are accessible through the National Center for Biotechnology Information Gene Expression Omnibus via accession series GSE128821.

    Statistical analysis

    In demonstrating the utility of the exposome model, the overall hypothesis was that exposures and their biological impacts (in this case, epigenetics) would jointly predict neurobehavioral outcomes in the NOVI cohort of children born <30 weeks GA. Associations between societal and environmental exposures and outcomes were examined first. To showcase the wide array of possible techniques available for testing the exposome framework, analyses were conducted using three statistical approaches: individual variable or exposome-wide association study (ExWAS), cumulative exposure and machine learning models. For the first two approaches, associations between all exposures and all outcome variables were examined. Results from the first two approaches were used to select a single child outcome (referred to as 'focal outcome' moving forward) that was robustly associated with both individual and cumulative exposures that would be carried forward using the third approach and beyond. Due to the computationally intensive nature of the machine learning models, paired with a modest sample size, it was necessary to select a single outcome rather than test all possible outcomes in this step.

    These three analytic approaches were next expanded upon to jointly consider exposures and the biological impacts of exposures (epigenetics) as joint predictors of child neurobehavioral outcomes. To do this, an epigenome-wide association study (EWAS) was first planned to examine the relationship between DNAm at individual sites across the epigenome and the focal outcome. This would result in a set of significantly associated epigenetic variables that could be added to the ExWAS, cumulative exposure models and machine learning predictive models described previously. Adding these epigenetic variables in addition to the 24 pre- and peri-natal exposure variables facilitated the investigation of how the inclusion of DNAm data changed the findings from exposure-only models. For example, if the effect of exposure on outcome changed substantially when including DNAm data, it suggests that DNAm may partially mediate or confound the exposure-outcome relationship. Although these epigenetic models are only demonstrated using the focal outcome, this is meant to be an exemplar of the types of analyses that could be expanded upon in future substantive investigations.

    In sum, the proposed exposome model was tested using three statistical approaches, ExWAS, cumulative exposure and machine learning models, with and without the inclusion of epigenetic data, for six total statistical demonstrations. Additional details for each of these approaches are described in subsequent sections. Unless otherwise noted, all models were adjusted for nesting of children within families (i.e., multiple births) and a standard set of covariates (i.e., study site, primary language, infant GA at birth, infant sex, neonatal medical morbidities, maternal postnatal psychological distress). In all analyses, listwise deletion was utilized for missing data. Sample sizes for all models are included in their respective tables. All outcomes were retained as continuous variables. Statistical analyses were conducted in SPSS 27.0 and R 4.2.1.

    ExWAS models

    Generalized estimating equation (GEE) models were used to explore associations between individual exposures and outcomes (a total of 240 models). These models tested single exposure–outcome associations independently from all others. This approach was useful in helping to identify clusters of associations between exposures and child outcomes. Models were interpreted in terms of the significance, magnitude and direction of associations between exposures and outcomes. These GEE models were conducted with the gee package [37] in R 4.2.1.

    Cumulative exposure models

    To test the associations between exposures and outcomes more holistically, two different data reduction techniques were applied to the set of exposure variables. First, latent class analysis was applied to the 24 prenatal exposures based on previous work [19]. Three distinct and mutually exclusive exposure phenotypes were derived: a low-risk phenotype with low endorsement of all risk factors, a physical risk phenotype with elevated maternal medical problems and a psychological risk phenotype with elevated maternal substance use and psychopathology [19]. A cumulative exposure score that assessed the total proportion of risk factors (out of 24) experienced by mothers was also created. This type of approach is in line with an allostatic load perspective that evaluates the cumulative biological burden of adverse exposures [38]. Two sets of GEE models were used to test whether prenatal exposure phenotypes (ten models) and cumulative exposure scores (ten models) were associated with individual child outcomes. These GEE analyses were conducted via SPSS 27.0.

    Machine learning models

    As previously described, before proceeding with machine learning models, results from ExWAS and cumulative exposure models were examined to select a single focal outcome that was robustly related to both individual and cumulative exposures. Cross-validated random forest models were used to evaluate the joint contribution of prenatal exposure variables to the focal outcome. Random forest models leverage the joint prediction of numerous weakly correlated regression trees and thus can detect both linear and nonlinear associations. K-fold cross-validation was used such that the test-train split was repeated ten times (on all possible 90–10% data splits that can be generated from splitting the data into ten equal subsets), and model performance estimates were averaged across models. Model estimates were used to compute diagnostic odds ratios (DORs; and 95% CIs) and p-values (alpha = .05 statistical significance threshold) that together indicate the predictive performance of the model. Variable importance estimates were also examined to understand which variables made the strongest contributions to model performance. Note that variable importance estimates show how model performance would be impacted if the variable were removed and are thus a relative metric rather than one with universal standards. Machine learning models were created using the randomForestSRC [39] and ranger [40] packages in R 4.2.1.

    Adding epigenetics

    Before repeating the three analyses (ExWAS, cumulative exposure and machine learning models) incorporating child epigenetics at birth as an additional component of the exposome model, an EWAS was conducted to identify the CpGs (sites of DNAm) that were most strongly associated with the focal outcome. The EWAS consisted of running a series of ∼450K GEE models (one for each CpG site that remained in the dataset after the exclusion of those with little variability), with substantive (child GA, sex, age at buccal swab, study site, neonatal medical morbidities) and technical (batch, cell-type composition) covariates. Cell-type composition was estimated using previously developed reference methylomes [41]. These models were estimated using the gee package [37] in R 4.2.1. Adjustment for multiple testing was done by applying a Bonferroni correction (α = .05/452453 = 1.1E-7). Only CpG sites meeting this p-value threshold were carried forward as epigenetic predictor variables. Once the list of CpG sites was determined, the models (ExWAS, cumulative exposure and machine learning) were repeated with the 24 pre- and peri-natal exposure variables in addition to all significantly associated CpGs to investigate the joint contributions of exposures and epigenetics to the focal outcome.

    Results

    ExWAS

    An ExWAS approach was used to examine associations between individual exposures and outcomes. The results of the ExWAS allowed a comparison of the strength of the associations between exposures and outcomes, highlighting significant clusters of associations. The heatmap in Figure 2 illustrates the direction (positive or negative denoted by red or blue shading), magnitude (large or small denoted by the darkness of shading) and significance (presence or absence of coefficients) of these associations. Several interesting observations were apparent, including multiple strong, inverse associations between prenatal mood disturbance, lack of prenatal care, medical conditions and alcohol use with child executive function as measured by both the NIH Toolbox DCCS and Flanker tasks. There were also some inverse associations between demographic factors, prenatal mood and substance use and child behavior as measured by the CBQ.

    Figure 2. Heatmap of associations between 24 prenatal exposures (columns) and 10 neurodevelopmental outcomes (rows) after covariate adjustment (nesting of children in families, study site, primary language, infant gestational age at birth, infant sex, neonatal medical morbidities and maternal postnatal psychological distress).

    Positive associations are shown in red; negative associations are shown in blue. Standardized coefficients for significant associations are shown.

    CBQ: Child Behavior Questionnaire; DCCS: Dimensional Change Card Sort Test; HS: High school; KCPT: Kiddie Continuous Performance Test; NIH: National Institutes of Health; Preg: Pregnancy; SES: Socioeconomic status; SRS: Social Responsiveness Scale; wt: Weight.

    Cumulative exposures

    GEE results showed that children born to mothers with the physical risk phenotype had poorer executive function, as indexed by the NIH Toolbox DCCS (β = -.68; p < .001), compared with children born to mothers with the low-risk phenotype (Table 2). Children born to mothers with the psychological risk phenotype had increased social impairment as measured by the SRS (β = .54; p = .014) compared with the low-risk phenotype. Higher cumulative exposure scores were also associated with poorer executive function (both DCCS and Flanker) and language skills (β = -.14 to -.25; all p ≤ .05) and increased social impairment (β = .21; p < .001). Interestingly, physical and psychological risk phenotypes and higher cumulative exposure scores were associated with lower child negative affect (β = -.19 to -.38; all p < .05) and, in the case of cumulative exposure, lower child surgency (β = -.16; p = .010).

    Table 2. Associations between exposures and outcomes in cumulative exposure models.
    Child outcomesLow riskPhysical riskPsychological riskPhysical vs low riskPsychological vs low riskCumulative risk
     M ± SDM ± SDM ± SDβ (SE)β (SE)β (SE)
    Social Responsiveness Scale (n = 340)53.5 ± 9.353.6 ± 10.060.9 ± 12.4.12 (.12).54 (.22).21 (.10)§
    Conner's K-CPT (n = 168)
    Omissions67.3 ± 17.468.0 ± 16.970.3 ± 16.4.22 (.18).18 (.30).09 (.08)
    Commissions55.7 ± 12.554.3 ± 10.958.8 ± 12.3-.11 (.18).24 (.28).09 (.08)
    Perseverations63.8 ± 16.064.7 ± 16.964.0 ± 16.5.27 (.17)-.02 (.29).11 (.08)
    NIH Toolbox
    DCCS (n = 187)95.7 ± 13.487.7 ± 14.794.0 ± 13.9-.68 (.20)§-.10 (.25)-.25 (.08)
    Flanker (n = 191)97.5 ± 16.293.6 ± 13.092.2 ± 14.1-.31 (.20)-.31 (.21)-.14 (.07)
    Picture vocabulary (n = 236)97.5 ± 17.893.8 ± 18.191.3 ± 19.7-.19 (.16)-.35 (.21)-.14 (.07)
    Childhood Behavior Questionnaire
    Effortful control (n = 357)3.0 ± 1.02.8 ± 0.83.0 ± 0.9-.11 (.11)-.04 (.19).05 (.06)
    Surgency (n = 357)3.7 ± 0.83.6 ± 0.63.5 ± 0.7-.17 (.12)-.32 (.18)-.16 (.06)
    Negative affect (n = 356)4.1 ± 0.83.8 ± 0.83.6 ± 0.7-.38 (.12)-.37 (.17)-.19 (.05)§

    †p ≤ .05.

    ‡p ≤ .01.

    §p ≤ .001.

    Adjusted models accounted for nesting of multiple births within families and a priori covariates (study site, primary language, infant gestational age at birth, infant sex, neonatal medical morbidities and maternal postnatal psychological distress).

    DCCS: Dimensional Change Card Sort Task; Flanker: Flanker Inhibitory Control and Attention Test; K-CPT: Kiddie Continuous Performance Test; M: Mean; SD: Standard deviation; SE: Standard error.

    Both the ExWAS and cumulative exposure results pointed to robust associations between exposures and child executive function outcomes, with particularly consistent findings and large effect sizes for the DCCS compared with other outcomes. For this reason, child executive function was retained as measured by the DCCS as the focal outcome in subsequent analyses.

    Machine learning

    Model estimates from the random forest model showed a statistically significant DOR (2.55; 95% CI = 1.28–5.08; p = .008), suggesting a significant association between the set of input exposure variables (Supplementary Table 1) and child executive function. Examining variable importance estimates revealed that maternal pre-eclampsia, anxiety, obesity, hypertension and tobacco use were the top five predictors contributing to the quality of model predictions (Figure 3A).

    Figure 3. Variable importance from random forest machine learning models predicting executive function (Dimensional Change Card Sort scores) from (A) exposures only and (B) exposures and DNA methylation.

    Adding epigenetics via EWAS

    Having established links between exposures and outcomes using a variety of data analytic methods, analyses were conducted that added biological responses to exposures (i.e., epigenetics) to these models. To do this, an EWAS was first conducted to assess relationships between DNAm at individual loci and child executive function measured by the DCCS. After Bonferroni correction for multiple testing (p < 1.1E-7), four loci were significantly associated with child executive function (Supplementary Table 2). All four were individual CpG sites rather than CMRs. Two of the four CpGs (cg25934728 [b = 5.19; p = 1.96E-08] and cg09604167 [b = 6.31; p = 6.32E-08]) were positively associated and two were inversely associated (cg12608445 [b = -5.32; p = 2.91E-08] and cg09604180 b = -7.02; p = 3.19E-08). The four CpG sites were weakly correlated with one another (|r| = 0.013–0.294; p < .001 to p = .868). DNAm at these four CpG sites became the epigenetic variables carried forward into further models to examine the joint contribution of exposures and epigenetics to child executive function.

    Integrating ExWAS & DNAm

    To examine the associations between individual exposures, epigenetics and outcomes, changes in the relationships between individual exposures and outcomes reported previously once DNAm was included at the four identified CpGs were assessed. For simplicity, two models were estimated to focus on the two exposures with the largest associations with executive function (prenatal alcohol use and pre-eclampsia; Figure 2). Whereas the model results from the previously reported ExWAS tested the association of each exposure with each outcome, controlling for covariates, these two new models included the single exposure variable alongside the four epigenetic variables as simultaneous predictors of child executive function, controlling for the same standard set of covariates (Table 3). In both models, adding DNAm somewhat reduced the strength of the association between the exposure and the outcome. For example, in the exposure-only model (left panel in Table 3), prenatal alcohol exposure was associated with a 1.03 standard deviation decrease in child executive function, on average, whereas in the exposure and epigenetic model, this effect size decreased to 0.73 standard deviation. In addition to significant effects of exposure variables in both models, two of the four CpG sites identified in the EWAS (cg25934728, cg09604167) were also independently associated with executive function in these models (β = .23–.32; all p < .001). Thus, both exposures and epigenetics jointly predicted child executive function.

    Table 3. Individual exposures and DNA methylation jointly predict child executive function.
    Outcome = DCCS
    Predictors
    Exposure = alcoholExposure = pre-eclampsia
    Exposure only
    β (SE)
    (n = 187)
    Exposure + Epi
    β (SE)
    (n = 143)
    Exposure only
    β (SE)
    (n = 187)
    Exposure + Epi
    β (SE)
    (n = 143)
    Exposure-1.03 (.34)-.73 (.35)-.73 (.21)§-.58 (.20)
    DNA methylation
      cg25934728 .32 (.07)§ .29 (.07)§
      cg12608445 -.16 (.09) -.13 (.08)
      cg09604180 -.13 (.08) -.13 (.08)
      cg09604167 .24 (.06)§ .23 (.06)§

    †p ≤ .05.

    ‡p ≤ .01.

    §p ≤ .001

    All models accounted for nesting of multiple births within families and a priori covariates (study site, primary language, infant gestational age at birth, infant sex, neonatal medical morbidities and maternal postnatal psychological distress).

    DCCS: Dimensional Change Card Sort Task; Epi: Epigenetics; SE: Standard error.

    Cumulative exposures & epigenetics

    A similar approach was taken by adding DNAm at the four identified CpG sites as additional predictors in the cumulative exposure models. As summarized in Table 4, these results mirrored those of the ExWAS models where both exposures (physical risk phenotype and cumulative exposure score) and DNAm at multiple CpG sites (cg25934728, cg09604167, cg12608445) were significant, independent predictors of child executive function.

    Table 4. Cumulative exposures and DNA methylation jointly predict child executive function.
    Outcome = DCCS
    Predictors
    Exposure phenotypeOutcome = DCCS
    Predictors
    Cumulative exposure score
    Exposure only
    β (SE)
    (n = 187)
    Exposure + Epi
    β (SE)
    (n = 143)
    Exposure only
    β (SE)
    (n = 187)
    Exposure + Epi
    β (SE)
    (n = 143)
    Physical risk vs low risk-.68 (.20)§-.53 (.19)Cumulative exposure-.25 (.08)-.18 (.06)
    Psychological risk vs low risk-.10 (.25).12 (.24)   
    DNA methylation  DNA methylation  
      cg25934728 .29 (.07)§cg25934728 .31 (.07)§
      cg12608445 -0.13 (.08)cg12608445 -.18 (.08)
      cg09604180 -0.13 (.08)cg09604180 -.11 (.08)
      cg09604167 .24 (.06)§cg09604167 .22 (.06)§

    †p ≤ .05.

    ‡p ≤ .01.

    §p ≤ .001.

    All models accounted for nesting of multiple births within families and a priori covariates (study site, primary language, infant gestational age at birth, infant sex, neonatal medical morbidities and maternal postnatal psychological distress).

    DCCS: Dimensional Change Card Sort Task; Epi: Epigenetics; SE: Standard error.

    A further exploratory step was taken to examine whether DNAm mediated or moderated the association between prenatal exposure phenotypes and/or cumulative exposures and child executive function. While no evidence for mediation was found (data not shown), there was a suggestive interaction between the psychological risk phenotype and DNAm of cg09604167 predicting executive function (β = .35; p = .07). As shown in Figure 4, this interaction suggested that DNAm was a stronger predictor of executive function for children exposed to the psychological risk phenotype (b = .60; p < .001) compared with children in the physical or low-risk phenotypes (b = .25, p = .001). These results suggest the importance of considering both exposures and biological impacts of exposures as joint predictors of neurodevelopmental outcomes.

    Figure 4. Associations between DNA methylation and child executive function depends on prenatal exposure phenotype.

    DCCS: Dimensional Change Card Sort Task.

    Machine learning & epigenetics

    Once again, cross-validated (k-fold) random forest models were used to examine the joint contribution of 24 prenatal exposure variables and DNAm at the 4 identified CpG sites to the prediction of child executive function. The addition of the 4 epigenetic variables improved model prediction such that the magnitude of the observed DOR was increased (2.84; 95% CI = 1.30–6.20; p = .009). Consistent with this interpretation, two CpG sites (cg25934728, cg09604180) were featured among the top five variables exhibiting the highest variable importance estimates (Figure 3B). The three exposure variables featured in the top five (tobacco, pre-eclampsia and hypertension) were also in the top five important variables from the exposure-only predictive model (Figure 3A).

    Results summary

    This work demonstrated three different analytic approaches to studying the impact of exposures on childhood outcomes and extended these models to jointly consider the impacts of external exposures and internal biological responses to exposures (i.e., epigenetics). The individual variable (i.e., ExWAS) models were useful in identifying a cluster of exposures that were significantly associated with child executive function, with large effect sizes. Through cumulative exposure and machine learning models, combinations of perinatal exposures (e.g., physical risk phenotype, cumulative risk) were shown to be potential risk factors for poorer executive function at age 5.5 years. Next, DNAm at individual CpG sites was shown to be additionally predictive of childhood executive function, possibly explaining some of the relationships observed between exposures and outcome (as evidenced by a reduction in the strength of the exposure-outcome associations). Yet, exposures and epigenetics remained significant independent predictors when modeled together. While DNAm did not mediate the relationships between perinatal exposures and childhood executive function, there was suggestive evidence of an interaction between DNAm at one CpG (cg09604167) and maternal exposure phenotype, suggesting that strengths of relationships between child epigenetics and executive function may differ based on the type of perinatal exposures children experience. Finally, a machine learning predictive model that included both exposures and epigenetics was built, and a modest improvement was found in the prediction accuracy of the model compared with an exposure-only model. Interestingly, DNAm at one CpG (cg25934728) had, by far, the strongest variable importance, and the maternal prenatal factors that remained in the top five of variable importance were tobacco use, hypertension and pre-eclampsia. Taken together, these results showcase the variety of methods available for applying the exposome model to understanding child neurobehavioral outcomes and paint a clear picture of the importance of both external exposures and internal biological responses to exposures (i.e., epigenetics) in explaining these outcomes.

    Discussion

    The goal of this work was to develop a novel exposome model and demonstrate how to apply this model to understanding child outcomes. This unique exposome model integrates contemporary exposures, including positive and negative exposures, alongside factors within the child that may modify their reaction to environmental exposures. Critically, our model broadens the exposome approach to emphasize positive and negative outcomes. Ours is not a strictly ‘disease’ or pathological model. Moreover, while historically we describe exposures as modifiable factors in the environment that can be targeted to prevent ‘disease’, the exposome framework we present also includes internal factors in the child (i.e., epigenetics) that can contribute to either positive or negative outcomes and are themselves dynamic and responsive to environments and health states.

    To demonstrate how we might advance the field, we presented three approaches to examining the impact of exposures on child outcomes, each of which provides unique but complementary information regarding the importance of prenatal exposures for child neurodevelopment. We also demonstrated how epigenetic data could be integrated into these models to test both exposures and the biological impact of these exposures as joint contributors to outcomes. Across all methods, our hypotheses were supported suggesting that exposures and epigenetics both contribute to child neurodevelopment, in this case, executive function.

    We initially utilized an ExWAS approach, examining all possible correlations between individual exposures and available outcomes. From this approach, we found that childhood executive function assessed using the NIH Toolbox DCCS and Flanker tests appeared most sensitive to early environmental exposures, such as maternal physical and psychological health, with particularly strong relationships between maternal pre-eclampsia and alcohol use. Then, to more precisely characterize the relationships between exposures and outcomes identified using the fully agnostic approach, we developed both a cumulative risk index, based on the concept of the allostatic load, summing the different exposures experienced by an individual, and a latent modeling approach to define groups of individuals experiencing specific sets of exposures, which revealed three distinct groups (one with elevated physical risk factors, one with elevated psychological risk factors and a comparator group with low risk). Using these approaches, we identified lower executive function scores in children born to mothers with the physical risk profile compared with the low-risk profile. The cumulative risk score also was associated with lower executive function and receptive vocabulary. This could suggest a greater sensitivity of the cumulative risk score to a variety of neurodevelopmental measures, with the latent class approach providing more specificity in the relationships. Both the psychological and physical risk profiles demonstrated inverse associations with negative affect and, consistently, the cumulative risk index was also inversely associated with negative affect and surgency. These findings suggest that prenatal exposures could be associated with blunted emotional expression [42], and the consistency between methods highlights the robustness of these approaches.

    Utilizing a machine learning approach, in this case, random forest, which allows for the detection of nonlinear and conditional relationships, we found that the combined effect of the exposure variables was associated with lower executive function. Our model demonstrated a significant DOR of 2.55 (95% CI: 1.28–5.08) suggesting the combined impacts of exposures are quite effective in differentiating children with delays or deficits in executive function. Thus, there is utility in considering these environmental measures in a holistic framework to better understand developmental outcomes.

    Using an epigenome-wide approach, we found four CpG sites with independent associations with child executive function. These CpGs were annotated to genes previously found to be associated with cognitive functioning and intelligence (TMBIM6), educational attainment (ADARB1), and health outcomes such as BMI (ADARB1, POU2FI), diabetes (POU2FI) and asthma (POU2FI). We then performed multiple regression to predict executive function using measures of DNAm at these sites along with four perinatal exposure variables (alcohol, pre-eclampsia, prenatal exposure phenotype, and cumulative exposure score). We found that relationships between the exposures and child executive function were attenuated when these CpGs were in the model. However, all four exposures and two CpG sites (cg25934728 and cg09604167) remained significant predictors of childhood executive function in all models. Thus, both exposures and epigenetics independently predicted child executive function using ExWAS and cumulative exposure approaches. The attenuation in the associations between exposures and executive function suggests that some of the effects of these exposures on executive function may be due to the biological impact of the exposure (i.e., changes in child epigenetic profiles).

    Further analyses suggested that DNAm at these sites was not a mediator but did reveal an interaction between the psychological risk profile and DNAm at cg09604167, such that there was a stronger association between DNAm and executive function among children born to mothers in the psychological risk group. The specific CpG highlighted in these analyses is located in the transcriptional start site of the TMBIM6 gene, a gene that has previously been related to cognitive function phenotypes [43]. Evidence of interactive effects between the exposome and epigenetics could imply that epigenetic susceptibility and resiliency are important to consider alongside environmental exposures for better prediction of child outcomes. Although we took a post-hoc approach to testing interactive effects, other methods exist that can integrate tests of interaction into their main screening process, such as elastic net and ensemble tree methods. Researchers may wish to integrate hypothesis-driven a priori tests of such interactions between exposures and internal factors such as epigenetics.

    Similarly, incorporating DNAm into our random forest models led to an increase in the DOR of the model and showed that two of the CpG sites (the same sites highlighted in ExWAS and cumulative exposure models) moved into the top five variables with the greatest variable importance scores. These findings again highlight why it is so important to add internal measures, like the epigenome, to an exposome approach. Unlike exposure variables, measures of DNAm supply integrated sources of information (i.e., about cell state, genetics and/or response to exposures, potentially even historical exposures [44]), which meaningfully contributed to the prediction of child outcome. These results demonstrate how the integration of internal factors such as epigenetics could make the exposome approach much more powerful, as opposed to approaches that rely only on external assessments.

    Limitations of reported analyses

    The substantive results reported here are not meant to be definitive. They are presented as a blueprint to demonstrate potential strategies for the analysis of an exposome approach updated to include a wide range of environmental exposures, biological responses to and/or modifiers of exposures and positive and negative child outcomes. We also recognize that our findings are limited to the external and internal exposures that we focused on herein. Integration of more comprehensive exposome features, such as environmental contaminants, structural factors, metabolomics and others, will yield additional insights into the broad array of influences on children's neurodevelopment. We also did not account for factors in the postnatal environment, such as caregiving and parenting, though these could be of increasing importance as they may impact or interact with earlier exposures and/or internal biological responses to early exposures. These results should be interpreted cautiously given the relatively modest sample size, particularly for some of the analytic methods used (e.g., machine learning) and the fact that we studied a very preterm population. Prematurity is already a potent exposure that may influence the relationship between other exposures and child outcomes. We also cannot say how these findings might generalize to other populations with different perinatal characteristics. Finally, exposome research is often data-driven (at least in light of the current state of the literature) as opposed to a priori hypothesis-driven, though combinations of the methods presented here could be used for both types of approaches. As the body of literature testing exposome models grows, new information and methods may drive the field toward more confirmatory rather than exploratory approaches.

    Conclusion

    Taken together, these results show the utility of examining exposures more holistically, particularly for developmental outcomes that have complex etiologies. Incorporating epigenetic markers into such studies could be particularly promising, as DNAm has been proposed to be a marker of some environmental exposures, socioeconomic position, health states and developmental states, but as we show here, also acts independently and possibly as a modifier of exposure effects. Epigenetic features may both capture intrinsic individual variability that may alter risk stratification and serve as a marker of biological response along the path between exposures and outcomes. This study focused on DNAm as a possible molecular response or integrator of early life exposures, which was measured at a single time point and in a single tissue type. We see enormous value in incorporating repeated measures to better understand patterns of epigenetic change over time, potentially in response to changing exposures, and in integrating additional measures of biological responses, such as metabolomics, proteomics or other potential intermediaries, and additional measures of individual variability such as genetic variation. In addition, trajectories of exposures, biological features and outcomes would be more in line with the overall exposome framework, which considers a life course of exposures to health, and statistical methodologies will need to consider how to incorporate these types of data into analyses.

    A goal of this paper was to guide applied researchers interested in applying an exposome approach to their substantive area. We demonstrated several approaches (individual variable, cumulative risk and/or phenotypes and machine learning), each capable of answering distinct research questions and containing strengths and weaknesses. An individual variable approach can be beneficial for identifying specific exposome components that are independently associated with a given outcome. It can also be useful for determining whether different types of exposures are related to different groups of outcomes when multiple different types of outcomes are measured (e.g., child cognitive vs behavioral development). A downside is that an individual variable approach can not inform understanding of the relationship between the totality of exposures an individual experiences and their outcomes, nor does it consider potential relationships among components of the exposome (e.g., correlated exposures, interactive effects). Depending on the number of exposome constructs and outcomes tested, the multiple testing burden can also increase and the need to adjust p-values for multiple comparisons can detract from statistical power to detect small but meaningful associations.

    Cumulative exposure models are an attractive, parsimonious solution that allows one to test associations among many exposome elements and outcomes. As demonstrated in the current investigation, and as we have found previously [19], cumulative risk scores are often a powerful predictor of outcome compared with alternative methods. However, beyond their usefulness in indexing an overall relationship between exposures and outcomes, cumulative risk variables are nonprecise (i.e., summed scores assume equal weighting of all exposures) and nonspecific (i.e., results do not show which exposures contribute the most to outcomes). Alternative specifications of cumulative risk (e.g., factor scores) could overcome some of these limitations by allowing for uneven weighting of items, but these approaches are likely to yield sample-specific measures that may not replicate across studies. Other cumulative exposure models such as latent profile analysis (LPA) are similarly parsimonious, account for co-occurring or correlated exposures and can add a degree of specificity in characterizing different groups of individuals with different types of exposures. In our work with the NOVI study [19] and other studies of term children [45], distinct cumulative risk phenotypes have been identified that differentiate between psychological and physical risk factors and are differentially associated with fetal and child outcomes. However, the extent to which LPA or similar methods can be applied to the exposome framework will rely on the number of exposome features studied, their overall distribution in the sample and sample size. Like cumulative risk indices or factor scores, there is also the difficulty of replicability across studies.

    Finally, machine learning models are an agnostic, data-driven method for predicting outcomes that can handle large numbers of potentially correlated exposome elements and can accommodate both linear and nonlinear relationships between exposures and outcomes. Unlike cumulative exposure models, machine learning models can distinguish the most important predictors among a group of exposome elements, although these are likely to differ based on sample characteristics and outcomes investigated. Machine learning models also require large sample sizes and, ideally, both internal and external validation. External validation sets are not always readily available, especially when the sample population is unique (like children born very preterm, as studied here). There is also a wide landscape of machine learning methods (e.g., random forest, penalized regression, neural networks) with unique strengths and weaknesses. Selecting the optimal method will depend on a given sample and research question and may require collaboration with a methodological expert.

    Given the strengths and weaknesses inherent in each of these techniques, it seems advisable for researchers to select one or more methods to test their specific exposome model and to interpret their findings in light of the idiosyncrasies of their selected methods. When multiple methods are used, findings that are broadly replicated across methods, such as our finding of unique exposure and epigenetic effects on child executive functioning, should be emphasized. Utilizing additional methods such as data imputation to ensure larger or more complete data will support the use of more complex models of the exposome. As is becoming a more mainstream expectation in research, replication in independent samples should be pursued. As other methods are developed for studying the complexity of the exposome, the specific options of models may change, but this advice will likely continue to define best practices.

    Continued development of frameworks and methods for studying the exposome will enhance our understanding of the complex associations between exposures and child development outcomes in the years to come. Shifting the study of child development toward a holistic approach that integrates external exposures and internal factors will allow for a richer understanding of developmental pathways and yield better predictions of positive and negative child outcomes. While these initial studies will undoubtedly seek to describe associations, we hope that, in the long term, the information garnered by the exposome approach can be used to inform interventions during periods of developmental plasticity to promote positive outcomes for children.

    Summary points
    • Environmental and psychosocial exposures have been recognized as important predictors of chronic disease risk and neurodevelopmental outcomes.

    • The exposome is an established conceptual model that systematically examines the combined effects of biological, psychological, environmental and social exposures on health, well-being and development.

    • This work was designed to apply the exposome framework to children's behavioral and neurodevelopmental outcomes and provide a roadmap for statistical approaches to assessing exposure-outcome relationships.

    • Data was collected from a cohort of very preterm infants and included pre- and peri-natal exposure variables and behavioral and neurodevelopmental outcomes. Genomic DNA was collected from neonates for genome-wide DNA methylation analysis.

    • Application of the exposome framework was demonstrated via three statistical approaches: exposome-wide association study, cumulative exposure models and random forest machine learning models. These techniques were then expanded to consider the joint role of exposures and epigenetics as predictors of child neurobehavioral outcomes.

    • The results demonstrated the ability of an exposome-wide association study model to identify a cluster of exposure variables associated with child executive function. These findings were corroborated by findings from the cumulative exposure and machine learning models, suggesting that perinatal exposures predicted child executive function.

    • Joint models showed independent and, potentially, interactive effects of exposures and epigenetic patterns on outcomes.

    • These findings demonstrate the utility of the exposome framework for more holistically studying associations between exposures and child health and neurobehavioral outcomes.

    Supplementary data

    To view the supplementary data that accompany this paper please visit the journal website at: www.futuremedicine.com/doi/suppl/10.2217/epi-2023-0424

    Author contributions

    All authors initiated and designed this investigation, contributed to interpretation of the results, revisions to the manuscript and approval of the final version. BM Lester, CJ Marsit, TM Everson and M Camerota acquired funding for this study and M Camerota, TM Everson and CL Shuster conducted the statistical analysis.

    Financial disclosure

    This work was funded by the National Institutes of Health/Eunice Kennedy Shriver National Institute of Child Health and Human Development grant R01HD072267 (BM Lester and O'Shea ), UH3OD023347 (BM Lester and CJ Marsit) and R01HD084515 (BM Lester and TM Everson) and National Institute of Mental Health grant K01MH129510 (M Camerota). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

    Competing interests disclosure

    The authors have no competing interests or relevant affiliations with any organization or entity with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

    Writing disclosure

    No writing assistance was utilized in the production of this manuscript.

    Ethical conduct of research

    The authors state that institutional review board approval was received from Children's Mercy Hospital, Los Angeles Biomedical Research Institute, Memorial Care Health System, Western Institutional Review Board, Spectrum Health, University of North Carolina–Chapel Hill, Wake Forest University Health Services and Women and Infants Hospital of Rhode Island for the research described. In addition, verbal and written informed consent was obtained from all participants for the inclusion of their data within this work.

    Data sharing statement

    The microarray data generated and/or analyzed in the current study are available in the National Center for Biotechnology Information Gene Expression Omnibus (accession series GSE128821). R codes used for the analyses presented in this paper are available upon request from the corresponding author.

    Open access

    This work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/

    Papers of special note have been highlighted as: • of interest; •• of considerable interest

    References

    • 1. Gillman MW, Blaisdell CJ. Environmental Influences on Child Health Outcomes, a Research Program of the National Institutes of Health. Curr. Opin. Pediatr. 30(2), 260–262 (2018).
    • 2. Rappaport SM. Implications of the exposome for exposure science. J. Expo. Sci. Environ. Epidemiol. 21(1), 5–9 (2011).
    • 3. Rappaport SM. Genetic factors are not the major causes of chronic diseases. PLOS ONE 11(4), e0154387 (2016).
    • 4. Rappaport SM, Smith MT. Environment and disease risks. Science 330(6003), 460–461 (2010).
    • 5. Willett WC. Balancing life-style and genomics research for disease prevention. Science 296(5568), 695–698 (2002).
    • 6. Etzel RABalk SJ (Eds). Pediatric Environmental Health (4th edition). American Academy of Pediatrics, Elk Grove Village, IL, USA (2019).
    • 7. The American College of Obstetricians and Gynecologists. Reducing prenatal exposure to toxic environmental agents. (2013). www.acog.org/en/clinical/clinical-guidance/committee-opinion/articles/2021/07/reducing-prenatal-exposure-to-toxic-environmental-agents
    • 8. Wild CP. Complementing the genome with an “exposome”: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol. Biomarkers Prev. 14(8), 1847–1850 (2005). •• Provides the original concept of the exposome.
    • 9. Vermeulen R, Schymanski EL, Barabási A-L, Miller GW. The exposome and health: where chemistry meets biology. Science 367(6476), 392–396 (2020).
    • 10. Miller GW, Jones DP. The nature of nurture: refining the definition of the exposome. Toxicol. Sci. 137(1), 1–2 (2014).
    • 11. David A, Chaker J, Price EJ et al. Towards a comprehensive characterisation of the human internal chemical exposome: challenges and perspectives. Environ. Int. 156, 106630 (2021).
    • 12. Price EJ, Vitale CM, Miller GW et al. Merging the exposome into an integrated framework for “omics” sciences. iScience 25(3), 103976 (2022).
    • 13. Zota AR, Shamasunder B. Environmental health equity: moving toward a solution-oriented research agenda. J. Expo. Sci. Environ. Epidemiol. 31(3), 399–400 (2021). • Argues for the inclusion of the structural factors that drive the environment in the exposome concept and the importance of considering these factors to develop meaningful solutions.
    • 14. Dennis KK, Auerbach SS, Balshaw DM et al. The importance of the biological impact of exposure to the concept of the exposome. Environ. Health Perspect. 124(10), 1504–1510 (2016). • Outlines why biological responses to exposures should also be considered in the exposome concept, as they allow for an accounting of the variability of response to the environment by an individual.
    • 15. Miller GW. Integrating the exposome into a multi-omic research framework. Exposome 1(1), osab002 (2021).
    • 16. Nobile S, Di Sipio Morgia C, Vento G. Perinatal origins of adult disease and opportunities for health promotion: a narrative review. J. Pers. Med. 12(2), 157 (2022).
    • 17. Itoh H, Ueda M, Suzuki M, Kohmura-Kobayashi Y. Developmental origins of metaflammation; a bridge to the future between the DOHaD theory and evolutionary biology. Front. Endocrinol. 13, 839436 (2022).
    • 18. Chung EH, Chou J, Brown KA. Neurodevelopmental outcomes of preterm infants: a recent literature review. Transl. Pediatr. 9(S1), S3–S8 (2020).
    • 19. Camerota M, Graw S, Everson TM et al. Prenatal risk factors and neonatal DNA methylation in very preterm infants. Clin. Epigenetics 13(1), 171 (2021).
    • 20. Constantino JN, Gruber C. Social Responsiveness Scale, Second Edition (SRS-2), 2nd Ed. Western Psychological Services, Torrance, CA, USA (2012).
    • 21. Constantino JN, Davis SA, Todd RD et al. Validation of a brief quantitative measure of autistic traits: comparison of the Social Responsiveness Scale with the Autism Diagnostic Interview-Revised. J. Autism Dev. Disord. 33(4), 427–433 (2003).
    • 22. Bruni TP. Test review: Social Responsiveness Scale–Second Edition (SRS-2). J. Psychoeduc. Assess. 32(4), 365–369 (2014).
    • 23. Conners C. Conners Kiddie Continuous Performance Test 2nd Edition (K-CPT 2). Multi-Health Systems North Tonawanda, NY, USA (2006).
    • 24. Mahone EM. Measurement of attention and related functions in the preschool child. Ment. Retard. Dev. Disabil. Res. Rev. 11(3), 216–225 (2005).
    • 25. Putnam SP, Rothbart MK. Development of short and very short forms of the Children's Behavior Questionnaire. J. Pers. Assess. 87(1), 102–112 (2006).
    • 26. Rothbart MK, Ahadi SA, Hershey KL, Fisher P. Investigations of temperament at three to seven years: the Children's Behavior Questionnaire. Child Dev. 72(5), 1394–1408 (2001).
    • 27. Zelazo PD, Anderson JE, Richler J, Wallner-Allen K, Beaumont JL, Weintraub S. II. NIH Toolbox Cognition Battery (CB): measuring executive function and attention. Monogr. Soc. Res. Child Dev. 78(4), 16–33 (2013).
    • 28. Gershon RC, Cook KF, Mungas D et al. Language measures of the NIH Toolbox Cognition Battery. J. Int. Neuropsychol. Soc. 20(6), 642–651 (2014).
    • 29. Everson TM, O'Shea TM, Burt A et al. Serious neonatal morbidities are associated with differences in DNA methylation among very preterm infants. Clin. Epigenetics 12(1), 1–15 (2020).
    • 30. Liu J, Siegmund KD. An evaluation of processing methods for HumanMethylation450 BeadChip data. BMC Genomics 17(1), 469 (2016).
    • 31. Aryee MJ, Jaffe AE, Corrada-Bravo H et al. Minfi: a flexible and comprehensive bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30(10), 1363–1369 (2014).
    • 32. Pidsley R, Zotenko E, Peters TJ et al. Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol. 17(1), 208 (2016).
    • 33. Pidsley R, Y Wong CC, Volta M, Lunnon K, Mill J, Schalkwyk LC. A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics 14(1), 293 (2013).
    • 34. Teschendorff AE, Marabita F, Lechner M et al. A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29(2), 189–196 (2013).
    • 35. Gatev E, Gladish N, Mostafavi S, Kobor MS. CoMeBack: DNA methylation array data analysis for co-methylated regions. Bioinformatics 36(9), 2675–2683 (2020).
    • 36. Logue MW, Smith AK, Wolf EJ et al. The correlation of methylation levels measured using Illumina 450K and EPIC BeadChips in blood samples. Epigenomics 9(11), 1363–1371 (2017).
    • 37. Carey VJ. gee: Generalized Estimating Equation Solver (2023). https://CRAN.R-project.org/package=gee
    • 38. McEwen BS, Seeman T. Protective and damaging effects of mediators of stress: elaborating and testing the concepts of allostasis and allostatic load. Ann. NY Acad. Sci. 896(1), 30–47 (1999).
    • 39. Ishwaran H, Kogalur UB. Fast unified Random Forest for Survival, Regression, and Classification (RF-SRC) (2023). www.randomforestsrc.org/
    • 40. Wright MN, Ziegler A. ranger: a fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Softw. 77(1), 1–17 (2017). http://arxiv.org/abs/1508.04409
    • 41. Zheng SC, Webster AP, Dong D et al. A novel cell-type deconvolution algorithm reveals substantial contamination by immune cells in saliva, buccal and cervix. Epigenomics 10(7), 925–940 (2018).
    • 42. Lester BM, LaGasse LL, Shankaran S et al. Prenatal cocaine exposure related to cortisol stress reactivity in 11-year-old children. J. Pediatr. 157(2), 288–295 (2010).
    • 43. Dumitrescu L, Mahoney ER, Mukherjee S et al. Genetic variants and functional pathways associated with resilience to Alzheimer's disease. Brain J. Neurol. 143(8), 2561–2575 (2020).
    • 44. Schrott R, Song A, Ladd-Acosta C. Epigenetics as a biomarker for early-life environmental exposure. Curr. Environ. Health Rep. 9(4), 604–624 (2022). •• A comprehensive review detailing the role of epigenetics as biomarkers that can represent a history of exposure, particularly during development and early life.
    • 45. Walsh K, McCormack CA, Webster R et al. Maternal prenatal stress phenotypes associate with fetal neurodevelopment and birth outcomes. Proc. Natl Acad. Sci. 116(48), 23996–24005 (2019).