The aims of this study were (1) to determine how postnatal depression (PPD) can be predicted during pregnancy and (2) to identify the most predicted prenatal risk factors for developing PPD (measured at 8-10 weeks postnatal). To achieve this, we evaluated the performance of nine machine learning models. These models were gradually trained with an increase in the amount of data. The first model included only the first pregnancy sample data, and the final model used variables collected throughout pregnancy (i.e., data collected in all three divisions). This process mimics clinical practice where healthcare professionals accumulate data from pregnant women during pregnancy. First, our results suggest that women at low risk of PPD can already be identified based on variables collected during early pregnancy. Second, we identified the following variables as the most important prenatal predictors of PPD: depression during pregnancy, negative emotions, pregnancy distress, and history of mental disorder treatment.
Our findings show that using variables collected during the initial gestation period, PPD can be predicted as early as 12 weeks of pregnancy. The model's performance was slightly improved with the addition of data from the second and third parts, but the average MSE remained relatively stable. This suggests that additional information from the later stages does not dramatically improve the predictive model, suggesting that the first predictor variable is sufficient for early prediction of PPD. Importantly, the high specificity of our model shows that we can accurately identify women who are unlikely to develop PPD, allowing healthcare providers to withhold intensive interventions from this low-risk group and focus their resources on those who need them the most. However, it is difficult to compare MSE values with other studies, as most researchers treat PPD predictions as a classification task and dichotomize EDS scores at clinical cutoffs rather than as regression problems.18. Nevertheless, most classification performance metrics found in this study are comparable to those reported in similar studies.
However, it should be noted that the sensitivity found in this study (i.e. 0.19) was lower than that seen in previous similar studies. For example, Andersson et al.16predicted PPD at 6 weeks postpartum and reported a sensitivity of 0.71. Although we cannot directly compare predictive performance across studies due to differences in sample size, variables, and ML algorithms, we consider possible explanations to differing. First, many existing studies use postpartum data collected simultaneously with predicted depressive symptoms, focusing on “casting now” rather than actual predictions. This essentially explains how PPD presents itself rather than predicting it (i.e. its indicators and symptoms). In contrast, our study aimed to predict PPD using collected data Only during pregnancy. This strategy is clinically preferred as early intervention during pregnancy can prevent PPD before onset. Second, our model was optimized to minimize MSE rather than classification-specific metrics such as sensitivity. Furthermore, the distribution of EDS was highly right-skewed (i.e., bed effect), with the majority of mothers with total scores below 10. This skewness limits the ability of the model to learn patterns from maternal data at high risk of developing PPD, as there is relatively little information available to the maternal group in this group. Finally, prediction of PPD can be challenging as the study shows that it is not a single homogeneous disorder27. Many studies including Osborne et al.28PPD consists of multiple subtypes, suggesting that each is associated with a different risk factor. Specifically, Osborne et al.28we propose that women who still develop PPD without prenatal depressive symptoms are hormonally different from women who experience depression during pregnancy and develop PPD, highlighting the heterogeneity of the disorder.
Our findings further show that self-reported depressive symptoms during pregnancy are the most predictive risk factors for all pregnancy rates. This finding, consistent with other recent studies, is an incentive because it is a simple 10-item questionnaire that EDS is easy to administer and score, and is feasible for regular screening during pregnancy.29. This is closely aligned with healthcare providers' priorities, demonstrating the need to quickly and easily manage screening tools30.
In addition to self-reported depressive symptoms, several other psychological risk factors were identified as important. These included self-report measures for difficult pregnancy (TPD)twenty fiveneurosis (BFI-2-S)26Negative Emotions (DS14)twenty fourand the history of treatment of mental disorders. Personality traits such as neurosis and negative emotions (the latter is the type D personality dimension) have both been linked to PPD in previous studies31,32. People with higher levels of neurosis tend to experience more negative emotions due to increased stress responsiveness33. Especially stressful postnatal periods can broaden these negative emotions, increasing the risk of PPD. Similarly, individuals with high negative emotions have a general predisposition to negative emotions. This is associated with an increased risk of PPD34.
In addition to psychological variables, biological risk factors such as low BMI and rheumatism also predicted PPD. Previous studies have shown that low BMI is a risk factor for PPD independent of previous history of depression, as mood regulation is closely related to nutritional status.35. It should be noted that elevated BMI is associated with an increased risk of PPD, possibly via the pro-inflammatory pathway.36. However, underestimation of women with high BMI in the cohort reduced the statistical power to replicate this finding. Different types of rheumatic diseases have previously been linked to PPD37. In particular, rheumatoid arthritis was found to be associated with an increased risk of PPD in women with no history of mental disorders.38. Although our study did not distinguish between different types of rheumatic states, similar physiological and psychological mechanisms such as chronic inflammation, hormonal fluctuations, and dysregulation of the immune system, are likely to underlie the association between rheumatism and PPD.
Our research has several strengths and limitations. One strength is that our research is closely aligned with clinical practice, as risk factors were measured in parallel with daily testing by Dutch midwives and gynecologists. This means that the identified risk factors can be easily implemented through routine screening protocols. Another strength was the inclusion of a wide range of potential risk factors, allowing us to assess the relative predictive performance of these variables within a single model. Furthermore, large sample sizes improve the robustness and generalization of machine learning models.
However, the limitations of this study are a non-representative sample, as it mainly includes highly educated white mothers who are married or live with their partners. This may limit the generalizability of the findings to more diverse populations. Furthermore, the distribution of right distortions in EDS scores and the choice to optimize mean square errors rather than sensitivity potentially constrained the model's ability to detect true positives in the most risky women. To address these limitations, future studies could recruit more diverse demographic and clinical subgroups (including various socioeconomic and ethnic groups) and incorporate additional predictive data such as perinatal biomarkers (e.g. DNA methylation biomarkers, inflammation markers, etc.) and longitudinal data collected at more frequent frequencies to promote sensation and develop women.
Nevertheless, our findings are highly relevant to clinical practice. Midwives can adjust screening at each stage of pregnancy using risk factors that are unique to mid-term pregnancy. Our study suggests that while some risk factors are consistent, others may differ at extreme times. By identifying the most predictive risk factors early, providers can implement targeted interventions faster. Furthermore, our results show that we can accurately identify mothers who are unlikely to develop PPD, which may allow for more efficient allocation of healthcare resources to people at higher risk, and may improve prevention and intervention efforts. Although the inclusion of second and third predictive data only slightly improves predictive performance, it should be noted that continuous monitoring of risk factors throughout the pregnancy and postnatal period is essential as certain symptoms may appear later.
In summary, this study illustrates the potential of machine learning models, particularly normalized linear regression techniques, that predict PPD using only prenatal data. The consistent role of depressive symptoms highlights psychological properties such as neurosis and negative emotions that highlight the neurological and the need for early mental health screening during pregnancy. Identifying people at risk early can help health professionals intervene faster and improve outcomes for both mother and child. These findings suggest that integration of routine psychological assessments into prenatal care can enhance early detection of PPD risk, allow for timely referrals to mental health services, and ultimately reduce the prevalence and impact of PPD.
