Validation of sleep-based actigraphy machine learning models for predicting preterm birth

Machine Learning


Research characteristics

This study was completed as part of the March of Dimes Prematurity Research Center at Washington University in St. Louis/BJC Healthcare12approved by the Washington University IRB (Reference #201612070) in accordance with the FDA's excellent clinical practice and the Declaration of Helsinki. Written informed consent was obtained from participants for clinical, biological systems, imaging, and use of questionnaire data. Patients were singleton pregnant with a presumed GA of less than 20 weeks at the University of Washington Medical Campus, and were scheduled to give birth at Burns Jewish Hospital, and patients were recruited if they were over 18 years old.

Trained obstetric research staff used a series of case reports to collect baseline maternal demographics, medical history, antibinomial data and obstetric outcomes as previously described in the bibliography. 12. Patient data were collected at scheduled study visits and births during each pregnancy, where biological samples, imaging, actiography, and responses to standardized surveys were obtained from each patient.

The survey data included questions from 11 different validated surveys and standalone questions covering stress, schedule, sleep quality, physical activity, postnatal depression, diet, demographics, and overall lifestyle. Deduce the label of the PTB from the reported estimated confinement date (EDC) and label the births that occur 3 weeks before the listed EDC as PTB. EDC was derived from the patient's last menstrual period or initial ultrasound27.

Actigraphy feature design

Actigligraphy measurements were collected over two weeks of each gestational period (early pregnancy: 0-13 weeks and 6 days, second stage: 14-27 weeks and 6 days, third stage: 28 weeks or more). Patients were recalled via phone, email, and text to return the action watch after the capture period of either the following survey visit or courier service12. The results of this analysis filtered patients who had no action data in either early or late pregnancy.

These features are very high resolution and are designed to aggregate these raw time series signals in a day-level window to ensure that data is tractable to shallow ML model training. In addition to day-level measurements, it also measures absolute changes between the days in which the data is present. Section 2 of the Supplementary Materials contains an overview of these designed features.

To generate these features, all action data is separated at midnight on a central day, and attempts to estimate the sleep cycles that occur on each given day from it. An overview of these calculated features used in the dataset of the ML model is provided in Section 2 of the Supplementary Material.

Model design

For each study participants, day-level actigraphy characteristics are aggregated to mean and standard deviation over the entire gestational period. Evaluating the windows for gases below pregnancy for the entire period will drop all actiglyfer data under the set range (for example, setting a cap 140 days ago will delete all data from 140 days ago, and the rest will be aggregated).

For survey data, choose features that have both domain knowledge and automated technology. First, we select together a set of predefined features based on pre-determined clinical knowledge and a total value of individual birth questions. After these features, we select 10 additional features with minimal minimum value related algorithm with similarity scores for semantic text generated in Pubmedbert28 As explained, fine-tuned with several clinical and general datasets29. Features that are not numerically expressed will be removed. A complete list of features used can be found in Section 2 of the Supplementary Material.

After this, we concatenate both data sources, extend all numeric features to their normal distribution, and encode all categorical features as ordinal values. Missing values ​​are assigned either the mean, median, most common, or mean of the five nearest neighbors learned during cross-validation (CV). Data are randomly split into 80%/20% train/test splits. For the entire cohort, 532 and 133 patients appeared in each split, with 66 and 28 PTB patients, respectively. For the ineffective cohort, this has 238 people in the train set and 59 people in the test set, with 24 and 9 patients in the split respectively.

Train your model with several standard ML models, including logistic regression, linear support vector machines (SVMs), kernelized/nonlinear SVMs30xgboost31and Gauss NB32. Logistic regression predicts the output class using the differentiation of linear combinations of input weights. Linear SVM uses linearly isolated hyperplanes to predict classes, and kernelized SVM uses kernel functions to learn nonlinear separation of each class30. Xgboost is a gradient boost method to build an ensemble of decision trees to optimize predictive performance31and Gaussian NB model output classes that are subject to the normal distribution of each function.32. Evaluate the results over 10 random initializations for each model in section “Results” and report the average AUROC and AUPRC through pooling3395% confidence interval across all initializations. SHAP values ​​are averaged over all random initializations. A graphical overview of this training pipeline can be found in Section 2 of the Supplementary Material.

Use a 5x stratified CV to find the best hyperparameter for each model tested. This uses a training set to maintain class proportions across each fold. For XgBoost, the range of hyperparameter space is 1-3 estimator, maximum depth of 1-3, learning rate of 0.1, and the appropriate range of objectives for AUROC. For linear SVM, test regularization parameters in the logarithmic range of 0.001 to 10 with 1000 training. For nonlinear SVMs, evaluate the polynomial and radial basis function kernels above the linear SVM parameters. For logistic regression, set the normalization parameters from 0.001 to 10. l2 Penalty and maximum repetition of training 1000. For Gauss NB, use 10-9 As a fixed smoothing parameter.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *