Study population
This prospective observational study was conducted in a neonatal intensive care unit (NICU) of a tertiary hospital in China from January 2024 to October 2025. The study population consisted of preterm infants who underwent PICC during this period. A total of 490 eligible infants were screened and enrolled. Inclusion criteria were as follows: (1) gestational age less than 37 weeks, birth weight less than 2000 g; (2) admitted to the NICU within 12 hours of birth; (3) successful placement and maintenance of PICC lines at this center; (4) underwent at least one blood culture during catheterization for clinical signs of infection (e.g., unexplained thrombocytopenia, apnea, feeding intolerance, abnormal body temperature or heart rate); (5) no history of bloodstream infections recorded before catheter placement; (6) Informed consent obtained from legal guardian. Exclusion criteria include: (1) catheter removal within 48 hours after PICC placement due to non-infectious causes (eg, catheter malfunction, completion of treatment); (2) incomplete clinical records or documentation of key variables; This study protocol was approved by the Institutional Review Board of Choji Maternal and Child Health Hospital (ethics approval number: CZSFYLL2024.026).
Sample size calculation
According to previous literature, the incidence of PICC-associated bloodstream infections in neonates is approximately 22%.14 Initially, 34 candidate variables were considered, but the number of variables included in the regression model was ultimately limited to less than eight. An EPV value of 34 was adopted in this study. The required sample size was calculated using the following formula: Required sample size = (number of variables included in the model × EPV)/(1 − event rate).15 Therefore, the estimated minimum sample size was 8 × 34/(1 − 0.22) ≈ 349. Considering an expected 10% data loss rate, a total of 384 subjects were required. Ultimately, the study enrolled 490 preterm infants, a sufficient sample size to meet the requirements of subsequent statistical and modeling analyses.
Evaluation of variables
We predefined a set of candidate variables based on literature review and expert discussion. The first category includes baseline characteristics such as gestational age, birth weight, biological sex, 5-minute Apgar score, method of delivery, number of fetuses, neonatal respiratory distress syndrome, and mechanical ventilation requirements. The second category included catheter-related factors such as days at catheter indwelling, catheter insertion site, number of puncture attempts, catheter repositioning, PICC indwelling duration, frequency of dressing changes, catheter occlusion, catheter connector wrapping, and catheter displacement. The third category consisted of vital sign parameters covering axillary temperature, heart rate, respiratory rate, mean blood pressure, and oxygen saturation. The fourth category consists of laboratory parameters such as red blood cell count (RBC), white blood cell count (WBC), platelet count (PLT), neutrophil percentage (NEUT), lymphocyte percentage (lymph), and C-reactive protein (CRP).
To comprehensively evaluate the association between test parameters and CRBSI, two feature sets were defined based on blood sampling timing. Sample 1 represents laboratory data obtained within 72 hours before blood culture collection, and Sample 2 represents laboratory data obtained on the same day as blood culture collection. In this study, blood culture results served as the gold standard for diagnosing bloodstream infections and were used to determine the outcome of CRBSI. In contrast, complete blood count (CBC) and CRP testing served only as laboratory input variables for model training and prediction, supporting risk assessment rather than diagnostic decisions.
diagnostic criteria
The diagnostic criteria for CRBSI refer to the 2009 guidelines published by the Infectious Diseases Society of America. Catheter-related bloodstream infections16 It is defined as a patient presenting with bacteremia or fungemia within 48 hours after intravascular catheter insertion or after intravascular catheter removal, with signs of infection such as fever (>38 °C), chills, hypotension, and oliguria, with no obvious source of infection other than the intravascular catheter. On the other hand, laboratory microbiological testing reveals the same pathogens (such as Gram-positive cocci, coagulase-negative cocci, and Staphylococcus epidermidis) cultured from both peripheral venous samples and intravascular catheters.
Data collection method
After standardized training, two researchers independently reviewed and collected data using standardized data collection forms. Disease-related information and laboratory data were extracted from the electronic medical record system, and laboratory tests were performed regularly every 3 days. Catheter-related variables were obtained from the electronic nursing record. Vital sign data were obtained from continuous bedside monitoring records. Monitors recorded raw data hourly, and temperature measurements were extracted from nursing records. To assess infection-related physiological status, vital sign values within 24 hours before blood culture collection were extracted, averaged, and used as predictor variables. Data accuracy was independently cross-validated and confirmed by two researchers. All procedures were performed according to standardized protocols.
Data preprocessing
First, missing data across all variables were assessed. Variables with >20% missing values, such as cord blood pH and whether the patient was repositioned during catheter insertion, were excluded from further analysis. For variables with <20% missing data, values were imputed using the mean of the corresponding gestational age group to minimize potential systematic bias. For this purpose, gestational age was grouped as <28 weeks, 28 ≤ <34 weeks gestational age, and 34 ≤ <37 weeks gestational age. To assess the applicability of this imputation strategy, data completeness was examined across different gestational age groups. This analysis revealed a similar pattern of missingness among each group, with no gestational age group having a missing rate greater than 20%. Therefore, this study concluded that the stratification-based mean imputation method is feasible and robust. After data cleaning and preprocessing, 490 preterm infants were ultimately retained for subsequent model development and analysis. Variable-level missingness for variables with missing data, including gestational age, is summarized in Supplementary Table S1.
statistical methods
Quantitative data that followed a normal distribution were expressed as mean ± standard deviation, with intergroup comparisons performed using an independent samples t test. Quantitative data that did not follow a normal distribution were expressed as median values (P25, P75), and between-group comparisons were performed using the Mann-Whitney U test. Categorical variables were expressed as frequencies and percentages, and comparisons were performed using the chi-square test or Fisher’s exact test, as appropriate. Because univariate analysis involves a large number of candidate variables, all should be univariate to control the risk of false positives from multiple comparisons. P-values were subjected to false discovery rate (FDR) correction using the Benjamini-Hochberg method to yield adjusted Q values. All statistical analyzes were performed using SPSS version 24.0 software. After FDR correction, a Q value <0.05 was defined as statistically significant.
machine learning algorithms
Model development was performed using R software (version 4.4.2). All candidate variables, including demographic characteristics, vital signs, catheter-related factors, and both test metric sets, were input into a machine learning algorithm for feature selection. Three feature selection methods were applied: least absolute shrinkage and selection operator (LASSO) regression, Boruta algorithm, and recursive feature elimination (RFE). Overlapping features consistently identified by the three feature selection methods were retained as a robust subset of key predictors. Based on these selected features, four machine learning models were built: Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), and Light Gradient Boosting Machine (LightGBM). The dataset was randomly split into a training set (70%) and a validation set (30%). For the training set, hyperparameter optimization and model selection were performed using grid search combined with 10-fold cross-validation. The performance of the final model was evaluated on an independent validation set. Model performance was evaluated using area under the receiver operating characteristic curve (AUC), calibration curve, and decision curve analysis (DCA). Precision, specificity, precision, recall, and F1 score were calculated for each machine learning model. Additionally, 95% confidence intervals were reported for AUC, precision, recall, and specificity to increase the statistical clarity of model evaluation. Confidence intervals for precision, recall, and specificity were calculated using the exact Clopper-Pearson method. The selection of the optimal model was based on a comprehensive evaluation of area under the curve (AUC) and accuracy. Statistical comparisons of AUC between models were performed using the DeLong test. Variable importance rankings were visualized using an importance matrix, and the SHAP method was applied to interpret the contribution of each variable to model predictions.
