Study design and setting
We conducted a retrospective study using electronic medical records to assess predictors of mortality in pediatric patients with fungemia in the pediatric intensive care unit (PICU) of King Abdulaziz University Hospital, Jeddah, Saudi Arabia. The analysis included data from 2016 to 2020. The PICU manages approximately 500 admissions per year and provides a robust dataset to assess the impact of Candida scores and other clinical risk factors on outcomes. As a tertiary referral center, the hospital manages a large volume of medically complex pediatric patients from across the region.
ethical considerations
This study complied with the ethical standards outlined in the Declaration of Helsinki (1964) and its later amendments. Ethical approval was obtained from the Institutional Review Board (IRB) of King Abdulaziz University Hospital, Jeddah, Saudi Arabia (reference number 566-23). Due to the retrospective nature of the analysis, the Institutional Review Board waived the requirement for informed consent. Data anonymization and security measures were introduced to strictly protect patient confidentiality and anonymity. Patient identifiers were removed from the dataset before analysis and replaced with a unique anonymization code. Electronic data were encrypted and stored on a secure, password-protected server, and could only be accessed by authorized study personnel. Physical copies of the data were stored in locked cabinets within a secure facility, adhering to the highest standards of privacy and ethical integrity, and minimizing the risk of data breach while maintaining the usefulness of the data for research purposes.
Study population
The study cohort consisted of pediatric patients (1 month to 14 years) admitted to the PICU with positive blood cultures. Candida seed. Patients with incomplete medical records are defined as having any of the following records missing: Death at PICU discharge. Candida score components (full parenteral nutrition, recent surgery, multifocal colonization, severe sepsis). Duration of mechanical ventilation. PRISM score. PICU length of stay. or the dates required to derive the duration (administration of antibiotics and hospitalization before fungemia) were excluded. Neonates (<1 month) were not included because our institution has a separate neonatal intensive care unit (NICU). Additional exclusions include: A patient developed fungemia after being transferred from the PICU. Those who received antifungal treatment before PICU admission. PICU stay less than 24 hours. and comorbid conditions unrelated to fungemia where death is considered imminent (e.g., terminal malignancy under palliative plan, end-stage organ failure not suitable for transplantation).
main variables
Clinical measurements including demographic variables (age, gender, weight), clinical score (PRISM), and duration of mechanical ventilation (DOMV). PICU length of stay (LOPICU); duration of antibiotics before fungal infection (DOAbBFI); Length of hospitalization before fungal infection (LHSBFI); Candida score (CScore); presence of Candida albicans (Calbicans).
Diagnosis of fungemia
Blood cultures were processed using the BacT/ALERT automated system (bioMérieux, France). For each sample, 5 mL of blood was inoculated into pediatric culture bottles, continuously incubated, and flagged positives were subcultured on Sabouraud dextrose agar for identification. Yeast species identification was performed by VITEK MS, and antifungal susceptibility was performed by VITEK.®2 (bioMérieux, France). If no growth was detected after 5 days, the culture was finally determined to be negative.
candida score
A 4-item Candida score (total parenteral nutrition = 1, recent surgery = 1, multifocal) was calculated. Candida Colonization = 1, severe sepsis = 2); score ≥ 3 indicates high risk. In this study, Candida score was evaluated not as a diagnostic tool but as a predictor of mortality after fungemia diagnosis.
measure of success
The primary outcome was mortality at PICU discharge.
statistical analysis
Statistical programs and data processing
Analyzes were performed in Python 3.8. Logistic regression was fitted using statsmodels 0.13.5. Random forest and gradient boosting machine were implemented in scikit-learn 1.0.2. I used pandas 1.5.3 and numpy 1.23.5 for data wrangling and calculations.
Modeling and validation overview
Our primary analysis used multivariable logistic regression (LR) with prespecified covariates. random foresttwenty two and gradient boosting machinetwenty threewas a second-order exploratory model that served as a benchmark for discrimination. For performance reports, we used (a) an 80/20 training and testing split on retained test set metrics, (b) 10-fold cross-validation on the full dataset for internal stability, and (c) bootstrapping (1,000 resamples) and permutation testing as additional internal validation/negative control steps.
Descriptive statistics
Descriptive statistics were used to summarize the demographic and clinical characteristics of pediatric patients. Continuous variables were expressed as median with interquartile range or mean with standard deviation based on distribution. Categorical variables were described using percentages and frequencies to reveal their distribution within the study population.
Data processing for model development
Data were complete (no imputation). Outliers were limited to 1.5 times the IQR. Min-max scaling normalized continuous features [0,1]. The categorical variable gender was one-hot encoded (binary indicator) for inclusion in the regression. (Dataset assignment is discussed in Modeling and Validation Overview and will not be repeated here.)
Multivariable logistic regression analysis
Logistic regression estimated associations with mortality. For each predictor variable, we report the odds ratio (OR) and 95% confidence interval (CI). p< 0.05 indicates statistical significance. Prespecified covariates were CScore, DOMV, LOPICU, age, weight, PRISM, DOAbBFI, LHSBFI, gender, and Calbicans. All covariates were entered simultaneously to assess independent associations while avoiding the bias of automated stepwise selection.
Machine learning model development
In addition to logistic regression, we developed two ensemble learning models: Random Forest (RF) and Gradient Boosting Machine (GBM). For the RF model, we measured the splitting quality using 100 decision trees (n_estimators = 100) and the Gini impurity criterion. For the GBM model, a learning rate of 0.1 and 100 boosting stages were used. The model was implemented in Python using the scikit-learn library (version 1.0.2). Default parameters were used for other hyperparameters, as large-scale adjustments are often not feasible or reliable for small datasets. The objective was to compare the performance of these standard, well-established models with logistic regression.
Model validation
Unless otherwise noted, we assessed internal validity using three complementary procedures applied to the complete analysis dataset. A 10-fold cross-validation divided the data into 10 folds. The model was trained on nine folds, evaluated on the held-out fold, iterated across all folds, and performance was summarized as the mean and standard deviation of fold-level metrics. Bootstrap resampling (1,000 resamples) drew a sample of size n with replacement. The model was fit within each resample, and performance was summarized as the mean (SD) across resamples. For permutation testing (1,000 permutations), we randomly permuted the resulting labels and retrained the model for each permutation to generate a null distribution of performance and provide a negative control benchmark.
Model performance metrics
For binary mortality prediction, we report precision, precision, recall (sensitivity), F1 score, and area under the receiver operating characteristic curve (ROC AUC). Unless otherwise stated, a probability threshold of 0.50 was used to derive class labels and confusion matrices. For resampling procedures (cross-validation, bootstrap, permutation), performance is summarized as the mean (SD) over folds/resamples/permutations.
Comparative model analysis
Comparisons between multivariable logistic regression (primary), random forest, and gradient boosting machine (exploratory) were performed using the same set of metrics defined above. Hyperparameters followed standard defaults. RF with 100 trees (Gini criterion). GBM with learning rate 0.1 and boost stage 100.
Feature importance analysis
Feature importance was analyzed for each model to identify the most influential predictors of mortality. For the logistic regression model, feature importance was determined by the absolute value of the standardized regression coefficient. The Random Forest and Gradient Boosting Machine models used the standard mean impurity reduction (MDI) method. MDI calculates the importance of a feature by measuring the total reduction in the Gini impurity criterion caused by that feature, averaged over all trees in the ensemble. The features that contribute to the greatest reduction of impurities are considered to be the most important. All calculations were performed using built-in functions of the scikit-learn library in Python.
