patient population
A total of 55 patients were enrolled in this study, including 37 (67.3%) with non-progressive disease (non-PD) and 18 (32.7%) with PD. PD patients had a shorter overall survival (OS) than non-PD patients (9.35 months vs. not reached, P.< 0.001, Fig. 1a). The median age of patients in the overall setting he was 53.0 years (range, 43.0–55.0) and most patients were male (90.9%). Hepatitis B HCC etiology accounted for 94.5% of his, and his 67.3% of patients had α-fetoprotein levels ≥200 ng/mL. Their mean albumin-bilirubin (ALBI) score was -2.60, which was relatively higher in his PD patients than in non-PD patients (-2.29 vs. -2.76, P.< 0.001). Median platelet (PLT), aspartate transaminase (AST), alanine transaminase (ALT), and alkaline phosphatase (ALP) levels were 194.00*10.9/L, 60.30 U/L, 44.50 U/L and 134.10 U/L. At baseline, his 61.8% and 38.2% of patients had multiple tumors and a single nodule, respectively. Tumor diameter was greater than or equal to 5 cm in 81.8% of patients and less than 5 cm in 18.2%. The percentage of patients with gross vascular invasion and extrahepatic spread at baseline was 80% and 34.5%, respectively. No differences in characteristics have been reported for these variables (P.> 0.05) except ALBI (Table 1).

Radiomics features of the selection process for predicting progressive disease (PD). (be) We stratified overall survival (OS) for HCC patients treated with TKI-PD-1 by PD and non-PD. (b) find the 10 most meaningful features of PD using the LASSO method. (c) chord diagram and heatmap show the correlation matrix between all 10 features. In the chord diagram on the left, the wider the band, the closer the correlation. In the heatmap on the right (generated with R software version 3.6.0), correlated paired features are displayed on a colored background. Red indicates positive correlation. Green indicates negative correlation. (d) principal component analysis of features correlated with PD.
Selection and Analysis of Radiomics Features
A total of 2,458 features were obtained for all patients, of which 568 features were excluded with an intraclass correlation coefficient (ICC) < 0.90, leaving 1,890 features for further analysis. Ten discriminative wavelet-related features between non-PD and PD patients assessed by RECIST 1.1 were then identified through least absolute contraction and selection operator (LASSO) feature selection (Fig. 1b, Supplementary Table S1 ).
After obtaining these 10 features, we further explored their relationships using Spearman correlation analysis. Chord diagrams show links between features (see Figure 1c). Clearly, the broad band of warm red color indicates that features 1 and 2, features 3 and 5, and features 5 and 6 are significantly positively associated with each other in the network. On the other hand, features 4 and 6 were negatively associated with green broadband connections. Overall, detailed relationships between features are shown in heatmap Figure 1c, with Spearman analysis showing feature 1 and feature 2 with the highest correlation coefficient (0.99). A principal component analysis (PCA) process was then performed to compress features and reduce dimensionality. As visualized in Fig. 1d, we were able to separate PD patients (lower left cluster) from non-PD patients (upper right cluster).
For all radiomic features, difference values were distributed between PD and non-PD patients (Supplementary Fig. S1). Features 1, 2, 4, 7, and 10 had higher expression in non-PD patients, and the difference in feature 1 was significant (P.< 0.05). In contrast, PD patients showed relatively higher expression of features 3, 5, 6, 8, and 9, although the differences were not significant (P.> 0.05). As shown in Supplementary Fig. S2, the area under the receiver operating characteristic curve (AUC) for the prediction of PD ranged from 0.572 to 0.664 for 10 features. Among these features, the AUC for feature 4 was 0.661 (95% CI: 0.500 to 0.822) and for feature 6 was 0.664 (95% CI: 0.509 to 0.818), both reaching significance in the ROC analysis. bottom (P.< 0.05).
Radiomics-based machine learning algorithms predict responses
Ten algorithms distinguished PD and non-PD disease with varying efficacy. Overall, we observed better prediction results in the training cohort than in the test cohort (Figure 2). Support Vector Machines (SVM), Naive Bayes (NB), Recursive Partitioning and Regression Trees (Rpart), Conditional Inference Trees (Ctree), Random Forests (RF), k-Nearest Neighbors (KNN), Neural Nets, Boosting, Bagging , Logistics resulted in mean F1 scores of 0.88, 0.80, 0.85, 0.80, 1.00, 0.89, 0.96, 1.00, 0.91, 0.82, 0.80, 0.69, 0.70, 0.80, 0.79, 0.76 and 0 in the training cohort. 75 and 0.77, 0.74 and 0.74 in the testing cohort, respectively. Overall, the classification model with the SVM algorithm achieved the highest F1 score. In addition, the SVM algorithm predicted non-PD and PD with average accuracies and sensitivities of 81.8%, 100.0%, 43.7% and 78.8% in the training cohort and average accuracies of 43.7% and 78.8%, respectively, in the training cohort and 69.1% and 69.1% in the training cohort. 95.0%, 20.0%. , and 70.6%, respectively, in the test cohort (Supplementary Table S2).

Boxplots show F1 score, accuracy, sensitivity, specificity, and accuracy of 10 machine learning models in training and test cohorts.
AUC values for each model were generally higher in the training cohort than in the test cohort (Figure 3). The accuracy of the models varied, among which the SVM model had the highest AUC score for him, 0.933 in the training cohort and 0.792 in the test cohort. SVM, Bayesian and bagging algorithms all achieved AUC > 0.7 in our test cohort.

The overall accuracy of the 10 machine learning algorithms was assessed by the area under the receiver operating characteristic curves of the training and test cohorts.
Survival analysis
Overall, 19 patients died during follow-up. Not all patients reached OS, and OS for subgroups based on the drugs used is shown in Supplementary Figure S3. This indicates that there is no significant difference between the subgroups according to the drugs used (P.= 0.990). A log-rank test was used to compare high and low radioactive features on survival. 4 of the 10 features are feature 4 (P.= 0.039), feature 5 (P.= 0.049), feature 8 (P.= 0.018), and feature 9 (P.= 0.041), as shown in Supplementary Fig. S4. A Cox regression analysis incorporating radiological features and clinical variables was performed to identify survival-related factors (Fig. 4a). A total of 7 factors generated in the univariate Cox regression model (features 4, 6, 8, ALBI, tumor size, ALT, and portal vein invasion) were further selected and included in the multivariate Cox regression model. Finally feature 4 (P.= 0.002) and feature 6 (P.= 0.001) was a favorable survival factor, whereas feature 8 (P.= 0.033), ALBI (P.= 0.032), portal vein invasion (P.= 0.002) was a risk survival factor in a multivariate Cox regression model with a c-index of 0.81 (Fig. 4b). Figure 4c shows that there is a greater net benefit from using a model that includes clinical features and radiomics than from using a clinical model.

Landscape of radiological and clinical features. (be) Patients with progressive disease (PD) had shorter survival times and higher albumin-bilirubin (ALBI) scores. Their radiological features showed different expression levels between PD and non-PD patients. (b) Radiological features 4, 6, 8, ALBI, and portal vein invasion were survival-related factors in univariate and multivariate Cox regression survival analyses. (c) the additional net benefit of Radiomics in decision curve analysis. ***P.< 0.001; **P.< 0.01; *P.< 0.05.
