Mechanism of public behavioral intention to use generative AI for folk story image co-creation

Machine Learning


Socio-demographic characteristics

A total of 682 valid questionnaires were collected for this study. As shown in Fig. 3, the gender distribution was relatively balanced, with 377 males (55.3%) and 305 females (44.7%). The age structure was dominated by young and middle-aged adults between 18 and 40 (79.0%), of which 25.9% were aged 18–25, 29.0% were aged 26–30, and 24.1% were aged 31–40. Regarding educational attainment, 85.0% of respondents held a bachelor’s degree or higher, indicating a generally high level of education. Regarding professional background, 57.3% had relevant expertise in art and design, fine arts, or computer science. The occupational structure was diverse: enterprise employees accounted for 24.9%, teachers 19.4%, designers 18.2%, freelancers 14.8%, students 13.2%, and other occupations 9.5%. Concerning Generative AI usage, 96.6% reported using it either frequently or occasionally, indicating a high adoption rate and suggesting that most respondents already had some usage experience. According to the 2025 Global AI Usage Report by KPMG73, 92% of the Chinese public sample accepted or recognized the use of artificial intelligence, and 93% reported using AI tools, indicating the widespread penetration of AI technology in Chinese society. Therefore, the 96.7% rate of AI tool usage observed in our study sample is consistent with this national context, ensuring both the feasibility and real-world representativeness of our analysis. In addition, 96.8% of respondents reported being “familiar” or “very familiar” with traditional Chinese folk stories, reflecting a generally high level of cultural knowledge about folk stories within the sample.

Fig. 3: Socio-demographic characteristics.
Fig. 3: Socio-demographic characteristics.

The figure presents the socio-demographic profile of the respondents, including gender, age, education level, occupation, disciplinary background, experience with AI drawing tools, and familiarity with traditional Chinese folk stories.

Results of measurement model

The measurement model was evaluated primarily through reliability and validity testing65. For reliability assessment, Cronbach’s alpha and composite reliability (CR) were adopted as performance metrics, with recommended thresholds of ≥0.70. As shown in Table 3, the Cronbach’s α and CR values for all latent variables exceeded the recommended threshold, indicating good internal consistency of the measurement scales. For validity assessment, both convergent validity and discriminant validity were examined. Convergent validity was evaluated using the average variance extracted (AVE) and factor loadings, where the suggested thresholds were AVE ≥ 0.50 and factor loadings ≥0.70. According to the results in Table 3, all constructs met these standards, indicating good convergent validity of the measurement model.

Table 3 Results of measurement model

Regarding discriminant validity, two approaches were employed. First, according to the Fornell–Larcker criterion74, the square root of each construct’s AVE exceeded its correlations with other constructs (Table 4), indicating satisfactory discriminant validity. Second, based on the heterotrait–monotrait ratio (HTMT) criterion65, where the recommended threshold is <0.85, the results in Table 5 show that all constructs in this study had HTMT values below the threshold, providing further support for discriminant validity.

Table 4 Discriminant validity and the correlations of variables (Fornell–Larcker criterion)
Table 5 Results of heterotrait–monotrait ratio (HTMT) discriminant validity

Structural model evaluation

The structural model was analyzed using the bootstrapping method with 5000 subsamples. As shown in Table 5, all hypotheses were supported (Table 6).

Table 6 Results of structural model and path coefficients

PU had significant positive effects on BA (β = 0.226, t = 5.978, p < 0.001) and BI (β = 0.107, t = 2.959, p < 0.01), supporting H1 and H2.

PEOU had significant positive effects on BA (β = 0.188, t = 4.801, p < 0.001), BI (β = 0.145, t = 3.984, p < 0.001), PU (β = 0.314, t = 9.509, p < 0.001), and PBC (β = 0.242, t = 6.142, p < 0.001), supporting H3, H4, H5, and H6.

BA (β = 0.133, t = 3.592, p < 0.001), SN (β = 0.113, t = 2.980, p < 0.001), and PBC (β = 0.143, t = 3.942, p < 0.001) all had significant positive effects on BI, supporting H7, H8, and H9.

SE had significant positive effects on PEOU (β = 0.397, t = 11.919, p < 0.001), PBC (β = 0.302, t = 8.210, p < 0.001), and BI (β = 0.100, t = 2.541, p < 0.05), supporting H10, H11, and H12.

EAW had significant positive effects on BA (β = 0.286, t = 7.575, p < 0.001), BI (β = 0.112, t = 2.916, p < 0.01), PU (β = 0.284, t = 8.203, p < 0.001), and SN (β = 0.345, t = 9.731, p < 0.001), supporting H13, H14, H15, and H16.

AIIB had significant negative effects on EAW (β = −0.393, t = 12.990, p < 0.001), SN (β = −0.249, t = 6.816, p < 0.001), and BI (β = −0.200, t = 6.013, p < 0.001), supporting H17, H18, and H19.

The R² value is used to evaluate the model’s predictive accuracy for the dependent variable. An R² > 0.67 indicates high predictive power, an R² between 0.33 and 0.67 indicates moderate predictive power, and an R² between 0.19 and 0.33 indicates weak predictive power75. As shown in Fig. 4, the R² value for BI is 0.421, indicating that the research model’s explanatory power for the public’s behavioral intention to use Generative AI for folk story image co-creation is moderate.

Fig. 4: Results of the structural model.
Fig. 4: Results of the structural model.

The figure presents the results of the structural model analysis, including the standardized path coefficients among the latent variables, the factor loadings of the observed indicators, and the R2values of the endogenous constructs.

ML modeling results

After completing hypothesis testing, the PLS-SEM and extracting the latent variable data, the latent variable and the original observed datasets were preprocessed. For socio-demographic characteristics, feature encoding was performed according to variable type: ordinal categorical variables were encoded using label encoding, and nominal categorical variables were encoded using one-hot encoding. The BI variable was binarized based on the average score of its original observed values: samples with an average score between 1 and 3 were labeled as 0, indicating weak behavioral intention, while those with an average score between 4 and 5 were labeled as 1, indicating firm behavioral intention. Subsequently, all numerical variables were standardized using Standard Scaler to eliminate dimensional differences and improve model stability76. To further evaluate the influence of demographic characteristics on predicting BI, this study applied the Pearson correlation method on the latent variable dataset for a macro-level feature selection77. As shown in Fig. 5, the importance rankings of demographic variables were significantly lower than those of the core constructs in the theoretical model, indicating their limited impact on BI. This indirectly supports the strategic direction of promoting cultural sustainability through participatory public engagement. Therefore, two datasets excluding demographic characteristics were used as the final input features, thereby improving the simplicity and generalization capability of the models.

Fig. 5: Feature importance ranking.
Fig. 5: Feature importance ranking.

The figure presents the relative importance of the input features in the predictive model.

ML classification model training

To verify the predictive performance of the proposed model, the two processed datasets were split into a training set (75%) and a testing set (25%), and classification modeling of BI was conducted using five mainstream algorithms: LR, RF, XGB, SVM, and ANN. As shown in Table 7, regardless of whether the training was based on the original observed or latent variable data, all models achieved stable performance above 0.70 in key performance metrics, including accuracy, precision, recall, and F1-score, indicating good classification capability and robustness.

Table 7 ML classification model training results based on latent variables and observed variables

Moreover, as illustrated by the ROC curves in Fig. 6, except for ANN, the AUC values of the other models exceeded 0.80, further demonstrating strong overall performance and generalization capability78. These results indicate that the theoretical pathways—from specific measurement indicators to higher-order psychological constructs—are well supported at the data level. Notably, models trained on latent variable scores generally exhibited higher accuracy and stronger generalization, suggesting that the constructs in the research model effectively explain BI in the context of public engagement in folk story image co-creation using Generative AI, thereby validating the cross-level explanatory power of the theoretical model in the present research context.

Fig. 6: ROC curves and AUC comparison of the classification models.
Fig. 6: ROC curves and AUC comparison of the classification models.

a ROC curves and AUC comparison of the models based on observed variables. b ROC curves and AUC comparison of the models based on latent variables.

Hyperparameter optimization using the Optuna framework

Although models trained on latent variables outperformed those trained on observed variables across all performance metrics, their further improvement largely depends on appropriate hyperparameter configurations. To efficiently search for the optimal hyperparameter combination, this study employed the Optuna framework for hyperparameter tuning of models trained with latent variables. Optuna is a new-generation hyperparameter optimization framework whose core advantages lie in its dynamic construction of the parameter search space and its efficient pruning mechanism, both of which significantly improve search efficiency and convergence performance70. The optimization objective function, parameter types, and value ranges were first defined in the tuning process. Then, in each iteration, Optuna evaluated the performance of candidate parameter combinations and terminated trials with potentially poor performance early, allowing the search to focus on more promising regions of the parameter space. This process continued until the maximum number of iterations was reached, ultimately identifying the optimal hyperparameter combination. This study adopted fivefold cross-validation and used the mean AUC as the performance metric during tuning, with 500 trials. To ensure both search efficiency and adequate coverage of the parameter space, preliminary trials and performance observations were conducted to adjust and refine the value ranges of each hyperparameter iteratively, leading to the determination of a reasonable search space. The detailed procedure and results are provided in the Supplementary Information. Compared with other models, the Random Forest model optimized through Optuna hyperparameter tuning (Optuna-RF) achieved the best performance across all evaluation metrics. The specific parameter settings and performance results are presented in Tables 8 and 9.

Table 8 Definition, search space, and optimal values of RF hyperparameters
Table 9 Comparison of classification performance metrics between the baseline and Optuna-RF models

Figure 7 illustrates the confusion matrix and ROC curve of the final model on the test set. On the test set, the AUC of Optuna-RF reached 0.852, representing an improvement of 0.007 over the non-optimized model. The overall prediction accuracy was 80.7%, marking a 2.34% increase compared with the baseline. These results demonstrate that the Optuna-RF model is not only capable of effectively distinguishing between individuals with high versus low BI toward using Generative AI for folk story image co-creation, but also confirms that the influence paths of the constructs in the theoretical model on BI are identifiable and interpretable at the data level. The stability and generalization capability exhibited by the model provide strong support for the subsequent interpretability analysis using SHAP.

Fig. 7: Classification performance of the Optuna-RF model.
Fig. 7: Classification performance of the Optuna-RF model.

a Confusion matrix of the Optuna-RF model. b ROC curve and AUC value of the Optuna-RF model.

SHAP interpretability analysis results

This study employed the SHAP method to enhance the interpretability of the model’s predictions. SHAP was grounded in the Shapley value theory from game theory and can precisely decompose and quantify the marginal contribution of each input feature to the model output without introducing approximation errors72. Compared with traditional feature importance evaluation methods, SHAP offers a more objective allocation of feature contributions and enables a quantitative assessment of each feature’s specific impact on classification decisions. Therefore, analyzing the importance of features based on SHAP values is reliable and theoretically well-founded. In this study, SHAP was applied to quantitatively evaluate the extent to which each latent variable influences the model’s classification of individuals with high BI, with the results visualized in Fig. 8. The figure integrates feature importance ranking with a beeswarm plot, showing the overall contribution and the direction (positive or negative) of each feature’s effect on the model output. Each row corresponds to an input variable, and each point represents the SHAP value of a sample from the test set for that variable. The x-axis denotes the SHAP value magnitude, indicating the direction and strength of the feature’s impact on the output. Points shaded in red indicate higher feature values, whereas those in blue indicate lower values. With SHAP value = 0 as the reference, points on the right side indicate a positive contribution to predicting high BI, while those on the left indicate a negative contribution.

Fig. 8: SHAP feature importance and beeswarm distribution.
Fig. 8: SHAP feature importance and beeswarm distribution.

The figure presents the SHAP importance of each feature and the distribution of SHAP values, with color indicating the feature value from low to high.

The results show that AIIB (SHAP value: 0.059, contribution ratio: 18.2%), EAW (SHAP value: 0.055, contribution ratio: 16.8%), and BA (SHAP value: 0.047, contribution ratio: 14.5%) were the top three latent variables exerting the most significant influence on the model output. These were followed by PBC (13.4%), SE (12.4%), and PEOU (11.1%). PU and SN had relatively lower impacts, with contribution ratios of 8.0% and 5.7%, respectively. Moreover, the plot’s red–blue gradient distribution revealed the relationship between the original feature values and the model predictions: the further the feature values deviate toward the extremes, the larger the shift in SHAP values. AIIB contributed significantly negatively to high BI, whereas all other features contributed positively.

Building on this, to further uncover the marginal contribution trends and potential interaction effects of each feature—and to enhance the explanatory power of the structural model from the perspective of predictive mechanisms—this study selected the features involved in significant paths of the structural equation model and conducted SHAP dependence analysis. An interaction plot between EAW and AIIB was also generated. As shown in Fig. 9, each point represents a test sample, the x-axis indicates the standardized value of the primary feature, the y-axis represents the corresponding SHAP value, and the color denotes the value level of the interacting feature. A higher SHAP value indicates a more substantial positive contribution of that feature’s value to the model output for the given sample; a redder color indicates a higher value of the interacting feature at the current level of the primary feature, suggesting a potential conditional amplification effect. All variables were standardized, so their values can be interpreted as standard deviations from the mean. Accordingly, the feature levels were divided into five intervals corresponding to the observed variables: −3.0 to −2.0 (very low), −2.0 to −1.0 (low), −1.0 to 0.0 (medium), 0.0 to 1.0 (high), and 1.0 to 2.0 (very high), to facilitate subsequent interpretation and analysis.

Fig. 9: SHAP interaction analysis of pairwise relationships between key variables.
Fig. 9: SHAP interaction analysis of pairwise relationships between key variables.

The figure presents the pairwise interaction effects between key variables based on SHAP values, showing how combinations of variables jointly influence the model output.

EAW and interaction effect with AIIB (subfigure 1)

Consistent with H14, EAW exerts a nonlinear positive effect on BI: when EAW is at low or below-average levels, its impact remains modest; after reaching moderate levels, the positive effect increases substantially, and at near-maximum levels, EAW produces a pronounced boost to BI. AIIB at moderate or higher levels is concentrated in the negative SHAP value range and overlaps with the lower-to-moderate range of EAW, thereby suppressing the positive marginal contribution of EAW. In conjunction with H17, this further confirms that AIIB not only directly exerts a negative impact on EAW but also attenuates its positive influence on BI through its moderating effect.

AIIB and interaction effect with EAW (subfigure 2)

Consistent with H19, AIIB exhibits a linear adverse effect on BI, with the negative impact intensifying as AIIB increases; at high or above-average levels, AIIB exerts a substantial suppressive effect on BI. Interaction analysis reveals that EAW at moderate or higher levels is concentrated in the positive SHAP value range and the low-to-moderate AIIB range, thereby mitigating the negative marginal contribution of AIIB. Although this study did not hypothesize a positive effect of EAW on AIIB, this finding indirectly supports the possibility of such a causal relationship.

SN and interaction effects with AIIB and EAW (subfigures 3 and 4)

Consistent with H8, SN exerts a nonlinear positive effect on BI. At low-to-moderate levels of SN, the effect remains flat. In contrast, at higher levels, the positive impact increases substantially and significantly promotes BI before plateauing at extremely high levels. Interaction analysis indicates that moderate or higher levels of AIIB are concentrated in the negative SHAP value range and within the low-to-moderate SN range, thereby suppressing the positive marginal contribution of SN. In line with H18, this confirms that AIIB not only directly and negatively influences SN but also attenuates its positive effect on BI through a moderating mechanism. Conversely, moderate or higher levels of EAW are concentrated in the positive SHAP value range and within the moderate-to-high SN range, enhancing the positive marginal contribution of SN. Consistent with H16, this further demonstrates that EAW directly and positively influences SN and strengthens its positive effect on BI through a moderating role.

BA and interaction effects with PEOU, PU, and EAW (subfigures 5, 6, and 7)

Consistent with H7, BA exerts a nonlinear positive effect on BI. At low-to-moderate levels of BA, the effect remains relatively flat; at higher levels, the positive impact increases sharply, exerting a substantial promoting effect on BI, before plateauing at extremely high levels. Interaction analysis shows that high or above-high levels of PEOU, PU, and EAW are concentrated in the positive SHAP value range and within the moderate-to-high BA range, thereby enhancing the positive marginal contribution of BA. In line with H3, H1, and H13, this further confirms that PEOU, PU, and EAW directly and positively influence BA and strengthen its positive effect on BI through moderating mechanisms.

PEOU and interaction effect with SE (subfigure 8)

Consistent with H4, PEOU exerts a nonlinear positive effect on BI. At low or below-low levels of PEOU, the effect remains flat; at moderate and higher levels, the positive effect significantly increases, and at near-maximum levels, it exerts a substantial promoting effect on BI. Interaction analysis shows that high and above-high levels of SE are concentrated in the positive SHAP value range and within the moderate-to-high PEOU range, thereby enhancing the positive marginal contribution of PEOU. In line with H11, this further confirms that SE directly and positively influences PEOU and strengthens its positive effect on BI through a moderating mechanism.

PU and interaction effects with EAW and PEOU (subfigures 9 and 10)

Consistent with H2, PU exhibits a nonlinear positive effect on BI. At low-to-moderate levels, the effect steadily increases; after reaching moderate levels, it plateaus, then at high levels, the positive effect increases again and substantially promotes BI, albeit to a relatively limited extent, before stabilizing at extremely high levels. Interaction analysis shows that high and above-high levels of PEOU, PU, and EAW are concentrated in the positive SHAP value range and within the moderate-to-high PU range, enhancing the positive marginal contribution of PU. In line with H3, H1, and H13, this further confirms that PEOU, PU, and EAW directly and positively influence BA and strengthen its positive effect on BI through moderating mechanisms.

PBC and interaction effect with SE (subfigure 11)

Consistent with H9, PBC exerts a nonlinear positive effect on BI. At low-to-moderate levels, the effect remains flat; at higher levels, the positive effect significantly increases and substantially promotes BI, before plateauing at extremely high levels. Interaction analysis shows that high and above-high levels of SE are concentrated in the positive SHAP value range and within the higher PBC range, thereby enhancing the positive marginal contribution of PBC. In line with H12, this further confirms that SE directly and positively influences PBC and strengthens its positive effect on BI through a moderating mechanism.

SE and interaction effect with PEOU (subfigure 12)

Consistent with H12, SE exhibits a nonlinear positive effect on BI. At low-to-moderate levels, the effect is limited; at higher levels, the positive effect increases sharply and substantially promotes BI. Interaction analysis shows that moderate and above-moderate levels of PEOU are relatively evenly distributed across both positive and negative SHAP value ranges and across different SE levels, indicating that PEOU alone cannot determine the direction of SE’s marginal contribution to BI. This suggests that the interaction effect between the two exhibits heterogeneity and complexity.



Source link