Prediction of coating degradation based on “Environmental Factors–Physical Property–Corrosion Failure” two-stage machine learning

Machine Learning


Development of the “Environmental Factors-Physical Properties” model

Using nine atmospheric corrosion exposure sites located in various geographical regions, the service behavior of Polyurethane (PU) varnish coating was evaluated over a 1-year period under diverse environmental conditions. As shown in Table 1, the nine atmospheric corrosion exposure sites are located in diverse geographic regions along the Belt and Road Initiative (BRI). These locations include Singapore, Guangxi, Nepal, Chennai, Kalimantan, Jeddah, Cilacap, Cairo, and Islamabad, each exhibiting distinct environmental characteristics. Singapore, located at the southern tip of the Malay Peninsula, maintains consistently warm and humid conditions. Guangxi, located in southern China, features a subtropical climate with numerous rivers. Nepal, nestled in the Himalayas, features a range of climates from tropical to alpine. Chennai, a coastal city in southern India, experiences a tropical wet climate. Kalimantan, on the Indonesian island of Borneo, represents a tropical rainforest climate rich in biodiversity. Jeddah, a key city along Saudi Arabia’s Red Sea coast, is characterized by a hot desert climate. Cairo, situated on the Nile River in Egypt, has a dry desert climate with minimal rainfall. Islamabad, in Pakistan, experiences a temperate climate. Table 2 presents the physical property data of coatings from the nine locations, including adhesion, glossiness, WCA, and yellowness. Among the tested sites, the coating exposed in Jeddah exhibited the highest adhesion strength of 6.97 MPa, while the one in Cilacap recorded the lowest adhesion value of 4.03 MPa. This discrepancy may be attributed to the contrasting humidity levels between the two regions. In areas with higher humidity, such as Cilacap, moisture infiltration can readily cause the coating to absorb water and expand, generating increased internal stress. This internal stress weakens the bond between the coating and the substrate, ultimately reducing adhesion strength25,26. In terms of glossiness, the coatings in Nepal exhibited the highest value of 22.14 Gu, whereas those in Cilacap displayed the lowest glossiness of 8.00 Gu. The climate in the Zhilaza region is hot, and it is exposed to intense direct sunlight for a long time. High-energy ultraviolet rays continuously impact the coating surface27,28, accelerating the deterioration process of gloss, whereas Nepal’s relatively mild climate helps preserve gloss. Regarding the WCA, the coating in Cilacap recorded the lowest value at 73.20°, indicating that the coating surface is more susceptible to wetting. The dry environment in Cilacap can damage the coating’s micro-roughness, thus reducing its hydrophobicity. The yellowness analysis revealed that the coatings in Singapore had the highest yellowness value of 0.2, while those in Jeddah had the lowest at −1.2. This phenomenon suggests that the high-temperature and humid climate of Singapore may accelerate the aging and yellowing of coatings during service, in contrast to the dry, low-precipitation conditions in Jeddah. In summary, environmental factors, including humidity, temperature, UV irradiation, and precipitation, exert distinct and significant effects on the physical properties of coatings.

Table 1 Distribution of atmospheric stations
Table 2 Physical property data of coatings at nine sites

As illustrated in the environmental data (Fig. 1), substantial variations in climatic conditions occur across different regions. For example, desert areas such as Jeddah exhibit substantial daily temperature variations and low precipitation, whereas tropical regions like Singapore and Chennai experience high temperatures and humidity. Certain regions may face extreme environmental conditions, including elevated temperatures, low atmospheric pressure, or severe dryness (e.g., Cairo and Jeddah). These extreme conditions present considerable challenges to the durability and reliability of coating materials.

Fig. 1: Distribution of various environmental factors across nine representative sites.
figure 1

a Average Temperature; b Maximum Temperature; c Minimum Temperature; d Atmospheric Pressure; e Dew Point Temperature; f Wind Speed; g Cloud Cover; h Mean Daily Precipitation; i Annual Mean Sunshine Duration; j Average Humidity; k Yearly Average Surface Horizontal Irradiation; l Yearly Total Surface Horizontal Irradiation.

To identify the key environmental factors, it is first essential to consider the correlations among these variables. Highly correlated variables may contain redundant information, which can diminish the modeling accuracy and efficiency. The Pearson correlation coefficients were calculated to evaluate the relationships among various environmental factors, including Ave. Temp., humidity, Daily Precip., Min Temp., Dew Point, Atm. Press., Max Temp., Sun Hours, and two solar irradiance indices (YTSHI and YASHI), along with wind speed and cloud coverage, as depicted in Fig. 2. To filter the key environmental factors, variables that exhibited high correlations but low significance were removed. For instance, the correlation coefficient between YTSHI and YASHI is 1, indicating that both parameters represent weighted or combined measures of solar irradiation with identical data trends. Consequently, YTSHI was determined to be a redundant parameter and was excluded from the selection of key environmental factors.

Fig. 2
figure 2

Heat map of Pearson correlation coefficients for 12 environmental factors.

Next, the environmental factors were ranked by their significance, and the most influential factors were selected for subsequent modeling. Following the method outlined in section “Identification of key environmental factors”, the importance of 12 environmental variables to various coating physical properties was quantified. As illustrated in Fig. 3, the four most significant factors were identified as the initial candidate key environmental factors. For adhesion, the key environmental factors were average temperature, relative humidity, daily precipitation, and total solar irradiation intensity, with average temperature showing the highest importance. This highlights the pivotal role of temperature in coating corrosion failure. Higher temperatures are more likely to trigger the oxidation reaction of the coating29,30, and changes in temperature can also cause the shrinkage and expansion of the coating. In terms of glossiness, the primary environmental factors identified were relative humidity, dew point temperature, maximum temperature, and average hourly surface horizontal irradiation. Among these, relative humidity exhibited the greatest influence. High-humidity environments promote moisture uptake by the coating, disrupting surface smoothness and reducing gloss. Moisture absorption may also cause slight swelling or blistering on the coating surface, altering its uniformity of light reflection. Furthermore, elevated humidity levels can facilitate the hydrolysis or degradation of coating components, thereby intensifying gloss loss. Similarly, the main environmental factors affecting WCA were average wind speed, maximum temperature, cloud cover, and daily precipitation, the wind speed accelerates the evaporation rate of the droplets and reduces the degradation of the coating. For yellowness, the primary influencing factors were atmospheric pressure, relative humidity, average hourly surface horizontal irradiation, and daily precipitation.

Fig. 3: Importance analysis of environmental factors for different physical properties.
figure 3

a adhesion; b glossiness; c contact angle; d yellowness.

After identifying the key environmental factors, predictive models were developed for the four physical properties of coatings: adhesion, glossiness, contact angle, and yellowness. The data from nine sites involved in the coating corrosion failure experiments were used as labeled samples, while data from 100 additional sites served as unlabeled samples. Analysis of the contributions of environmental factors to coating corrosion failure revealed that the cumulative contribution of the top four key factors exceeded 80%31. Consequently, these four factors were selected as model inputs. Using a co-training regression algorithm combined with the leave-one-out method, nine iterations were performed to construct the final model. Semi-supervised learning integrates both supervised and unsupervised learning by utilizing labeled and unlabeled samples to train a model, thereby reducing the dependence on labeled sample data16. As described in section “Cooperative training regression algorithm”, four algorithms were compared: 12-RF (a random forest model trained on the original 12-dimensional features), 4-RF (a random forest model trained on a reduced 4-dimensional features), 12-CORF (a COREG co-training model based on 12 features), and 4-CORF (a COREG co-training model based on the reduced 4-dimensional features). As shown in Fig. 4, the 12-RF model exhibited a higher Root Mean Square Error (RMSE) than the 12-CORF model, and similarly, the RMSE of 4-RF model was higher than that of the 4-CORF model. The 4-CORF model achieved the smallest prediction error among all tested models. Figure 5 further compares the prediction results of these models for the four physical properties under different environmental conditions. Gray bars represent true values, while colored dots and red star markers indicate predicted values. The prediction results of 4-CORF model were significantly closer to the true values across all four physical properties, particularly for glossiness and WCA.

Fig. 4
figure 4

RMSE of the four models for predicting physical properties.

Fig. 5: Predicted and actual values obtained from exposure experiments at nine locations for different physical properties.
figure 5

a Adhesion; b Glossiness; c Contact angle; d Yellowness.

The results demonstrate that the co-training regression algorithm effectively enhances the model’s sample space by incorporating unlabeled data, thereby reducing prediction errors. The higher error observed in the 12-CORF model compared to the 4-CORF model highlights the efficacy of the combination of RF importance analysis and Pearson correlation analysis, in extracting essential input information while eliminating redundancy. This improvement highlights the superior adaptability of 4-CORF model and generalization capability in predicting the physical properties of coatings. The reduced accuracy of models with excessive parameters is attributed to interactions among similar input variables, which may amplify certain factors’ effects on physical properties. For instance, combinations of high temperature and low humidity under specific conditions may result in larger prediction errors. However, the 4-CORF model, by supplementing unlabeled data and mitigating the influence of outlier features, significantly improved the prediction accuracy. In conclusion, the screened co-training regression model demonstrated excellent accuracy and generalization in predicting coating physical properties under various environmental conditions.

Development of the “Physical Property-Corrosion Failure” model

In the second stage, the dataset was expanded from indoor accelerated experiments, which contained multiple cycles of ultraviolet (UV) aging and salt spray tests. The UV aging test was conducted at 50 °C with an irradiation intensity of 60 W/m² for 1.5 days, and the salt spray test was performed using a 5.0 wt.% NaCl solution at 40 °C for 0.5 days. Each cycle of UV aging and salt spray test lasted 2 days, and a total of three such cycles were completed32,33,34,35. The physical property data of coatings were used as inputs to develop two distinct models: a prediction model to explore the nonlinear relationship between coating properties and barrier performance, and a binary classification model to predict coating damage states, thereby improving the accuracy of service state predictions. The classification model complements the prediction model by providing the specific damage state of the coating.

The primary purpose of the accelerated indoor experiments is to expand the dataset by simulating the aging and damage processes of coatings under various environmental conditions. This provides a more diverse and enriched set of input data for model, which will help improve its accuracy and generalization capability, particularly enhancing its predictive precision in practical applications. A total of 37 valid datasets were obtained from multiple cycles of UV aging and salt spray tests. Figure 6 illustrates the distribution and pairwise correlations between different physical property parameters, including glossiness, yellowness, WCA, adhesion, and the low-frequency impedance modulus (|Z|0.01Hz). The diagonal histograms (or density plots) represent the distributions of each physical property parameter, which generally exhibit a near-normal distribution trend. The graph shows varying correlations, both positive and negative, with differing linearity. The scatter plots highlight a positive correlation between glossiness, adhesion, and |Z|0.01Hz. This indicates that samples with higher glossiness and adhesion are associated with higher impedance modulus values, suggesting that these properties contribute to improved barrier property of coatings.

Fig. 6
figure 6

Data distribution of indoor accelerated experiment results.

Based on the coating physical property data and electrochemical impedance modulus data collected from indoor accelerated experiments, glossiness, yellowness, WCA, and adhesion were used as input variables, while the |Z|0.01Hz values served as the output. Two distinct models are used: a regression model that explores the nonlinear relationship between coating properties and barrier performance, and a binary classification model that predicts coating damage states. The classification model complements the prediction model by providing specific information about the coating’s damage state, thereby enhancing the accuracy of service state predictions. Regression and classification models were established using Support Vector (SVR), ANN, and Adaptive Boosting (AdaBoost). In terms of dataset division, considering the data distribution, we adopted a training set:test set ratio of 4:1 for partitioning. Optimal parameter combinations were identified for multiple regression models through Grid Search. Specifically, the code iterates through each model, creating a GridSearchCV object for each. The parameter search space encompasses candidate parameter values for each model, as shown in Table 3. The code evaluates each parameter combination using 10-fold cross-validation (KFold). The performance of these models was evaluated using the Coefficient of Determination (R2) and precision metrics.

Table 3 Hyperparameter search space for three models

As shown in Table 4, AdaBoost demonstrated significant advantages in both regression and classification tasks, achieving an R2 value of 0.83 and a precision value of 1. In the binary classification task, SVR also showed high classification accuracy but had lower prediction accuracy in regression tasks, with R2 of only 0.41 and 0.44, respectively. Overall, AdaBoost exhibited exceptional accuracy, making it the optimal model for coating performance prediction and classification.

Table 4 R2 and precision values of the three models

Figure 7 presents a SHapley Additive exPlanations (SHAP) bar plot illustrating the contributions of four features–adhesion, glossiness, contact angle, and yellowness–to the model output (i.e., barrier property). The horizontal axis represents the SHAP values, which reflect the impact of each feature on the prediction performance, with the color gradient (pink to green) indicating feature values from low to high. The results show that adhesion has the greatest influence on the barrier property of coating, followed by WCA and glossiness, with yellowness having the least impact. From literature, higher adhesion improves bonding strength, reducing voids and defects at the interface, and significantly enhancing barrier property by preventing moisture and corrosive substances from penetrating. Water contact angle and glossiness, as a measure of hydrophobicity and surface smoothness, respectively, enhance barrier property by minimizing moisture retention and contaminant adhesion. In contrast, yellowness, which relates to coating aging and photostability, has limited direct effects on the barrier property.

Fig. 7
figure 7

SHAP value interpretability analysis.

Evaluation of the “Environmental Factors-Physical Property-Corrosion Failure” model

In this study, a two-stage modeling framework, “Environmental Factors-Physical Property-Corrosion Failure” was proposed to comprehensively evaluate the anti-corrosion performance of coatings. The framework systematically progresses from environmental factors to modeling changes in physical property, electrochemical behavior, and ultimately the corrosion failure state of coatings, thereby establishing a complete evaluation system.

The framework employs the 4-CORF model to predict the physical properties of coatings using environmental factors as inputs. Leveraging a co-training semi-supervised learning strategy, the 4-CORF model effectively integrates labeled and unlabeled samples, enhancing generalization capability while quantifying the effects of variables such as temperature and humidity on coating physical property. These physical property changes directly influence the electrochemical results of coatings. Subsequently, the optimized AdaBoost model is applied to reveal nonlinear relationships between physical property metrics and the electrochemical impedance modulus, quantifying the specific contributions of each metric to the barrier performance. Based on these predictions, binary classification methods are then employed to assess the coating’s damage state, distinguishing between intact coatings and damaged ones (Table 5), where 1 represents intact coating and 0 represents damaged coating. As shown in Fig. 8, the two-stage model exhibits relatively small prediction errors for the coating performance in both damaged and undamaged states, especially in the damaged state. This demonstrates that the proposed two-stage “Environmental Factors-Physical Property-Corrosion Failure” model significantly outperforms the direct one-stage “Environmental Factors-Corrosion Failure” model (constructed using semi-supervised algorithms), achieving lower relative errors and superior prediction accuracy. This highlights the advantages and robustness of the two-stage framework in capturing complex relationships and improving predictive performance.

Fig. 8
figure 8

Relative error of one-stage model and two-stage model.

Table 5 Binary classification results of one-stage model and two-stage model

The single-stage model directly fits the input environmental variables to the output barrier property results, without considering the influence of complex physical features. Therefore, the single-stage model struggles to accurately capture the nonlinear effects in evaluating the coating’s barrier performance, often leading to significant prediction errors. In contrast, the “Environmental Factors-Physical Property-Corrosion Failure” two-stage modeling strategy effectively captures the influence of physical properties on corrosion processes, enabling a more detailed analysis of barrier property change. It presents lower relative errors and higher prediction accuracy compared to the one-stage model, performing exceptionally well in distinguishing damaged coatings from intact coatings. This physical property-based approach provides a multidimensional and dynamic evaluation of the anti-corrosion performance of coatings, overcoming the limitations of traditional methods that rely barely on environmental variables. This study offers a robust and practical methodology for advancing coating performance research.

In summary, this study constructed a comprehensive predictive model framework, “Environmental Factors-Physical Property-Corrosion Failure” by integrating environmental factors, physical properties, and corrosion failure performance of coatings in different environments. Compared to the one-stage “Environmental Factors-Corrosion Failure” model, the proposed two-stage model demonstrated significant advantages. First, it achieved higher predictive accuracy by effectively reducing the nonlinear complexity between environmental factors and corrosion failure performance during modeling, with a relative error significantly lower than that of the one-stage model. Second, it provided enhanced scientific interpretability by incorporating intermediate physical property parameters. This addition clarified the mechanisms by which environmental factors influence corrosion failure, thereby improving the model’s physical relevance and practical applicability.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *