Intelligent lifetime prediction approach using machine learning models for power transformers

Cellulose, a natural glucose polymer, gradually degrades over time due to polymer chain scission under the influence of operating conditions during transformer operation. The molecular weight of insulating paper decreases when thermal aging is accelerated, especially in the presence of moisture and oxygen. This aging process produces degradation byproducts such as furan, carbon oxides (CO2 and CO), water, and acids, which are dissolved in the transformer oil, as shown in Figure 4. Various diagnostic methods utilize these important by-products to assess the remaining life of aging transformers. Some of the most widely used techniques include degree of polymerization (DP) measurement, furan analysis, and dissolved gas analysis (DGA). DGA is primarily used to identify incipient failures in power transformers and₂), methane (CH₄), acetylene (C₂H₂), ethylene (C₂H₄), ethane (C₂H₆) Formed by decomposition of oil during overheating or arcing, rather than by deterioration of solid insulation. Carbon oxides (CO and CO)₂) is useful for assessing the condition of paper, but it is also caused by oil oxidation, which confuses the identification of the cause of deterioration and reduces the accuracy of DGA in the aging assessment of ordinary insulators (solids). In particular, furans, especially 2-furfuraldehyde (2-FAL), are important cellulose degradation byproducts, allowing non-invasive DP estimation to predict transformer health.

This section outlines the machine learning model selected to predict the degree of polymerization (DP) of transformer insulation paper based on 2-furfuraldehyde (2-FAL) concentration, training dataset preparation, training parameters for model optimization, and performance metrics used to evaluate the model. Therefore, the health or remaining useful life of the transfer is determined by examining the interval that contains the value of DP.

machine learning model

Since the DP value of cellulose paper degrades linearly with the concentration of 2-FAL (in ppm) in the transformer oil (according to the IEEE C57.104 standard), we predicted DP as a continuous variable (regression) and divided the insulation condition into four categories: new (DP 700-1200), mildly aged (DP 450-700), and moderately aged (DP). ) 250–450), and the worst among the elderly (DP < 250). Table 1 represents the various conditions of the insulating paper described here.

Table 1 Different states of insulating paper according to DP.

The following supervised machine learning model was used.

Linear regression: Baseline model assuming a linear relationship between 2-FAL concentration and DP.
Polynomial regression (order 2): Captures nonlinear relationships by fitting a second-order polynomial to the data.
Random Forest Regression: An ensemble model that uses multiple decision trees to model complex nonlinear relationships for regression.
Logistic regression: A baseline classifier for predicting isolation status categories.
Support Vector Machine (SVM): Uses a radial basis function (RBF) kernel to classify insulation states based on 2-FAL.
Random Forest Classifier: An ensemble classifier that leverages decision trees for robust category prediction.

The selection of these models was driven by their proven effectiveness in processing tabular, nonlinear data with limited functionality to achieve our goals with sufficient accuracy. However, we intentionally chose not to incorporate deep learning (DL) techniques such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for several practical reasons. First, while DL models typically perform best on large and diverse datasets, our study relies on a synthetic dataset of 1000 samples centered around a single feature, 2-FAL, which may not provide the diversity needed to avoid overfitting. Furthermore, while DL techniques are better suited to complex high-dimensional inputs such as images or continuous data, our work involves simpler input-output relationships that can be effectively handled by traditional machine learning approaches. Additionally, implementing DL-based solutions in real-time transformer monitoring systems requires large amounts of computational resources, which may limit field deployment scenarios.

Preparing the training dataset

The training dataset was generated synthetically based on the IEEE C57.104-2019 standard. This standard provides guidelines for interpreting transformer insulation degradation due to chemical byproducts such as 2-FAL.³⁰. The dataset consists of 1000 samples with 2-FAL concentration (ppm) as the input feature and DP as the regression output, with category labels for classification. The data are structured according to the ranges shown in Table 1.

Temperature and humidity affect the insulation deterioration of the transformer and therefore the DP. Experimental evidence shows that thermal stress reduces the insulating dielectric and mechanical strength, leading to lower DP, while moisture accelerates paper degradation.¹⁵. Importantly, increasing temperature and humidity increases the formation of 2-furfuraldehyde (2-FAL) in oil. This is used as the main input for DP estimation. Our 1,000-sample dataset based on IEEE C57.104-2019 contains a balanced range of temperature and humidity levels from low to high levels, ensuring that the proposed machine learning model captures these effects. This balanced training data supports high accuracy and reliability of the model when predicting DP.

For each range, the 2-FAL and DP values are uniformly sampled to create input-output pairs to ensure balanced representation across all categories. The dataset was split into 80% (800 samples) training set and 20% (200 samples) testing set to evaluate the performance of the model on unseen data.

Training parameters and optimization

The machine learning model was implemented using the scikit-learn library in Python. Training parameters were carefully optimized to ensure accurate prediction of insulation state classification based on degree of polymerization (DP) and 2-furfuraldehyde (2-FAL) concentration. We also adopted a systematic approach to hyperparameter tuning and cross-validation to achieve a robust and generalizable model. For the regression model, linear regression served as a baseline without adjustable hyperparameters and relied on ordinary least squares to minimize the mean squared error between predicted and actual DP values. Polynomial regression was set to a fixed order of 2 to capture nonlinear relationships to avoid excessive complexity, and a pipeline combining polynomial feature transformation and linear regression was used, with no additional hyperparameters tuned to maintain simplicity. A random forest regressor designed to handle complex nonlinear patterns consisted of 100 decision trees, with the maximum tree depth adjusted across values of 5, 10, and none (unlimited depth), and the value that maximized the R² score was selected during 5-fold cross-validation. Meanwhile, the minimum sample per leaf was set to 1 to allow fine-grained splits, and the number of features considered in each split was set to the square root of the total standard features. regression task. For the classification model, a logistic regression was trained with a multinomial loss function handling four adiabatic condition categories (fresh, lightly aged, moderately aged, and worst aged) using the ‘lbfgs’ solver with up to 1000 iterations to ensure convergence. Also, the regularization strength parameter (C) was adjusted to values of 0.1, 1, and 10 by grid search to balance the complexity of the model and prevent the occurrence of problems. Overlearning. A support vector machine (SVM) classifier that employs a radial basis function (RBF) kernel to capture nonlinear decision boundaries has its regularization parameter (C) tuned over 1, 10, and 100, with kernel coefficients (gamma) of 0.01, 0.1, and “scale” (calculated as 1 / (number of features × variance of X)) The combinations that yielded the highest accuracy were selected. Grid search. The Random Forest classifier, also set to 100 trees, was optimized for accuracy during cross-validation by setting the minimum samples per leaf to 1, setting the number of features per split to the square root of total features, and adjusting the maximum depth to 5, 10, and none.

Table 2 Tuned hyperparameters for the machine learning model.

All models undergo 5-fold cross-validation, where we split the training data into 5 subsets, train on 4 folds, and validate on the 5th fold. We repeated this process five times to calculate average performance metrics (R² for regression, classification accuracy) to ensure robustness to overfitting and data variability. A random seed of 42 was set for all models to ensure reproducibility of results, and the training process was performed with a training split of 80% of the dataset, with 20% reserved for testing on unconfirmed data. Table 2 summarizes the hyperparameters tuned for each model, and Table 3 details the cross-validation setup.

Table 3 Cross-validation setup.

Source link