Machine learning framework for oxytetracycline removal using nanostructured cupric oxide supported on magnetic chitosan alginate biocomposite

Machine Learning


FESEM images of nanostructured cupric oxide on magnetic chitosan/alginate biocomposite

Examination by FESEM (Fig. S1 in the supporting information file) shows an open and uneven surface structure that gives the adsorbent more pollution contact ability. Single nanoparticle clusters and arranged CuO nanoparticles manifest on biocomposite surfaces through the successful synthesis process according to the FESEM image. Evidence of nanoparticle clumping appears scattered across the displayed image which indicates the pollutant adsorption performance may decrease specifically in those areas. Worked experiments confirm that the synthesized CuO nanoparticles have sizes between 10 and 100 nm which demonstrates successful nanostructure synthesis while the different structures manifest as rods, sheets, or spheres with individual adsorption behaviors. The appearance of multimorphous particles with different brightness levels corresponds to Fe3O4 magnetic compounds. The separation process of the adsorbent from its solution depends on these specific particles. The effective pollutants removal combined with high mechanical performance depends on the material being free of cracks and maintaining a full porous structure.

The surface of biocomposite displays CuO nanoparticles which adopt sheet-like and rod-shaped together with spherical morphologies (It also was proved by the ultraviolet–visible adsorption spectrum. The ultraviolet–visible adsorption spectrum of the synthesized copper oxide nanoparticles was observed at a wavelength of 570 nm. This adsorption peak was related to copper oxide nanoparticles). The FESEM images exhibit scale bars of 200 and 500 nm representing particle sizes between 10 and 100 nm and this illustrates the information from the manuscript. The adsorbent`s magnetic properties originate from the spherical magnetic Fe3O4 nanoparticles which present as brighter structures with moderately smooth surfaces. Figures of higher brightness in FESEM images reveal Fe3O4 components because chemical distinctions between particles lead to image brightness variations. When subjected to magnetic recovery these magnetic particles show dimensions from 20 to 50 nm and they also enable simple retrieval of the adsorbent by magnetic means. Nanoparticles find their suitable distribution platform through the alginate/chitosan matrix which displays a porous mesh structure in the images. The manuscript notes this porous structure delivers effective pollutant adsorption by providing extensive surface area.

FTIR spectrum of nanostructured CuO on magnetic chitosan/alginate biocomposite

The FTIR spectrum of CuO (Fig. S2a in the supporting information file) displays a broad peak at 3587.39 cm−1 and 3392.17 cm−1, attributed to O–H stretching vibrations, indicating hydroxyl groups or adsorbed water. Peaks at 2920.20 cm−1 and 2853.35 cm−1 correspond to C–H stretching, possibly from organic residues. Peaks at 1629.71 cm−1 and 1575.01 cm−1 relate to bending vibrations of water or carbonyl groups, while the 1427.12 cm−1 band suggests carbonate species. Additional peaks between 1126.80 cm−1 and 734.51 cm−1 correspond to C–O stretching and metal–oxygen interactions. Cu–O stretching vibrations appear at 624.33 cm−1 and 604.97 cm−1, confirming CuO presence, with further metal–oxygen peaks at 511.42 cm−1, 485.70 cm−1, and 424.28 cm−1. The FTIR spectrum of chitosan (Fig. S2b in the supporting information file) shows a broad peak at 3444 cm−1 for O–H and N–H stretching. Peaks at 2922 cm−1 and 2874 cm−1 correspond to aliphatic C–H stretching, while the 1598 cm−1 peak confirms primary amine groups. Other significant peaks include those at 1421 cm−1 (CH2 bending), 1380 cm−1 (CH₃ bending), and 1321–1262 cm−1 (amide III vibrations). C–O stretching in chitosan is confirmed by peaks at 1187 cm−1, 1076 cm⁻1, and 1034 cm−1, with saccharide ring vibrations appearing at 995 cm−1 and 896 cm−1. Peaks at 667 cm−1 and 604 cm−1 correspond to skeletal polysaccharide vibrations. The FTIR spectrum of alginate (Fig. S2c in the supporting information file) features a broad O–H stretching peak at 3408.21 cm−1 and a C–H stretching peak at 2924.29 cm−1. The asymmetric stretching vibration of carboxylate (–COO) groups appears at 1414.10 cm−1, with symmetric stretching at 1302.59 cm−1, indicating metal-binding potential. C–O stretching vibrations at 1094.88 cm−1 and 1032.97 cm−1 confirm glycosidic linkages, while peaks at 953.07 cm−1 and 819.70 cm−1 relate to saccharide ring vibrations. Skeletal vibrations occur at 726.17 cm−1, with metal-alginate bending at 620.93 cm−1. The FTIR spectrum of CuO-M-CAB (Fig. S2d in the supporting information file) shows a broad peak in the 3200–3600 cm−1 region, representing O–H and N–H stretching, indicative of polysaccharide structures. Peaks between 1600 and 1750 cm−1 correspond to C=O stretching in carboxylate (–COO) and amide groups. The 1000–1300 cm−1 region contains C–N and C–O–C stretching peaks, confirming polysaccharide biopolymers. Finally, Cu–O bond vibrations between 500 and 700 cm−1 confirm the presence of CuO nanoparticles in the composite.

The FTIR spectra in Fig. S2 demonstrate the successful integration as well as the interaction between CuO nanoparticles in alginate/chitosan biopolymer matrix. The final spectra shows the unified spectral integration in Fig. S2d. A spectrally distinct section of CuO-specific bonds exists between 500 and 700 cm−1 in both the composite spectrum (Fig. S2d) and pure CuO spectrum (Fig. S2a) although peak positions slightly differ showing molecular union instead of simple physical contact.

Shifts and merging of functional group peaks: The composite spectrum (S2d) shows a broad band in the 3200–3600 cm−1 region corresponding to overlapping O–H and N–H stretching vibrations. The broadening and shift compared to the original polymers (Figs. S2b,c) suggest hydrogen bonding and/or interaction with CuO surface hydroxyl groups.

New or altered absorption bands: The band region stretching from 1600 to 1750 cm−1 reveals both peak intensity and position changes in C=O and amide-related peaks because of coordination interactions between CuO and carboxylate/amide groups in the biopolymers. Such interactions demonstrate both CuO inclusion in the matrix and potential chelation effect.

Retention of polysaccharide features with CuO bonding: Despite the presence of CuO, characteristic peaks of the polysaccharides (C–O–C, C–N, saccharide ring vibrations in the 1000–1300 cm−1 region) remain visible, confirming that the polymeric network is preserved while participating in bonding with CuO. These spectral features particularly the shifts and merging of CuO and polymer functional group vibrations are a strong indication of chemical interactions between the CuO nanoparticles and the biopolymer matrix, going beyond simple physical dispersion. Therefore, the FTIR analysis does provide convincing evidence for CuO binding to the biopolymer structure.

XRD of CuO-M-CAB

The XRD pattern (Fig. S3 in the supporting information file) presented for the CuO-M-CAB reveals essential information regarding the crystalline and amorphous phases present in the synthesized material. The high level of crystalline content manifests as a prominent peak situated at 2θ ≈ 18–20° and separate peaks emerge from 20° up to 50°. Some reflections in the monoclinic CuO crystal structure produce peaks at 2θ values of 32.5°, 35.5°, 38.7°, and 48.7° demonstrating that CuO nanoparticles formed in the biocomposite. When magnetic behavior occurs in the material the most likely explanation is the presence of iron oxide components Fe3O4 or γ-Fe2O3 because their main XRD peaks are detected near 2θ values of 30.1°, 35.4°, 43.1°, and 57.0°. The broad elevation in intensity between 10° and 25° is commonly associated with the amorphous nature of biopolymers such as chitosan and alginate, which is consistent with the organic matrix used in this composite. The material structure consists of crystalline CuO layers alongside amorphous biopolymer material to achieve active surface sites within the dispersible and flexible structure. The XRD pattern confirms that nanostructured CuO becomes magnetically enhanced and stays within the biopolymeric structure thus indicating its effectiveness in OTC antibiotic removal through improved aqueous solution characteristics.

Tikhonov regularization model

Evaluation of model performance in predicting removal percentages

The prediction accuracy of Tikhonov Regularization achieves its results through complex model control and overfitting prevention by adding a regulatory penalty to model coefficients. Because of hyperparameter adjustment the predictive variance decreases while maintaining minimum prediction errors thus the model achieves excellent performance in both training data and unknown data generalization.

Figure 1 outlines the relationship between the real expulsion rates and the values anticipated by the show. The ruddy line speaks to the relapse line, demonstrating the leading linear fit to the information. The information focused within the cleared-out plot (training information) are by and large near to the relapse line, demonstrating a great fit on the training data. This recommends that the demonstration has viably learned designs within the prepared dataset. In the Tikhonov Regularization model, R2 was 0.973 for the training dataset and 0.958 for the testing dataset. Analysis of model values shows approximately 97.3% variance capacity in training data and 95.8% variance ability in testing data which demonstrates strong accuracy levels along with good generalization abilities28,42.

Fig. 1
figure 1

Actual vs Predicted Removal Percentages and Residual Analysis.

With an R2 estimate of 0.973448, the demonstration clarifies roughly 97.34% of the change within the training information. Information focused within the right plot (testing information) is moreover near to the relapse line, even though there’s marginally more scramble. This shows that the show generalizes sensibly well to modern information (testing data), but may have bigger blunders in certain cases. An R2 estimate of 0.958178 appears that the demonstration accounts for almost 95.82% of the change within the testing information. The foot plots show the distinction between real and anticipated values (residuals) against genuine values. The level line at zero speaks to zero blunder. Residuals within the cleared-out plot (training information) are by and large scattered around zero, with a few changes. This shows that the show has bigger blunders for certain training information focuses. An RMSE of 4.621522 and an MAE of 3.648229 show that, on normal, the model’s error on the training information is around 4.62 units, and its absolute error is almost 3.65 units. Residuals in the right plot (testing information) are moreover for the most part scattered around zero, in spite of the fact that the variance design may vary. This proposes that the demonstration might have a few orderly blunders within the testing information. An RMSE of 5.039186 and an MAE of 4.208315 show that the model’s normal error on the testing information is marginally higher than on the training information. The demonstration performs marginally superior on the training data than on the testing information. This contrast proposes that the show may show slight overfitting, meaning it depends as well intensely on particular highlights of the prepared information, which influences its capacity to generalize to unused information. Although the initial assessment included a qualitative reference to the potential occurrence of overfitting in the model, the statistical results indicate that the performance difference between the training and testing datasets remains within an acceptable range. The R2 value for training data reaches 0.973, whereas testing data achieves 0.958 indicating the model maintains effective forecasting ability between these sets. The RMSE value for training data is 4.62, with MAE being 3.65, and the testing data showed RMSE of 5.04 and MAE of 4.21. This slight increase in prediction error for the test set is expected and does not indicate significant overfitting. Both data sets display random distribution of residuals around zero while avoiding major systematic patterns and extreme fluctuations so the model shows reliable performance for predicting new data and demonstrates no indication of overfitting33,55,56.

In the Tikhonov Regularization model, only slight signs of systematic errors and overfitting were observed. Although the residual plots for both training and testing datasets showed that the prediction errors were generally scattered around the zero line with no clear systematic pattern there were a few instances, particularly in the testing data, where larger residuals appeared. This suggests that the model may produce occasional systematic errors in specific cases. Regarding overfitting, the model showed a marginally better performance on the training data compared to the testing data. The R2 score slightly decreased from 0.973 in training to 0.958 in testing whereas RMSE and MAE rose from 4.62 and 3.65 to 5.04 and 4.21 respectively. The minor overfitting means the model uses training data characteristics excessively but the variation stays within valid ranges. There exists potential overfitting but this minimal effect does not affect the model’s generalization capabilities.

In any case, the model’s general execution is great, making it reasonable for forecasting removal rates. Based on the plots and execution measurements, it can be concluded that the demonstration has solid prescient execution. It has learned the designs within the information and gives precise expectations for evacuation percentages. However, as watched within the leftover plots, the demonstration may display a few orderly mistakes in certain cases52,57,58. The main drawback of Tikhonov Regularization in this study involves slight overfitting which reveals itself through performance differences between training and testing data sets. Prediction accuracy was high because the model demonstrated R2 values at 0.973 for training data and 0.958 for testing data but the test set revealed slightly more prediction inaccuracies with RMSE of 5.04 and MAE of 4.21. There are indications the model relies too intensely on particular training features which might impact its generalization to fresh unseen data, but the differences in performance do not point to substantial overfitting. The model shows no major systematic errors because residuals stay close to zero but occasional large mistakes occur sporadically.

The experimental results and parameters of Tikhonov regularization (Ridge Regression) based on the provided code and outcomes are as follows: The regularization parameter (α) was optimized using Grid Search with fivefold cross-validation, yielding the best value of α = 0.1. The regularized Tikhonov model achieved an R2 of 0.973 for the training data and 0.958 for the testing data, indicating high accuracy and strong generalization capability. The RMSE and MAE for the testing data were 5.04 and 4.21, respectively, reflecting minimal and acceptable prediction errors. The random distribution of residuals around zero in both training and testing datasets confirms the absence of systematic error patterns and the model’s reliable performance. Tikhonov regularization with α = 0.1 effectively reduced overfitting and enhanced generalization, while the slight performance difference between training and testing datasets remains within acceptable limits, indicating no significant overfitting. Thus, the regularized Tikhonov model is validated as an effective tool for predicting removal percentages with high accuracy and good generalization. By incorporating these details, the theoretical explanations are supported by empirical evidence and specific parameter values28,49,50,59.

Performance analysis of Tikhonov regularization model using various evaluation metrics

The radar chart (Fig. 2) could be a capable visual apparatus for at the same time comparing different factors. In this chart, each pivot speaks to a performance metric of the Tikhonov Regularization relapse demonstration. The length of the line drawn from the center to each pivot demonstrates the esteem of that metric. By comparing the lengths of these lines for preparing and testing information, we can incorporate important bits of knowledge into the model’s execution. A few key focuses are essential within the displayed chart:

Fig. 2
figure 2

Comparison of Tikhonov Regularization Model Performance Metrics for Training and Test Data.

The biggest distinction between the preparing and testing information is seen within the MSE metric. The Mean Squared Error value of test data stands notably higher than training data which demonstrates the model exhibits much worse performance with data it has not encountered before. Whereas not as noteworthy as MSE, the distinction in MAE is additionally striking. This appears that, on normal, the model’s expectation mistake is higher for the testing information compared to the training data. RMSE essentially shows that the model’s error is much higher for the testing information, a bit like MSE. In spite of the fact that not clearly appeared within the chart, it can be gathered that the R-squared value for the testing data is lower than for the preparing information. This implies the demonstration is less viable in clarifying the variety within the testing information compared to the training information. Given the noteworthy contrasts between the preparing and testing information, it can be concluded that the Edge relapse show has overfitted. In other words, the demonstrate has ended up excessively dependent on the particular highlights of the training data and is incapable to generalize well to modern, concealed information (Additional materials are provided in the supplementary file)57,60,61,62.

Evaluation of regression model performance using statistical analysis and visualization

The displayed charts (Fig. 3) give a detailed and precise comparison between real and anticipated values by a relapse demonstration for both training and testing datasets. Each thrust is committed to one of these datasets, though the columns talk to three particular sorts of plots: violin, box, and swarm. These plots, combined with expressive measurements, serve as capable instruments for analyzing the model’s execution. The violin plot appears the information dissemination based on thickness, with the largest portion demonstrating the most elevated information concentration. The box plot gives data approximately the least, most extreme, to begin with, and third quartiles, and middle. The swarm plot shows each information point independently, advertising a clear see of the exact information dispersion. The training dataset shows that the cruel of genuine and anticipated values (both 54.62) is exceptionally near, illustrating the model’s great capacity to gauge the cruelty of training information. The standard deviation of genuine values (28.36) is marginally higher than the anticipated values (27.60), recommending that the demonstrate marginally thinks little of the spread of the information. The least anticipated esteem (− 3.76) is lower than the real least (4.0), whereas the greatest anticipated esteem (99.22) is marginally higher than the genuine greatest (100.0). These contrasts propose that they demonstrate battles to precisely foresee extraordinary values. The anticipated middle (62.29) is marginally higher than the genuine middle (60.0), demonstrating an overestimation of the central esteem, and the mode of the anticipated information (− 3.76) contrasts essentially from the real mode (74.0), appearing the model’s shortcoming in distinguishing the mode within the training information. Within the testing dataset, the cruelty of the anticipated values (51.23) is somewhat higher than the genuine cruelty (49.92), demonstrating a propensity of the show to overestimate the cruelty of the testing information. The standard deviation of the anticipated values (23.62) is lower than the genuine esteem (24.64), proposing that the demonstration belittles the spread of the information. The least anticipated esteem (5.19) is somewhat higher than the genuine least (6.3), and the greatest anticipated esteem (86.55) is lower than the genuine greatest (92.0). These errors highlight the model’s trouble in precisely foreseeing the extraordinary values within the testing information. The anticipated middle (52.63) is higher than the genuine middle (47.0), demonstrating an overestimation of the central esteem, whereas the anticipated mode (5.19) essentially contrasts with the real mode (6.3), appearing the model’s weakness in recognizing the foremost visit esteem within the testing information. The relapse demonstration performs sensibly well in evaluating the cruel and spread of information in both training and testing sets but appears shortcomings in foreseeing extraordinary values and recognizing the mode. These issues might show overfitting, where the show depends as well intensely on the training information and loses its capacity to generalize to unused information. Alterations to the demonstration or extra information may be essential for enhancement57,63,64,65,66.

Fig. 3
figure 3

Comparison of Violin, Box, and Swarm Plots for Training and Testing Data in Tikhonov Regularization Model.

Assessment of residual distribution in regression models Using QQ plots

QQ Plots (Quantile–Quantile Plots) (Fig. 4) are capable devices for comparing the dispersions of two datasets. Within the setting of relapse modeling, they are utilized to assess the ordinariness of residuals. Typicality of residuals is one of the key presumptions in numerous relapse models and plays a significant part in model approval. Within the displayed plots, there are four QQ Plots: two for the training information and two for the testing information. Each plot incorporates two tomahawks: the level pivot speaks to the quantiles of the hypothetical dissemination (as a rule the standard typical dispersion), and the vertical hub speaks to the test quantiles. If the focus is on a QQ Plot roughly adjusted with the 45-degree line, it shows that the test conveyance closely matches the hypothetical distribution. In these figures, QQ plots are drawn utilizing two different methods. The primary strategy (stats.probplot) changes the information into quantiles and after that plots them against the quantiles of the standard normal distribution. The moment strategy (sm.qplot) straightforwardly plots the quantiles of the information against the quantiles of the standard ordinary dispersion. The primary QQ Plot for the residuals of the training information appears that the points for the most part adjust with the 45-degree line, but slight deviations are observed at the tails of the distribution. This proposes that the residuals of the training information, for the most part, follow a typical dispersion, even though a few deviations from normality exist at the tails. The moment plot for the training information also shows typicality for the residuals, even though a few deviations from the 45-degree line are observed. These deviations may well be due to outliers or the impact of particular information highlights. The primary QQ Plot for the residuals of the testing information appears that the points are, for the most part more distant from the 45-degree line, demonstrating that the residuals veer off altogether from the ordinary conveyance. This recommends that the residuals of the testing information are non-normally distributed. The moment plot for the testing information moreover appears clear non-normality within the residuals. Based on the QQ Plots, the residuals of the training information are, by and large, near to an ordinary conveyance, with a few deviations in particular regions. This recommends that the relapse demonstrate performs moderately well on the training information, and the presumption of ordinariness for residuals is generally met. In any case, the residuals of the testing information are essentially diverse from an ordinary conveyance. This shows the destitute execution of the relapse demonstrated on the testing information, and the ordinary suspicion for residuals is abused30,65,67.

Fig. 4
figure 4

Comparison of Residual Distributions Using QQ Plots (stats.probplot and sm.qqplot).

Visual analysis of the relationship between independent variables and removal efficiency

The displayed charts (Fig. 5) are numerous relationship plots that outwardly outline the relationship between two autonomous factors and a subordinate variable (removal efficiency). The flat and vertical tomahawks speak to the values of the autonomous factors. Diverse colors within the plot demonstrate shifting values of the subordinate variable, evacuation, where a slope from low to high is ordinarily utilized to show these values. Form lines speak to focuses where the value of the subordinate variable is steady. In case the form lines slant upward, this demonstrates a coordinate relationship between the free and subordinate factors, though a descending slant recommends a converse relationship. When the frame lines are closer together, it implies a strong relationship between the factors. Straight-form lines illustrate a coordinate relationship, while bowed lines prescribe a non-linear relationship. By observing the color inclination, one can get the run of the subordinate variable’s values. If the plots appear that as X1 and X2 increment, the colors alter from blue to yellow (accepting blue represents to low values and yellow represents high values of the subordinate variable), it can be concluded that expanding both autonomous factors lead to an increment within the subordinate variable. Moreover, if the form lines are roughly parallel, it is recommended that both autonomous factors have a comparative effect on the subordinate variable52,68,69.

Fig. 5
figure 5

Contour Plots Showing the Interaction Between Independent Variables and Removal Efficiency.

Visual representation of feature importance in the Tikhonov regularization model

Radar and bar charts reveal different aspects of the importance of input features in relation to the dependent variable. The radar chart visually assesses input features (X1, X2, X3, X4) through plotted polygons over circular geometry. The areas of colored inner sections reflect feature importance, while larger areas indicate more significant features. The radar chart uses color fields to display the effect size of dependent variable change caused by each feature based on its coefficient value. The extent of the colored area near a specific feature indicates its level of importance. The two charts combine for data visualization and numerical information display, and maintain an exact representation of bar values in the feature scale. A bar chart displays coefficient strength based on height measurements which depict how dependent variable responds to different features. Direct effects appear as positive bars, and inverse relationships manifest as negative bars70,71,72.

Figure 6 presents two sorts of charts outlined to demonstrate include significance within the Tikhonov Regularization: a Radar Chart and a Bar Chart. The Radar Chart is utilized to compare numerous features (independent factors: X1, X2, X3, X4) within the shape of a polygon plotted over a circular format. Each winding rotates talks to one of the input highlights (X1, X2, X3, X4). The lines diagram the numerical values of each highlight, and the colored internal locale talks to the relative centrality of highlights. Highlights with higher coefficients incorporate a more noteworthy isolation of the inner run. In this chart, the more the color extends opens up toward a specific incorporate, the more conspicuous the centrality of that highlight. A Bar Chart is planned to show the relative significance of each included in a numerical frame. The without-a-doubt turn (X-axis) talks to the names of the highlights (X1, X2, X3, X4), whereas the vertical turn (Y-axis) talks to the coefficient values of the highlights for each highlight. Coefficients can be either positive or negative. The height of each bar shows the coefficient estimate of the comparison included. Highlights with taller bars have a more noteworthy impact on the subordinate variable. Positive bars show a coordinate (expanding) impact of the highlight on the dependent variable, and negative bars indicate a reverse (diminishing) impact of the include on the subordinate variable. The Radar Chart is reasonable for visual and comprehensive comparison of numerous highlights, whereas the Bar Chart provides exact coefficient values and the heading of impact (positive or negative) for each highlight. Based on the Radar and Bar Charts, highlights X3 and X2 have the most elevated effect on the subordinate variable. Highlights X1 and X4 have the slightest significance for the subordinate variable. Highlights X3 and X2 have a coordinate (positive) impact on the subordinate variable, whereas highlights X1 and X4 have a reverse (negative) impact on the subordinate variable70,71,72,73,74.

Fig. 6
figure 6

Feature Importance Analysis: Radar and Bar Chart Representations.

Yandex boosting

Evaluating the performance of the Yandex boosting regression model for predicting removal efficiency (%) based on key variables

Based on the provided numbers and graphs (Fig. 7), the analysis reveals a comprehensive understanding of the Yandex Boosting regression model’s performance. The model, trained to predict the target variable “Removal (%)” using the independent variables X1 to X4, has shown good results on the training data. The mean residual is close to zero (− 0.25), and the low standard deviation (1.28) indicates that the predictions are, on average, very close to the actual values. The mean of 54.62 and the standard deviation of 28.80 for the training data reflect significant data dispersion, which the model has captured well. On the test information, the model’s execution is somewhat weaker. The negative, cruel leftover (− 1.16) recommends a slight underestimation, whereas the higher standard deviation of residuals (4.66) demonstrates more variable expectation blunders. This variability could hint at overfitting. The mean of 49.92 and the standard deviation of 26.14 for the test data also reveal significant dispersion, though slightly less than the training data. The scatter plot of actual vs. predicted values shows data points generally distributed around the ideal line (y = x), confirming relative agreement between predictions and actual values. However, the test data points exhibit more scatter, indicating slightly reduced predictive accuracy on new data. Residual plots show residuals around the zero line, but with greater variability in the test data, highlighting larger and more inconsistent prediction errors compared to the training data (Additional materials are provided in the supplementary file)75,76,77,78,79,80,81,82.

Fig. 7
figure 7

Comparison of Predicted versus Actual Values and Residual Analysis for Yandex Boosting Regression Model.

Evaluating model performance through learning curves

Learning bends (Fig. S4 in the supporting information) are a capable apparatus for assessing the execution of machine learning models. These bends outline the relationship between the size of the training dataset and the model’s performance. By analyzing these bends, able to pick up basic bits of knowledge in the demonstration, such as its complexity, signs of overfitting or underfitting, and the requirement for extra information collection. Interpretation of the provided learning curves:

  1. 1.

    RMSE Decrease in RMSE for training data as the dataset size increases: This indicates that the model becomes better at learning patterns in the data as the training set grows. Reduction in RMSE for test data up to a certain point, followed by stabilization or increase: This suggests the presence of overfitting. The model has ended up so complex that it begins learning irregularities within the training information, driving troubles in generalizing to unused (test) data.

  2. 2.

    R-squared Increment in R-squared for training information as the dataset measure develops: This reflects that the demonstration continuously makes strides in its capacity to explain the fluctuation within the training information. Improvement in R-squared for test data up to a specific point, followed by stabilization or decline: This also indicates overfitting, where the model loses its ability to generalize effectively to unseen data.

  3. 3.

    MAE Decrease in MAE for both training and test data with larger dataset sizes: This shows that the model becomes generally more accurate in predicting target values. However, similar to RMSE and R-squared, a slight increase in MAE for test data after reaching a specific point might occur, signaling overfitting57.

In summary, the learning curves and scatter plots provide various insights into Yandex Boosting’s predictions.

  • A lower RMSE for the training data as the size of the training dataset increases may indicate that the model is improving at learning patterns from the data. However, for test data, RMSE decreases up to some extent and then either plateaus or increases, suggesting overfitting. The model becomes overly specialized to the training data that it struggles to generalize to unseen (test) data. Additionally, the R-squared value tends to increase for training data, reflecting improved model performance in explaining variance; however, for checking data, it often levels off or declines beyond a certain point, which is another sign of overfitting. MAE decreases for each training and test data as the dataset grows; however, after a specific threshold, it may slightly increase, indicating overfitting.

  • Scatter plots show that the data points for the training data are generally distributed around the ideal line (y = x), indicating good agreement between predictions and actual values. However, for test data, there is more scatter, reflecting reduced predictive accuracy on new data. The elevated dispersion in testing data leads to bigger unpredictable prediction errors compared to training data, possibly because of overfitting.

Yandex Boosting shows high performance within training data; however, overfitting issues while processing test data decrease its capacity to recognize new data patterns26,71,72,83.

Evaluation of model performance with cross-validation in Yandex boosting regression

The provided chart (Fig. 8) is a cross-validation chart for a Yandex Boosting Regression Model. It shows how the model’s error (measured by the RMSE metric) changes with the increasing number of algorithm iterations. The horizontal axis represents the number of iterations of the Yandex Boosting algorithm. Each iteration corresponds to one training step of the model. The vertical pivot speaks to the Root Cruel Squared Blunder, a common metric for assessing the execution of relapse models. The lower the RMSE, the better the model’s performance. The blue line shows the average RMSE across the validation datasets for each iteration. The shaded blue ranges speak to the standard deviation of RMSE, demonstrating the extent of conceivable RMSE values for each iteration. Initial Diminish in RMSE: Within the early stages of training, as the number of cycles increases, the model’s errors decrease. This shows that the show is rapidly learning patterns from the information within the beginning stages. Plateau or slight increase in RMSE after a certain number of iterations: After a certain point, the model’s error may level off or even slightly increase. This phenomenon is called overfitting. At this point, the model is fitting the training data too closely and may perform poorly on new, unseen data. Standard Deviation of RMSE: The shaded ranges demonstrate that as the number of cycles increases, the variety in mistakes over the approval datasets decreases. This recommends that the show is coming to more steady comes about with each cycle26,84.

Fig. 8
figure 8

Cross-Validation RMSE Progression During Yandex Boosting Model Training.

Analyzing model performance through learning curves

The learning curve (Fig. S5 in the supporting information file) is an effective tool for evaluating machine learning model performance, illustrating how the model’s error, often measured by RMSE, changes with increasing training iterations (Additional materials are provided in the supplementary file)26,84.

Feature importance and influence on model predictions

The SHAP plotting system displays authentic results about feature elements that affect model outcomes, while X2 and X3 prove to be the prime influential variables in process optimization. The direction of impact demonstrates that the removal efficiency increases when the values of these features rise because all bars face in a positive direction. Model Transparency provides clear models for users to follow input–output system relationships so they can modify conditions for better removal efficiency, especially by focusing on significant features. The SHAP analysis demonstrates both feature significance level and effect direction patterns on removal efficiency for scientists to optimize their systems85.

The SHAP Rundown Plot (Fig. 9) may be an effective instrument for deciphering machine learning models. This plot appears how each input highlight influences the model’s forecasts. In other words, it makes a difference to us to get it which highlights are more critical and how they impact the model’s yield. Interior of the given plot, each bar talks to a highlight, and the length of the bar outlines the noteworthiness of that consolidated interior illustrated. The heading of the bar shows whether the highlight is a positive or negative influence on the model’s yield. Highlights X2 and X3 are more fundamental, as the length of their comparing bars is more unmistakable. This suggests that changes in these highlights have a greater influence on the model’s desires. Since all the bars are within the positive course, it can be concluded that expanding the esteem of these highlights by and large leads to an increment within the model’s yield. By comparing the lengths of the bars, we are going to choose the relative noteworthiness of each highlight compared to the others. In this illustration, including X2 has the most prominent effect on the model’s yield, followed by X3, and at long last X1 and X473.

Fig. 9
figure 9

SHAP Summary Plot for Model Interpretation.

X2 (pH) plays a pivotal role in the adsorption and degradation efficiency due to its influence on both the surface charge of the adsorbent and the ionization state of the target contaminant (in this case, the antibiotic). The removal efficiency shows a peak at pH 5.37 because excessive H⁺ ions interfere with antibiotic molecule binding, while pH range success creates electrostatic repulsion effects. H⁺ ions compete directly at low pH levels, and repulsion occurs at higher pH levels. X3 (reaction time) governs the contact duration between the antibiotic molecules and active sites of the adsorbent. The SHAP results show that reaction times between 0 and 120 min produce the most efficient removal because this duration enables complete diffusion and adsorption equilibrium development, yet does not exceed a cost-efficient usage period. The SHAP interpretation method receives support through the foundations of mechanistic reasoning. The mechanism-based foundation proves support for the SHAP interpretation method. The X2 (pH) variable affects both ionization and electrostatic interactions and surface reactivity levels. The SHAP-based Time variable (X3) dictates molecular interaction rates and affects reaction speeds in the system85.

Particle swarm optimization

Optimization algorithm performance analysis, comparison of best fitness and mean fitness distributions in the optimization algorithm

The given chart (Fig. 10a) outlines the variations in two imperative parameters amid the execution of an optimization calculation: Best Wellness alludes to the most elevated esteem accomplished by the objective work in each emphasis, speaking to the most excellent arrangement found by the calculation at that point. Cruel Wellness alludes to the average value of the objective work for the complete population in each cycle, showing the by and large performance of the population in each cycle. The blue line shows the changes in best wellness over the cycles. For the most part, this line is anticipated to move upward as the calculation endeavors to find the ideal arrangement. The ruddy line speaks to the changes in cruel wellness over the cycles. This line is additionally expected to move upward as the population advances toward superior arrangements. In this chart, mean wellness increments but shows more changes compared to the most excellent wellness. On the off chance that the calculation focalizes toward an ideal arrangement, both the finest wellness and cruel wellness lines ought to stabilize at a consistent value, and their variances ought to diminish. This chart shows that the calculation is advancing toward merging, but has not, however, come to a steady arrangement. Variations in both lines demonstrate irregular changes within the populace, which might result from the stochastic nature of the calculation or the nearness of numerous local optima.

Fig. 10
figure 10

Variations in Best and Mean Fitness During Algorithm Iterations (a); Violin Plot of Best Fitness and Mean Fitness Distributions (b).

A violin plot (Fig. 10b) may be a visual strategy for showing the dissemination of a numerical variable over diverse groups. In this plot, each gatherer is talked to by a violin, where the thickness outlines the thickness of the information in that locale. The center line of the violin talks to the center of the data, and the dashed line inside it shows the extent between the essential and third quartiles. Within the given plot, there are two bunches: Best Wellness, which speaks to the finest wellness in each cycle of the optimization calculation, and Cruel Wellness, which speaks to the average fitness in each cycle. The violin for Best Wellness is more extensive, demonstrating a more prominent spread of information within this bunch, meaning the leading wellness in each cycle can shift. The violin for Cruel Wellness is smaller, recommending less inconsistency, as mean wellness is regularly more steady. The center line of the violin for Best Wellness is higher than the center line for Cruel Wellness, showing that, on average, the leading wellness is higher than Cruel Wellness. Usually, since the leading wellness is continuously more prominent than or breaks even with to the cruel wellness. The amplification between the essential and third quartiles is greater for Best Wellness, demonstrating a more prominent spread around the middle. Based on this, it can be concluded that the finest wellness has a more noticeable spread than the brutal wellness, and on typical, the finest wellness is higher. The shape of the violins recommends that the data dissemination in both bunches may not be symmetrical30,58,80,86,87.

Analysis of variable distributions in the dataset

This boxplot (Fig. S6 in the supporting information file) charts the development of components X1, X2, X3, and X4 interior the dataset, with higher values capped at 100. For each variable, the box talks to the interquartile expand (IQR), which is the center of 50% of the information. The length of the box outlines the development between the first quartile (Q1) and the third quartile (Q3). The taller the box, the more pivotal the spread of the information for that variable. The center line interior of the box talks to the center of the information, with half the information over and half underneath in this regard. The lower edge of the box talks to Q1, where 25% of the values drop underneath it, though the upper edge talks to Q3, where 75% of the values drop underneath it. Focuses are plotted exterior of the box, and bristles are considered exceptions. The hairs expand up to 1.5 times the IQR. Exceptions may demonstrate bizarre values or blunders within the information. The information dissemination in X1 shows up moderately symmetrical with negligible spread, and the middle is close to the center of the box. The information dispersion in X2 is to some degree symmetrical but encompasses a more noteworthy spread compared to X1, with an exception unmistakable at the distant right of the plot. The conveyance in X3 is somewhat left-skewed, meaning a bigger parcel of the information is to the correct the middle and the spread of information in this variable is more noteworthy than within the past two factors. The distribution in X4 is profoundly left-skewed, with most information concentrated on the cleared-out side of the box, and a few exceptions are watched on the proper side of the plot57,63,64.

Correlation analysis of variables and their relationships

A relationship organized (Fig. 11) may well be a square table that shows the quality and heading of the coordinate relationship between particular factors. Each cell in this course of action talks approximately the relationship coefficient between two specific components. The relationship coefficient may well be a number between -1 and 1, where 1 appears as a come full circle positive relationship, meaning that as one variable increases, the other variable as well increases, -1 appears as an idealized negative relationship, meaning that as one variable increments, the other reduces, and illustrates no coordinate relationship between the two factors. Within the given relationship framework, each thrust and column compares to a variable (X1, X2, X3, X4, and Removal (%)). The numbers interior the cells conversation to the relationship coefficients between the comparing factors. Hotter colors (blushing) outline a solid positive relationship, whereas cooler colors (blue) outline a solid negative relationship. Colors closer to white recommend no or frail relationship. The corner-to-corner line over the framework appears value of 1, demonstrating that each variable is impeccably related to itself. The relationship network is symmetric, meaning the esteem in cell (i, j) breaks even with the esteem in cell (j, i). Based on what comes about within the chart, highlights 2 and 3 have the most elevated positive relationship with Removal, while feature 4 has the strongest negative relationship with Removal. Also, the most grounded negative intelligence among the autonomous factors is between highlights 2 and 4 and highlights 3 and 452,88,89.

Fig. 11
figure 11

Correlation Matrix of X1, X2, X3, X4, and Removal.

Visualization of optimal parameter values

The histogram in Fig. 12 for the Ideal Course of action could be a straightforward device for visualizing the perfect parameter values. It presents the records of the parameters beside their comparison to ideal results, utilizing intervals to distinguish the most excellent setup within the demonstration. Comparing the course of activity of central values with the target gives bits of information about the changeability of the data. The chart centers on four parameters, labeled X1 through X4, and shows how their ideal comes about. It moreover makes a difference to clarify why certain parameters are more suitable by laying out the esteem extended for each variable. The event hub speaks to the computed values, appearing as the vector cruel of the dataset, whereas the vertical pivot outlines the extent related to the numerical values of each parameter. For the given dataset, the perfect parameter values are X1 = 1.0, X2 = 5.37, X3 = 120, and X4 = 0.002. This visualization moves forward the interpretability of the appearance and makes a difference in refining parameter assurance90.

Fig. 12
figure 12

Histogram of Optimal Parameter Configuration for the Model.

Long-term opportunities of integrating machine learning with water treatment systems for real-time monitoring and automated performance optimization

This work presents critical long-term advantages for real-time monitoring and adaptive system control in water purification technologies through its proposed integration of machine learning models with experimental water treatment systems in “A Machine Learning-Driven Framework for OTC Antibiotic Removal Using CuO-M-CAB (Additional materials are provided in the supplementary file).

Ethical and practical challenges and opportunities in introducing nanomaterial-based water treatment technologies in impoverished regions

Nanomaterial-based water treatment technologies face difficulties in poor resource-limited areas because they need thorough analysis of moral and practical considerations to secure sustainable and ethical solutions (Additional materials are provided in the supplementary file).



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *