Machine learning analytical solubility parameters of pharmaceutical co-crystals in enhancing drug properties for advanced drug manufacturing.

Machine Learning


The regression task was performed using three different machine learning models: KRR, MLR, and OMP. Performance for each model was evaluated with three different outputs. \(\:{\updelta \:}} _{d} \), \(\:{\delta \:} _{p} \)and \(\:{\updelta \:}} _{h} \). The performance metric considered was R2 Score, Monte Carlo Cross-validation (CV) mean score, Monte Carlo CV standard deviation, RMSE, and MAE. This study employs splitting of train tests from 80 to 20. Cross-validation analysis showed that the effect of this random segmentation on the final outcome is negligible. The average R² score for the KRR model across the three outputs improved by 6% following hyperparameter optimization, while the other two models improved by about 4%. This highlights the important role of optimization in improving model performance.

For all outputs, recursive function rejection (RFE) was employed for function selection. Various numbers of features were evaluated using this approach, and then the mean of r was evaluated.2 The scores associated with the corresponding feature were documented (see Figure 1). The best feature counts were determined using Figure 1 using the average r value2 It's the best for each input. Finally, for each output, the selected feature set with various numbers is shown in Table 1. Detailed statistical parameters for all models and outputs are shown in Tables 2, 3, and 4, comparing the prediction accuracy of all models in the calculation of the three Hansen solubility parameters. For all outputs,2 The values achieved in this study are greater than those reported by Mahdi and Obaidra10 This illustrates the advantage of the model developed in this study to correlate Hansen solubility parameters.

Figure 1
Figure 1

Results of RFE function selection.

Table 1: Final selection function for each output.
Performance Metrics in Table 2 \(\:\left({\delta \:}_{d}\right)\left(\left(mp {a}^{1/2}\right)\right)\).
Performance Metrics in Table 3 \(\:\left({\delta \:}_{p}\right)\left(\left(mp {a}^{1/2}\right)\right)\).
Performance Metrics in Table 4 \(\:\left({\delta \:}_{h}\right)\left(\left(mp {a}^{1/2}\right)\right)\).

for \(\:{\updelta \:}} _{d} \)MLR model showed the largest r2 Additionally, the lowest RMSE and MAE prove to be the most accurate models for estimating this Hansen parameter (see Table 2). The higher Monte Carlo CV mean score and standard deviation of MLR suggest a relatively stable performance with some degree of variability across different splits of the data.

It should be pointed out that Hansen solubility parameters can be used to predict the possible formation of solubility parameters (solubility parameters).\(\:\delta\:\)) is used to find the affinity between API and Coformer in terms of molecular interaction10,13. Hansen developed three-dimensional solubility parameters based on specific energies including hydrogen bonds (\(\:{\updelta \:}} _{h} \)), molecular polarity (\(\:{\updelta \:}} _{p} \)), and atomic dispersion (\(\:{\updelta \:}} _{d} \)))26. These parameters can be used to predict interactions between APIs and Coformers, and to screen a wide range of chemical Coformers for successful formation of pharmaceutical crystals.

Figures 2, 3, and 4, and 5 show comparisons of actual and predicted values. \(\:{\delta \:} _{d} \) The importance of function via MLR is similar. Figure 3 highlights the top 10 features that have the most impact \(\:{\delta \:} _{d} \)which inputs have the most influence on prediction. You can see which functional groups and parameters affect the Hansen solubility parameters most \(\:{\updelta \:}} _{d} \). Area with units of Å (surface area of molecules)2 It is one of the important parameters that affect atomic dispersion.10. Furthermore, in the intermolecular forces, van der Waals (eVDW) interactions play an important role in modifying the Hansen parameters of cocrystals. Figure 4 shows the interplot of the four features \(\:{\delta \:} _{d} \)selected from functional analysis. Finally, Figure 5 shows the 3D representation of the estimated \(\:{\delta \:} _{d} \) The pair region and N provide a visual representation of the interaction between the co-crystal and the coformer via the atomic dispersive energy27.

Figure 2
Figure 2

Actual vs. predicted value \(\:{\delta \:}_{d} \left(mp {a}^{1/2} \right)\).

Figure 3
Figure 3

The importance of the top 10 features \(\:{\delta \:}_{d} \left(mp {a}^{1/2} \right)\).

Figure 4
Figure 4

Partial dependencies of the four most important features \(\:{\delta \:} _{d} \).

Figure 5
Figure 5

3D plot of \(\:{\delta \:}_{d} \left(mp {a}^{1/2} \right)\) As a function of the two most important functions. Draw using Python 3.8, Open source software obtained from:https://www.python.org.

for \(\:{\updelta \:}} _{p} \)KRR is significantly better than both MLR and OMP in terms of R2RMSE, and MAE. The low standard deviation of the Monte Carlo CV indicates consistent performance across different data splits. This consistency makes KRR the most reliable model for predicting \(\:{\updelta \:}} _{p} \).

Figure 6 shows a comparison of actual and predicted values of \(\:{\updelta \:}} _{p} \)shows the prediction accuracy of the model. Figure 7 highlights the top 10 features that have a big impact \(\:{\delta \:} _{p} \)reveal the most influential inputs in forecasts. Figure 8 shows the influence of four inputs on variations of \(\:{\delta \:} _{p} \)reveals the parabolic variation of this output. compared to \(\:{\delta \:} _{d} \)variations \(\:{\delta \:} _{p} \) The selected function is nonlinear due to the nonlinear function of molecular polarity (\(\:{\updelta \:}} _{p} \)) Use the selected function. Finally, Figure 9 shows the 3D analysis \(\:{\delta \:} _ {p} \:\)Provides a visual representation of the most important inputs and interactions with these features, and their collective influence. \(\:{\delta \:} _ {p} \:\)prediction.

Figure 6
Figure 6

Actual vs. predicted value \(\:{\delta \:} _{p} \left(mp {a}^{1/2} \right)\).

Figure 7
Figure 7

The importance of the top 10 features \(\:{\delta \:} _{p} \left(mp {a}^{1/2} \right)\).

Figure 8
Figure 8

Partial dependencies of the four most important features \(\:{\delta \:} _{p} \).

Figure 9
Figure 9

3D plot of \(\:{\delta \:} _{p} \left(mp {a}^{1/2} \right)\) As a function of the two most important functions. Draw using Python 3.8, Open source software obtained from:https://www.python.org.

Similarly, \(\:{\delta \:} _{h} \)KRR revealed the best r2 Score and minimum RMSE and MAE. Despite the slightly higher standard deviation of the Monte Carlo CV compared to other models, the overall performance of KRR is the best option to predict \(\:{\delta \:} _{h} \). Figure 10 shows a comparison of actual and estimated values of \(\:{\updelta \:}} _{h} \)demonstrates the prediction accuracy of the model. Figure 11 highlights the top 10 features that have a big impact \(\:{\delta \:} _{h} \)reveal the most influential inputs in forecasts. Figure 12 shows the partially dependent plot of the four most important features above \(\:{\delta \:} _{h} \). Finally, Figure 13 shows the 3D surface \(\:{\delta \:} _{h} \) The features of interest and critical features provide a visual representation and its impact of interactions with these inputs. \(\:{\delta \:} _{h} \) prediction. Hansen parameters of \(\:{\updelta \:}} _{h} \) The representation of the hydrogen bond interaction between the API molecule and the cofosar is heavily influenced by hydroxy groups and other hydrogen bond energies, as shown in Figure 11, and is correctly captured by the ML model. These observations of variations in Hansen parameters have already been reported10 Use machine learning analytics that match the results.

These data and results obtained in this study using an optimized ML model can be used to estimate Hansen solubility parameters to understand which coformers provide a stronger interaction with the API for the formation of drug crystals. In fact, it is extremely important to analyze the molecular interactions represented by the three Hansen parameters to ensure the stability of the formed cocrystals.

Figure 10
Figure 10

Actual vs. predicted value \(\:{\delta \:}_{h}\left(mp {a}^{1/2}\right)\).

Figure 11
Figure 11

The importance of the top 10 features \(\:{\delta \:}_{h}\left(mp {a}^{1/2}\right)\).

Figure 12
Figure 12

Partial dependencies of the four most important features \(\:{\delta \:} _{h} \).

Figure 13
Figure 13

3D plot of \(\:{\delta \:}_{h}\left(mp {a}^{1/2}\right)\) As a function of the two most important functions. Draw using Python 3.8, Open source software obtained from:https://www.python.org.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *