Protein structure
In this study, NADH-ubiquinone (UQ) oxidoreductase, which is responsible for the Na+-pumping, was examined specifically. The protein is a hexameric complex consisting of six unique chains that are co-crystallized with korormicin. Korormicin was observed to bind to chain B of the protein structure and also interact with a portion of chain A, as depicted in Fig. 1(a). Biovia Discovery Studio79 revealed that korormicin formed several interactions with residues in chain B. Additionally, it formed a bond with a single residue (Trp337) in chain A, as depicted in Fig. 1(b). Therefore, both chain A and chain B were isolated for molecular docking along with korormicin for subsequent analysis. Furthermore, a prior study demonstrated that the inhibitors disrupt the UQ process by obstructing the interface between chain A and B of the protein16. Additionally, the residues that interact with korormicin within a distance of approximately 8 Å were used to construct the grid box for the docking process, as depicted in Fig. 1(b). As shown in Fig. 1(c), korormicin forms carbon-hydrogen bonds with residues Ile138 and Gly157, as well as conventional hydrogen bonds with Asn156 and Phe160 of chain B, highlighting key interactions within the binding pocket. The essential residues are likely engaging with the established inhibitor korormicin. Aurachin D was identified as a protein inhibitor. The binding location of the known inhibitors was determined utilising these two structures and subsequently utilised for molecular docking.

(a) Korormicin bound to the Chain A and B of Na+-pumping NADH-ubiquinone oxidoreductase, (b) residues around Korormicin (8 Å) with the protein, (c) interacting residues with Korormicin.
Compound library
Phytochemicals from each of the plant species, Berberis vulgaris and Hydrastis canadensis, were repossessed from the Natural Product Activity and Species Source Database (NPASS). Here, 20 phytochemicals were found from Berberis vulgaris and 23 phytochemicals from Hydrastis canadensis. A total of 43 phytochemicals were used further for investigation, as shown in the supplementary Table S1.
The 43 phytochemicals from Berberis vulgaris and Hydrastis canadensis were retrieved from the NPASS database and subjected to clustering using a logistic regression model implemented via the sklearn module in Python. Clustering was performed based on physicochemical features derived from Lipinski’s Rule of Five descriptors (molecular weight, hydrogen bond donors/acceptors, and logP). This process produced four distinct clusters, from which the largest, comprising 28 compounds, was selected for downstream virtual screening, owing to its high classification accuracy (0.94).
Although certain structural similarities are present within this cluster, this is expected due to the common biosynthetic origins of the compounds and the characteristics of the physicochemical attributes employed for grouping. The clustering process prioritized drug-likeness based on these features rather than purely structural diversity, which may have resulted in the inclusion of compounds with similar scaffolds. Thus, the cluster with the highest population, with 28 compounds, was used for virtual screening along with the two controls.
Virtual screening
Post clustering, 28 phytochemicals were virtually screened against the target protein. The functionality of chains A and B of the protein has been used for the molecular docking. Two controls, Korormicin and Aurachin D, were also docked in the same binding pocket with the same protocol. The compounds that exhibited favourable hydrogen bonding and docking scores were selected for additional analysis via MD simulation. As shown in Table 1, the top 15 compounds that showed better performance than the controls were further used for interaction analysis. The top 15 compounds, along with the two controls, had high binding affinity with the binding score in the range − 8 kcal/mol to −9.6 kcal/mol. The 2D structure of the top 15 compounds are shown in the supplementary Figure S1.
The Tanimoto similarity heatmap, as shown in supplementary Figure S2, quantitatively illustrates the diversity of the selected compounds. While certain compounds exhibit higher similarity (values closer to 1), there is still a wide range of Tanimoto similarity values, from as low as 0.06 to 1.00. This variation demonstrates that although some compounds share core structural features, significant differences exist across the selection. Compounds with low similarity scores (< 0.2) remain in the final compound set, ensuring diversity in their chemical scaffolds. This clustering method, which uses Lipinski’s Rule of Five descriptors along with molecular fingerprints, maintains a balance between structural variety and computational efficiency for virtual screening. In addition, while clustering prioritized drug-likeness based on physicochemical properties, the heatmap provides further evidence of the structural diversity, especially when considering the similarity between the controls and selected compounds. This approach ensures that despite some compounds being structurally similar, the overall diversity and representativeness of the library are maintained.
It was found that among the two controls, with the residues Lys54 and Phe160 of the chain B, korormicin was observed to be have two hydrogen bonds, as can be seen in Fig. 2(b). However, Aurachin D was not found to have hydrogen bonds; thus, it was not used further for investigation. Out of the top 15 compounds, exhibited hydrogen bonds with the protein residues; however, compounds 197835, 371942, 122623, and 638024 showed the maximum number of hydrogen bonds, as illustrated in Fig. 2. Here, 197835 and 371942 showed identical interaction and the same binding score of −9.6 kcal/mol; however, 197835 showed a lower average binding score than 371942.
The structures of the top 15 docked compounds were analyzed and are presented in the revised Table 1. Among them, compounds 197835 and 371942 share identical atom connectivity but differ in stereochemistry, as confirmed by InChI analysis shown in supplementary Table S2. Specifically, 197835 has both stereocenters defined with configurations at C18 (−) and C19 (+), while 371942 has one stereocenter (C18) unspecified and the other (C19) with the opposite configuration (−). Despite these stereochemical differences, both compounds exhibit identical docking scores (− 9.6 kcal/mol) and interactions due to their shared molecular framework.
The structural comparison in Figure S1 illustrates the stereochemical differences between compounds 197835 and 371942. Despite these differences, both compounds adopt similar binding orientations within the target’s active site, contributing to their identical docking scores. The images highlight how the stereochemistry at the chiral centers (C18 and C19) of 197835 and 371942 influences their hydrogen bonding patterns, yet the overall molecular framework remains similar.
This explains the reason they show identical binding energy, while their docking poses differ, the overall interactions with the protein target are similar. In the docking results, 197835 forms five hydrogen bonds (Lys54, Gly141, Asn156, Gly158, Phe160), whereas 371942 forms three (Lys54, Glu157, Phe160). The difference in hydrogen bonds arises from the subtle stereochemical variations, particularly the configurations of the chiral centers, which influence the precise orientation of the ligands in the active site. Based on its more favorable interaction profile, 197835 was selected for further studies. In contrast, compound 470416 was excluded due to its lower number of hydrogen bonds (two), indicating weaker interaction with the target site. The 3D representation of the interactions between the protein and these ligands is shown in Fig. 2(a), where all three compounds occupy the same active site as the control. Thus, 197835, 122623, and 638024 were selected along with Korormicin as a control for further studying the stability of the complexes.

(a) 3D representation of the interaction between protein and ligand, and 2D representation of interaction analysis for the compounds (b) Control (Korormicin), (c) 197835, (d) 122623, (e) 638024.
Ligand efficiency and estimated Inhibition constants
The calculated ligand efficiency (LE) and estimated inhibition constants (Ki) for the control compound and the top docked ligands are summarized in Table 2. Among the analyzed ligands, compound 197835 exhibited the most favorable ligand efficiency (− 0.343 kcal/mol/atom) and the lowest estimated Ki (90.97 nM), suggesting stronger binding affinity relative to the control and other candidates. The control compound 9845752 (Korormicin) displayed the weakest binding with an estimated Ki of 1356.45 nM and the lowest ligand efficiency (− 0.258 kcal/mol/atom). The remaining compounds, 638024 and 122623, demonstrated intermediate binding affinities with estimated Ki values of 817.28 nM and 492.42 nM, respectively. These calculations provide a comparative assessment of the binding efficiencies and potential inhibitory strengths of the identified ligands.
Electrostatic surface potential analysis
Electrostatic surface potential analysis was conducted for the control complex (Korormicin) and the top docked ligand-bound complexes (197835, 122623, and 638024) using APBS in PyMOL. The molecular surfaces were colored according to the electrostatic potential, ranging from − 5.0 kT/e (red) to + 5.0 kT/e (blue), enabling visualization of charge distribution at the binding interface as shown in Fig. 3. All complexes demonstrated overall positive electrostatic potential at the binding site, reflecting the presence of positively charged residues within the pocket. Compared to the Korormicin-bound complex, complexes 197835 and 122623 exhibited an expansion and intensification of positive potential (increased blue regions), suggesting that these ligands may stabilize additional polar or charged interactions within the binding pocket. Complex 638024, while still maintaining positive potential, displayed a more moderate distribution, closely resembling the electrostatic environment observed in the control complex. These ligand-specific variations in electrostatic surface potential may influence binding interactions, orientation, and stability of the complexes. Overall, these observations suggest that ligand binding induces localized electrostatic rearrangements, which may contribute to binding affinity and specificity.

Electrostatic surface potential analysis of the control and ligand-bound complexes generated using APBS in PyMOL. Complex with (a) Control (Korormicin), (b) 197835, (c) 122623, (d) 638024.
ADMET properties
Three compounds (638024, 197835, and 122623), along with a control drug, Korormicin, had their physicochemical and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties computed using SwissADME43 and the ProTox-3.0 server49. Table 3 lists the properties in detail for the four compounds. Here, 638024, 197835, and 122623 fall within the range of molecular weights (285.34–330.38.34.38 Da) that are favourable for drug-like compounds (500 Da). The upper drug-like threshold is being approached by korormicin, which has the highest MW at 433.58 Da. The maximum number of rotatable bonds (7) is found in compound 122623, suggesting that it has greater molecular flexibility. With sixteen rotatable bonds, korormicin is even more malleable. With the exception of korormicin, which has two, all of the compounds have few H-bond donors. Among the many compounds that can serve as H-bond acceptors, compound 197835 has the most, with seven. Korormicin has the highest Molecular Refractivity (MR), which is in line with its higher MW, with a range of 85.47 to 123.21. Drug permeability is indicated by the values of Topological Polar Surface Area (TPSA). The oral bioavailability threshold remains below 140 Ų for all drugs, including Korormicin. The higher TPSA of compounds 122623 and Korormicin indicates that they are more hydrophilic. It was observed that 122623 has an iLOGP (lipophilicity) value of 2.75, while Korormicin has an iLOGP (lipophilicity) value of 4.43. All the compounds showed good solubility. High gastrointestinal absorption indicates good oral availability for all drugs, including the control. A score of 0.55 indicates that all compounds have moderate bioavailability. It showed that these compounds do not include any substructures that are known to cause false positives in drug screening, as no PAINS warnings have been detected. Most synthetically accessible compounds are 638024 (2.92) and 122623 (3.33), whereas the least accessible is korormicin (5.67). Compounds 638024, 197835, and 122623 are classified as Toxicity Class 4 (acceptable toxicity) according to the Hazard Communication Standard (HCS) of the Globally Harmonised System of Classification and Labelling of Chemicals (GHS). Class 6 toxicity indicates very minimal or no toxicity for korormicin. All of the compounds exhibit drug-like characteristics, including excellent absorption and bioavailability. The bioavailability score predicted by SwissADME was 0.55 for all compounds, as they fulfilled standard drug-likeness criteria. Additional human intestinal absorption (%) values were retrieved from pkCSM to further characterize oral bioavailability. The exceptional solubility, synthetic accessibility, and tolerable toxicity of compounds 638024 and 122623 differentiate them. The compounds outperform the control in several respects, such as solubility and ease of production, as shown by these features.
ADMET validation using PkCSM
Pharmacokinetic predictions were evaluated through ADMET profiling using pkCSM,. All compounds demonstrated high intestinal absorption (> 90%), with compound 197835 exhibiting the highest absorption rate among the evaluated ligands. Favorable Caco-2 permeability values were observed for all compounds, with 638024 showing the highest permeability, suggesting efficient intestinal transport. The volume of distribution (VDss) was acceptable across the ligands, with the control exhibiting comparatively lower distribution than the top-ranked compounds. Predictions for blood-brain barrier (BBB) permeability indicated limited CNS penetration for most ligands; however, 638024 and 122623 displayed moderate BBB penetration, whereas the control and 197835 demonstrated lower permeability. Toxicity assessments consistently predicted no AMES mutagenicity for any compound, while hepatotoxicity was observed only for the control, 638024, and 122623. The high level of concordance across multiple predictive platforms underscores the favorable pharmacokinetic and toxicity profiles of the selected ligands, particularly highlighting compound 197835 as a promising candidate. A detailed summary of the pkCSM-based ADMET predictions is presented in Table 4.
Density functional theory (DFT) analysis
In order to compare the electrical characteristics of three compounds (197835, 638024, and 122623) against the control, density functional theory (DFT) calculations were conducted. The four compounds’ DFT analyses are listed in Table 5. The stability of the compounds is reflected in their total energy. With an energy of −1389.14 Hartree, Korormicin is the most stable of the compound. At −1302.05 Hartree, 197835 is the most stable of the compounds, with 122623 and 638024 following at −1095.42 and − 927.762 Hartree, respectively. Its HOMO energy indicates a molecule’s electron-donating capabilities. Because of its extremely low HOMO value of −0.13367 eV, korormicin is the most unlikely of all compounds to transfer electrons. Because of its high HOMO energy, compound 638024 is more likely to donate electrons than any other (−0.11323 eV). Acquiring electrons is indicated by the LUMO energy, which stands for the Lowest Unoccupied Molecular Orbital. The lowest electron-accepting capability relative to the others is demonstrated by 122623, which has the highest LUMO energy at 0.074703 eV. Since 638024 (0.038739 eV) and Korormicin (0.057417 eV) have lower LUMO values, they are more effective electron acceptors. A molecule’s polarity can be measured by its dipole moment. The most polar compound in the set is compound 122623, which has the most significant dipole moment (4.469887 Debye). With a dipole moment of only 1.03579 Debye, korormicin is clearly the least polar of the bunch. Compounds 197835 (3.13322 Debye) and 638024 (3.027657 Debye) exhibit a moderate degree of polarity. Here, 122623 may be more active in polar biological settings due to its extreme polarity, while 197835 achieves equilibrium by its moderate polarity, balanced HOMO-LUMO energies, and satisfactory stability.
The HOMO–LUMO gap, calculated as the energy difference between LUMO and HOMO levels, was also determined to assess the electronic stability and chemical reactivity of the compounds. A lower HOMO–LUMO gap generally indicates higher chemical reactivity and lower kinetic stability, while a larger gap suggests greater stability and lower reactivity. Among the tested compounds, 638024 exhibited the lowest HOMO–LUMO gap (0.152 eV), suggesting relatively higher reactivity compared to the others. In contrast, 122623 and Korormicin displayed slightly higher gaps (0.193 eV and 0.191 eV, respectively), indicating greater electronic stability. The compound 197835 showed an intermediate gap of 0.178 eV, balancing stability and reactivity. These differences may contribute to variations in interaction strength, binding adaptability, and biological activity among the studied compounds. The dipole moment, which correlates with molecular polarity and membrane permeability, was highest for 122623 (4.4699 Debye), indicating higher polarity and potentially reduced membrane permeability. In contrast, Korormicin showed the lowest dipole moment (1.0358 Debye), suggesting better lipophilicity and membrane diffusion, while 638024 and 197835 exhibited moderate dipole moments (~ 3.0 Debye), indicating balanced permeability characteristics.
ML-based interaction properties
Further, the compounds were used for interaction properties analysis using PSICHIC (PhySIcoCHemICal graph neural network), a machine learning (ML)-based tool specifically designed for protein-ligand interaction prediction48. PSICHIC directly decodes interaction signatures from sequence data, utilising physicochemical constraints. It efficiently evaluated the binding affinity, antagonistic/agonistic potential, and nonbinding probabilities of the selected compounds through the implementation of ML-driven predictive modelling. PSICHIC also demonstrated a high level of accuracy in predicting functional effects with a score of 0.96. On large-scale benchmarking against the PDBBind v2016 and v2020 test sets, PSICHIC achieved binding affinity prediction performance with root mean squared error (RMSE) between 1.31 and 1.34 log units and Pearson correlation coefficients ranging from 0.71 to 0.79, demonstrating robust regression performance for binding free energy estimation48. Notably, error analyses showed no significant dependence of prediction error on protein structure resolution (P = 0.711), further supporting the model’s stability across diverse input qualities. Additionally, experimental validation of model capabilities has been conducted to identify potential pharmaceuticals. Table 6 lists the predicted interaction properties of the three compounds (638024, 197835, and 122623) and the control compound Korormicin.
The intensity of the interaction with the target protein is represented by the predicted binding affinity (pKD/pKI). Here, it was found that 197835 has the maximum predicted binding affinity (4.81), which suggests that it has a stronger binding potential than Korormicin (4.79) and other compounds. The binding affinity of 122623 (4.60) and 638024 (4.48) is marginally lower. Predicted antagonist denotes the likelihood that the compound will function as an inhibitor of receptor activity. The antagonist probability of Korormicin is 0.08, which is quite similar to that of 197835 (0.09). The antagonist probabilities of 122623 (0.02) and 638024 (0.01) are low, indicating a diminished potential for receptor inhibition. 197835 demonstrated a robust binding potential. The strongest interaction with target proteins is indicated by the maximum binding affinity (4.81). It exhibited moderate antagonist activity (0.09), rendering it a viable candidate for further investigation. The lowest binding affinity (4.48) of 638024 indicates that it has limited efficacy in protein-ligand interactions.
Here, 122623 exhibited a marginally higher binding affinity (4.60) than 638024, but it was lower than Korormicin and 197835. It exhibited a decreased antagonist potential (0.02), which in turn reduced the potential inhibitory effects. Korormicin (Control) exhibited a robust interaction with target proteins, demonstrating a binding affinity that was comparable to 197835 (4.79). It exhibited moderate antagonist potential (0.08), which suggests the potential for receptor inhibition. 197835 is the most promising compound as a result of its balanced interaction profile and high binding affinity. An intermediate option is offered by 122623, which has moderate binding. Korormicin, which functions as the control, exhibits a well-balanced profile and is a dependable reference point. Further, these compounds 638024, 197835, and 122623 were used for additional research.
In this study, both docking scores and ML-based PSICHIC predictions showed broadly similar binding affinity trends for the selected compounds. Minor differences between methods may arise due to their distinct algorithms; docking focuses on static binding poses and energetics, while PSICHIC leverages protein sequence embeddings, physicochemical constraints, and learned interaction features to evaluate ligand binding. This study emphasises the effectiveness of ML-based methodologies in accelerating the prediction of drug-target interactions, improving binding affinity calculations, and refining the drug identification process with increased precision and efficiency.
Molecular dynamics analysis
RMSD
In addition to the control, the three top-ranked hit compounds were selected for molecular dynamics (MD) simulations. Figure 4(a) presents the backbone RMSD (Cα atoms) of the Na⁺-NQR protein in both apo form and ligand-bound complexes over the 300 ns simulation. The apo protein exhibited higher RMSD fluctuations, stabilizing around 0.35–0.45 nm, indicating greater flexibility in the absence of ligand. Upon ligand binding, the protein RMSD values decreased, demonstrating ligand-induced stabilization of the protein conformation.
Among the ligand-bound systems, the control (Korormicin) and 638024 complexes showed the lowest RMSD fluctuations, stabilizing between 0.2 and 0.3 nm throughout the simulation. The 122623-bound complex exhibited slightly higher RMSD values (approximately 0.3–0.4 nm) but remained stable after the initial equilibration period. The 197835 complex showed the highest backbone RMSD among all ligand-bound systems, fluctuating around 0.4–0.45 nm, suggesting slightly increased flexibility compared to the control.
The ligand RMSD profiles, shown in Fig. 4(b), revealed distinct stability trends for the ligands during binding. Korormicin and 122623 exhibited the most stable binding modes, maintaining RMSD values below 0.5 nm, indicating strong and consistent interactions with the protein binding pocket. Compound 638024 displayed moderate fluctuations ranging between 0.7 and 1.0 nm, while compound 197835 exhibited the highest ligand RMSD values, reaching up to 1.5 nm. The larger RMSD fluctuations observed for 197835 suggest greater internal flexibility or possible minor rearrangements within the binding cavity during the simulation.
Overall, the results demonstrate that Korormicin and 122623 form stable complexes with Na⁺-NQR, while 638024 and 197835 exhibit higher flexibility. The stable RMSD values of 122623, comparable to the control, further support its potential as a promising hit compound for targeting Na⁺-NQR. Comparing the two plots, it’s evident that the control provides the most stable interaction with the protein both as a ligand and when the protein is bound to it. Molecules 197835 and 638024 are less stable in both the protein-bound state and as free ligands, as indicated by their higher RMSD values. Overall, the control molecule had the most stable interaction with the protein, which is likely the reason for selecting it as the control. Similar to the control, 122623 could be considered more promising drug candidates due to lower and more consistent RMSD values, indicating a more stable interaction with the target protein. All raw data files, including GROMACS.xvg files, simulation inputs, and processed outputs, were deposited in a publicly accessible repository (Zenodo, https://doi.org/10.5281/zenodo.16835099) for transparency and reproducibility.

(a) RMSD of protein backbone atoms when bound to the ligands and apo form (b) RMSD of the ligands, Control (Korormicin), 197835, 122623 and 638024 when bound to the protein for 300 ns.
Structural dynamics analysis
The conformational changes of the Na⁺-NQR protein during the 300 ns molecular dynamics simulations were evaluated for both the apo state and ligand-bound complexes (Fig. 5). The apo protein (green) demonstrated noticeable flexibility and broader structural fluctuations throughout the simulation. In contrast, all ligand-bound complexes (cyan) exhibited relatively stabilized conformations by the end of 300 ns. Among the complexes, 197835 and 122623 displayed more compact and stabilized binding pockets, suggesting stronger protein-ligand interactions that potentially restrict excessive conformational fluctuations. Notably, the complex with 638024 maintained structural integrity but showed slightly greater surface flexibility compared to 197835 and 122623. The ligand binding contributed to stabilizing the active site conformation, as reflected in the reduced global fluctuations compared to the apo structure.
In Fig. 5, the 3D visualizations show clear differences in the protein’s structural dynamics between the initial and final poses. The apo state (green) demonstrates significant structural rearrangements and flexibility, indicative of an unbound, dynamic protein. Meanwhile, the ligand-bound complexes (cyan) exhibit more rigid structures, with 197835 and 122623 inducing tighter and more stabilized binding pockets. 638024, while stabilizing the structure, retains a degree of flexibility, particularly in the surface regions, suggesting a less rigid but still functionally relevant binding interaction.
These findings suggest that ligand binding plays a significant role in stabilizing the protein’s conformation, particularly in the binding site, which could have functional implications for the protein’s activity and interactions with other biomolecules.

3D conformation of the protein at the 0 ns (initial pose) and 300 ns (final pose) when in apo state and bound to the ligands Control (Korormicin), 197835, 122623 and 638024.
