Deep learning based attention enhanced phylogenetic radial basis function networks (AE-PRBFN) for genomic codon usage classification across species

Hershberg, R. & Petrov, D. A. Selection on codon bias. Annu. Rev. Genet. 42(1), 287–299 (2008).

Article
CAS
PubMed

Google Scholar

Plotkin, J. B. & Kudla, G. Synonymous but not the same: The causes and consequences of codon bias. Nat. Rev. Genet. 12(1), 32–42 (2011).

Article
CAS
PubMed

Google Scholar

Murray, E. E., Lotzer, J. & Eberle, M. Codon usage in plant genes. Nucleic Acids Res. 17(2), 477–498 (1989).

Article
CAS
PubMed
PubMed Central

Google Scholar

D. sequencing, assembly Barry Kerrie 5 Lucas Susan 5 Harmon-Smith Miranda 5 Lail Kathleen 5 Tice Hope 5 Schmutz (Leader) Jeremy 4 Grimwood Jane 4 McKenzie Neil 7 Bevan Michael W. michael. bevan@ bbsrc. ac. uk 7 k, G. analysis, annotation Haberer Georg 16 Spannagl Manuel 16 Mayer (Leader) Klaus 16 Rattei Thomas 17 Mitros Therese 6 Rokhsar Dan 6 Lee Sang-Jik 18 Rose Jocelyn KC 18 Mueller Lukas A. 19 York Thomas L. 19, and C. genomics Salse (Leader) Jerome 27 Murat Florent 27 Abrouk Michael 27 Haberer Georg 16 Spannagl Manuel 16 Mayer Klaus 16 Bruggmann Remy 13 Messing Joachim 13 You Frank M. 8 Luo Ming-Cheng 8 Dvorak Jan 8, “Genome sequencing and analysis of the model grass brachypodium distachyon,” Nature, 463 (7282), 763–768, 2010.

Libbrecht, M. W. & Noble, W. S. Machine learning applications in genetics and genomics. Nat. Rev. Genet. 16(6), 321–332 (2015).

Article
CAS
PubMed
PubMed Central

Google Scholar

Sheng, J., She, X., Liu, X., Wang, J. & Hu, Z. Comparative analysis of codon usage patterns in chloroplast genomes of five Miscanthus species and related species. PeerJ 9, e12173. https://doi.org/10.7717/peerj.12173 (2021).

Article
PubMed
PubMed Central

Google Scholar

Zou, J. et al. A primer on deep learning in genomics. Nat. Genet. 51(1), 12–18. https://doi.org/10.1038/s41588-018-0295-5 (2019).

Article
CAS
PubMed

Google Scholar

Chen, Y., Li, Y., Narayan, R., Subramanian, A., & Xie, X. (2016). Gene expression inference with deep learning. Bioinformatics, 32 (12), 1832–1839. https://doi.org/10.1093/bioinformatics/btw074

Article
CAS
PubMed
PubMed Central

Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (5998–6008).

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2 (2021).

Article
ADS
CAS
PubMed
PubMed Central

Google Scholar

Zhang, R., Liu, X., Wang, Y., Chen, Z. & Li, H. MCA framework: A novel multi-dimensional cooperative analysis framework for Alzheimer’s disease diagnosis. J. King Saud Univ. – Comput. Inf. Sci. 37(10), 353. https://doi.org/10.1007/s44443-025-00344-4 (2025).

Article

Google Scholar

Wang, H., Liu, Y., Zhang, X., Zhao, J., & Li, Q. (2025). LDSL framework: A lightweight dual-stream learning framework for wheat disease detection. Plant Methods, 21(1), 1–19. https://doi.org/10.1186/s13007-025-01455-9

Article
CAS

Google Scholar

Gardner, J. D., Baker, J., Venditti, C. & Organ, C. L. Phylogenetically informed predictions outperform predictive equations in real and simulated data. Nat. Commun. 16(1), 6130. https://doi.org/10.1038/s41467-025-61036-1 (2025).

Article
ADS
CAS
PubMed
PubMed Central

Google Scholar

Mauro, V. & Chappell, S. A. A critical analysis of codon optimization in human therapeutics. Trends Mol. Med. 20(11), 604–613 (2014).

Article
CAS
PubMed
PubMed Central

Google Scholar

S. Schnable et al., “The B73 maize genome: Complexity, diversity, and dynamics,” science, 326 (5956), 1112–1115, 2009.

Buhmann, M. D. Radial basis functions. Acta Numer. 9, 1–38 (2000).

Article
MathSciNet

Google Scholar

Orr, M. J. Regularization in the selection of radial basis function centers. Neural Comput. 7(3), 606–623 (1995).

Article
ADS

Google Scholar

Ghosh, J. & Nag, A. “An overview of radial basis function networks,.” In Radial basis function networks 2: new advances in design 1–36 (2001).

Angermueller, C., Pärnamaa, T., Parts, L. & Stegle, O. Deep learning for computational biology. Mol. Syst. Biol. 12(7), 878 (2016).

Article
PubMed
PubMed Central

Google Scholar

Ma, Y., Zhang, L., Chen, H., Wang, Z. & Liu, Q. ECMRN: Efficient cross-modal reparameterization network for RGB-D tasks via prompt tuning. Knowl.-Based Syst. 298, 114321. https://doi.org/10.1016/j.knosys.2025.114321 (2025).

Article

Google Scholar

Lloyd, S. Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982).

Article
ADS
MathSciNet

Google Scholar

Hotopp, J. C. D. Horizontal gene transfer between bacteria and animals. Trends Genet. 27(4), 157–163 (2011).

Article

Google Scholar

D. M. Powers, “Evaluation: From precision, recall and f-measure to ROC, informedness, markedness and correlation,” arXiv preprint arXiv:2010.16061, 2020.

Chicco, D. & Jurman, G. The advantages of the matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21, 1–13 (2020).

Article

Google Scholar

Boughorbel, S., Jarray, F. & El-Anbari, M. Optimal classifier for imbalanced data using matthews correlation coefficient metric. PLoS One 12(6), e0177678 (2017).

Article
PubMed
PubMed Central

Google Scholar

Saito, T. & Rehmsmeier, M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 10(3), e0118432 (2015).

Article
PubMed
PubMed Central

Google Scholar

F. Pedregosa et al., “Scikit-learn: Machine learning in python,” the Journal of machine Learning research, 12, 2825–2830, 2011.

MathSciNet

Google Scholar

G. Varoquaux, “Cross-validation failure: Small sample sizes lead to large error bars,” Neuroimage, 180, 68–77, 2018.

Article
PubMed

Google Scholar

G. Eraslan et al., “Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function,” Science, 376 (6594), eabl4290, 2022.

Article
CAS
PubMed
PubMed Central

Google Scholar

Hallee, L. & Khomtchouk, B. B. Machine learning classifiers predict key genomic and evolutionary traits across the kingdoms of life. Sci. Rep. 13(1), 2088 (2023).

Article
ADS
CAS
PubMed
PubMed Central

Google Scholar

Ma, X.-K. et al. CIRCexplorer3: A clear pipeline for direct comparison of circular and linear RNA expression. Genomics Proteomics Bioinformatics 17(5), 511–521 (2019).

Article
PubMed

Google Scholar

Kellogg, E. A. Flowering plants. monocots Vol. 13 (Springer, 2016).

Google Scholar

Lowe, D. & Broomhead, D. Multivariable functional interpolation and adaptive networks. Complex Syst. 2(3), 321–355 (1988).

MathSciNet

Google Scholar

Sokolova, M. & Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002 (2009).

Article

Google Scholar

Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986).

Article
ADS

Google Scholar

A. Field, Discovering statistics using IBM SPSS statistics. Sage publications limited, 2024.

M. Pagel, “Inferring the historical patterns of biological evolution,” Nature, 401 (6756), 877–884, 1999.

Article
ADS
CAS
PubMed

Google Scholar

I. W. G. S. C. (IWGSC). Et al., “Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361(6403), eaar7191 (2018).

Article

Google Scholar

Purugganan, M. D. Evolutionary insights into the nature of plant domestication. Curr. Biol. 29(14), R705–R714 (2019).

Article
CAS
PubMed

Google Scholar

Student, “The probable error of a mean,” Biometrika, 1–25, 1908.

Moody, J. & Darken, C. J. Fast learning in networks of locally-tuned processing units. Neural Comput. 1(2), 281–294 (1989).

Article

Google Scholar

D. Kingma, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

A. Vaswani et al., “Attention is all you need,” Advances in neural information processing systems, 30, 2017.

Felsenstein, J. Phylogenies and the comparative method. Am. Nat. 125(1), 1–15 (1985).

Article

Google Scholar

Efron, B. The jackknife, the bootstrap and other resampling plans (SIAM, 1982).

Book

Google Scholar

C. M. Bishop and N. M. Nasrabadi, Pattern recognition and machine learning, 4. Springer, 2006.

A. Scheben and D. Hojsgaard, “Can we use gene-editing to induce apomixis in sexual plants?” Genes, 11 (7), 781, 2020.

Article
CAS
PubMed
PubMed Central

Google Scholar

S. K. Sahu, M. Waseem, and M. M. Aslam, “Editorial: Bioinformatics, big data 2023, 2023, doi: https://doi.org/10.3389/fpls.2023.1271305.

D. Arthur and S. Vassilvitskii, “K-means++: The advantages of careful seeding,” Stanford, 2006.

Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. & Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35(9), 833–844 (2017).

Article
CAS
PubMed

Google Scholar

L. McInnes, J. Healy, and J. Melville, “Umap: Uniform manifold approximation and projection for dimension reduction,” arXiv preprint arXiv:1802.03426, 2018.

I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, 1. MIT press Cambridge, 2016.

Prechelt, L. Early stopping-but when? In Neural networks: Tricks of the trade 55–69 (Springer, 2002).

R. Kohavi et al., “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in Ijcai, Montreal, Canada, 1995, 1137–1145.

Hasin, Y., Seldin, M. & Lusis, A. Multi-omics approaches to disease. Genome Biol. 18, 1–15 (2017).

Article

Google Scholar

Meyer, R. S. & Purugganan, M. D. Evolution of crop species: Genetics of domestication and diversification. Nat. Rev. Genet. 14(12), 840–852 (2013).

Article
CAS
PubMed

Google Scholar

Vicario, S., Moriyama, E. N. & Powell, J. R. Codon usage in twelve species of drosophila. BMC Evol. Biol. 7, 1–17 (2007).

Article

Google Scholar

Kamilaris, A., Kartakoullis, A. & Prenafeta-Boldú, F. X. A review on the practice of big data analysis in agriculture. Comput. Electron. Agric. 143, 23–37 (2017).

Article

Google Scholar

Sharp, M. & Li, W.-H. The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15(3), 1281–1295 (1987).

Article
ADS
CAS
PubMed
PubMed Central

Google Scholar

A. Şen, K. Kargar, E. Akgün, and M. Ç. Pınar, “Codon optimization: A mathematical programing approach,” Bioinformatics, 36 (13), 4012–4020, 2020.

Article
PubMed

Google Scholar

Gustafsson, C., Govindarajan, S. & Minshull, J. Codon bias and heterologous protein expression. Trends Biotechnol. 22(7), 346–353 (2004).

Article
CAS
PubMed

Google Scholar

Rocha, E. Codon usage bias from tRNA’s point of view: Redundancy, specialization, and efficient decoding for translation optimization. Genome Res. 14(11), 2279–2286 (2004).

Article
CAS
PubMed
PubMed Central

Google Scholar

Hallee, L. & Khomtchouk, B. B. Machine learning classifiers predict key genomic and evolutionary traits across the kingdoms of life. Sci. Rep. 13(1), 2088. https://doi.org/10.1038/s41598-023-28965-7 (2023).

Article
ADS
CAS
PubMed
PubMed Central

Google Scholar

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521(7553), 436–444 (2015).

Article
ADS
CAS
PubMed

Google Scholar

C. R. Harris, K. J. Millman, S. J. van der Walt, R. Gommers, Virtanen, D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N. J. Smith, et al., “Array programming with NumPy,” Nature, 585 (7825), 357–362, Sep. 2020. doi: https://doi.org/10.1038/s41586-020-2649-2.

Article
ADS
CAS
PubMed
PubMed Central

Google Scholar

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, Prettenhofer, R. Weiss, V. Dubourg, et al., “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, 12, 2825–2830, 2011.

MathSciNet

Google Scholar

Cock, J. A. et al. Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25(11), 1422 (2009).

Article
CAS
PubMed
PubMed Central

Google Scholar

J. Cock et al., “Biopython: Freely available python tools for computational molecular biology and bioinformatics,” Bioinformatics, 25 (11), 1422, 2009.

Article
CAS
PubMed
PubMed Central

Google Scholar

D. W. Mount and D. W. Mount, Bioinformatics: Sequence and genome analysis, 564. Cold spring harbor laboratory press Cold Spring Harbor, NY, 2001.

B. Alberts, “Molecular biology of the cell 4th edition,” (No Title), 2002.

A. M. Lesk, Introduction to bioinformatics. Oxford university press, 2019.

Goulet, D. R. et al. Codon optimization using a recurrent neural network. J. Comput. Biol. 30(1), 70–81 (2023).

Yates, A. D. et al. Ensembl genomes 2022: An expanding genome resource for non-vertebrates. Nucleic Acids Res. 50(D1), D996–D1003 (2022).

Article
CAS
PubMed
PubMed Central

Google Scholar

Sueoka, N. Directional mutation pressure and neutral molecular evolution. Proc. Natl. Acad. Sci. U. S. A. 85(8), 2653–2657. https://doi.org/10.1073/pnas.85.8.2653 (1988).

Article
ADS
CAS
PubMed
PubMed Central

Google Scholar

Quax, T. E., Claassens, N. J., Söll, D. & van der Oost, J. Codon bias as a means to fine-tune gene expression. Mol. Cell 59(2), 149–161 (2015).

Article
CAS
PubMed
PubMed Central

Google Scholar

Jolliffe, I. T. & Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. Lond. A Math. Phys. Eng. Sci. 374(2065), 20150202 (2016).

ADS
MathSciNet

Google Scholar

Ringnér, M. What is principal component analysis?. Nat. Biotechnol. 26(3), 303–304 (2008).

Article
PubMed

Google Scholar

E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37(1), 38–44 (2019).

R. Leinonen, H. Sugawara, M. Shumway, and I. N. S. D. Collaboration, “The sequence read archive,” Nucleic acids research, 39 (suppl_1), D19–D21, 2010.

PubMed
PubMed Central

Google Scholar

Compeau, E., Pevzner, A. & Tesler, G. How to apply de bruijn graphs to genome assembly. Nat. Biotechnol. 29(11), 987–991 (2011).

Article
CAS
PubMed
PubMed Central

Google Scholar

S. Haykin, Neural networks and learning machines, 3/e. Pearson Education India, 2009.

A. Ng et al., “Sparse autoencoder,” CS294A Lecture notes, 72 (2011), 1–19, 2011.

Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006).

Article
ADS
MathSciNet
CAS
PubMed

Google Scholar

Nakamura, Y., Gojobori, T. & Ikemura, T. Codon usage tabulated from international DNA sequence databases: Status for the year 2000. Nucleic Acids Res. 28(1), 292. https://doi.org/10.1093/nar/28.1.292 (2000).

Article
CAS
PubMed
PubMed Central

Google Scholar

Yang, Z. & Nielsen, R. Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage. Mol. Biol. Evol. 25(3), 568–579 (2008).

Article
CAS
PubMed

Google Scholar

T. Hastie, R. Tibshirani, J. Friedman, et al., “The elements of statistical learning.” Citeseer, 2009.

S. S. Shapiro and M. B. Wilk, “An analysis of variance test for normality (complete samples),” Biometrika, 52 (3–4), 591–611, 1965.

Article
MathSciNet

Google Scholar

Zaidi, S.-A. et al. New plant breeding technologies for food security. Science 363(6434), 1390–1391 (2019).

Article
ADS
CAS
PubMed

Google Scholar

Smith, S. A. & Brown, J. W. Constructing a broadly inclusive seed plant phylogeny. Am. J. Bot. 105(3), 302–314 (2018).

Article
PubMed

Google Scholar

Salzberg, S. L. Next-generation genome annotation: We still struggle to get it right. Genome Biol. 20(1), 92 (2019).

Article
PubMed
PubMed Central

Google Scholar

Deep learning based attention enhanced phylogenetic radial basis function networks (AE-PRBFN) for genomic codon usage classification across species

RECENT POSTS

Virtual panel: Security in the Machine Age: Expert Insights on AI Threat Evolution

AI use outpaces governance in Australia and New Zealand

Is ADTRAN (ADTN) leveraging board shakeup to deepen its software and AI-driven strategy?

Related Posts