Busch, E. L. et al. Multi-view manifold learning of human brain-state trajectories. Nat. Comput. Sci. 3, 240–253 (2023).
Google Scholar
Floryan, D. & Graham, M. D. Data-driven discovery of intrinsic dynamics. Nat. Mach. Intell. 4, 1113–1120 (2022).
Google Scholar
Islam, M. T. et al. Revealing hidden patterns in deep neural network feature space continuum via manifold learning. Nat. Commun. 14, 8506 (2023).
Google Scholar
Vogelstein, J. T. et al. Supervised dimensionality reduction for big data. Nat. Commun. 12, 2872 (2021).
Google Scholar
Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 25, 417–441 (1933).
Google Scholar
Peng, D., Gui, Z., Gui, J., & Wu, H. A robust and efficient boundary point detection method by measuring local direction dispersion. IEEE Trans. Circuits Syst. Video Technol. https://doi.org/10.1109/TCSVT.2025.3546401 (2025).
Kruskal, J. B. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29, 1–27 (1964).
Google Scholar
Fisher, R. A. The use of multiple measurements in taxonomic problems. Ann. Eugenics 7, 179–188 (1936).
Google Scholar
Yan, S. et al. Nonlinear discriminant analysis on embedded manifold. IEEE Trans. Circuits Syst. Video Technol. 17, 468–477 (2007).
Google Scholar
Zhang, L., Zhou, W. & Chang, P.-C. Generalized nonlinear discriminant analysis and its small sample size problems. Neurocomputing 74, 568–574 (2011).
Google Scholar
Tenenbaum, J. B., Silva, V. D. & Langford, J. C. A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000).
Google Scholar
Roweis, S. T. & Saul, L. K. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000).
Google Scholar
Cover, T. & Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967).
Google Scholar
Belkin, M., & Niyogi, P. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proc. Advanced Neural Information Processing Systems (eds Dietterich, T. G. et al.) 585–591 (2001).
Donoho, D. L. & Grimes, C. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. Natl Acad. Sci. USA 100, 5591–5596 (2003).
Google Scholar
Coifman, R. R. et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl Acad. Sci. USA 102, 7426–7431 (2005).
Google Scholar
van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Google Scholar
McInnes, L., Healy, J., Saul, N. & Groçberger, L. UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2018).
Google Scholar
Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37, 38–44 (2019).
Google Scholar
Amid, E. & Warmuth, M. K. TriMap: large-scale dimensionality reduction using triplets. Preprint at https://arxiv.org/abs/1910.00204 (2022).
Fu, C., Zhang, Y., Cai, D. & Ren, X. AtSNE: efficient and robust visualization on GPU through hierarchical optimization. In Proc. 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) 176–186 (Association for Computing Machinery, 2019).
Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat. Biotechnol. 37, 1482–1492 (2019).
Google Scholar
Kobak, D. & Berens, P. The art of using t-SNE for single-cell transcriptomics. Nat. Commun. 10, 5416 (2019).
Google Scholar
Shen, C. & Wu, H.-T. Scalability and robustness of spectral embedding: landmark diffusion is all you need. Inf. Inference 11, 1527–1595 (2022).
Google Scholar
Long, A. W. & Ferguson, A. L. Landmark diffusion maps (L-dMaps): accelerated manifold learning out-of-sample extension. Appl. Comput. Harmon. Anal. 47, 190–211 (2019).
Google Scholar
Persad, S. et al. SEACells infers transcriptional and epigenomic cellular states from single-cell genomics data. Nat. Biotechnol. 41, 1746–1757 (2023).
Google Scholar
Stanoi, I., Riedewald, M., Agrawal, D. & El Abbadi, A. Discovery of influence sets in frequently updated databases. In Proc. 27th International Conference on Very Large Databases (VLDB) (eds. Apers P. M. G. et al.) 99–108 (Morgan Kaufmann Publishers Inc., 2001).
Tao, Y., Papadias, D. & Lian, X. Reverse kNN search in arbitrary dimensionality. In Proc. 30th International Conference on Very Large Databases (VLDB) (eds Nascimento, M. A. et al.) 744–755 (Morgan Kaufmann Publishers Inc., 2004).
Radovanovic, M., Nanopoulos, A. & Ivanovic, M. Hubs in space: popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11, 2487–2531 (2010).
Google Scholar
Loshchilov, I. & Hutter, F. SGDR: stochastic gradient descent with warm restarts. In Proc. 5th International Conference on Learning Representations (ICLR) 1769–1784 (Curran Associates, 2017).
Peng, D., Gui, Z. & Wu, H. MeanCut: a greedy-optimized graph clustering via path-based similarity and degree descent criterion. Preprint at https://arxiv.org/abs/2312.04067 (2023).
Karypis, G., Han, E.-H. & Kumar, V. Chameleon: hierarchical clustering using dynamic modeling. Computer 32, 68–75 (1999).
Google Scholar
France, S. & Carroll, D. in Machine Learning and Data Mining in Pattern Recognition (ed. Perner, P.) vol. 4,571, 499–517 (2007).
Moor, M., Horn, M., Rieck, B. & Borgwardt, K. Topological autoencoders. In Proc. 37th International Conference on Machine Learning (ICML) (eds Daumé, H. & Singh, A.) 7045–7054 (PMLR, 2020).
Sainburg, T., McInnes, L. & Gentner, T. Q. Parametric UMAP embeddings for representation and semisupervised learning. Neural Comput. 33, 2881–2907 (2021).
Google Scholar
Dua, D. & Graff, C. UCI machine learning repository. University of California Irvine https://archive.ics.uci.edu (2019).
Krizhevsky, A. & Hinton, G. Learning Multiple Layers of Features from Tiny Images Technical Report (Univ. Toronto, 2009).
Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).
Google Scholar
Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at https://arxiv.org/abs/1708.07747 (2017).
Corso, G. M. D., Gulli, A. & Romani, F. Ranking a stream of news. In Proc. 14th International World Wide Web Conference (WWW) 97–106 (Association for Computing Machinery, 2005).
Poličar, P. G., Stražar, M. & Zupan, B. openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding. J. Stat. Softw. 109, 1–30 (2024).
Google Scholar
Huang, H., Wang, Y., Rudin, C. & Browne, E. P. Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization. Commun. Biol. 5, 719 (2022).
Google Scholar
Amir, E. A. et al. viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nat. Biotechnol. 31, 545–552 (2013).
Google Scholar
Khan, S. A. et al. Reusability report: learning the transcriptional grammar in single-cell RNA-sequencing data using transformers. Nat. Mach. Intell. 5, 1437–1446 (2023).
Google Scholar
Heng, J. S. et al. Hypoxia tolerance in the Norrin-deficient retina and the chronically hypoxic brain studied at single-cell resolution. Proc. Natl Acad. Sci. USA 116, 9103–9114 (2019).
Google Scholar
Satija, R. et al. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).
Google Scholar
Kiselev, V. Y., Andrews, T. S. & Hemberg, M. Challenges in unsupervised clustering of single-cell RNA-seq data. Nat. Rev. Genet. 20, 273–282 (2019).
Google Scholar
Peng, D. et al. Clustering by measuring local direction centrality for data with heterogeneous density and weak connectivity. Nat. Commun. 13, 5455 (2022).
Google Scholar
Wang, S. K. et al. Single-cell multiome and enhancer connectome of human retinal pigment epithelium and choroid nominate pathogenic variants in age-related macular degeneration. Preprint at bioRxiv https://doi.org/10.1101/2025.03.21.644670 (2025).
Ziegler, C. G. et al. Impaired local intrinsic immunity to SARS-CoV-2 infection in severe COVID-19. Cell 184, 4713–4733 (2021).
Google Scholar
Levine, J. H. et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015).
Google Scholar
Weber, L. M. & Robinson, M. D. Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data. Cytom. A 89, 1084–1096 (2016).
Google Scholar
Ferrari, G., Thrasher, A. J. & Aiuti, A. Gene therapy using haematopoietic stem and progenitor cells. Nat. Rev. Genet. 22, 216–234 (2021).
Google Scholar
Street, K. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics 19, 477 (2018).
Google Scholar
Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).
Google Scholar
Young, W. J. et al. Genetic analyses of the electrocardiographic QT interval and its components identify additional loci and pathways. Nat. Commun. 13, 5144 (2022).
Google Scholar
Niroshana, S. M. I., Kuroda, S., Tanaka, K. & Chen, W. Beat-wise segmentation of electrocardiogram using adaptive windowing and deep neural network. Sci. Rep. 13, 11039 (2023).
Google Scholar
Huo, R. et al. ECG segmentation algorithm based on bidirectional hidden semi-Markov model. Comput. Biol. Med. 150, 106081 (2022).
Google Scholar
Oberlin, T., Meignen, S. & Perrier, V. The Fourier-based synchrosqueezing transform. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 315–319 (IEEE, 2014).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Google Scholar
Laguna, P., Mark, R. G., Goldberg, A. & Moody, G. B. A database for evaluation of algorithms for measurement of QT and other waveform intervals in the ECG. Comput. Cardiol. 24, 673–676 (1997).
Goldberger, A. L. et al. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101, e215–e220 (2000).
Google Scholar
Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2013).
Google Scholar
Peng, D., Gui, Z. & Wu, H. Interpreting the curse of dimensionality from distance concentration and manifold effect. Preprint at https://arxiv.org/abs/2401.00422 (2023).
Healy, J. & McInnes, L. Uniform manifold approximation and projection. Nat. Rev. Methods Primers 4, 82 (2024).
Google Scholar
Malkov, Y. A. & Yashunin, D. A. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Trans. Pattern Anal. Mach. Intell. 42, 824–836 (2018).
Google Scholar
Dong, W., Moses, C. & Li, K. Efficient k-nearest neighbor graph construction for generic similarity measures. In Proc. 20th International Conference on World Wide Web (WWW) 577–586 (Association for Computing Machinery, 2011).
Saul, L. K. & Roweis, S. T. Think globally, fit locally: unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4, 119–155 (2003).
Google Scholar
Wang, J. Real local-linearity preserving embedding. Neurocomputing 136, 7–13 (2014).
Google Scholar
Wang, Z. et al. Clustering by local gravitation. IEEE Trans. Cybern. 48, 1383–1396 (2018).
Google Scholar
von Luxburg, U. A tutorial on spectral clustering. Stat. Comp. 17, 395–416 (2007).
Google Scholar
Ding, J., Shah, S. & Condon, A. densityCut: an efficient and versatile topological approach for automatic clustering of biological data. Bioinformatics 32, 2567–2576 (2016).
Google Scholar
Kuhn, H. W. The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 2, 83–97 (1955).
Google Scholar
Peng, D. Source code of SUDE. Zenodo https://doi.org/10.5281/zenodo.16792257 (2025).
Peng, D. Open code for in-review “Sampling-enabled scalable manifold learning unveils discriminative cluster structure of high-dimensional data”. Code Ocean https://doi.org/10.24433/CO.9866909.v3 (2025).
