Progress, challenges and future of linguistic neural decoding with deep learning

Machine Learning


  • Abnar, S., Beinborn, L., Choenni, R. & Zuidema, W. Blackbox meets blackbox: Representational similarity & stability analysis of neural language models and brains. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp 191–203 (2019).

  • Frisby, S. L., Halai, A. D., Cox, C. R., Ralph, M. A. L. & Rogers, T. T. Decoding semantic representations in mind and brain. Trends Cogn. Sci. 27, 258–281 (2023).

    Article 
    PubMed 

    Google Scholar 

  • Silva, A. B., Littlejohn, K. T., Liu, J. R., Moses, D. A. & Chang, E. F. The speech neuroprosthesis. Nat. Rev. Neurosci. 25, 473–492 (2024).

  • Tuckute, G., Kanwisher, N. & Fedorenko, E. Language in brains, minds, and machines. Annu. Rev. Neurosci. 47, 271–301 (2024).

  • Ball, T., Kern, M., Mutschler, I., Aertsen, A. & Schulze-Bonhage, A. Signal quality of simultaneously recorded invasive and non-invasive eeg. Neuroimage 46, 708–716 (2009).

    Article 
    PubMed 

    Google Scholar 

  • Ahissar, E. et al. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc. Natl Acad. Sci. USA 98, 13367–13372 (2001).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Brodbeck, C., Hong, L. E. & Simon, J. Z. Rapid transformation from auditory to linguistic representations of continuous speech. Curr. Biol. 28, 3976–3983 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Koskinen, M., Kurimo, M., Gross, J., Hyvärinen, A. & Hari, R. Brain activity reflects the predictability of word sequences in listened continuous speech. NeuroImage 219, 116936 (2020).

    Article 
    PubMed 

    Google Scholar 

  • Donhauser, P. W. & Baillet, S. Two distinct neural timescales for predictive speech processing. Neuron 105, 385–393 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Gillis, M., Vanthornhout, J., Simon, J. Z., Francart, T. & Brodbeck, C. Neural markers of speech comprehension: measuring EEG tracking of linguistic speech representations, controlling the speech acoustics. J. Neurosci. 41, 10316–10329 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Leonard, M. K. et al. Large-scale single-neuron speech sound encoding across the depth of human cortex. Nature 626, 593–602 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Khanna, A. R. et al. Single-neuronal elements of speech production in humans. Nature 626, 603–610 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Clark, A. Whatever next? predictive brains, situated agents, and the future of cognitive science. Behav. Brain Sci. 36, 181–204 (2013).

    Article 
    PubMed 

    Google Scholar 

  • Schrimpf, M. et al. The neural architecture of language: Integrative modeling converges on predictive processing. Proc. Natl Acad. Sci. USA 118, e2105646118 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Heilbron, M., Armeni, K., Schoffelen, J.-M., Hagoort, P. & De Lange, F. P. A hierarchy of linguistic predictions during natural language comprehension. Proc. Natl Acad. Sci. USA 119, e2201968119 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Caucheteux, C., Gramfort, A. & King, J.-R. Evidence of a predictive coding hierarchy in the human brain listening to speech. Nat. Hum. Behav. 7, 430–441 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Broderick, M. P., Anderson, A. J., Di Liberto, G. M., Crosse, M. J. & Lalor, E. C. Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech. Curr. Biol. 28, 803–809 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Caucheteux, C., Gramfort, A. & King, J.-R. Disentangling syntax and semantics in the brain with deep networks. In: International conference on machine learning, pp 1336–1348 (PMLR, 2021).

  • Toneva, M., Mitchell, T. M. & Wehbe, L. Combining computational controls with natural text reveals aspects of meaning composition. Nat. Comput. Sci. 2, 745–757 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Antonello, R. & Huth, A. Predictive coding or just feature discovery? An alternative account of why language models fit brain data. Neurobiol. Lang. 5, 64–79 (2024).

    Google Scholar 

  • Alkhamissi, B., Tuckute, G., Bosselut, A., Schrimpf, M.: The llm language network: A neuroscientific approach for identifying causally task-relevant units. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 10887–10911 (2025).

  • Whittington, J. C., Warren, J. & Behrens, T. E. Relating transformers to models and neural representations of the hippocampal formation. In: International Conference on Learning Representations (2021).

  • Liu, X. et al. Coupling artificial neurons in Bert and biological neurons in the human brain. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, 8888–8896 (2023).

  • Antonello, R., Vaidya, A. & Huth, A. Scaling laws for language encoding models in fMRI. Adv. Neural Inf. Process. Syst. 36, 21895–21907 (2024).

  • Lin, H. et al. Selecting large language model to fine-tune via rectified scaling law. In Proceedings of the 41st International Conference on Machine Learning, pp. 30080–30107 (2024).

  • Ren, Y., Jin, R., Zhang, T. & Xiong, D. Do large language models mirror cognitive language processing? In: Proceedings of the 31st International Conference on Computational Linguistics, pp. 2988–3001 (2025).

  • Papineni, K., Roukos, S., Ward, T. & Zhu, W.-J. Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311–318 (2002).

  • Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q. & Artzi, Y. Bertscore: evaluating text generation with bert. In: International Conference on Learning Representations (2019).

  • Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 4171–4186 (2019).

  • Taal, C. H., Hendriks, R. C., Heusdens, R. & Jensen, J. A short-time objective intelligibility measure for time-frequency weighted noisy speech. In: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp 4214–4217 (IEEE, 2010).

  • Kubichek, R. Mel-cepstral distance measure for objective speech quality assessment. In: Proceedings of IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, vol. 1, 125–128 (IEEE, 1993).

  • Just, M. A., Cherkassky, V. L., Aryal, S. & Mitchell, T. M. A neurosemantic theory of concrete noun representation based on the underlying brain codes. PloS ONE 5, e8622 (2010).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Sudre, G. et al. Tracking neural coding of perceptual and semantic features of concrete nouns. NeuroImage 62, 451–463 (2012).

    Article 
    PubMed 

    Google Scholar 

  • Anderson, A. J., Kiela, D., Clark, S. & Poesio, M. Visually grounded and textual semantic models differentially decode brain activity associated with concrete and abstract nouns. Trans. Assoc. Comput. Linguist. 5, 17–30 (2017).

    Article 

    Google Scholar 

  • Abnar, S., Ahmed, R., Mijnheer, M. & Zuidema, W. Experiential, distributional and dependency-based word embeddings have complementary roles in decoding brain activity. In: Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2018), pp 57–66 (2018).

  • Pereira, F. et al. Toward a universal decoder of linguistic meaning from brain activation. Nat. Commun. 9, 963 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Affolter, N., Egressy, B., Pascual, D. & Wattenhofer, R. Brain2word: decoding brain activity for language generation. Preprint at arXiv https://doi.org/10.48550/arXiv.2009.04765 (2020).

  • Zou, S., Wang, S., Zhang, J. & Zong, C. Towards brain-to-text generation: neural decoding with pre-trained encoder-decoder models. In: NeurIPS 2021 AI for Science Workshop (2021).

  • Anderson, A. J. et al. Predicting neural activity patterns associated with sentences using a neurobiologically motivated model of semantic representation. Cereb. Cortex 27, 4379–4395 (2017).

    PubMed 

    Google Scholar 

  • Wang, J., Cherkassky, V. L. & Just, M. A. Predicting the brain activation pattern associated with the propositional content of a sentence: modeling neural representations of events and states. Hum. Brain Mapp. 38, 4865–4881 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Anderson, A. J. et al. An integrated neural decoder of linguistic and experiential meaning. J. Neurosci. 39, 8969–8987 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Sun, J., Wang, S., Zhang, J. & Zong, C. Towards sentence-level brain decoding with distributed representations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp 7047–7054 (2019).

  • Gauthier, J. & Levy, R. Linking artificial and human neural representations of language. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 529–539 (2019).

  • Sun, J., Wang, S., Zhang, J. & Zong, C. Neural encoding and decoding with distributed sentence representations. IEEE Trans. Neural Netw. Learn. Syst. 32, 589–603 (2020).

    Article 

    Google Scholar 

  • Wehbe, L. et al. Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses. PloS ONE 9, e112575 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Jat, S., Tang, H., Talukdar, P. & Mitchell, T. Relating simple sentence representations in deep neural networks and the brain. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 5137-5154 (2019).

  • Li, Y. et al. Dissecting neural computations in the human auditory pathway using deep neural networks for speech. Nat. Neurosci. 26, 2213–2225 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Liu, Y. & Ayaz, H. Speech recognition via fNIRS-based brain signals. Front. Neurosci. 12, 395799 (2018).

    Article 

    Google Scholar 

  • Moses, D. A., Leonard, M. K., Makin, J. G. & Chang, E. F. Real-time decoding of question-and-answer speech dialogue using human cortical activity. Nat. Commun. 10, 3096 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Radford, A. et al. Learning transferable visual models from natural language supervision. In: International conference on machine learning, pp 8748–8763 (PMLR, 2021).

  • Défossez, A., Caucheteux, C., Rapin, J., Kabeli, O. & King, J.-R. Decoding speech perception from non-invasive brain recordings. Nat. Mach. Intell. 5, 1097–1107 (2023).

    Article 

    Google Scholar 

  • Baevski, A., Zhou, Y., Mohamed, A. & Auli, M. wav2vec 2.0: A framework for self-supervised learning of speech representations. Adv. Neural Inf. Process. Syst. 33, 12449–12460 (2020).

    Google Scholar 

  • Bollens, L., Francart, T. & Van Hamme, H. Learning subject-invariant representations from speech-evoked EEG using variational autoencoders. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 1256–1260 (IEEE, 2022).

  • Aiken, S. J. & Picton, T. W. Human cortical responses to the speech envelope. Ear Hear. 29, 139–157 (2008).

    Article 
    PubMed 

    Google Scholar 

  • Ding, N. & Simon, J. Z. Cortical entrainment to continuous speech: functional roles and interpretations. Front. Hum. Neurosci. 8, 311 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Vanthornhout, J., Decruy, L., Wouters, J., Simon, J. Z. & Francart, T. Speech intelligibility predicted from neural entrainment of the speech envelope. J. Assoc. Res. Otolaryngol. 19, 181–191 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Crosse, M. J., Di Liberto, G. M., Bednar, A. & Lalor, E. C. The multivariate temporal response function (mtrf) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli. Front. Hum. Neurosci. 10, 604 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Accou, B. et al. Modeling the relationship between acoustic stimulus and EEG with a dilated convolutional neural network. In: 2020 28th European Signal Processing Conference (EUSIPCO), pp 1175–1179 (IEEE, 2021).

  • De Clercq, P., Vanthornhout, J., Vandermosten, M. & Francart, T. Beyond linear neural envelope tracking: a mutual information approach. J. Neural Eng. 20, 026007 (2023).

    Article 

    Google Scholar 

  • Thornton, M., Mandic, D. & Reichenbach, T. Robust decoding of the speech envelope from EEG recordings through deep neural networks. J. Neural Eng. 19, 046007 (2022).

    Article 

    Google Scholar 

  • Accou, B., Vanthornhout, J., Hamme, H. V. & Francart, T. Decoding of the speech envelope from EEG using the VLAAI deep neural network. Sci. Rep. 13, 812 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • de Taillez, T., Kollmeier, B. & Meyer, B. T. Machine learning for decoding listeners’ attention from electroencephalography evoked by continuous speech. Eur. J. Neurosci. 51, 1234–1241 (2020).

    Article 
    PubMed 

    Google Scholar 

  • Xu, Z. et al. Decoding selective auditory attention with EEG using a transformer model. Methods 204, 410–417 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Krishna, G., Han, Y., Tran, C., Carnahan, M. & Tewfik, A. H. State-of-the-art speech recognition using eeg and towards decoding of speech spectrum from eeg.Preprint at arXiv https://doi.org/10.48550/arXiv.1908.05743 (2019).

  • Petrosyan, A., Voskoboynikov, A. & Ossadtchi, A. Compact and interpretable architecture for speech decoding from stereotactic EEG. In: 2021 Third International Conference Neurotechnologies and Neurointerfaces (CNN), pp 79–82 (IEEE, 2021).

  • Krishna, G., Tran, C., Carnahan, M. & Tewfik, A. H. Advancing speech synthesis using EEG. In: 2021 10th International IEEE/EMBS Conference on Neural Engineering (NER), pp 199–204 (IEEE, 2021).

  • Pasley, B. N. et al. Reconstructing speech from human auditory cortex. PLoS Biol. 10, e1001251 (2012).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Akbari, H., Khalighinejad, B., Herrero, J. L., Mehta, A. D. & Mesgarani, N. Towards reconstructing intelligible speech from the human auditory cortex. Sci. Rep. 9, 874 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Arjovsky, M., Chintala, S. & Bottou, L. Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp 214–223 (PMLR, 2017).

  • Wang, R. et al. Stimulus speech decoding from human cortex with generative adversarial network transfer learning. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp 390–394 (IEEE, 2020).

  • Yi, Z., Zhang, H., Tan, P. & Gong, M. Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2849–2857 (2017).

  • Guo, Y., Liu, T., Zhang, X., Wang, A. & Wang, W. End-to-end translation of human neural activity to speech with a dual–dual generative adversarial network. Knowl. Based Syst. 277, 110837 (2023).

    Article 

    Google Scholar 

  • Senda, J. et al. Auditory stimulus reconstruction from ECoG with DNN and self-attention modules. Biomed. Signal Process. Control 89, 105761 (2024).

    Article 

    Google Scholar 

  • Makin, J. G., Moses, D. A. & Chang, E. F. Machine translation of cortical activity to text with an encoder–decoder framework. Nat. Neurosci. 23, 575–582 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Wang, Z. & Ji, H. Open vocabulary electroencephalography-to-text decoding and zero-shot sentiment classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp 5350–5358 (2022).

  • Lewis, M. et al. Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880 (2020).

  • Van Den Oord, A. et al. Neural discrete representation learning. Adv. Neural Inf. Process. Syst. 30, 6309–6318 (2017).

  • Duan, Y., Chau, C., Wang, Z., Wang, Y.-K. & Lin, C.-t. Dewave: discrete encoding of EEG waves for EEG to text translation. Adv. Neural Inf. Process. Syst. 36, 9907–9918 (2024).

  • Jo, H. et al. Are EEG-to-text models working? Preprint at arXiv https://doi.org/10.48550/arXiv.2405.06459 (2024).

  • Yang, Y., Duan, Y., Zhang, Q., Xu, R. & Xiong, H. Neuspeech: Decode neural signal as speech. Preprint at arXiv https://doi.org/10.48550/arXiv.2403.01748 (2024).

  • Yang, Y. et al. Mad: Multi-alignment meg-to-text decoding. Preprint at arXiv https://doi.org/10.48550/arXiv.2406.01512 (2024).

  • Chen, X., Du, C., Liu, C., Wang, Y. & He, H. Open-vocabulary auditory neural decoding using fMRI-prompted llm. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.07840 (2024).

  • Ye, Z. et al. Generative language reconstruction from brain recordings (2024).

  • Yin, C., Ye, Z. & Li, P. Language reconstruction with brain predictive coding from fMRI data. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.11597 (2024).

  • Feng, X., Feng, X., Qin, B. & Liu, T. Aligning semantic in brain and language: a curriculum contrastive method for electroencephalography-to-text generation. IEEE Trans. Neural Syst. Rehabil. Eng. 31, 3874–3883 (2023).

  • Xi, N. et al. Unicorn: unified cognitive signal reconstruction bridging cognitive signals and human language. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 13277–13291 (2023).

  • Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. Preprint at arXiv https://doi.org/10.48550/arXiv.1301.3781 (2013).

  • Tang, J., LeBel, A., Jain, S. & Huth, A. G. Semantic reconstruction of continuous language from non-invasive brain recordings. Nat. Neurosci. 26, 858–866 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Zhao, X. et al. Mapguide: a simple yet effective method to reconstruct continuous language from brain activities. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 3822–3832 (2024).

  • Chen, X. et al. High-speed spelling with a noninvasive brain–computer interface. Proc. Natl Acad. Sci. USA 112, E6058–E6067 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Metzger, S. L. et al. Generalizable spelling using a speech neuroprosthesis in an individual with severe limb and vocal paralysis. Nat. Commun. 13, 6510 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Leuthardt, E. C. et al. Using the electrocorticographic speech network to control a brain–computer interface in humans. J. Neural Eng. 8, 036004 (2011).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Willett, F. R., Avansino, D. T., Hochberg, L. R., Henderson, J. M. & Shenoy, K. V. High-performance brain-to-text communication via handwriting. Nature 593, 249–254 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Brigham, K. & Kumar, B. V. Imagined speech classification with EEG signals for silent communication: a preliminary investigation into synthetic telepathy. In: 2010 4th International Conference on Bioinformatics and Biomedical Engineering, pp 1–4 (IEEE, 2010).

  • Duraivel, S. et al. High-resolution neural recordings improve the accuracy of speech decoding. Nat. Commun. 14, 6938 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Wandelt, S. K. et al. Representation of internal speech by single neurons in human supramarginal gyrus. Nat. Hum. Behav. https://api.semanticscholar.org/CorpusID:269759448 (2024).

  • Chang, E. F. et al. Categorical speech representation in human superior temporal gyrus. Nat. Neurosci. 13, 1428–1432 (2010).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Mesgarani, N., Cheung, C., Johnson, K. & Chang, E. F. Phonetic feature encoding in human superior temporal gyrus. Science 343, 1006–1010 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Suppes, P., Lu, Z.-L. & Han, B. Brain wave recognition of words. Proc. Natl Acad. Sci. USA 94, 14965–14969 (1997).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Suppes, P., Han, B. & Lu, Z.-L. Brain-wave recognition of sentences. Proc. Natl Acad. Sci. USA 95, 15861–15866 (1998).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • D’Zmura, M., Deng, S., Lappas, T., Thorpe, S. & Srinivasan, R. Toward EEG sensing of imagined speech. In Human-Computer Interaction. New Trends: 13th International Conference, HCI International 2009, San Diego, CA, USA, July 19-24, 2009, Proceedings, Part I 13, pp 40–48 (Springer, 2009).

  • Tankus, A., Fried, I. & Shoham, S. Structured neuronal encoding and decoding of human speech features. Nat. Commun. 3, 1015 (2012).

    Article 
    PubMed 

    Google Scholar 

  • DaSalla, C. S., Kambara, H., Sato, M. & Koike, Y. Single-trial classification of vowel speech imagery using common spatial patterns. Neural Netw. 22, 1334–1339 (2009).

    Article 
    PubMed 

    Google Scholar 

  • Wang, L., Zhang, X., Zhong, X. & Zhang, Y. Analysis and classification of speech imagery EEG for BCI. Biomed. signal Process. control 8, 901–908 (2013).

    Article 

    Google Scholar 

  • Stavisky, S. D. et al. Decoding speech from intracortical multielectrode arrays in dorsal “arm/hand areas” of human motor cortex. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp 93–97 (IEEE, 2018).

  • Pei, X., Barbour, D. L., Leuthardt, E. C. & Schalk, G. Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J. Neural Eng. 8, 046028 (2011).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Deng, S., Srinivasan, R., Lappas, T. & D’Zmura, M. Eeg classification of imagined syllable rhythm using Hilbert spectrum methods. J. Neural Eng. 7, 046006 (2010).

    Article 
    PubMed 

    Google Scholar 

  • Kim, J., Lee, S.-K. & Lee, B. Eeg classification in a single-trial basis for vowel speech perception using multivariate empirical mode decomposition. J. Neural Eng. 11, 036010 (2014).

    Article 
    PubMed 

    Google Scholar 

  • Moses, D. A., Mesgarani, N., Leonard, M. K. & Chang, E. F. Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity. J. Neural Eng. 13, 056004 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Brumberg, J. S., Wright, E. J., Guenther, F. H. & Kennedy, P. R. Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech motor cortex. Front. Neurosci. 5, 7880 (2011).

    Google Scholar 

  • Kellis, S. et al. Decoding spoken words using local field potentials recorded from the cortical surface. J. Neural Eng. 7, 056007 (2010).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Mugler, E. M. et al. Direct classification of all American English phonemes using signals from functional speech motor cortex. J. Neural Eng. 11, 035015 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Clayton, J., Wellington, S., Valentini-Botinhao, C. & Watts, O. Decoding imagined, heard, and spoken speech: Classification and regression of eeg using a 14-channel dry-contact mobile headset. In: INTERSPEECH, pp 4886–4890 (2020).

  • Mohanchandra, K. & Saha, S. A communication paradigm using subvocalized speech: translating brain signals into speech. Augmented Hum. Res. 1, 3 (2016).

    Article 

    Google Scholar 

  • Martin, S. et al. Word pair classification during imagined speech using direct brain recordings. Sci. Rep. 6, 25803 (2016).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Nguyen, C. H., Karavas, G. K. & Artemiadis, P. Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. J. Neural Eng. 15, 016002 (2017).

    Article 

    Google Scholar 

  • González-Castañeda, E. F., Torres-García, A. A., Reyes-García, C. A. & Villaseñor-Pineda, L. Sonification and textification: Proposing methods for classifying unspoken words from EEG signals. Biomed. Signal Process. Control 37, 82–91 (2017).

    Article 

    Google Scholar 

  • Salama, M., ElSherif, L., Lashin, H. & Gamal, T. Recognition of unspoken words using electrode electroencephalograhic signals. In: The Sixth International Conference on Advanced Cognitive Technologies and Applications, pp 51–5 (2014).

  • Zhao, S. & Rudzicz, F. Classifying phonological categories in imagined and articulated speech. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 992–996 (IEEE, 2015).

  • Saha, P. & Fels, S. Hierarchical deep feature learning for decoding imagined speech from EEG. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp 10019–10020 (2019).

  • Dash, D., Ferrari, P. & Wang, J. Decoding imagined and spoken phrases from non-invasive neural (meg) signals. Front. Neurosci. 14, 490970 (2020).

    Article 

    Google Scholar 

  • Herff, C. et al. Brain-to-text: decoding spoken phrases from phone representations in the brain. Front. Neurosci. 8, 141498 (2015).

    Google Scholar 

  • Moses, D. A. et al. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. N. Engl. J. Med. 385, 217–227 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Willett, F. R. et al. A high-performance speech neuroprosthesis. Nature 620, 1031–1036 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Metzger, S. L. et al. A high-performance neuroprosthesis for speech decoding and avatar control. Nature 620, 1037–1046 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Munteanu, C., Penn, G., Baecker, R., Toms, E. & James, D. Measuring the acceptable word error rate of machine-generated webcast transcripts. In: Ninth International Conference on Spoken Language Processing (Citeseer, 2006).

  • Sun, P., Anumanchipalli, G. K. & Chang, E. F. Brain2char: a deep architecture for decoding text from brain recordings. J. Neural Eng. 17, 066015 (2020).

    Article 

    Google Scholar 

  • Feng, C. et al. A high-performance brain-to-sentence decoder for logosyllabic language (2023).

  • Silva, A. B. et al. A bilingual speech neuroprosthesis driven by cortical articulatory representations shared between languages. Nat. Biomed. Eng. 8, 977–991 (2024).

  • Yuan, B. A. & Makin, J. G. Improving speech decoding from ECOG with self-supervised pretraining. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.18639 (2024).

  • Schneider, S., Baevski, A., Collobert, R. & Auli, M. wav2vec: Unsupervised pre-training for speech recognition. In Proc. Interspeech 2019, pp. 3465–3469 (2019).

  • Radford, A. et al. Language models are unsupervised multitask learners. OpenAI blog 1, 9 (2019).

    Google Scholar 

  • Zhang, S. et al. Opt: Open pre-trained transformer language models. Preprint at arXiv https://doi.org/10.48550/arXiv.2205.01068 (2022).

  • Touvron, H. et al. Llama 2: open foundation and fine-tuned chat models. Preprint at arXiv https://doi.org/10.48550/arXiv.2307.09288 (2023).

  • Feng, S., Liu, H., Wang, Y. & Wang, Y. Towards an end-to-end framework for invasive brain signal decoding with large language models. In: Interspeech 2024, pp 1495–1499 (2024).

  • Guenther, F. H. et al. A wireless brain-machine interface for real-time speech synthesis. PloS ONE 4, e8218 (2009).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Chen, X. et al. A neural speech decoding framework leveraging deep learning and speech synthesis. Nat. Mach. Intell. 6, 467–480 (2024).

  • Anumanchipalli, G. K., Chartier, J. & Chang, E. F. Speech synthesis from neural decoding of spoken sentences. Nature 568, 493–498 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Krishna, G., Tran, C., Han, Y., Carnahan, M. & Tewfik, A. H. Speech synthesis using EEG. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 1235–1238 (IEEE, 2020).

  • Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 (2017).

  • Angrick, M. et al. Speech synthesis from ecog using densely connected 3d convolutional neural networks. J. Neural Eng. 16, 036019 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Bocquelet, F., Hueber, T., Girin, L., Badin, P. & Yvert, B. Robust articulatory speech synthesis using deep neural networks for BCI applications. In: Interspeech 2014-15th Annual Conference of the International Speech Communication Association (2014).

  • Bocquelet, F., Hueber, T., Girin, L., Savariaux, C. & Yvert, B. Real-time control of an articulatory-based speech synthesizer for brain computer interfaces. PLoS Comput. Biol. 12, e1005119 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Cheung, C., Hamilton, L. S., Johnson, K. & Chang, E. F. The auditory representation of speech sounds in human motor cortex. elife 5, e12577 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Dichter, B. K., Breshears, J. D., Leonard, M. K. & Chang, E. F. The control of vocal pitch in human laryngeal motor cortex. Cell 174, 21–31 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Chartier, J., Anumanchipalli, G. K., Johnson, K. & Chang, E. F. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron 98, 1042–1054 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Mugler, E. M. et al. Differential representation of articulatory gestures and phonemes in precentral and inferior frontal gyri. J. Neurosci. 38, 9803–9813 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Angrick, M. et al. Real-time synthesis of imagined speech processes from minimally invasive recordings of neural activity. Commun. Biol. 4, 1055 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Herff, C. et al. Generating natural, intelligible speech from brain activity in motor, premotor, and inferior frontal cortices. Front. Neurosci. 13, 469935 (2019).

    Article 

    Google Scholar 

  • Liu, Y. et al. Decoding and synthesizing tonal language speech from brain activity. Sci. Adv. 9, eadh0478 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Angrick, M. et al. Online speech synthesis using a chronically implanted brain–computer interface in an individual with als. Sci. Rep. 14, 9617 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Ling, J., Wang, Y., Xue, H., Xie, R. & Song, L. Posetalk: text-and-audio-based pose control and motion refinement for one-shot talking head generation. Preprint at arXiv https://doi.org/10.48550/arXiv.2409.02657 (2024).

  • Song, H. et al. Continuous neural control of a bionic limb restores biomimetic gait after amputation. Nat. Med. 30, 2010–2019 (2024).

  • Wang, J. et al. Neural correlate and movement decoding of simultaneous-and-sequential bimanual movements using eeg signals. IEEE Trans. Neural Syst. Rehabil. Eng. 32, 2087–2095 (2024).

  • Luo, S. et al. Stable decoding from a speech BCI enables control for an individual with ALS without recalibration for 3 months. Adv. Sci. https://api.semanticscholar.org/CorpusID:264448311 (2023).

  • Wang, S., Liu, S., Tan, Z. & Wang, X. Mindbridge: A cross-subject brain decoding framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11333–11342 (2024).

  • Quan, R., Wang, W., Tian, Z., Ma, F. & Yang, Y. Psychometry: An omnifit model for image reconstruction from human brain activity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 233–243 (2024).

  • Liu, X. et al. Eeg2video: Towards decoding dynamic visual perception from EEG signals. In: The Thirty-Eighth Annual Conference on Neural Information Processing Systems.

  • van Stuijvenberg, O. C., Broekman, M. L., Wolff, S. E., Bredenoord, A. L. & Jongsma, K. R. Developer perspectives on the ethics of AI-driven neural implants: a qualitative study. Sci. Rep. 14, 7880 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Yuste, R. Advocating for neurodata privacy and neurotechnology regulation. Nat. Protoc. 18, 2869–2875 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar 

  • Wang, C. et al. Brainbert: Self-supervised representation learning for intracranial recordings. In: The Eleventh International Conference on Learning Representations.

  • Zheng, H. et al. Du-IN: Discrete units-guided mask modeling for decoding speech from intracranial neural signals. In: The Thirty-Eighth Annual Conference on Neural Information Processing Systems https://openreview.net/forum?id=uyLtEFnpQP (2024).

  • Dong, Y. et al. An approach for EEG denoising based on Wasserstein generative adversarial network. IEEE Trans. Neural Syst. Rehabil. Eng. 31, 3524–3534 (2023).

  • Huang, X., Li, C., Liu, A., Qian, R. & Chen, X. Eegdfus: a conditional diffusion model for fine-grained EEG denoising. IEEE J. Biomed. Health Inform. 29, 2557–2569 (2024).

  • Antonello, R., Sarma, N., Tang, J., Song, J. & Huth, A. How many bytes can you take out of brain-to-text decoding? Preprint at arXiv https://doi.org/10.48550/arXiv.2405.14055 (2024).

  • Han, J. et al. Onellm: One framework to align all modalities with language. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 26584–26595 (2024).

  • Tankus, A., Rosenberg, N., Ben-Hamo, O., Stern, E. & Strauss, I. Machine learning decoding of single neurons in the thalamus for speech brain-machine interfaces. J. Neural Eng. 21, 036009 (2024).

    Article 

    Google Scholar 

  • Feinberg, D. A. et al. Next-generation MRI scanner designed for ultra-high-resolution human brain imaging at 7 Tesla. Nat. Methods 20, 2048–2057 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Tatulian, S. A. Challenges and hopes for alzheimer’s disease. Drug Discov. Today https://api.semanticscholar.org/CorpusID:246553676 (2022).

  • Bucur, M. & Papagno, C. Deep brain stimulation in parkinson disease: a meta-analysis of the long-term neuropsychological outcomes. Neuropsychol. Rev. https://api.semanticscholar.org/CorpusID:247615265 (2022).



  • Source link

    Leave a Reply

    Your email address will not be published. Required fields are marked *