Progress, challenges and future of linguistic neural decoding with deep learning

Abnar, S., Beinborn, L., Choenni, R. & Zuidema, W. Blackbox meets blackbox: Representational similarity & stability analysis of neural language models and brains. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp 191–203 (2019).

Frisby, S. L., Halai, A. D., Cox, C. R., Ralph, M. A. L. & Rogers, T. T. Decoding semantic representations in mind and brain. Trends Cogn. Sci. 27, 258–281 (2023).

Article
PubMed

Google Scholar

Silva, A. B., Littlejohn, K. T., Liu, J. R., Moses, D. A. & Chang, E. F. The speech neuroprosthesis. Nat. Rev. Neurosci. 25, 473–492 (2024).

Tuckute, G., Kanwisher, N. & Fedorenko, E. Language in brains, minds, and machines. Annu. Rev. Neurosci. 47, 271–301 (2024).

Ball, T., Kern, M., Mutschler, I., Aertsen, A. & Schulze-Bonhage, A. Signal quality of simultaneously recorded invasive and non-invasive eeg. Neuroimage 46, 708–716 (2009).

Article
PubMed

Google Scholar

Ahissar, E. et al. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc. Natl Acad. Sci. USA 98, 13367–13372 (2001).

Article
CAS
PubMed
PubMed Central

Google Scholar

Brodbeck, C., Hong, L. E. & Simon, J. Z. Rapid transformation from auditory to linguistic representations of continuous speech. Curr. Biol. 28, 3976–3983 (2018).

Article
CAS
PubMed
PubMed Central

Google Scholar

Koskinen, M., Kurimo, M., Gross, J., Hyvärinen, A. & Hari, R. Brain activity reflects the predictability of word sequences in listened continuous speech. NeuroImage 219, 116936 (2020).

Article
PubMed

Google Scholar

Donhauser, P. W. & Baillet, S. Two distinct neural timescales for predictive speech processing. Neuron 105, 385–393 (2020).

Article
CAS
PubMed

Google Scholar

Gillis, M., Vanthornhout, J., Simon, J. Z., Francart, T. & Brodbeck, C. Neural markers of speech comprehension: measuring EEG tracking of linguistic speech representations, controlling the speech acoustics. J. Neurosci. 41, 10316–10329 (2021).

Article
CAS
PubMed
PubMed Central

Google Scholar

Leonard, M. K. et al. Large-scale single-neuron speech sound encoding across the depth of human cortex. Nature 626, 593–602 (2024).

Article
CAS
PubMed

Google Scholar

Khanna, A. R. et al. Single-neuronal elements of speech production in humans. Nature 626, 603–610 (2024).

Article
CAS
PubMed
PubMed Central

Google Scholar

Clark, A. Whatever next? predictive brains, situated agents, and the future of cognitive science. Behav. Brain Sci. 36, 181–204 (2013).

Article
PubMed

Google Scholar

Schrimpf, M. et al. The neural architecture of language: Integrative modeling converges on predictive processing. Proc. Natl Acad. Sci. USA 118, e2105646118 (2021).

Article
CAS
PubMed
PubMed Central

Google Scholar

Heilbron, M., Armeni, K., Schoffelen, J.-M., Hagoort, P. & De Lange, F. P. A hierarchy of linguistic predictions during natural language comprehension. Proc. Natl Acad. Sci. USA 119, e2201968119 (2022).

Article
CAS
PubMed
PubMed Central

Google Scholar

Caucheteux, C., Gramfort, A. & King, J.-R. Evidence of a predictive coding hierarchy in the human brain listening to speech. Nat. Hum. Behav. 7, 430–441 (2023).

Article
PubMed
PubMed Central

Google Scholar

Broderick, M. P., Anderson, A. J., Di Liberto, G. M., Crosse, M. J. & Lalor, E. C. Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech. Curr. Biol. 28, 803–809 (2018).

Article
CAS
PubMed

Google Scholar

Caucheteux, C., Gramfort, A. & King, J.-R. Disentangling syntax and semantics in the brain with deep networks. In: International conference on machine learning, pp 1336–1348 (PMLR, 2021).

Toneva, M., Mitchell, T. M. & Wehbe, L. Combining computational controls with natural text reveals aspects of meaning composition. Nat. Comput. Sci. 2, 745–757 (2022).

Article
PubMed
PubMed Central

Google Scholar

Antonello, R. & Huth, A. Predictive coding or just feature discovery? An alternative account of why language models fit brain data. Neurobiol. Lang. 5, 64–79 (2024).

Google Scholar

Alkhamissi, B., Tuckute, G., Bosselut, A., Schrimpf, M.: The llm language network: A neuroscientific approach for identifying causally task-relevant units. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 10887–10911 (2025).

Whittington, J. C., Warren, J. & Behrens, T. E. Relating transformers to models and neural representations of the hippocampal formation. In: International Conference on Learning Representations (2021).

Liu, X. et al. Coupling artificial neurons in Bert and biological neurons in the human brain. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, 8888–8896 (2023).

Antonello, R., Vaidya, A. & Huth, A. Scaling laws for language encoding models in fMRI. Adv. Neural Inf. Process. Syst. 36, 21895–21907 (2024).

Lin, H. et al. Selecting large language model to fine-tune via rectified scaling law. In Proceedings of the 41st International Conference on Machine Learning, pp. 30080–30107 (2024).

Ren, Y., Jin, R., Zhang, T. & Xiong, D. Do large language models mirror cognitive language processing? In: Proceedings of the 31st International Conference on Computational Linguistics, pp. 2988–3001 (2025).

Papineni, K., Roukos, S., Ward, T. & Zhu, W.-J. Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311–318 (2002).

Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q. & Artzi, Y. Bertscore: evaluating text generation with bert. In: International Conference on Learning Representations (2019).

Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 4171–4186 (2019).

Taal, C. H., Hendriks, R. C., Heusdens, R. & Jensen, J. A short-time objective intelligibility measure for time-frequency weighted noisy speech. In: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp 4214–4217 (IEEE, 2010).

Kubichek, R. Mel-cepstral distance measure for objective speech quality assessment. In: Proceedings of IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, vol. 1, 125–128 (IEEE, 1993).

Just, M. A., Cherkassky, V. L., Aryal, S. & Mitchell, T. M. A neurosemantic theory of concrete noun representation based on the underlying brain codes. PloS ONE 5, e8622 (2010).

Article
PubMed
PubMed Central

Google Scholar

Sudre, G. et al. Tracking neural coding of perceptual and semantic features of concrete nouns. NeuroImage 62, 451–463 (2012).

Article
PubMed

Google Scholar

Anderson, A. J., Kiela, D., Clark, S. & Poesio, M. Visually grounded and textual semantic models differentially decode brain activity associated with concrete and abstract nouns. Trans. Assoc. Comput. Linguist. 5, 17–30 (2017).

Article

Google Scholar

Abnar, S., Ahmed, R., Mijnheer, M. & Zuidema, W. Experiential, distributional and dependency-based word embeddings have complementary roles in decoding brain activity. In: Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2018), pp 57–66 (2018).

Pereira, F. et al. Toward a universal decoder of linguistic meaning from brain activation. Nat. Commun. 9, 963 (2018).

Article
PubMed
PubMed Central

Google Scholar

Affolter, N., Egressy, B., Pascual, D. & Wattenhofer, R. Brain2word: decoding brain activity for language generation. Preprint at arXiv https://doi.org/10.48550/arXiv.2009.04765 (2020).

Zou, S., Wang, S., Zhang, J. & Zong, C. Towards brain-to-text generation: neural decoding with pre-trained encoder-decoder models. In: NeurIPS 2021 AI for Science Workshop (2021).

Anderson, A. J. et al. Predicting neural activity patterns associated with sentences using a neurobiologically motivated model of semantic representation. Cereb. Cortex 27, 4379–4395 (2017).

PubMed

Google Scholar

Wang, J., Cherkassky, V. L. & Just, M. A. Predicting the brain activation pattern associated with the propositional content of a sentence: modeling neural representations of events and states. Hum. Brain Mapp. 38, 4865–4881 (2017).

Article
PubMed
PubMed Central

Google Scholar

Anderson, A. J. et al. An integrated neural decoder of linguistic and experiential meaning. J. Neurosci. 39, 8969–8987 (2019).

Article
CAS
PubMed
PubMed Central

Google Scholar

Sun, J., Wang, S., Zhang, J. & Zong, C. Towards sentence-level brain decoding with distributed representations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp 7047–7054 (2019).

Gauthier, J. & Levy, R. Linking artificial and human neural representations of language. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 529–539 (2019).

Sun, J., Wang, S., Zhang, J. & Zong, C. Neural encoding and decoding with distributed sentence representations. IEEE Trans. Neural Netw. Learn. Syst. 32, 589–603 (2020).

Article

Google Scholar

Wehbe, L. et al. Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses. PloS ONE 9, e112575 (2014).

Article
PubMed
PubMed Central

Google Scholar

Jat, S., Tang, H., Talukdar, P. & Mitchell, T. Relating simple sentence representations in deep neural networks and the brain. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 5137-5154 (2019).

Li, Y. et al. Dissecting neural computations in the human auditory pathway using deep neural networks for speech. Nat. Neurosci. 26, 2213–2225 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

Liu, Y. & Ayaz, H. Speech recognition via fNIRS-based brain signals. Front. Neurosci. 12, 395799 (2018).

Article

Google Scholar

Moses, D. A., Leonard, M. K., Makin, J. G. & Chang, E. F. Real-time decoding of question-and-answer speech dialogue using human cortical activity. Nat. Commun. 10, 3096 (2019).

Article
PubMed
PubMed Central

Google Scholar

Radford, A. et al. Learning transferable visual models from natural language supervision. In: International conference on machine learning, pp 8748–8763 (PMLR, 2021).

Défossez, A., Caucheteux, C., Rapin, J., Kabeli, O. & King, J.-R. Decoding speech perception from non-invasive brain recordings. Nat. Mach. Intell. 5, 1097–1107 (2023).

Article

Google Scholar

Baevski, A., Zhou, Y., Mohamed, A. & Auli, M. wav2vec 2.0: A framework for self-supervised learning of speech representations. Adv. Neural Inf. Process. Syst. 33, 12449–12460 (2020).

Google Scholar

Bollens, L., Francart, T. & Van Hamme, H. Learning subject-invariant representations from speech-evoked EEG using variational autoencoders. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 1256–1260 (IEEE, 2022).

Aiken, S. J. & Picton, T. W. Human cortical responses to the speech envelope. Ear Hear. 29, 139–157 (2008).

Article
PubMed

Google Scholar

Ding, N. & Simon, J. Z. Cortical entrainment to continuous speech: functional roles and interpretations. Front. Hum. Neurosci. 8, 311 (2014).

Article
PubMed
PubMed Central

Google Scholar

Vanthornhout, J., Decruy, L., Wouters, J., Simon, J. Z. & Francart, T. Speech intelligibility predicted from neural entrainment of the speech envelope. J. Assoc. Res. Otolaryngol. 19, 181–191 (2018).

Article
PubMed
PubMed Central

Google Scholar

Crosse, M. J., Di Liberto, G. M., Bednar, A. & Lalor, E. C. The multivariate temporal response function (mtrf) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli. Front. Hum. Neurosci. 10, 604 (2016).

Article
PubMed
PubMed Central

Google Scholar

Accou, B. et al. Modeling the relationship between acoustic stimulus and EEG with a dilated convolutional neural network. In: 2020 28th European Signal Processing Conference (EUSIPCO), pp 1175–1179 (IEEE, 2021).

De Clercq, P., Vanthornhout, J., Vandermosten, M. & Francart, T. Beyond linear neural envelope tracking: a mutual information approach. J. Neural Eng. 20, 026007 (2023).

Article

Google Scholar

Thornton, M., Mandic, D. & Reichenbach, T. Robust decoding of the speech envelope from EEG recordings through deep neural networks. J. Neural Eng. 19, 046007 (2022).

Article

Google Scholar

Accou, B., Vanthornhout, J., Hamme, H. V. & Francart, T. Decoding of the speech envelope from EEG using the VLAAI deep neural network. Sci. Rep. 13, 812 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

de Taillez, T., Kollmeier, B. & Meyer, B. T. Machine learning for decoding listeners’ attention from electroencephalography evoked by continuous speech. Eur. J. Neurosci. 51, 1234–1241 (2020).

Article
PubMed

Google Scholar

Xu, Z. et al. Decoding selective auditory attention with EEG using a transformer model. Methods 204, 410–417 (2022).

Article
CAS
PubMed

Google Scholar

Krishna, G., Han, Y., Tran, C., Carnahan, M. & Tewfik, A. H. State-of-the-art speech recognition using eeg and towards decoding of speech spectrum from eeg.Preprint at arXiv https://doi.org/10.48550/arXiv.1908.05743 (2019).

Petrosyan, A., Voskoboynikov, A. & Ossadtchi, A. Compact and interpretable architecture for speech decoding from stereotactic EEG. In: 2021 Third International Conference Neurotechnologies and Neurointerfaces (CNN), pp 79–82 (IEEE, 2021).

Krishna, G., Tran, C., Carnahan, M. & Tewfik, A. H. Advancing speech synthesis using EEG. In: 2021 10th International IEEE/EMBS Conference on Neural Engineering (NER), pp 199–204 (IEEE, 2021).

Pasley, B. N. et al. Reconstructing speech from human auditory cortex. PLoS Biol. 10, e1001251 (2012).

Article
CAS
PubMed
PubMed Central

Google Scholar

Akbari, H., Khalighinejad, B., Herrero, J. L., Mehta, A. D. & Mesgarani, N. Towards reconstructing intelligible speech from the human auditory cortex. Sci. Rep. 9, 874 (2019).

Article
PubMed
PubMed Central

Google Scholar

Arjovsky, M., Chintala, S. & Bottou, L. Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp 214–223 (PMLR, 2017).

Wang, R. et al. Stimulus speech decoding from human cortex with generative adversarial network transfer learning. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp 390–394 (IEEE, 2020).

Yi, Z., Zhang, H., Tan, P. & Gong, M. Dualgan: Unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2849–2857 (2017).

Guo, Y., Liu, T., Zhang, X., Wang, A. & Wang, W. End-to-end translation of human neural activity to speech with a dual–dual generative adversarial network. Knowl. Based Syst. 277, 110837 (2023).

Article

Google Scholar

Senda, J. et al. Auditory stimulus reconstruction from ECoG with DNN and self-attention modules. Biomed. Signal Process. Control 89, 105761 (2024).

Article

Google Scholar

Makin, J. G., Moses, D. A. & Chang, E. F. Machine translation of cortical activity to text with an encoder–decoder framework. Nat. Neurosci. 23, 575–582 (2020).

Article
CAS
PubMed
PubMed Central

Google Scholar

Wang, Z. & Ji, H. Open vocabulary electroencephalography-to-text decoding and zero-shot sentiment classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp 5350–5358 (2022).

Lewis, M. et al. Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880 (2020).

Van Den Oord, A. et al. Neural discrete representation learning. Adv. Neural Inf. Process. Syst. 30, 6309–6318 (2017).

Duan, Y., Chau, C., Wang, Z., Wang, Y.-K. & Lin, C.-t. Dewave: discrete encoding of EEG waves for EEG to text translation. Adv. Neural Inf. Process. Syst. 36, 9907–9918 (2024).

Jo, H. et al. Are EEG-to-text models working? Preprint at arXiv https://doi.org/10.48550/arXiv.2405.06459 (2024).

Yang, Y., Duan, Y., Zhang, Q., Xu, R. & Xiong, H. Neuspeech: Decode neural signal as speech. Preprint at arXiv https://doi.org/10.48550/arXiv.2403.01748 (2024).

Yang, Y. et al. Mad: Multi-alignment meg-to-text decoding. Preprint at arXiv https://doi.org/10.48550/arXiv.2406.01512 (2024).

Chen, X., Du, C., Liu, C., Wang, Y. & He, H. Open-vocabulary auditory neural decoding using fMRI-prompted llm. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.07840 (2024).

Ye, Z. et al. Generative language reconstruction from brain recordings (2024).

Yin, C., Ye, Z. & Li, P. Language reconstruction with brain predictive coding from fMRI data. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.11597 (2024).

Feng, X., Feng, X., Qin, B. & Liu, T. Aligning semantic in brain and language: a curriculum contrastive method for electroencephalography-to-text generation. IEEE Trans. Neural Syst. Rehabil. Eng. 31, 3874–3883 (2023).

Xi, N. et al. Unicorn: unified cognitive signal reconstruction bridging cognitive signals and human language. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 13277–13291 (2023).

Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. Preprint at arXiv https://doi.org/10.48550/arXiv.1301.3781 (2013).

Tang, J., LeBel, A., Jain, S. & Huth, A. G. Semantic reconstruction of continuous language from non-invasive brain recordings. Nat. Neurosci. 26, 858–866 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

Zhao, X. et al. Mapguide: a simple yet effective method to reconstruct continuous language from brain activities. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 3822–3832 (2024).

Chen, X. et al. High-speed spelling with a noninvasive brain–computer interface. Proc. Natl Acad. Sci. USA 112, E6058–E6067 (2015).

Article
CAS
PubMed
PubMed Central

Google Scholar

Metzger, S. L. et al. Generalizable spelling using a speech neuroprosthesis in an individual with severe limb and vocal paralysis. Nat. Commun. 13, 6510 (2022).

Article
CAS
PubMed
PubMed Central

Google Scholar

Leuthardt, E. C. et al. Using the electrocorticographic speech network to control a brain–computer interface in humans. J. Neural Eng. 8, 036004 (2011).

Article
PubMed
PubMed Central

Google Scholar

Willett, F. R., Avansino, D. T., Hochberg, L. R., Henderson, J. M. & Shenoy, K. V. High-performance brain-to-text communication via handwriting. Nature 593, 249–254 (2021).

Article
CAS
PubMed
PubMed Central

Google Scholar

Brigham, K. & Kumar, B. V. Imagined speech classification with EEG signals for silent communication: a preliminary investigation into synthetic telepathy. In: 2010 4th International Conference on Bioinformatics and Biomedical Engineering, pp 1–4 (IEEE, 2010).

Duraivel, S. et al. High-resolution neural recordings improve the accuracy of speech decoding. Nat. Commun. 14, 6938 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

Wandelt, S. K. et al. Representation of internal speech by single neurons in human supramarginal gyrus. Nat. Hum. Behav. https://api.semanticscholar.org/CorpusID:269759448 (2024).

Chang, E. F. et al. Categorical speech representation in human superior temporal gyrus. Nat. Neurosci. 13, 1428–1432 (2010).

Article
CAS
PubMed
PubMed Central

Google Scholar

Mesgarani, N., Cheung, C., Johnson, K. & Chang, E. F. Phonetic feature encoding in human superior temporal gyrus. Science 343, 1006–1010 (2014).

Article
CAS
PubMed
PubMed Central

Google Scholar

Suppes, P., Lu, Z.-L. & Han, B. Brain wave recognition of words. Proc. Natl Acad. Sci. USA 94, 14965–14969 (1997).

Article
CAS
PubMed
PubMed Central

Google Scholar

Suppes, P., Han, B. & Lu, Z.-L. Brain-wave recognition of sentences. Proc. Natl Acad. Sci. USA 95, 15861–15866 (1998).

Article
CAS
PubMed
PubMed Central

Google Scholar

D’Zmura, M., Deng, S., Lappas, T., Thorpe, S. & Srinivasan, R. Toward EEG sensing of imagined speech. In Human-Computer Interaction. New Trends: 13th International Conference, HCI International 2009, San Diego, CA, USA, July 19-24, 2009, Proceedings, Part I 13, pp 40–48 (Springer, 2009).

Tankus, A., Fried, I. & Shoham, S. Structured neuronal encoding and decoding of human speech features. Nat. Commun. 3, 1015 (2012).

Article
PubMed

Google Scholar

DaSalla, C. S., Kambara, H., Sato, M. & Koike, Y. Single-trial classification of vowel speech imagery using common spatial patterns. Neural Netw. 22, 1334–1339 (2009).

Article
PubMed

Google Scholar

Wang, L., Zhang, X., Zhong, X. & Zhang, Y. Analysis and classification of speech imagery EEG for BCI. Biomed. signal Process. control 8, 901–908 (2013).

Article

Google Scholar

Stavisky, S. D. et al. Decoding speech from intracortical multielectrode arrays in dorsal “arm/hand areas” of human motor cortex. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp 93–97 (IEEE, 2018).

Pei, X., Barbour, D. L., Leuthardt, E. C. & Schalk, G. Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J. Neural Eng. 8, 046028 (2011).

Article
PubMed
PubMed Central

Google Scholar

Deng, S., Srinivasan, R., Lappas, T. & D’Zmura, M. Eeg classification of imagined syllable rhythm using Hilbert spectrum methods. J. Neural Eng. 7, 046006 (2010).

Article
PubMed

Google Scholar

Kim, J., Lee, S.-K. & Lee, B. Eeg classification in a single-trial basis for vowel speech perception using multivariate empirical mode decomposition. J. Neural Eng. 11, 036010 (2014).

Article
PubMed

Google Scholar

Moses, D. A., Mesgarani, N., Leonard, M. K. & Chang, E. F. Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity. J. Neural Eng. 13, 056004 (2016).

Article
PubMed
PubMed Central

Google Scholar

Brumberg, J. S., Wright, E. J., Guenther, F. H. & Kennedy, P. R. Classification of intended phoneme production from chronic intracortical microelectrode recordings in speech motor cortex. Front. Neurosci. 5, 7880 (2011).

Google Scholar

Kellis, S. et al. Decoding spoken words using local field potentials recorded from the cortical surface. J. Neural Eng. 7, 056007 (2010).

Article
PubMed
PubMed Central

Google Scholar

Mugler, E. M. et al. Direct classification of all American English phonemes using signals from functional speech motor cortex. J. Neural Eng. 11, 035015 (2014).

Article
PubMed
PubMed Central

Google Scholar

Clayton, J., Wellington, S., Valentini-Botinhao, C. & Watts, O. Decoding imagined, heard, and spoken speech: Classification and regression of eeg using a 14-channel dry-contact mobile headset. In: INTERSPEECH, pp 4886–4890 (2020).

Mohanchandra, K. & Saha, S. A communication paradigm using subvocalized speech: translating brain signals into speech. Augmented Hum. Res. 1, 3 (2016).

Article

Google Scholar

Martin, S. et al. Word pair classification during imagined speech using direct brain recordings. Sci. Rep. 6, 25803 (2016).

Article
CAS
PubMed
PubMed Central

Google Scholar

Nguyen, C. H., Karavas, G. K. & Artemiadis, P. Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. J. Neural Eng. 15, 016002 (2017).

Article

Google Scholar

González-Castañeda, E. F., Torres-García, A. A., Reyes-García, C. A. & Villaseñor-Pineda, L. Sonification and textification: Proposing methods for classifying unspoken words from EEG signals. Biomed. Signal Process. Control 37, 82–91 (2017).

Article

Google Scholar

Salama, M., ElSherif, L., Lashin, H. & Gamal, T. Recognition of unspoken words using electrode electroencephalograhic signals. In: The Sixth International Conference on Advanced Cognitive Technologies and Applications, pp 51–5 (2014).

Zhao, S. & Rudzicz, F. Classifying phonological categories in imagined and articulated speech. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 992–996 (IEEE, 2015).

Saha, P. & Fels, S. Hierarchical deep feature learning for decoding imagined speech from EEG. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp 10019–10020 (2019).

Dash, D., Ferrari, P. & Wang, J. Decoding imagined and spoken phrases from non-invasive neural (meg) signals. Front. Neurosci. 14, 490970 (2020).

Article

Google Scholar

Herff, C. et al. Brain-to-text: decoding spoken phrases from phone representations in the brain. Front. Neurosci. 8, 141498 (2015).

Google Scholar

Moses, D. A. et al. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. N. Engl. J. Med. 385, 217–227 (2021).

Article
PubMed
PubMed Central

Google Scholar

Willett, F. R. et al. A high-performance speech neuroprosthesis. Nature 620, 1031–1036 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

Metzger, S. L. et al. A high-performance neuroprosthesis for speech decoding and avatar control. Nature 620, 1037–1046 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

Munteanu, C., Penn, G., Baecker, R., Toms, E. & James, D. Measuring the acceptable word error rate of machine-generated webcast transcripts. In: Ninth International Conference on Spoken Language Processing (Citeseer, 2006).

Sun, P., Anumanchipalli, G. K. & Chang, E. F. Brain2char: a deep architecture for decoding text from brain recordings. J. Neural Eng. 17, 066015 (2020).

Article

Google Scholar

Feng, C. et al. A high-performance brain-to-sentence decoder for logosyllabic language (2023).

Silva, A. B. et al. A bilingual speech neuroprosthesis driven by cortical articulatory representations shared between languages. Nat. Biomed. Eng. 8, 977–991 (2024).

Yuan, B. A. & Makin, J. G. Improving speech decoding from ECOG with self-supervised pretraining. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.18639 (2024).

Schneider, S., Baevski, A., Collobert, R. & Auli, M. wav2vec: Unsupervised pre-training for speech recognition. In Proc. Interspeech 2019, pp. 3465–3469 (2019).

Radford, A. et al. Language models are unsupervised multitask learners. OpenAI blog 1, 9 (2019).

Google Scholar

Zhang, S. et al. Opt: Open pre-trained transformer language models. Preprint at arXiv https://doi.org/10.48550/arXiv.2205.01068 (2022).

Touvron, H. et al. Llama 2: open foundation and fine-tuned chat models. Preprint at arXiv https://doi.org/10.48550/arXiv.2307.09288 (2023).

Feng, S., Liu, H., Wang, Y. & Wang, Y. Towards an end-to-end framework for invasive brain signal decoding with large language models. In: Interspeech 2024, pp 1495–1499 (2024).

Guenther, F. H. et al. A wireless brain-machine interface for real-time speech synthesis. PloS ONE 4, e8218 (2009).

Article
PubMed
PubMed Central

Google Scholar

Chen, X. et al. A neural speech decoding framework leveraging deep learning and speech synthesis. Nat. Mach. Intell. 6, 467–480 (2024).

Anumanchipalli, G. K., Chartier, J. & Chang, E. F. Speech synthesis from neural decoding of spoken sentences. Nature 568, 493–498 (2019).

Article
CAS
PubMed
PubMed Central

Google Scholar

Krishna, G., Tran, C., Han, Y., Carnahan, M. & Tewfik, A. H. Speech synthesis using EEG. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 1235–1238 (IEEE, 2020).

Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 (2017).

Angrick, M. et al. Speech synthesis from ecog using densely connected 3d convolutional neural networks. J. Neural Eng. 16, 036019 (2019).

Article
PubMed
PubMed Central

Google Scholar

Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).

Article
CAS
PubMed
PubMed Central

Google Scholar

Bocquelet, F., Hueber, T., Girin, L., Badin, P. & Yvert, B. Robust articulatory speech synthesis using deep neural networks for BCI applications. In: Interspeech 2014-15th Annual Conference of the International Speech Communication Association (2014).

Bocquelet, F., Hueber, T., Girin, L., Savariaux, C. & Yvert, B. Real-time control of an articulatory-based speech synthesizer for brain computer interfaces. PLoS Comput. Biol. 12, e1005119 (2016).

Article
PubMed
PubMed Central

Google Scholar

Cheung, C., Hamilton, L. S., Johnson, K. & Chang, E. F. The auditory representation of speech sounds in human motor cortex. elife 5, e12577 (2016).

Article
PubMed
PubMed Central

Google Scholar

Dichter, B. K., Breshears, J. D., Leonard, M. K. & Chang, E. F. The control of vocal pitch in human laryngeal motor cortex. Cell 174, 21–31 (2018).

Article
CAS
PubMed
PubMed Central

Google Scholar

Chartier, J., Anumanchipalli, G. K., Johnson, K. & Chang, E. F. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron 98, 1042–1054 (2018).

Article
CAS
PubMed
PubMed Central

Google Scholar

Mugler, E. M. et al. Differential representation of articulatory gestures and phonemes in precentral and inferior frontal gyri. J. Neurosci. 38, 9803–9813 (2018).

Article
CAS
PubMed
PubMed Central

Google Scholar

Angrick, M. et al. Real-time synthesis of imagined speech processes from minimally invasive recordings of neural activity. Commun. Biol. 4, 1055 (2021).

Article
PubMed
PubMed Central

Google Scholar

Herff, C. et al. Generating natural, intelligible speech from brain activity in motor, premotor, and inferior frontal cortices. Front. Neurosci. 13, 469935 (2019).

Article

Google Scholar

Liu, Y. et al. Decoding and synthesizing tonal language speech from brain activity. Sci. Adv. 9, eadh0478 (2023).

Article
PubMed
PubMed Central

Google Scholar

Angrick, M. et al. Online speech synthesis using a chronically implanted brain–computer interface in an individual with als. Sci. Rep. 14, 9617 (2024).

Article
CAS
PubMed
PubMed Central

Google Scholar

Ling, J., Wang, Y., Xue, H., Xie, R. & Song, L. Posetalk: text-and-audio-based pose control and motion refinement for one-shot talking head generation. Preprint at arXiv https://doi.org/10.48550/arXiv.2409.02657 (2024).

Song, H. et al. Continuous neural control of a bionic limb restores biomimetic gait after amputation. Nat. Med. 30, 2010–2019 (2024).

Wang, J. et al. Neural correlate and movement decoding of simultaneous-and-sequential bimanual movements using eeg signals. IEEE Trans. Neural Syst. Rehabil. Eng. 32, 2087–2095 (2024).

Luo, S. et al. Stable decoding from a speech BCI enables control for an individual with ALS without recalibration for 3 months. Adv. Sci. https://api.semanticscholar.org/CorpusID:264448311 (2023).

Wang, S., Liu, S., Tan, Z. & Wang, X. Mindbridge: A cross-subject brain decoding framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11333–11342 (2024).

Quan, R., Wang, W., Tian, Z., Ma, F. & Yang, Y. Psychometry: An omnifit model for image reconstruction from human brain activity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 233–243 (2024).

Liu, X. et al. Eeg2video: Towards decoding dynamic visual perception from EEG signals. In: The Thirty-Eighth Annual Conference on Neural Information Processing Systems.

van Stuijvenberg, O. C., Broekman, M. L., Wolff, S. E., Bredenoord, A. L. & Jongsma, K. R. Developer perspectives on the ethics of AI-driven neural implants: a qualitative study. Sci. Rep. 14, 7880 (2024).

Article
PubMed
PubMed Central

Google Scholar

Yuste, R. Advocating for neurodata privacy and neurotechnology regulation. Nat. Protoc. 18, 2869–2875 (2023).

Article
CAS
PubMed

Google Scholar

Wang, C. et al. Brainbert: Self-supervised representation learning for intracranial recordings. In: The Eleventh International Conference on Learning Representations.

Zheng, H. et al. Du-IN: Discrete units-guided mask modeling for decoding speech from intracranial neural signals. In: The Thirty-Eighth Annual Conference on Neural Information Processing Systems https://openreview.net/forum?id=uyLtEFnpQP (2024).

Dong, Y. et al. An approach for EEG denoising based on Wasserstein generative adversarial network. IEEE Trans. Neural Syst. Rehabil. Eng. 31, 3524–3534 (2023).

Huang, X., Li, C., Liu, A., Qian, R. & Chen, X. Eegdfus: a conditional diffusion model for fine-grained EEG denoising. IEEE J. Biomed. Health Inform. 29, 2557–2569 (2024).

Antonello, R., Sarma, N., Tang, J., Song, J. & Huth, A. How many bytes can you take out of brain-to-text decoding? Preprint at arXiv https://doi.org/10.48550/arXiv.2405.14055 (2024).

Han, J. et al. Onellm: One framework to align all modalities with language. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 26584–26595 (2024).

Tankus, A., Rosenberg, N., Ben-Hamo, O., Stern, E. & Strauss, I. Machine learning decoding of single neurons in the thalamus for speech brain-machine interfaces. J. Neural Eng. 21, 036009 (2024).

Article

Google Scholar

Feinberg, D. A. et al. Next-generation MRI scanner designed for ultra-high-resolution human brain imaging at 7 Tesla. Nat. Methods 20, 2048–2057 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

Tatulian, S. A. Challenges and hopes for alzheimer’s disease. Drug Discov. Today https://api.semanticscholar.org/CorpusID:246553676 (2022).

Bucur, M. & Papagno, C. Deep brain stimulation in parkinson disease: a meta-analysis of the long-term neuropsychological outcomes. Neuropsychol. Rev. https://api.semanticscholar.org/CorpusID:247615265 (2022).

Progress, challenges and future of linguistic neural decoding with deep learning

Leave a Reply

RECENT POSTS

Do Qualcomm’s (QCOM) AI shift and EPS surge suggest a new core business model?

AI won’t actually “take” your job. Here’s what’s happening instead

AI video startup receives backlash after boasting that it’s “done” the job

Related Posts

Leave a Reply