Artificial intelligence meets brain theory (again)

The NIH BRAIN NeuroAI 2024 Workshop (held on November 12 and 13) offered a partial view of the AI↔BT relationship. Around half the talks (Table 1) are discussed here, with the order changed to improve narrative flow. Brief notes on what a speaker said are indented. No one meeting can cover all aspects of AI↔BT, but the lacunae noted below provide useful entry points for the AI↔BT conversation.

Table 1 Selected talks from the NIH BRAIN NeuroAI 2024 workshop in order of presentation at the meeting

3.1 Neuromorphic computing

An important theme was the development of neuromorphic circuitry to expand the capabilities and reduce the power demands of neural computations (recall E1).

Brad Aimone presented an implementation that supports over a million CMOS neurons and billions of synapses, roughly the number of the parrot or small primate brain. Neurons are simulated using a leaky integrate-and-fire model with no neuromodulation. In future, “post-Moore devices” (electrochemical random-access memory, memristors, circuits that utilize magnetic tunnel junctions, and various optical and organic devices, etc.) may scale to human sizes.

Aimone also discussed modelling circuitry in Drosophila (see § 3.2).

Kwabena Boahen addressed the challenge of scaling processing from 2D chips to 3D brains and outlined an approach to neuromorphic computation based on considering the brain’s fundamental unit of computation to be dendrocentric learning with sequence detectors (Boahen 2022).

Indeed, much complexity of computation in biological neurons is mediated by the arrangement of inputs on different parts of the dendritic tree (see Poirazi in § 3.2), an important aspect of subneural neuromorphic computation.

Ralph Etienne-Cummings reported on the 2024 Workshop on Neuromorphic Principles in Biomedicine and Healthcare, addressing challenges of creating a new generation of biomedical and neurotechnologies that operate with extreme energy and data efficiency, adaptability, and performance advantages. They also noted the need for new electronics materials that better interface with tissue and possess stretchability and conformability, biocompatibility, self-healing capabilities, and low immune response. In some sense, the task involves reverse engineering the brain (Cauwenberghs 2013), but speakers cautioned against too strict an adherence to biological mimicry, suggesting a broader “physiomorphic” framework. [My italics]

Different properties are relevant for implantable or wearable neuromorphic circuitry as a tool for healthcare from those for free-standing AI agents. A basic question reiterated in § 4.2 is “at what level should the ‘neural circuit equivalent’ be designed or adapt”?

3.2 Subneural computation and learning

Diverse areas of the human brain employ different forms of synaptic plasticity (recall E2), and the brain comprises more than neurons. BT relates diverse forms of neuromodulation and synaptic plasticity to the role of different brain regions in supporting different “psychological-level” styles of learning and memory (Caligiore et al. 2019; Doya 2000) including episodic memory (based in part on the hippocampus), procedural memory (integrating cerebral, cerebellar, midbrain and even spinal mechanisms) and reinforcement learning (where the role of dopamine in learning within the basal ganglia is the classic example). Lifelong learning provides a related challenge:

Dhireesha Kudithipudi stressed the importance of machines that exhibit lifelong learning without catastrophic forgetting (van de Ven et al. 2024). In synaptic consolidation, newly formed synaptic connections become stable and integrated into the network over time, making the synapses more enduring and less prone to disruption. He argued that neuromorphic hardware that integrates probabilistic switching and the inherent variability of non-volatile memory can represent plasticity mechanisms through fine-grained reconfigurability units within the memory.

Kudithipudi’s talk suggests the importance of deeper analysis of diverse memory mechanisms even at the cognitive level as a target for AI-BT cooperation in shaping neural computing. Episodic memory encodes new episodes in which the agent is involved. One hypothesis is that some of these memories first formed in hippocampus become consolidated in neocortical circuits (McClelland et al. 1995; Squire and Alvarez 1995). Another, is that an increasing stock of episodic memories is supported in part by neurogenesis in the hippocampus (Aimone et al. 2009), suggesting that episodic memory might demand the use of growing circuitry. Could neuromorphic computing address this other than by deploying larger and larger portions of a prestructured network? For semantic memory, the registration of contradictions may initiate overwriting of one “fact” by “another”. In procedural learning not only can new skills be mastered, but old skills can be honed by continuing practice– a long-standing application area for applying ANNs to adaptive motor control (Albus 1975).

Two speakers moved beyond E2’s concern with artificial synaptic networks.

Wolfgang Losert introduced biocomputing with astrocytes as carriers of analog information and as enablers of slow integrative processing of information in neural networks.

This talk asks us whether one should consider “neuron + astrocyte” as a unit for biological NNs.

Panayiota Poirazi reviewed specific dendritic structures that empower brain function. They not only segregate neuronal inputs to a neuron and thus support differential plasticity of existing synapses but also support where new synapses may form, e.g., in clustering. She then suggested the challenges of extending both the neuroscience and the technology to advance dendrite-inspired computing (Pagkalos et al. 2024).

This suggests that dendritic compartments might become units for neuromorphic computing. Recall the original insights of Rall (1964), and the rich array of biological neural modeling on “conventional” computers using, e.g., NEURON — and note Boahen’s concern with neuromorphic circuitry that supports dendrocentric learning. The second property of synaptogenesis (and apoptosis) seems (to me) less tractable for neuromorphic computing.

William Nourse offered examples of “insect intelligence” (e.g., cooperative behavior of ants forming a formic bridge to cross a gap) but no analysis of how neural networks might support such behavior. Rather, he focused on the work of Hulse et al. (2021) on “A connectome of the Drosophila central complex reveals network motifs suitable for flexible navigation and context-dependent action selection,” but his example of a “fruit fly robot” was based on fly leg biomechanics (cf. Hartmann, below), not details of the fly nervous system.^{Footnote 2}

Nourse added that there is no complete whole-nervous-system connectome — elements are missing, such as electrical/gap junctions and fibers between sensory systems and ventral nerve cord. If we had them, would it make it easier to model the collective behaviors of insects? Ant bridge building or termite mound building (Turner 2011) can be understood without invoking neural circuitry but by invoking relatively simple rules as studied in “swarm intelligence.”

Aimone also mentioned that Shiu et al. (2024) exploit the recent central brain connectome of adult Drosophila melanogaster, containing more than 125,000 neurons and 50 million synaptic connections to model the entire Drosophila brain on the basis of neural connectivity and neurotransmitter identity. It accurately models the activation of gustatory neurons and of a small set of neurons comprising the antennal grooming circuit. Aimone claims that the brain likely only computes what the brain is efficient at computing, and that neuromorphics should have similar restrictions.

Will more expressive neuromorphic circuitry involve incorporating more and more subneural mechanisms (compare the synaptic networks of E2 and the dendritic computing just discussed)? Note that certain neurons (whether in Drosophila or human) are so complex that the circuitry required for the simulation of one such neuron may prove as complex as the circuitry now used to simulate many current artificial neurons. Alternatively, will they gain economy by discovering places where instead of representing a single neuron, a subcircuit can represent a pool of neurons or larger structural elements like columns (Hubel et al. 1978; Mountcastle 1997; Powell and Mountcastle 1959) or functional units like schemas (as explained in T2 of § 4).

Moreover, if neuromorphics should only compute what the brain is efficient at computing, what are we to learn from the neural architecture of Drosophila versus the very different architecture of different regions within a mammalian brain. Conversely, consider the way the human brain supports diverse “virtual machines.” The combination of cultural evolution and development equips humans to do many things the brain did not originally evolve for. Modeling all connectome-specified neural connections is very different from starting with a “standard” layered (possibly recurrent) ANN.

Between them, these talks raise the issue of when and whether “we” need all the details of biological neurons, whether in an AI application or in modeling a particular set of phenomena in BT. When we do, what biological details should be included?

3.3 Mammalian cortex and digital twins

With this we turn to simulations of systems akin to regions of mammalian cortex.

Patrick Mineault proposed developing digital twins and foundation models that enable in silico experimentation and hypothesis generation to better understand perception, cognition, and behavior.

Andreas Tolias notes the new neurotechnologies that now allow neurophysiologists to collect brain data at unprecedented scale and precision, while AI provides tools that can ingest vast, complex data to make predictions. With this, digital twins of the brain can support unlimited in silico experiments and the application of AI interpretability tools, enhancing our understanding of neural computations.

The term “digital twins” here seems to be simply a relabelling of the well-established notion of a computational model of some aspect of brain function. Disturbingly, previous work on modelling the diversity of brain systems was substantially ignored in this meeting. Moreover, approximating the task-dependent behaviour of a region using an ANN whose structure is unconstrained seems misleading when the goal is to use BT to advance our understanding of brain structure and function — though more is being learned when the output being considered include data on the firing patterns of neurons during certain tasks.

Anton Arkhipov argued that the next decade will combine dense reconstructions of the circuitry and neural activity across whole brains (in the mouse) or large portions of the brain (in non-human primate and human brain tissue) with bio-realistic modeling of these reconstructed circuits. Experimentally, this will leverage electron and light microscopy, expansion microscopy, spatial transcriptomics, large-scale optical- and electrophysiology, and associated AI tools.

The wealth of new methods Arkhipov mentions will certainly open up new findings in neuroscience. But the map is not the territory (Borges 1975). If we develop “overly isomorphic” digital twins, no matter how well trained by AI/ML techniques, have we gained understanding of “how the brain works”? See § 4.2.

Doris Tsao studied activity of macaque neurons in inferotemporal cortex during face recognition and discovered that cells in patches represent facial identity along two “axes” coding shape (an “axis” with 25 coordinates) and appearance (the other “axis,” 25 coordinates).

Tsao’s work on inferotemporal cortex, though limited to a very specific task of the visual system, offered the meeting’s most impressive demonstration of linking neurotechnology and machine learning to make progress in neuroscience. Moreover, she was the only speaker to stress the importance of insights from cognitive science, briefly citing concepts for vision science of surface representation, object files, and the recognition of equivalence under, for example, rotation.

3.4 Mathematical modeling

A different path to understanding the brain is offered by mathematical results that offer qualitative analyses of neurobehavioral phenomena– while laying the foundation for computational models that can address numerical details.

Carina Curto noted that without the notions of eigenvectors and eigenvalues for linear systems, many insights into system performance would be unobtainable. We cannot expect a similar handle for nonlinear systems in general but threshold-linear networks are a promising subclass that could connect graph structure and architecture, with the example of encoding multiple gaits in a recurrent network.

Threshold-linear networks were developed, for example, by Shun-Ichi Amari and applied to winner-take-all networks and stereopsis in an early paper on dynamic fields (Amari and Arbib 1977).

SueYeon Chung argued that deciphering geometry of neural manifolds and its computational role is crucial to understanding the emergence of intelligence by supplying key metrics for interpreting the structure of internal representations of the brain and Al systems. Studies briefly cited included the geometry of manifolds for deep neural networks for vision, for hierarchy in language, for invariant speech recognition, as well as manifolds related to mouse hippocampus (navigation), monkey motor cortex (reaching), and even the geometry of social learning.

Chung’s application of the notion of neural manifolds to model various specific systems is important, but we need to better understand how these new models relate to earlier models and whether the methodology does extend to support the “emergence of intelligence.” Crucially, though, mathematical results may suggest constraints on the structure of digital twins, specifying parameters for which machine learning may offer new insights.

3.5 Biomechanics and motor control

Action and motor control are highly relevant to BT since brains evolved to enable animals to survive by perceiving the world not as an end in itself but to enable choices as to what to do next while building up memories that might guide future action, Some attention was paid to the visual systems of flies and primates but motor control was almost completely ignored in the Workshop. One of the few exceptions was Hartmann’s:

Mitra Hartmann stressed that understanding neural function in organisms will ultimately require integrating accurate biomechanical models of sensors and muscles with neurophysiological data. She exemplified this in her study of rodent whisking, a system that operates in service of perceiving the environment. She paid particular attention to the details of neural innervation of each whisker, and the biomechanics of the whisker and its associated musculature. Her study of the biomechanical sensor was complemented by a model of sensing shape by whisking.

Rodents using whisking to perceive the shape of the environment have long been a target for studies of both animal behavior (ethology) and robot design (Mitchinson and Prescott 2013; Prescott and Wilson 2023). Such studies raise the issue of which mechanoreceptor details really matter for NeuroAI or robots, as distinct from neurobiology. Again, we see the tension between “understanding brains” and “building machines that can emulate (more or less) a ‘useful’ brain function,” whether or not AI is involved.

3.6 Intelligence and language

Surprisingly for a meeting with a concern for AI, none of the talks analyzed the notion of intelligence, and how it might best be understood for non-human animals and machines. Nourse spoke of “insect intelligence” but failed to define it. The closest the Workshop came to the notion of human intelligence was the discussion of LLMs and language, but no attempt was made to assess how the human brain differs from other brains to make human intelligence distinctive

Evelina Fedorenko sees animal studies as useful in relation to some aspects of human cognition but not for language. She defines the language system as a network of left frontal and temporal areas in the human brain that supports language by retrieving words from memory and building syntactic structures in the service of semantic composition (Shain et al. 2024). She dissociates it from systems for reasoning and does not include Wernicke’s and Broca’s classical areas for articulation and speech perception because “they are insensitive to what language-like input is given.” The link to AI is that “a new candidate model organism has emerged, albeit not a biological one, for the study of language —large language models (LLMs).” These models exhibit human-level performance on diverse language tasks, and she stated that their internal representations are similar to the representations in the human brain when processing the same linguistic inputs (Schrimpf et al. 2021).

There is no doubt that AI has been greatly influenced by the remarkable performance exhibited by LLMs, but I have reservations about Fedorenko’s program. First, she dismisses the relevance of animal studies to language, whereas much is to be learned by exploring evolutionary pathways from the posited manual abilities of our last common ancestor with the great apes and comparative neuroethology of humans and extant primates (Arbib 2005b, 2020). A different concern is that the complete input sequence (the prompt) to an LLM is maintained in a buffer as the output is produced through iterations that each produce one word with output-so-far stored in its own buffer. No internal working memory is maintained of the overall plan between iterations for what is to be said, though it might be claimed that such information is recomputed at each cycle. Such issues may matter both for more economical AI computation while casting doubt on LLMs as a basis for understanding neural correlates such as the generation of the classic ERP data on language processing.

3.7 Biomedical applications

Ironically for an NIH meeting, there was almost no discussion of the medical relevance of NeuroAI. One exception concerned neurosurgery:

Kai Miller assessed how AI may assist functional neurosurgeons in identifying structure in biological measurements, documentation and chart synthesis, and clinical prediction, and especially in the development of closed-loop devices. For the latter, the aim will be to match the measurement scale of implanted devices to the physical scale of the neurophysiological feature (embodied measurement hardware), and then to implement neurologically-inspired algorithms to match the natural statistics and dynamic variation of brain circuitry (neuromorphic computing). He stresses the difference between treating the symptoms of a disease and addressing the circuit dysfunctions underlying the disease (and here gene therapy and drugs may be complemented by AI-improved brain-machine interfaces) and notes that a clinical need may not be the same thing as what is scientifically compelling.

Miller charts possible roles of machine learning in the cycle of patient care. He suggests the need to describe patients for neurosurgery at a level that relates the current patient to earlier patients with known care and outcome. This points to the possible relevance of LLM-type expert systems to aid diagnosis. Understanding data on brain imaging and ERP may need modeling to probe how such measures relate to the underlying neural circuitry (Arbib et al. 1995; Barrès et al. 2013; Horwitz et al. 2005).

Giacomo Indiveri addressed the limitation of current AI technologies when it comes to brain-machine interfaces, requiring real-time interaction with the nervous system. Since both wearable and implantable neural interfaces need to operate continuously for tasks such as real-time anomaly detection, they require extremely low power consumption. He advocates analog neuromorphic electronic circuits and mixed-signal neuromorphic processing systems to minimize power consumption while engaging in continuous dialog with signals produced by neurons in a living (human) brain.

Note that the machine in a brain-machine interface is not a digital twin. Rather it must build an internal model of some aspect of neural function adequate to “hold a conversation” with some portions of the brain to provide signals that will maintain the function within desired limits. To this end, Indiveri uses populations of neurons, averages over space and time, and employs negative feedback, adaptation, and learning mechanisms. His message is that thinking at the level of networks may be more economical (and, I add, more “task-relevant”) than importing all the neuron details.

Source link