Neurehab achieves 60% improvement in robotic rehabilitation with reinforcement learning

Robotic rehabilitation holds great promise in helping patients recover from stroke, but current systems often lack the adaptability needed to respond to an individual's progress and operate effectively outside of the clinical setting. To address these limitations, Phani Pavan Kambhampati, Chainesh Gautam, Jagan Palaniswamy, and Madhav Rao from the International Institute of Information Technology, Bangalore introduce NeuRehab, a novel framework that intelligently automates rehabilitation exercises. The system uniquely combines reinforcement learning and ultra-low-power spiking neural networks to distribute computation between mobile and fixed hardware to optimize both performance and energy efficiency. NeuRehab achieves significant improvements over existing approaches, reducing both power consumption and control latency by more than 60% while achieving comparable exercise performance, paving the way for more effective and accessible rehabilitation therapies.

Spiking network enables efficient robot control

This paper presents research focused on applying spiking neural networks (SNNs) to reinforcement learning for robot control and rehabilitation applications. The overarching goal is to develop energy-efficient, low-latency control systems that have the potential to be deployed on neuromorphic hardware. This study investigates multiple strategies for training SNNs, converting deep reinforcement learning (DRL) policies into spiking models, and optimizing SNN architectures to achieve reliable performance in real-time robotic scenarios, especially in medical and rehabilitation settings.

The main application highlighted in this document is robot control for rehabilitation, especially for assisting stroke patients. The proposed system aims to support motor training of both upper and lower limbs using a robotic arm. SNNs are being investigated as a means of processing electromyographic (EMG) signals, allowing the patient's intended movements to be decoded and translated into robot movements. The focus is on adaptability and individualization, allowing the system to be tailored to individual patient needs and rehabilitation progression.

Training SNNs using reinforcement learning is recognized as a major challenge, which complicates traditional gradient-based optimization due to the non-differentiable nature of spike events. To address this, research is considering several approaches, including surrogate gradient learning, which uses differentiable approximations of the spike function, and direct training methods that avoid conversion from artificial neural networks (ANNs). Evolutionary algorithms have also been considered as alternative training strategies, and temporal coding schemes have been investigated to encode information in spike timing rather than firing rate.

Another important focus of research is the conversion of existing DRL policies into SNNs. This approach leverages the maturity and robustness of established DRL algorithms by first training policies using traditional ANNs and then converting them to spike-equivalent algorithms. To ensure stable and effective SNN behavior after transformation, various transformation techniques are considered, such as direct weight mapping, rate coding, temporal coding, normalization and scaling techniques.

This paper also describes SNN optimization techniques and architectural innovations aimed at improving efficiency and responsiveness. These include methods such as slayers to improve error reassignment during training, temporal pruning to reduce unnecessary spikes and latency, and optimization of neuron leakage and firing thresholds. Early termination network designs that allow predictions to be made before all layers have been processed have been explored, and precharging of the membrane potential has been proposed as a way to further reduce response time.

Energy efficiency and low latency are the central motivations of the entire research. SNNs have attracted attention for their potential to significantly reduce power consumption, especially when implemented on neuromorphic hardware, while meeting the real-time requirements of robot control systems. This document references tools and frameworks such as Gymnasium, CleanRL, PyTorch, SpyTorch, and SEENN to support experimentation and implementation.

Overall, this study shows that SNNs can be effectively applied to robot control and rehabilitation tasks, and that converting DRL policies into spiking models is a non-trivial but viable approach. While promising results are reported in terms of efficiency and latency, this paper also acknowledges limitations, such as the difficulty in training SNNs, the need for more scalable and robust training methods, and the need for further evaluation on real-world robotic systems. Future work will be directed towards improving the SNN architecture, exploring alternative coding schemes, and validating its performance in real deployment scenarios.

NeuRehab framework for wheelchair robotic rehabilitation

The research team designed NeuRehab, a novel framework for robotic rehabilitation that separates compute-intensive training and low-power, on-device inference, specifically for wheelchair-based exoskeleton systems. The system is built on the existing XoRehab platform and introduces a purpose-built hardware and software architecture that enables efficient learning and deployment of control policies. The key innovation lies in the division of labor, with the docking station incorporating a powerful GPU to fine-tune the behavioral model using data collected from the wheelchair's edge devices, and the wheelchair itself using a neuromorphic chip to execute the learned policies. To facilitate safe development and thorough analysis, scientists designed two custom simulation environments: Kinematic Environment (KENV) and Dynamic Environment (DENV). Both are compliant with the Gym API. KENV models the stepper motor's discrete motion profile and its associated delays, while DENV simulates the pendulum-like physics of a torque drive that incorporates patient interaction.

Both environments provide clinically relevant observations such as joint angles, velocities, patient torques, and strains, and employ reward conditions that penalize excessive force, sudden movements, and misalignment between patient and device. The team implemented Soft Actor Critical (SAC) as a basic reinforcement learning method and introduced a heterogeneous training scheme called Hybrid SAC (HSAC). In HSAC, the policy or actor is implemented as a spiking neural network (SNN), but the critical remains an artificial neural network (ANN) to maintain high precision value estimation for learning stability and to coordinate the actor architecture with neuromorphic hardware. Furthermore, the researchers developed two inference time optimizations for the spiking control policy. One is post-spike training time quantization (SPTTQ), which treats the number of spike time steps as a post-training quantization parameter, and the other is sequential leaky (SLeaky) neurons, which preserves the membrane potential throughout the reinforcement learning step and reduces charge-up overhead. Together, these innovations achieve over 60% savings in both power and latency during inference compared to standard implementations while maintaining comparable performance gains.

NeuRehab Framework Facilitates Robotic Stroke Recovery

Recent research has provided NeuRehab, a novel robotic rehabilitation framework designed to accelerate patient recovery after stroke. This study addresses the limitations of existing modular exercise systems by introducing an end-to-end framework that integrates artificial intelligence and a co-designed control system. The system optimizes both performance and power consumption and enables mobile operation without loss of functionality. NeuRehab consists of two main partitions: wheelchair-based edge devices and fixed docking stations, enabling efficient resource allocation and learning. The framework achieves over 60% savings in both power and latency during inference while maintaining comparable performance compared to standard implementations.

This is achieved through a split machine learning process that splits the computational tasks between the wheelchair and the docking station. A fixed dock with powerful hardware handles computationally intensive tasks such as fine-tuning the model, while the wheelchair utilizes an ultra-low-power spiking network for real-time control. Task-specific temporal optimization further reduces the delay of edge inference control, ensuring responsive and efficient operation. In our experiments, we validate our shoulder exercise framework using the XoRehab platform, an IoT-enabled rehabilitation system. The shoulder joint operates using a stepper motor with a 1:15 gearbox, providing position holding functionality and requiring no electrical power to maintain a particular state.

The system incorporates force sensors to provide feedback on the patient's athletic performance, replacing reliance on potentially noisy biosignals such as electromyography. This study demonstrates a reinforcement learning architecture that can adapt to patient interactions, adjust speed, avoid excessive force, and ultimately assist in flexion and extension movements. This innovative approach allows the system to tailor its response to individual patient needs, providing a promising advance in robotic rehabilitation technology.

Spike network enables resource-aware rehabilitation

NeuRehab represents a significant advance in robotic rehabilitation, providing an end-to-end framework that integrates reinforcement learning and spiking neural networks for autonomous, resource-aware exoskeletons. The team developed a system based on the XoRehab platform, alongside a new simulation environment designed to accurately model shoulder joint movement and clinically relevant constraints such as minimizing strain and providing assistance only when needed. The key algorithmic innovation is a hybrid SAC that combines spiking neural network actors and artificial neural network criticality to achieve a balance between accuracy, learning efficiency, and compatibility with neuromorphic hardware. Evaluations across standard benchmarks and custom simulation environments demonstrate that hybrid SAC reliably matches the performance of traditional artificial neural network approaches, while exceeding the capabilities of full spike network systems. The contribution of further algorithms such as post-spike training time quantization and sequential leaky neurons optimizes performance by reducing the number of spikes by up to 63% and improving latency by more than 60% during control tasks, while maintaining reward levels comparable to those achieved with traditional systems. The researchers acknowledge that current transfer learning approaches are necessary when adapting the system to new patient profiles, and suggest that adaptability could be further improved by incorporating a baseline control profile. Future work will focus on extending this framework to additional joints within the XoRehab system and expanding the range of human torque profiles used for training, while the adaptability of the underlying software suggests potential applications in other time- and power-constrained environments such as industrial machinery.

👉 More information
🗞 NeuRehab: Reinforcement learning and spiking neural network-based rehabilitation automation framework
🧠ArXiv: https://arxiv.org/abs/2512.17841

Source link