Quantum Recurrent Unit achieves 98.05% accuracy with just 132 parameters

Researchers are using new quantum approaches to address the increasing demands for parametric efficiency and computational power in modern machine learning. Tzong-Daw Wu, Hsi-Sheng Goan, and colleagues from the Department of Physics and Center for Theoretical Physics at National Taiwan University present the quantum recurrent unit (QRU), a new neural network architecture designed for noisy intermediate-scale quantum (NISQ) devices. This innovative design utilizes quantum-controlled SWAP gates to selectively process information and mirrors classic gated recurrent units, but with the key advantage that the number of parameters and circuit depth are constant regardless of input length. Demonstrating its capabilities, QRU achieves performance comparable to or better than traditional neural networks in tasks such as vibration behavior prediction, breast cancer diagnosis, and MNIST digit recognition, all with dramatically fewer parameters, paving the way for more scalable and efficient quantum machine learning systems.

An innovative recurrent architecture featuring measurement feedforward uses only 35 parameters to achieve an accuracy of 0.13%, comparable to a 167-parameter artificial neural network (ANN) in WDBC classification. Moreover, the accuracy reaches 98.05%, which outperforms Convolutional Neural Network (CNN) with about 27,265 parameters in MNIST handwritten digit classification. To fully understand QRU’s design philosophy, it is important to review the relevant theoretical foundations. As Wu et al., quantum neural networks have great expressive power and can implement complex functional mappings with relatively few tunable parameters. demonstrated that even parameterized circuits with a single qubit and data reupload have universal approximation properties. Schuld et al. Furthermore, we investigated how data encoding strategies affect the expressive power of quantum machine learning models, and in particular how appropriate encoding can enhance the expressive power of models in multi-qubit systems. Recent research in quantum recurrent neural networks has mainly focused on migrating classical architectures to quantum frameworks.

For example, the quantum long short-term memory (LSTM) proposed by Chen et al., and the quantum GRU architecture developed by Ceschini et al. Both replace classical neural network layers with variational quantum circuits (VQC). Although these approaches are intuitive, they maintain classic LSTM/GRU gate logic for time processing. State update rules remain classical operations, which limits the full exploitation of quantum advantages and adaptability to future fault-tolerant architectures. Beyond hybrid architectures, fully quantum recurrent approaches are also being considered. Bausch proposed a quantum RNN (QRNN) using parameterized quantum neurons with a polynomial activation function and amplitude amplification, achieving an accuracy of 90.8–98.6% in MNIST binary classification with approximately 1,200 parameters.

Although the feasibility of QRNN has been demonstrated, the repeated circuits and amplitude amplification to the success of this architecture may limit its scalability for longer sequences. Alternative VQC-based approach [27, 28] It maintains quantum hidden states across timesteps and presents a compact implementation (e.g., 55 parameters in Takaki’s work). However, these approaches implement basic recursion without a gating mechanism similar to GRU or LSTM, which can limit their ability to capture long-term dependencies. Additionally, maintaining quantum coherence across a sequence requires very high-fidelity gates to prevent error accumulation across sequential operations and extended coherence times to preserve quantum information, both of which pose practical limitations for NISQ devices.

These limitations have necessitated a reconsideration of quantum recurrent architecture design approaches, particularly how to build architectures that maximize the benefits of quantum through task-specific mechanisms inspired by classical GRU while remaining compatible with NISQ. The QRU architecture consists of four main components: input and hidden state encoding, update and reset mechanisms, variational layers, and quantum measurement output generation. Data qubits provide the computational output, and hidden qubits enable the propagation of temporal information.

QRU indicates constant depth, parameter scaling

The researchers systematically verified the performance of the QRU through three progressive experiments, starting with the prediction of vibration behavior. In this task, the 72-parameter QRU matched the performance of the 197-parameter traditional GRU, demonstrating comparable predictive capabilities with a significantly reduced number of parameters. These measurements confirm that model complexity is significantly reduced without sacrificing classification performance. Test results demonstrate that QRU excels at more complex tasks, achieving 98.05% accuracy on the MNIST handwritten digit recognition dataset containing 132 parameters.

This result is significantly better than a convolutional neural network with approximately 27,265 parameters, which significantly reduces parameter requirements while increasing accuracy. Measurements confirm that this architecture offers a promising path towards more efficient and scalable quantum machine learning architectures that are compatible with near-future quantum hardware and could revolutionize the way complex data is processed and analyzed. The team’s research establishes QRU as a viable alternative to classical recurrent networks and paves the way for future advances in quantum machine learning.

QRU excels at sequence learning tasks

The authors acknowledge the limitations associated with validating QRU’s performance in larger and more complex problems and physical quantum computers. Future research will focus on investigating the scalability of QRUs in architectures such as Transformers, preserving quantum states through direct propagation, and evaluating performance with increasingly sophisticated quantum hardware. By designing quantum circuits that leverage quantum-native operations for efficient information processing, researchers have built an architecture that exhibits remarkable parametric efficiency and universality across a variety of applications, from time-series prediction to medical diagnostics to visual data recognition. Ultimately, this research supports the idea that quantum machine learning is moving from a theoretical concept to a practical reality with the potential to transform artificial intelligence and machine learning.

👉 More information
🗞 Quantum Recurrent Unit: Parameter-efficient quantum neural network architecture for NISQ devices
🧠ArXiv: https://arxiv.org/abs/2601.18164

Source link