Quantum compressed K-means provides constant size scaling with up to 9 real datasets and 1 mean squared error

Machine Learning


Clustering large datasets poses a major challenge for current and near-term quantum computers, which struggle with both data input and limited qubit availability. To address this, Pedro Chumpitaz-Flores and My Duong of the University of South Florida, and Ying Mao of Fordham University, in collaboration with Kaixun Hua of the University of South Florida, developed a new hybrid quantum algorithm, qc-kmeans. The method uses compact Fourier feature sketches to efficiently summarize data, reducing the need to load the entire dataset onto a quantum processor, and identifies cluster centers by solving a simplified optimization problem using shallow quantum circuits. The researchers demonstrate that qc-kmeans can maintain competitive clustering accuracy in simulations even when noise is added, while keeping the number of qubits required constant regardless of dataset size, representing a promising step toward practical quantum machine learning in near-future devices.

This approach achieves unbiased estimation with mean squared error O(ε2) for a given parameter. And importantly, the maximum qubit requirement does not increase with dataset size. A refinement process that incorporates data retention ensures consistent improvement throughout the process. Minimal circuit depth simulations using standard quantum computing software demonstrate the performance of the method on low-dimensional synthetic data and yield competitive results compared to quantum baselines. The method was further tested on nine real-world datasets containing up to 4.3 × 105 data points.

Hybrid quantum-classical K-Means clustering for NISQ devices

This research addresses the challenge of applying quantum machine learning, specifically data clustering, to current and near-future quantum computers (NISQ devices). These computers are limited by the number of qubits, coherence time, and susceptibility to error, making many quantum algorithms impractical. The authors introduce qc-kmeans, a new quantum-classical hybrid algorithm designed for this purpose. It combines classical and quantum computation to take advantage of the strengths of each. This algorithm uses Fourier feature sketches to compress the dataset into a fixed-size representation and separates the quantum register width from the number of data points.

This allows the algorithm to process large datasets without requiring large numbers of qubits. The core quantum computation uses a shallow quantum approximation optimization algorithm (QAOA) with specific types of interactions to select optimal cluster centers. The main advantage is that the quantum register width remains constant regardless of the size of the dataset. This is a significant improvement over many other quantum clustering algorithms.

K-Means clustering performance, quantum and classical comparison

In this study, we investigated the performance of a new quantum-classical clustering algorithm (QC-KMEANS) against traditional K-means and classical clustering methods across diverse datasets including HEMI and URBANGB. The results show that QC-KMEANS consistently achieves competitive or superior clustering accuracy as measured by the sum of squared errors (SSE). Specifically, QC-KMEANS reduced SSE by an average of 15-25% on several datasets. The results for QC-KMEANS_NOISE show higher SSE values, indicating higher sensitivity to noise. These findings suggest that quantum-inspired algorithms like QC-KMEANS have the potential to improve clustering performance, but are not a one-size-fits-all solution.

Fourier feature sketches separate quantum clustering

The team introduced qc-kmeans, a new hybrid quantum approach to data clustering designed for implementation on short-term quantum computers. This method addresses the limitations imposed by the limited number of qubits and circuit depth by compressing the dataset into a fixed-size Fourier feature sketch. The subsequent selection of cluster centers relies on a shallow quantum approximation optimization algorithm, effectively decoupling quantum resource requirements from the size of the input data. This method achieves clustering quality comparable to classical algorithms while maintaining constant qubit width and shallow circuit depth across both synthetic and real-world datasets.

Importantly, the researchers acknowledge that qc-kmeans currently offers no speed advantage over established classical algorithms. The main value of this research lies in the feasibility of near-term quantum hardware, where the number of qubits and circuit depth are limited. Through simulations, the research team also demonstrated robustness against noise, suggesting the possibility of practical application as quantum technology advances.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *