Indexing fast-computation architecture
Based on the characteristic data pattern of the discrete optimization problems formulated as the Ising model, the indexing fast-computing architecture performs data compression and high-speed computation.
The Ising problem is to find a spin configuration s that minimizes the Hamiltonian (a cost function) HIsing38 defined by
$${H}_{{\rm{Ising}}}=-\frac{1}{2}\mathop{\sum }\limits_{i=1}^{N}\mathop{\sum }\limits_{j=1}^{N}{J}_{i,j}{s}_{i}{s}_{j}+\mathop{\sum }\limits_{i=1}^{N}{h}_{i}{s}_{i},$$
(1)
where N is the number of spins, si ( ∈ { − 1, 1}) is the ith Ising spin, Ji,j is the coupling coefficient between the ith and jth spins, and hi is the bias coefficient (local field) for the ith spin. Many combinatorial optimization problems can be formulated as Ising problems39. The MIS problem, which we focus on in this work, is to find the maximum independent set of a graph G(V, E) with a node set V and an edge set E. It can be formulated as follows:
$${s}_{i}=\left\{\begin{array}{ll}1,& ({\rm{Node}}\,i\,{\rm{is}}\,{\rm{an}}\,{\rm{element}}\,{\rm{of}}\,{\rm{the}}\,{\rm{independent}}\,{\rm{set}}),\\ -1,& ({\rm{Node}}\,i\,{\rm{is}}\,{\rm{not}}\,{\rm{an}}\,{\rm{element}}\,{\rm{of}}\,{\rm{the}}\,{\rm{independent}}\,{\rm{set}}).\end{array}\right.$$
(2)
$${J}_{i,j}=-\frac{A}{2}{G}_{i,j},$$
(3)
$${h}_{i}=\frac{1}{2}\left(A\mathop{\sum }\limits_{j=1}^{N}{G}_{i,j}-B\right).$$
(4)
$${G}_{i,j}=\left\{\begin{array}{ll}1,& ({\rm{Edge}}\,{\rm{between}}\,i\,{\rm{and}}\,j{\rm{is}}\,{\rm{an}}\,{\rm{element}}\,{\rm{of}}\,{\rm{the}}\,{\rm{edge}}\,{\rm{set}}\,E),\\ 0,& ({\rm{Edge}}\,{\rm{between}}\,i\,{\rm{and}}\,j\,{\rm{is}}\,{\rm{not}}\,{\rm{an}}\,{\rm{element}}\,{\rm{of}}\,{\rm{the}}\,{\rm{edge}}\,{\rm{set}}\,E).\end{array}\right.$$
(5)
Here, the coefficients Aand B are positive real constants (see the “Methods” section for details). Unless there are special constraints, each element of J and h must be represented as real numbers. When representing real numbers on a computer, single precision (32-bit) or double precision (64-bit) floating-point expressions are often used. In those cases, the data size of the J-matrix, which has N2 elements, can become enormous, and floating-point arithmetic requires abundant computational resources.
The literature39 has provided Ising formulations for many NP-complete and NP-hard problems. In many of those cases, it is observed that the values of the elements in J-matrix are represented by floating-point numbers but the number of possible values (Nv) that each element takes is limited (limited variation). Figure 2a shows the distribution (histogram) of the values of the J-matrix corresponding to several representative combinatorial optimization problems, i.e., MIS, WMMBG (Weighted Maximum Matching in Bipartite Graphs) 7, and G-set48, where the elements of the J-matrixes take only two or three different values. The reason for the limited variation is that in the discrete optimization problems formulated as the Ising model, non-diagonal elements Ji≠j often represent binary (or digitalized) exclusive relationship between two decision variables (si and sj). Note that the TSP (Traveling Salesman Problem) shown in Fig. 2a is an exception for the limited variation, where Ji≠j can take continuous values corresponding to distances between two cities. See the Methods section for MIS, WMMBG, G-set, TSP, and other combinatorial optimization problems. Since the Ising model allows arbitrary continuous relationships between two variables to be expressed via off-diagonal terms, the TSP-type formulation can be considered a more generalized case. However, in many combinatorial optimization problems, even when the objective function is linear, the constraints are often encoded as quadratic penalty terms—i.e., off-diagonal elements—resulting in limited variation in the J-matrix. Such formulations for limited variation are frequently found in practical and NP-hard problems, including the aforementioned MIS. Therefore, it is meaningful to explore acceleration techniques specifically tailored to this class of problems, where the quadratic terms primarily represent discrete constraint structures.

a Distributions of J-matrix element values in Ising models, representing various combinatorial optimization problems. b The method of indexing. An indexed J-Table is generated by extracting distinct values from the distribution of the J-matrix and then assign an index for each value. By replacing the real values in the original J-matrix with the corresponding indices, a smaller-sized indexed J-matrix is produced. c A system utilizing the encoding method. The indexed J accelerates data transfer speed between the CPU and the Ising machine, and also reduces the memory requirements for the Ising machine. d Indexed J-specialized MAC for simulated bifurcation (SB). By factorizing a MAC (multiply and accumulate) operation to isolate the additive part of the position variables x, most computations can be performed without decoding the indexed J from indices to real values, leading to faster computation.
In this architecture, the Ji,j is losslessly encoded using an indexed J-Table generated based on the distribution of J-matrix elements. (Fig. 2b). The indexed J-Table comprises the pairs of unique real values in the J-matrix and their corresponding integer indices t, and the number of entries is equal to the number of data patterns Nv. The number of bits required to represent an index value t is \(\lceil {\log }_{2}{N}_{v}\rceil\). The lossless encoding is performed by replacing the original values Ji,j of the J-matrix with the corresponding indexed (encoded) value \({J}_{i,j}^{c}\) in the J-Table. If the original real values are represented in float32 (32-bit floating-point numbers), the size of the J-matrix becomes \(\lceil {\log }_{2}{N}_{v}\rceil /32\). When Nv is small, the data size can be compressed. For example, in the case of MIS (Nv = 2), the J size is reduced to 1/32. By table lookup in the J-Table, the original values can be restored from \({J}_{i,j}^{c}\) without any loss of information. We note that this indexing scheme is conceptually related to dictionary-based encoding; however, an important design choice in our approach is that the indexing is selectively applied only to the coupling matrix J. In contrast, the bias vector h is kept in its original real-valued form, reflecting the structural asymmetry of Ising formulations in which J often exhibits limited variation while h does not necessarily share this property. This selective treatment allows the representation to remain lossless while preserving the flexibility of the Ising model formulation.
This encoding method further provides significant benefits when combined with hardware acceleration systems (Fig. 2c). A hardware acceleration system refers to a technology that accelerates specific processes (in this case, the Ising machine) using dedicated hardware (such as FPGA) instead of a general-purpose processor (CPU). In systems where the Ising machine is implemented on a device, the Ising model is transmitted from the host, where the CPU operates, to the device prior to the execution of the Ising machine. During this process, the J-matrix with a large data size increases the transmission time. Considering limited resources available for the dedicated hardware, the memory size required to store the J-matrix can becomes relatively large (can be a dominant limiting factor). By encoding the J-matrix, both the data transmission time and the memory size can be reduced.
Furthermore, the proposed encoding method can enhance the parallelism of computation using the limited resources. We propose a method to efficiently implement multiply accumulate (MAC) operations with less resources using encoded states. MAC operations, which combine multiplication and addition, are frequently used in digital signal processing, machine learning, and matrix operations. The SB, adopted as the Ising machine in this study (see the Methods section for details), also execute numerous MAC operations. The MAC operation using the encoded states in this method can be described as follows:
$$\mathop{\sum }\limits_{j=1}^{N}{x}_{j}{J}_{i,j}\,\Longleftrightarrow \,\mathop{\sum }\limits_{t=1}^{{N}_{v}}({\rm{TBL}}[t]\mathop{\sum }\limits_{j=1}^{N}{x}_{j}{\chi }_{t}({J}_{i,j}^{c})).$$
(6)
The left-hand side represents the conventional MAC operation to calculate the many-body interactions to ith non-linear oscillator in SB. The right-hand side represents its transformed version, where TBL[t] represents the real value corresponding to index t in the indexed J-Table, and χt(v) is an indicator function that returns 1 when v = t and 0 otherwise. xj is the position variable for jth non-linear oscillator in SB, and it is crucial that the cumulative operation of xj is factored out. The circuit architecture (indexed J-specialized MAC) corresponding to the right-hand side in Eq. (6) is shown in (Fig. 2d). After the selective cumulative-addition of xj by an accumulator (ACCt) for each index t, the original real value (TBL[t]) obtained from the indexed J-Table is multiplied. In other words, the selective cumulative-addition is performed while Ji,j remains encoded. This allows the dominant accumulation process to be implemented using integer or fixed-point arithmetic units, without fully decoding J into floating-point values. Since xj is a physical quantity with a predetermined dynamic range (in the case of bSB used here, the position variable is ( − 1.0≤x≤1.0)), a fixed-point data type (e.g., int16) can be used for ACCt. Generally, fixed-point arithmetic units can be implemented with lower latency and fewer resources compared to floating-point units. Thus, the MAC operations can be implemented with low resources when Nv is sufficiently small, leading to the overall acceleration (more parallelism) of the Ising machine.
While the above explanation is based on the physical characteristics of bSB, it is important to note that the proposed method is not limited to bSB. Although it requires that the dynamic range of the variable x be well-defined and that reduced precision still yields acceptable results, the technique is broadly applicable to MAC operations in general.
ML-assisted scalable and diversified architecture
To handle the various sizes and characteristics of dynamically changing problems, we propose a scalable and diversified architecture for Ising machine-based solvers, which consists of three key methods. In this work, we choose the MIS problem as a representative problem.
The first method is how to construct a machine learning-based parameter estimation model. Typically, an Ising machine has its own control parameters, which have optimal values that depend on the problems to be solved. We build and train a parameter estimation model using a large amount of training data, which enables obtaining optimal parameters with low latency at runtime. As the training dataset (Fig. 3a), we prepare 2840 graphs with various sizes and characteristics, which are generated from the representative networks of Erdös-Rényi (ER)49, Barabási-Albert (Scale-free network, SF)50, and Watts-Strogatz (Small-world network, SW)51 with randomly-selected structural parameters. The explanatory variables of the estimation model are the number of nodes, edge density, average degree, and Gini coefficient, which quantifies the heterogeneity of the degree distribution52. The objective variables of the estimation model are the three control parameters of the SB, i.e., c, dt, step (see the “Methods” section). A grid search for those control parameter is executed with the SB-based Ising machine for each graph (each problem) to find the parameter set that solves the problem faster and more accurately. The best parameter set found by the grid search (the objective variables) is then recorded along with the graph features (the corresponding explanatory variables). To evaluate the trade-off between speed and accuracy, we introduce a metric called time-to-sectional-target (TTST). While other metrics called time-to-solution (TTS) or time-to-target (TTT)53 are often used for evaluating the performance of Ising machines14, these metrics require the exact solution of the problem to be known. Since the problems (graphs) are randomly generated, the exact solutions are difficult to be obtained. Here, the value that is 99% of the maximum value (the maximum size of independent set) obtained within the executed grid search is defined as the sectional target (ST). TTST is formulated as follows:
$${\rm{TTST}}=\left\{\begin{array}{ll}{T}_{{\rm{COM}}}\frac{\log (1-0.99)}{\log (1-{P}_{s})},& ({\rm{if}}\,{P}_{s} < 0.99),\\ {T}_{{\rm{COM}}},\hfill & ({\rm{if}}\,{P}_{s}\ge 0.99),\end{array}\right.$$
(7)
where TCOM is the computation time per shot (per execution of the Ising machine) and Ps is the success probability to find solutions better than or equal to the ST. The parameter set that minimizes TTST is regarded as the best one by the grid search.

The MIS (maximum independent set) problem is chosen as a motif. a Preparation of the training dataset. Generate 2840 random graphs based on three different basic structures. Using an Ising machine-based MIS solver, perform a grid search to find the Ising machine parameters that can produce the best independent set for each graph more quickly. Record the best parameters with graph features in the dataset. b ML-based parameter estimator trained based on a XGBoost regression model. The 3D graph visualizes the response of steps to N and density, with the values of average degree and Gini fixed. c System architecture, which comprises multiple Ising machines (FPGA/CPU-implemented), a machine selector, a parameter estimator, and a batch mapper. By using the machine selector and the parameter estimator, one of Ising machines and the control parameters for the Ising machine are selected for each graph (problem) depending on the graph features. A batch processing mechanism to efficiently solves small problems with the fixed-size FPGA-based Ising machine is also equipped.
As the parameter estimator in this work, a regression model is constructed using XGBoost (eXtreme Gradient Boosting)54 (Fig. 3b). The XGBoost is a machine learning library based on the gradient boosting algorithm and includes regularization features to prevent overfitting, which thus can realize high predictive accuracy and computational efficiency and can be applied to various tasks such as regression and classification. There are cases where different sets of explanatory variables (c, dt, step) yield the same TTST. This is due to the dependencies among the explanatory variables, making it impossible to learn c, dt, and step independently. Therefore, we first learn c independently as the reference variable, and then learn the other variables as ratios related to it (c/dt, dt × step). Once c is determined, dt and step are uniquely determined through simple arithmetic operations (see the Methods section for details). Overall, the parameter estimator takes as input four graph features (number of nodes (N), edge density, average degree, Gini coefficient) and then estimates the three Ising machine parameters (c, dt, step).
The second method is how to keep high computational efficiency with a fixed-size Ising machine (a hardware accelerator) when solving various sizes of problems. As shown in Fig. 3c, the proposed architecture has a batch processing (multi-shot execution) mechanism that efficiently solves small problems with a relatively large Ising machine. A hardware Ising machine, such as the FPGA-based SB machine used in this work, is usually designed to treat the Ising model with a predetermined maximum number of spins \({N}_{{\rm{MAX}}}\) (\({N}_{{\rm{MAX}}}=2048\) in this work). The computational efficiency becomes the highest when solving the maximum size of Ising problem (the problem size N is equal to \({N}_{{\rm{MAX}}}\)), while it lowers when the problem size is small (\(N < {N}_{{\rm{MAX}}}\)) due to increased idle time of hardware resources (e.g., the arithmetic units). In the batch processing mechanism, when the problem size N is less than or equal to the half of \({N}_{{\rm{MAX}}}\), the J data of the problem is duplicated (i.e. the problem itself is duplicated) and assigned to the diagonal part of the J memory of the Ising machine \(\lfloor {N}_{{\rm{MAX}}}/N\rfloor\) times with zero padding to remaining parts (the Batch Mapper in Fig. 3c). This enables obtaining multiple (\(\lfloor {N}_{{\rm{MAX}}}/N\rfloor\)) different approximate solutions by one execution of the Ising machine (there is no interaction between duplicated Ising models owing to the zero padding), effectively making the execution speed \(\lfloor {N}_{{\rm{MAX}}}/N\rfloor\) times faster. Generally, an Ising machine is a heuristic solver that does not guarantee the exact solution to be obtained and is capable of producing different-quality solutions by changing the initial values or other control parameters. In the case of SB, different solutions can be obtained by changing the initial values of position x or momentum y variables. After obtaining different solutions, we can choose the best solution, which improves the quality of the solution.
The third method is another approach also to efficiently solve various-size problems. We prepare various-size Ising machines with different implementations (CPU-based and FPGA-based Ising machines in Fig. 3c) and a selection mechanism based on machine learning methodology (the Machine Selector in Fig. 3c). Even with the batch mapping mechanism, as the problem size decreases, the execution by the FPGA-based implementation becomes inefficient. Compared to the computation time depending on the N (the problem size) and step (the number of iterations in SB algorithm), the data transfer time (the time for offloading from CPU to FPGA) becomes non-negligible. We design a software (CPU)-implemented Ising machine that operates relatively faster only for smaller problems and also prepare a machine selector to select a faster machine depending on the situation (the features). The machine selector is a second machine learning model, distinct from the parameter estimator mentioned above. In the machine selector, the training dataset consists of recorded system-wide latency (computation time) for each Ising machine on the host computer for multiple sets of N and step. The goal is to perform binary classification to determine which Ising machine operates faster. We use a polynomial logistic regression model as the machine learning method. Using N determined by the problem itself and step determined by the parameter estimator described above, the model selects the fastest implementation based on these features. Thus, scalability (agility) toward very small problems are ensured.
Evaluation on the MIS problem
This section presents a systematic evaluation of a MIS solver constructed by integrating the methods introduced in Sections “Indexing fast-computation architecture” and “ML-assisted scalable and diversified architecture”, namely the indexing fast-computation architecture and the ML-assisted scalable and diversified architecture. In this subsection, we consider three SB-based MIS solver configurations: a Baseline configuration using a conventional SB implementation, and two implementations of the proposed framework, denoted as Proposed (Generic) and Proposed (Optimized). Both Proposed configurations incorporate the indexing fast-computation architecture and the ML-assisted scalable and diversified architecture, while they differ in how the indexing-related processing is implemented. In the following, unless otherwise noted, the term Proposed configuration refers to Proposed (Optimized).
We first analyze the system-wide latency of the MIS solver to clarify the impact of these architectural components. Figure 4a shows the breakdown of the system-wide latency for different solver configurations, and Table 1 provides the corresponding numerical details under various input conditions. To isolate the effect of the implementation style of the proposed framework, we compare Proposed (Optimized) with Proposed (Generic) and the Baseline configuration. The detailed system configurations corresponding to Proposed (Generic) and Proposed (Optimized) are described in “Methods” section.

a System-wide latency breakdown for three MIS solver configurations under the condition ∣V∣ = 2048, step = 2000: Baseline, Proposed (Generic), and Proposed (Optimized). Baseline denotes an MIS solver using a conventional SB implementation, whereas Proposed (Generic) and Proposed (Optimized) represent the proposed SB-based framework with generic and optimized configurations, respectively. b System-wide latency of CPU- and FPGA-implemented MIS solvers (∣V∣ = 256) versus steps. c System-wide latency of CPU- and FPGA-implemented MIS solvers (step = 1000) versus ∣V∣. d Performance comparison of five MIS solvers: the proposed SB-based solver, SB (baseline), NetworkX, OR-Tools, and ReduMIS, showing the number of independent sets (#IS, larger is better) versus the computation time (lower is better) for various sizes and characteristics of graphs with edge densities p (0.1, 0.4, 0.8) and node numbers ∣V∣ (2000, 500, 100, 20). Each data point represents the mean value across five instances; error bars indicate the standard deviation (SD), and individual results for each instance are shown as faint dots. The legend in the top-left graph applies to all twelve graphs.
The following analysis is based on three representative settings of (∣V∣, step), namely (1024, 1000), (2048, 2000), and (2048, 20,000). Across all evaluated settings, the Optimized (Proposed) configuration achieves substantially lower system-wide latency than the Baseline, demonstrating the effectiveness of the integrated architectural design. A key contributor is the reduced Ising machine execution time (H). In both the Generic and Optimized configurations, the execution time is approximately three times shorter than that of the Baseline. This speedup is enabled by the indexed J-specialized MAC units, which allow a higher-density implementation with four times more parallel arithmetic units. A further reduction is observed in the offloading overhead (G), which is decreased by roughly a factor of 30 due to compression of the J-matrix. In the Generic configuration, however, the overhead of index encoding (D) partially offsets this gain, although the total latency remains lower than that of the Baseline. In contrast, the Optimized configuration significantly reduces this overhead by unifying Ising model conversion and index encoding (E), highlighting the benefit of problem-specific architectural optimization. The overhead of the ML-assisted scalable and diversified architecture (Extract features (A) and Estimation (B)) depends on ∣V∣ and remains small, at approximately 2 ms for ∣V∣ = 1024 and 2.8 ms for ∣V∣ = 2048, indicating that it is not a performance bottleneck.
Figure 4b plots the system-wide latency of the MIS solver as a function of the step parameter, comparing CPU-based and FPGA-based Ising machine implementations. In both cases, the computation time increases proportionally with step, but the rate of increase is smaller for the FPGA implementation. Figure 4c shows a similar trend with respect to the problem size ∣V∣. While the CPU implementation exhibits a steep increase in computation time as ∣V∣ grows, the FPGA implementation shows a much slower growth rate. However, as indicated by the insets in Fig. 4b, c, the CPU implementation is faster when step or ∣V∣ is small. This crossover behavior supports the effectiveness of the Ising machine switching mechanism shown in Fig. 3c.
Following the latency analysis above, we next evaluate the proposed MIS solver in terms of both computation time and solution quality, measured by the size of the independent set found (#IS), and compare it with three software-based reference solvers, NetworkX55, OR-Tools56, and ReduMIS57, as well as an SB-based hardware baseline, denoted as SB (baseline). NetworkX provides a heuristic MIS solver as part of a general-purpose graph library, whereas OR-Tools constructs an exact MIS solver through integer programming. We further include ReduMIS, a strong classical MIS heuristic that combines reduction (kernelization) techniques with evolutionary search and has been widely used as a high-quality solver for large sparse graphs57. As an additional algorithmic and hardware baseline, we include a standard SB implementation on FPGA with fixed parameters, denoted as SB (baseline). For all problem instances, this baseline uses the same parameter set, c = 0.2, dt = 0.2, and step = 1000, following ref. 58. In the evaluation, the graph sizes ∣V∣ were set to 2000, 500, 100, and 20, and the edge densities p were set to 0.1, 0.4, and 0.8. For each combination of ∣V∣ and p, five different random graph instances were generated and processed by all solvers. The computation time reported here corresponds to the system-wide wall-clock latency of the MIS solver, including all processing stages shown in Fig. 1.
Figure 4d shows the comparison results. In terms of computation time, for medium-to-large problems (∣V∣≥100), both SB-based solvers are substantially faster than the software-based baselines. While the computation time of NetworkX, OR-Tools, and ReduMIS increases with problem size, both SB (proposed) and SB (baseline) maintain latencies below 100 ms over most of the evaluated conditions. For the largest sparse case, ∣V∣ = 2000 and p = 0.1, the proposed solver is 5500 times faster than NetworkX and 4300 times faster than ReduMIS. OR-Tools failed to return a solution within the timeout period of 3600 s for some instances with ∣V∣ = 500 and for all instances with ∣V∣ = 2000. For very small problems (∣V∣ = 20), SB (baseline) is in some cases slower than the software-based baselines because FPGA offloading becomes inefficient in this regime; by contrast, the proposed solver preserves low latency by switching to the CPU-based implementation, consistent with the crossover behavior shown in Fig. 4b, c. We also note that, for ∣V∣ = 2000, SB (baseline) is sometimes faster than the proposed solver because the baseline always uses step = 1000, whereas the proposed solver may select a much larger step count (approximately 20000) through parameter estimation for more difficult instances. However, this fixed parameter setting degrades robustness, and in the dense case (p = 0.8) SB (baseline) even fails to obtain valid solutions in some instances, yielding #IS = 0.
In terms of solution quality, OR-Tools and ReduMIS attain the best results on smaller instances (∣V∣≤100), and ReduMIS consistently achieves the highest or near-highest #IS across all evaluated settings. Although the proposed solver is occasionally slightly inferior to OR-Tools and ReduMIS on small problems, it consistently outperforms NetworkX. Moreover, except for the case of ∣V∣ = 2000 and p = 0.4, the proposed SB solver achieves better solution quality than SB (baseline), demonstrating the benefit of problem-dependent parameter estimation over a fixed-parameter setting. These results indicate that ReduMIS is a strong choice when larger latency is acceptable and maximizing solution quality is the primary objective, whereas the proposed solver provides a markedly better trade-off for dynamically changing problems, where low system-wide latency is essential and a slight loss in solution quality can be tolerated.
Demonstration
Combinatorial optimization problems in wireless network systems have been tackled with various classical and quantum approaches1,59,60,61,62. A TDMA scheduling problem to avoid interference in wireless multi-hop networks proposed by Ishizaki1 is chosen as a motif of dynamically changing problems to demonstrate the performance of the proposed MIS solver. We also developed an interactive application with graphical user interface for the demonstration. For a more visual understanding, see Supplementary Movie 1 and Supplementary Movie 2; representative snapshots from these movies are provided as Supplementary Figs. 1 and 2, respectively. Descriptions of these movies are provided in the Description of Additional Supplementary Files.
Figure 5a shows an example of static tree-topology network (a directed graph) which determines the transmission paths of the sensor network system; the destination and source nodes (sensors) of a directed edge (an arrow) are in the relationship of parent and child nodes. A parent node has to receive data from all the child nodes in a time slot or over several time slots (receiving state) and then the node (in turn as a child node) transmits the aggregated data to its parent node in a following time slot (transmitting state). The transmission process is repeated slot-by-slot until all data is finally collected at the base station (BS). A finite range that the transmission signal reaches is defined as the communication radius r (common for all the sensor nodes). After the transmission graph is determined by a graph-theory method, the TDMA scheduler determines which node transmits in which time slot. Multiple nodes can transmit simultaneously unless wireless interference occurs. It is prohibited that multiple transmission signals reach a receiving node simultaneously (in a time slot) in order to avoid signal interference. For an example, in the situation shown in the Fig. 5b, the node 3 is not allowed to transmit when the node 1 is transmitting because the node 3 is in the radius r centered at the node 2 that is the destination (the parent node) of the node 1. The relationships between nodes that cannot transmit simultaneously are summarized as an interference graph (Fig. 5c), where the nodes connected by an edge are not allowed to transmit simultaneously. The TDMA scheduling is the problem to determine the assignment of the nodes to time slots so as to minimize the number of slots to be needed based on the predetermined transmission graph and interference graph.

a An example of transmission paths with a tree topology. Each sensor sends its data to its parent node, and eventually, the data from all sensors is collected at the base station (BS). b An example of interference as a constraint. Consider a situation where data is being transmitted from node 1 to node 2 and simultaneously from node 3 to another node. If node 2 in the receiving state is in the communication radius r centered at node 1 and node 3, the interference of the multiple signals happens. c TDMA scheduling procedure. Nodes are iteratively assigned to time slots (the k-th TDMA slot) based on interference constraints. The tree graph shrinks as leaf nodes are removed at each iteration. The maximum independent set (MIS) of the subgraph G, formed by leaf nodes, is found for each time slot, and these nodes are scheduled and removed from the tree graph. This process repeats until all nodes are scheduled. d Experimental setup. Ns sensor nodes placed randomly in a 1.0 × 1.0 field, with low interference fields (L.I.F) and high interference fields (H.I.F). The communication radius r varies to create different interference levels. e The total number of TDMA slots (#Slots, lower is better) versus the computation time (lower is better) for simulated bifurcation (SB)-based and NetworkX (NX)-based MIS solvers when changing the number of sensor nodes (Ns) and the communication radius (r). Error bars represent standard deviation (SD) across 5 instances; individual data points for each instance are shown as faint dots. f Transition of graph features (edge density, average degree, Gini coefficient) and selected Ising machine parameters (c, dt, step), as well as the decision on which Ising machine to use, during iterations for Ns = 2000 in H.I.F. The light-colored area indicates the FPGA-based Ising machine is selected, while the dark-colored area indicates the CPU-based Ising machine is chosen.
Figure 5c illustrates the iteration procedure of the TDMA scheduling, where the kth iteration determines the nodes assigned to the kth time slot. The procedure uses two temporary graphs: a tree graph and a subgraph (G). The tree graph starts from the given transmission graph and gradually shrinks as a set of leaf nodes are removed at each iteration. Here, the removed leaf nodes are assigned to the time slot corresponding to the iteration. The iteration is repeated until all nodes are removed from the tree graph. At each iteration, the leaf nodes are the candidate nodes to transmit. The subgraph G at each iteration comprises nodes corresponding to the leaf nodes of the tree graph, where the presence or absence of edges between the nodes is determined according to the interference graph. The maximum independent set (MIS) of G corresponds to the maximum number of nodes that can transmit simultaneously in the kth slot. We use the proposed MIS solver to find the MIS of G. The size and characteristics of G dynamically change iteration-by-iteration as illustrated in Fig.1. The figures of merit for a method for TDMA scheduling are the number of slots achieved (#Slots, lower is better) and the total computation time (lower is better) needed to obtain the schedule. In this study, we adopted a simple greedy algorithm as described above, for the purpose of evaluating the scalable MIS solver. Intuitively, it is expected that higher MIS accuracy (the ability to find larger independent sets) would result in a smaller #Slots. However, this is not always the case due to the structure of the tree network. Therefore, it is important to note that in the subsequent evaluations, #Slots should be considered as a supplemental metric.
To compare the performance of the proposed MIS solver (SB) and NetworkX (NX), we systematically prepare experimental conditions (fields) as shown in Fig. 5d. The numbers of sensor nodes (Ns) in the fields are 200, 1000, or 2000, where they are randomly placed. There are two types of fields: low interference fields (L.I.F) and high interference fields (H.I.F), where the average number of the neighboring nodes within the communication radius r centered at a node are 8 for L.I.F and 50 for H.I.F. Note that in this work, the transmission graph is constructed based on the shortest paths from the BS to each node to create a balanced tree structure instead of the minimum spanning tree (MST) used in ref. 1 because the MST can sometimes produce very long sequential paths.
Figure 5e shows the total number of slots(#Slots) versus the computation time for the proposed MIS solver (SB) and NetworkX (NX). In all the conditions, the proposed MIS solver demonstrates a computational speed advantage over NX, with no significant difference in #Slots. Figure 5f shows, for the case of Ns = 2000 and H.I.F., how the graph features of the MIS problems (the subgraph G) change as the iteration proceeds, what Ising machine parameters (c, dt, step) are selected by the parameter estimator for each iteration, and which Ising machine (FPGA-based or CPU-based) is selected by the machine selector. The graph features change in a wide dynamic range (e.g. ∣V∣ decreases from 1174 to 1, while edge density increases from 0.042406 to 1.0). Responding to the variety of the problems, a wide variety of combinations of parameters and machines are appropriately selected by the ML-based technique, leading to the overall performance advantages of the proposed methodology shown in Fig. 5e.
The scalability, adaptability, and low system-wide latency demonstrated in the TDMA scheduling case suggest that the proposed framework can be applied to a broader class of dynamically changing problems, provided that they can be formulated as QUBO/Ising models and require repeated low-latency solving under changing problem instances.
-
TDMA Scheduling1: This is the application referenced in this chapter. Industrial monitoring and process automation represent important application contexts in which combinatorial optimization problems for TDMA scheduling arise under TDMA/TSCH (Time-Slotted Channel Hopping)-based communication frameworks. In process automation, end-to-end communication and schedule-update latencies on the order of hundreds of milliseconds are generally considered practical, whereas more stringent sub-10-ms requirements are typically associated with factory automation63. Accordingly, the computation time required to generate or update TDMA schedules represents a non-negligible component of the application-specific latency budget, underscoring the importance of system-wide latency evaluation. In1, the D-Wave Ising machine was used to solve problems with up to 88 nodes (r=0.4) within 20–30 ms. However, the reported computation time includes only the annealing time, excluding job queuing and offloading overhead, and therefore does not capture system-wide latency under deployment conditions. Although such deployment-related overheads are not intrinsic to quantum annealing itself and may be mitigated as hardware and execution environments mature, system-wide latency remains the practically relevant metric for this application. In contrast, our method demonstrates the ability to solve problems with up to 2000 nodes in real time, within 200 ms to 3 s. The unique scalability, adaptability, and speed of our approach make this application practically viable.
-
Wireless Interference Network Optimization60: Similar to1, this application aims to maximize throughput under interference constraints in wireless networks. The link selection problem is formulated as a Weighted Maximum Independent Set (WMIS) problem, and the solution is used to determine sets of links that can be activated simultaneously. In60, problems with QUBO sizes of 31 and 57 were solved using D-Wave in 20 ms. Again, only annealing time was considered. Our proposed MIS solver supports weighted MIS problems (with weights reflected in the h-vector rather than the J-matrix), enabling fast and practical solutions for this application.
-
Resource Allocation for Small Star Networks61: This application focuses on efficient resource allocation while suppressing interference between randomly deployed networks. It proposes a method that repeatedly searches for MIS on an interference graph and assigns the same resources to groups with minimal interference. Although61 does not utilize an Ising machine, the requirement to solve multiple types of MIS problems rapidly within a single application suggests that our proposed method would be highly effective.
-
In-Vehicle Multi-Object Tracking (MOT)7: This application formulates the association between trackers and detections in MOT as a QUBO problem using a flexible assignment method, and solves it with an Ising machine. In7, an embedded Ising machine was used to achieve over 20 FPS throughput from object detection to MOT (i.e., approximately 50 ms latency per solution). The use case was constrained to a fixed set of Ising parameters. By applying our proposed method, it becomes possible to robustly determine Ising parameters in response to variations in the number and overlap of objects, making the application more general and practical. Furthermore, the underlying combinatorial optimization problem (WMMBG) can be effectively accelerated using the indexing fast-computation architecture.
-
Other Wireless Communication Problems3: Various combinatorial optimization problems in wireless communication systems-such as channel assignment, transmission power control, scheduling, D2D resource allocation, and sensor network routing and load balancing-have been formulated as Ising models using Coherent Ising Machines (CIM) to enable fast and accurate optimization under dynamically changing environments. While the indexing fast-computation architecture may not be effective for all such formulations, the ML-assisted scalable and diversified architecture is expected to perform well in adapting to dynamically changing problems.
In these applications, the proposed framework does not alter the underlying problem formulation, but rather improves the system-level efficiency by enabling low-latency execution and adaptive parameter selection for repeatedly solved Ising-formulated subproblems.
