Adaptive Test Gaining Ground For HPC And AI Chips

Machine Learning


Adaptive test is starting to gain traction for high-performance computing and AI chips as test programs that rely on static limits and fixed test sequences reach their practical limits.

The growing complexity of multi-die assemblies and power delivery, along with increased stresses, are forcing a shift toward real-time, data-driven optimization at the test cell.

“It’s the same old problem,” said Brent Bullock, test technology director at Advantest. “If you don’t have the right data, all the intelligence in the world doesn’t help you. You’re just guessing.”

Adaptive test offers a way to adjust test conditions on the fly, predict failures before they occur, and focus expensive test resources where they matter most. Yet adopting these techniques in production is more complicated than just bolting machine learning onto a tester. Every component of the test environment must remain stable enough for models to function. The data pipeline must be accurate, timely, and complete, and the models themselves must be carefully validated to avoid costly escapes or unnecessary overkill.

Scaling test insight
The growing reliance on real-time insight places new expectations on both the test infrastructure and the data that flows through it. Adaptive techniques can only be as effective as the measurements and models they depend on. The value is clear, but so are the challenges. Static testing may no longer be adequate, but dynamic testing introduces complexities that must be understood before they can be trusted.

The first problem is recognizing that much of what engineers want from adaptive test depends on data that is not readily accessible or simply does not exist. HPC and AI architectures are increasingly defined by their interactions across multiple voltage islands, chiplets, and localized thermal traps. Due to variation at those scales, it is extremely difficult to get consistent data. Even the same chip may produce different results depending on microenvironmental influences. Variation no longer follows clean statistical patterns across a wafer or lot. Instead, it often appears as localized shifts in timing or power behavior that emerge only under load. That makes test programs harder to optimize with traditional guardbands and predetermined sequences.

“You can’t fully mimic real workloads upfront,” said Alex Burlak, vice president of test and analytics at proteanTecs. “You need observability in the silicon so you can see what the device is doing while it’s operating.”

Modern devices also generate far more data than earlier-generation test systems were designed to handle. An HPC processor or AI accelerator can produce millions of data points across wafer sort, package test, and system-level evaluation. These measurements are often needed at multiple test insertions to enable predictive or adaptive decisions. Routing, synchronizing, and filtering that information fast enough to influence real-time test behavior has become a new engineering problem in its own right.

“There is a lot of value in big-data analytics, but if you want to make decisions for screening or optimization, you must deploy that on the tester,” said Burlak. “If it cannot operate in real time, you cannot use it in production.”

At the same time, engineers must confront the fact that the test cell itself introduces variability. Sockets, probe cards, load boards, and power delivery systems all add their own fingerprints to the measurement environment. When adaptive techniques tighten limits or flag marginal behavior, even small mechanical or electrical inconsistencies can influence the outcome. Stability, calibration, and continuous monitoring across the entire test stack are essential.

“The test socket or probe interconnect is often the forgotten part of the measurement path, it’s often assumed good and not a major variable,” said Jack Lewis, CTO of Modus Test. “Test sockets and probe interconnects introduce mechanical variability into the measurement, which creates electrical variability, and that variability shows up in your results.”

The emphasis in test is no longer solely on measurement accuracy, but on the timeliness, completeness, and contextual clarity of the data itself. To support adaptive decisions, test platforms must treat data as an operational asset rather than a reporting artifact. That means integrating feature extraction, metadata alignment, and genealogy tracking throughout the manufacturing flow, often in ways that challenge legacy systems.

“Validation isn’t only about the final result. It starts with the data,” said Jin Yu, head of machine learning at Teradyne. “We spend a lot of time verifying that the data quality is right before building models, because if you just throw everything together and hope for a good outcome, it won’t happen.”

The growing need for data orchestration is one of the main reasons adaptive test has not yet become standard practice. The techniques are promising, but the engineering foundation required to support them is still under construction in many organizations. For test teams that traditionally have relied on deterministic, procedure-based workflows, adopting a more fluid, data-driven model requires new investment in software and modeling, along with collaborative processes that span design, test, and manufacturing.

“It’s not just about looking at test results. You have to bring together the test data, the equipment data, the OEE data, and the prior process data so you can connect the dots,” said Aftkhar Aslam, CEO at YieldWerx. “That’s what enables true feed-forward and feedback decisions, where AI and machine learning can recommend which actions to take and which ones to avoid.”

This also raises deeper questions about how engineers validate models that influence production decisions. Predictive screening and adaptive limits appeal to cost-sensitive organizations, but only when the analysis is transparent enough to gain trust. Engineers need to know why a model labels a device as marginal, and what specific features contribute to that classification. Without that clarity, skepticism remains high.

“It’s like Clarke’s third law,” said Marc Jacobs, senior director of solutions architecture at PDF Solutions. “People want something indistinguishable from magic to solve problems that regular engineering can’t. For example, they might want a machine-learning model for field failures, but the problem with machine learning is that you need significant numbers of field failures to train the model accurately, and nobody wants to be the company shipping that many failures.”

Furthermore, errors in adaptive classification can be expensive. Overly aggressive limits can eliminate good silicon, adding direct cost and supply risk. Insufficiently conservative limits can allow latent defects to escape, where they may appear unexpectedly in the field. Both outcomes undermine the value of adaptive techniques and highlight the need for careful validation.

“When people start deploying AI at scale on manufacturing data, there are often a lot of garbage-in, garbage-out situations,” said John Kibarian, president and CEO of PDF Solutions, in a recent presentation. “You only really start to understand what your true data quality is once you have context for that data.”

Adaptive adoption challenges
Some companies are finding early success with predictive screening and dynamic limits, but others caution that mechanical variation, data quality, and model drift still limit broad adoption. Nearly everyone agrees that demand for adaptive test is rising, but most are skeptical that it will be a straightforward transition.

“Two approaches we see are hold-back or skip programs to validate changes, so you have a constant check that you’re not creating a product problem,” said Jacobs. “If the input data shifts drastically, then the model may not behave the way you expect.”

Early results are mixed because adaptive methods touch every part of the test stack. The wins tend to come where teams already have clean telemetry, disciplined limits management, and a way to execute decisions at the edge. The misses show up when model inputs drift faster than governance can respond, or when mechanical variation and noisy fixtures swamp small-signal effects. In actual practice, most organizations are still building the infrastructure, with feature pipelines between insertions, synchronized IDs across suppliers, and guardrails that keep models inside known-safe operating envelopes.

Building trustworthy signals
The near-term playbook focuses on pragmatics. Start with use cases that are insensitive to small metrology noise, such as redundancy pruning and outlier screening on stable parameters. Instrument the test cell to separate device behavior from socket and power-path effects, and make edge execution a first-class requirement rather than a future enhancement. Treat models like production equipment that need calibration schedules, change control, and rollback plans. The goal is fewer brittle, monolithic models and more narrowly scoped ones that are easier to validate and retire when conditions change.

“Right now, the main thing people are doing is monitoring for drift and shift,” said Jacobs. “If someone makes a test-program change or fixturing moves, the model will drift. The first step is to debug the issue. If the probe card has some flux residue causing leakage, you don’t retrain your model.”

Adaptive test also hinges on explainability. Test teams need to know which features drove a decision and how those features map to physical failure modes, not just a test score.

“For a long time, when we handled the test data, we assumed the test instrument was perfect, but it’s not,” said Teradyne’s Yu. “The instrument data, calibration data, and service patterns all need to be considered. If the model only looks at the device side, it doesn’t see when the instrument itself needs attention.”

That transparency is what lets a product engineer decide whether to tighten a limit, route a lot to stress testing, or trigger a tool-side investigation. Without it, the safest choice is to do nothing, which defeats the purpose of adaptive control.

Even when organizations have the infrastructure to support adaptive methods, the fundamental challenge often comes down to signal quality. The parameters most useful for prediction are rarely the ones that are easiest to measure. Subtle timing margins, droop signatures, thermal interactions, and coupling effects can drift as the package settles or as the device experiences real workloads. Extracting meaningful signals requires instrumentation that can survive variation in the test cell, electrical loading, and environmental conditions.

“You need enough visibility into the device to know what is happening electrically,” said proteanTecs’ Burlak. “If you only see the final outcome, you cannot tell whether the part is changing or your environment is changing.”

Still, even with built-in observability, small shifts in external conditions can distort the measurements feeding an adaptive model. Thermal settling, probe-to-pad contact variation, and power-path impedance all influence how a device behaves under stress. These factors become more pronounced in HPC and AI devices that operate near thermal or electrical limits. If the environment shifts faster than the model can track it, previously stable features become unreliable.

“Latent defects aren’t discovered at time zero. It usually requires some kind of stress put on the device — high voltage or temperature over time — and burn-in serves that purpose,” said Davette Berry, senior director of customer programs and business development at Advantest. “Most commercial products have gotten away from doing burn-in, but most of these high-performance compute devices are having to put it back in to the product test flow, because having it fail six hours after it’s been installed in the data center is much worse than adding a burn-in test insertion.”

Once a clean signal is identified, the next challenge is aligning measurements across insertions. Many adaptive strategies rely on correlating wafer-sort behavior with package or inspection data, which requires stable identifiers and comparable conditions. Small mismatches between insertions can produce apparent trends that have nothing to do with silicon health. Engineers often discover inconsistencies not in the device, but in the metadata that accompanies it.

“In the ideal situation, you would measure every location, every die, every wafer but that’s not possible,” said Joe Kwan, director of product management at Siemens EDA. “In reality, the data is very sparse. What we do is take that sparse metrology, combine it with the design data and the process information, and put it into a digital twin. Then we can predict what the metrology would look like in all the other places we didn’t physically measure.”

By extending sparse measurements with modeled context, engineers gain a fuller view of silicon behavior while avoiding the cycle-time hit of expanded inspection. That broader context becomes essential as packaged devices exhibit new thermal and electrical sensitivities that do not always appear at wafer sort.

The importance of physical context extends to the test conditions. Adaptive decisions are only as good as the stability of the conditions they rely on. Temperature gradients, socket thermal impedance, and power-delivery transients can create differences between otherwise identical devices. For HPC parts that draw large, rapidly changing currents, the test cell must be instrumented to capture the true mechanical and electrical state at the moment of measurement.

The result is a growing recognition that adaptive test is not simply a data-science problem. It is an infrastructure problem, a measurement science problem, and a correlation problem all rolled into one. For organizations pursuing adaptive methods, the early wins tend to come from reducing noise, instrumenting the environment, and aligning data structures across the test flow. The more stable the signal, the more reliable the model, and the easier it becomes to introduce controlled adaptation without risking escapes or unnecessary yield loss.

Adaptive test as new discipline
If adaptive test is going to scale beyond isolated use cases, the supporting infrastructure has to evolve from a deterministic, stepwise flow into something that can react in real-time. Traditional test assumes stability — stable limits, fixturing, data paths, and relationships between insertions. Adaptive methods challenge those assumptions by introducing feedback loops into places that historically only consumed data. That shift exposes weaknesses in handoffs, metadata, and change control that were less visible in fixed-sequence programs.

A bottleneck arises when test engineers attempt to combine device telemetry, environmental measurements, and upstream context into a single actionable model. Each signal reflects a slightly different view of the device, and unless those views are synchronized, the model will treat inconsistency as variation. Organizations accustomed to static guardbands may find that their first adaptive experiments fail because the infrastructure around the tester was never designed to maintain continuity across so many moving parts, not because the underlying science is flawed.

Adaptive test also alters who owns the decisions. Limits engineering, product engineering, data science, and manufacturing operations all have to treat test as a dynamic system rather than a static script. That requires new governance, calibrations, and explicit roles for validating models and monitoring drift. Companies that make progress with adaptive test tend to be the ones willing to unify disciplines that historically operated in parallel rather than together.

This is not primarily a modeling problem. It is a coordination problem. Production environments are full of micro-decisions that depend on accurate IDs, consistent metadata, synchronized timestamps, and stable supply-chain handoffs. Even a strong predictive model will fail if it receives misaligned features or conditions that drift faster than its update cycle.

“Collaboration isn’t something that happens only at the tail end,” said YieldWerx’s Aslam. “It runs through the entire product lifecycle from design and DFT, to building and evaluating test chips, to feeding those results back into simulation and verification. As you move into manufacturing, you’re working with product, package, test, and quality engineers, and even field teams once early samples reach applications. It’s a closed loop, and every stage influences the next.”

The move toward real-time control also forces teams to revisit how engineering work is divided. Historically, test engineers wrote patterns, debugged failures, and tuned limits, while manufacturing owned execution. Adaptive programs blur those boundaries. If a model tightens or relaxes limits at the edge, someone must define the allowed range of motion, certify that the signals feeding the model have not shifted, and decide when the model should be retrained, rolled back, or retired. None of those responsibilities map neatly to legacy roles.

What emerges is a picture of adaptive testing as both a technical and cultural transition. The technical work involves cleaning signals, synchronizing data, instrumenting the test cell, and validating models under drift. The cultural work involves redefining ownership and closing communication gaps between design, test, and fabrication. Treating adaptive test as an engineering discipline, rather than as a bolt-on feature, is what turns isolated wins into a sustainable capability.

Conclusion
Adaptive test is moving from theory to practice, but the transition is neither linear nor guaranteed. The promise is certainly compelling. Engineers want test systems that respond to real device behavior, allocate time where it matters most, and prevent downstream surprises in high-value HPC and AI products. Yet each step toward that future exposes another dependency. Clean telemetry, stable measurement conditions, synchronized identifiers, interpretable models, and edge execution all need to align before adaptation can safely influence production decisions.

The encouraging trend is that these capabilities are emerging together. Built-in observability is improving. Test cell instrumentation is becoming more precise. Analytics pipelines are maturing, and the organizations deploying them are learning what it actually takes to maintain model health over time. At the same time, engineers are developing a more practical understanding of when adaptive methods add value and when they introduce unnecessary risk. Most of the early successes come from removing noise, tightening correlations, and establishing the governance needed to trust even small adjustments, not from aggressive control.

The real shift is cultural as much as technical. Adaptive test requires teams to think of models the way they think of equipment. It requires organizations to treat data lineage as part of the manufacturing flow and to align design, test, and operations around shared signals rather than isolated metrics. That shift will take time, but the direction is clear. As complexity continues to rise, deterministic programs will struggle under the weight of their own assumptions, and dynamic strategies will become less an innovation than a necessity. The companies that succeed will be the ones that build stable signals first, apply adaptation cautiously, and treat explainability as a requirement rather than an aspiration.



Source link