From the black box to the watchtower: governing AI in times of conflict

When a military conflict can escalate within hours, the tools used to predict it cannot afford to be opaque. Artificial intelligence (AI) is already shaping the way governments assess threats, monitor flashpoints, and weigh the costs of intervention. However, the systems that perform this task often operate as black boxes, producing unexplained probabilities and unaccountable conclusions. The gap between predictive power and institutional trust is not a technical footnote. It’s a governance crisis.

Opacity problem

For decades, high-performance models such as support vector machines and deep neural networks have demonstrated superior accuracy in competitive prediction. But when war and peace are at stake, accuracy alone is not enough. Predicted scores that cannot be investigated cannot be challenged, audited, or owned. When policymakers act on results they do not understand, they are not exercising judgment. They outsource it.

The result may be called Algorithm myths: A quiet respect for machine output disguised as evidence-based decision-making. In the context of diplomacy, this is not just intellectually unsatisfying. It’s dangerous. Historically, misreading signals, unchallenged assumptions, and opaque escalation paths have contributed to catastrophic miscalculations. Adding an unaccountable AI layer does not reduce this risk. That makes it worse.

Rough sets and readable uncertainty values

One underappreciated alternative is rough set theory, a framework that treats ambiguities as signals worth preserving rather than flaws to be removed by design. Rather than forcing geopolitical complexity into a clean probabilistic output, rough sets organize knowledge into zones of certainty, possibility, and uncertainty. Border areas are not a weakness of the model, although it is not clear that conflict will occur or that it can be avoided. It is its most important artifact. Complementary to this, fuzzy logic provides a way to express stages of truth, capturing the reality that geopolitical situations are rarely binary, but exist along a continuum such as “high tension,” “moderate instability,” or “low risk.” Rough sets depict the structure of uncertainty, whereas fuzzy systems quantify its extent and assign interpretable membership values that reflect partial attribution rather than strict classification.

Expanded rationality without governance is not the solution. This is a new category of risk.

For diplomats and security analysts, this combination is strategically powerful. Rough sets pinpoint where uncertainties are concentrated and direct attention and resources to the most unstable cases, while fuzzy logic refines these insights by quantifying how strongly certain conditions contribute to risk. Together, these allow for earlier and more targeted intervention without sacrificing interpretability. Importantly, both approaches produce linguistic, rule-based output in the form of traceable statements that connect observable states and expected outcomes, rather than the opaque numerical weights of traditional neural networks. This creates an audit trail. Policymakers can examine what their systems rely on, challenge assumptions that appear to be flawed, and take real responsibility for subsequent decisions.

From bounded rationality to controlled rationality

Political theorist Herbert Simon explained that human decision-making is limited and constrained by limited information, cognitive capacity, and time. AI systems extend their limits and act as so-called rationality multipliers. They process more data, identify more patterns, and model more scenarios than human analysts alone can.

But extended rationality without governance is not the solution. This is a new category of risk. Biases embedded in training data are amplified at scale. Models trained on past patterns can misinterpret new configurations. An illusory confidence can sometimes masquerade as analytical rigor. Therefore, the challenge is not only to build more powerful models that improve our ability to predict conflicts. This is to control the expansion of rationality itself and ensure that AI can augment human judgment. Human judgment should always be involved in high-stakes conflicts.

This means building transparency, traceability, and human oversight into AI systems by design, rather than as an afterthought. This means evaluating systems not only for predictive accuracy, but also for explainability, auditability, and alignment with legal and ethical standards. And that means structurally centering human judgment in high-stakes decisions, not as a formality, but as a real safeguard.

Multilateral obligations

No single state can and should not set the standards for responsible AI use in the security sector. Conflict prevention is inherently transnational. Important signals such as refugee flows, arms transfers, economic shocks, and political violence cross borders, and so must the frameworks that govern the systems that read them.

Multilateral institutions play an important but largely unfulfilled role. Data sharing agreements between governments can be facilitated to accumulate information for competitive advantage. These can facilitate interoperability between national early warning systems. It can reduce technological asymmetries that risk turning AI-powered conflict prediction into a tool of great power domination rather than collective security.

Sovereignty and collective responsibility will inevitably be pulled in different directions.

This requires creative governance architecture. That means a trusted data space with defined access rules, an independent international auditing body with real powers to assess high-risk AI systems, and shared transparency standards that don’t require states to disclose sensitive capabilities. The goal is a minimum viable layer of accountability sufficient to prevent misunderstandings, reduce strategic miscalculations, and build the cross-border trust necessary for effective conflict prevention.

Sovereignty and collective responsibility will inevitably be pulled in different directions. This tension cannot be resolved, but it can be managed through institutions designed for that purpose.

Hybrid system, hybrid trust

The most effective early warning systems are neither purely interpretable nor purely high-performance. These will be hybrids that combine the predictive power of complex models with the readability of transparent approaches such as rough sets and neuro-fuzzy logic. High-performance models identify risks at scale. An interpretable model describes it in actionable terms for decision makers. One or the other alone is not enough. Together they constitute more than just a predictive tool. That is, a system that can build trust between machines and the humans who ultimately have to answer to what those machines recommend.

That trust is not a luxury. It is a prerequisite for everything else in conflict prevention.

language of peace

The goal of AI in conflict management should not be to eliminate uncertainty, which is impossible, but to make uncertainty legible, contestable, and governable. A system that tells diplomats there is a 70% chance of conflict has accomplished something. We realized that a system that explains why and where the evidence is weakest and what assumptions drive the conclusions would be much more useful.

The language of peace always requires precision, nuance, and the courage to act on incomplete information. AI can enhance its capabilities, but only if we insist that it speak in language that we can understand, challenge, and ultimately hold accountable. Prediction without accountability is simply a more sophisticated way of not knowing. Governed intelligence is something else entirely.

Source link