The “ground truth” gap: Why pixels need people

Machine Learning


The future of surveying, mapping, and global sustainability requires a combination of satellite technology and human insight.

Written by Priscilla Moulin

I would like to draw a picture that is familiar to anyone who works with spatial data.

A geospatial analyst is monitoring a compliance dashboard when a cluster of pixels in a remote location changes from deep green to red.

Machine learning algorithms generate automatic alerts to notify you of high probability deforestation events. Within days, small farmers thousands of kilometers away will be flagged down by corporate buyers and locked out of global supply chains.

What about the prey? The farmer had not cleared a single hectare of primary forest. Instead, the alert may have been caused by technical and data limitations, environmental and biological factors, human activities, land use misclassification, or analysis errors. Observations of the 2026 orbit are undoubtedly impressive, but they cannot accurately determine intent or causality, nor can they account for complex local dynamics.

This critical flaw will become even more important with the landmark European Union Deforestation Regulation (EUDR) due to come into force at the end of 2026.

(Editor’s note: EUDR applies to Australian companies exporting certain goods such as beef and timber, as well as derivative products such as leather, to the EU. They must provide geolocation information about the source of the product at the time of import.)

Big commodity buyers are desperate for compliance, and many are leveraging the golden age of AI and satellite imagery to help understand deforestation. However, as highlighted above, relying solely on remote sensing to monitor agricultural supply chains creates a highly detrimental “ground truth gap.”

For the surveyors, geospatial professionals, and data scientists building these systems, it’s time to critically examine the limitations of pixels. We also need to recognize that automated models built without human geography have fundamental flaws that need to be addressed.

Algorithm limitations

In the rush to expand supply chain mapping, compliance platforms rely heavily on automated land cover classification. Modern satellite constellations provide incredible temporal frequency and spatial resolution, but the machine learning models that interpret their images are not perfect.

Although algorithms are good at detecting land cover changes, they are not designed with context, intent, or legality in mind. For those of us working at the intersection of spatial data and tropical agriculture, there are several potential reasons for deforestation warnings to be aware of.

  • Intentional clearing: Deliberate and illegal deforestation to clear new land for cash crops.
  • Natural phenomenon: Accidental or natural landscape disturbances, such as damage from localized fires, flooding, or severe storms.
  • Permitted agriculture: Legal and planned rotational harvesting, fully consistent with recognized community land ownership rights.
  • Citizen development: Unrelated rural infrastructure projects, such as new residential roads or power lines, that are completely disconnected from the agricultural supply chain.
  • Invisibility of land ownership: Much of the world’s agricultural production takes place on completely unregistered land with no formal legal title. Satellites cannot detect cadastral boundaries or the legal status of land, making it impossible to comply with regulations requiring verifiable land ownership without ground-level administrative data.
  • Protected areas overlap: Algorithms can detect trees, but without integrated spatial mapping, agricultural expansion cannot be automatically cross-referenced to legally designated protected areas or permanent forest reserves.

Water warning

The image below is a good example. This shows a series of red polygons (deforestation warnings) appearing along the surface of the lake and its edges. Forest cover loss does not occur in open waters because canopy vegetation does not grow in open waters, both geomorphologically and ecologically. Alerts in these locations are pure false positives.

Generally, several technical factors cause errors such as:

  1. Due to seasonal water level fluctuations that expose edge vegetation and substrate, changes in spectral reflectance between acquisition periods are read as forest loss.
  2. Residual thin cloud cover, cloud shadows, and atmospheric haze that were not fully corrected during preprocessing.
  3. The sun’s glow on calm water produces NIR/SWIR signatures similar to vegetation loss. and
  4. Water mask not updated to reflect lake morphology dynamics (edge ​​expansion or contraction).

Risk of leakage

What happens when we use spatial data to drive automated enforcement without human context? We oversimplify complex geographies and create paths of least resistance.

Faced with red polygons on compliance dashboards and harsh EUDR penalties, or the threat of violating their “No Deforestation, No Peat, No Exploitation” (NDPE) commitments, corporate buyers almost always opt for self-preservation. They drop the flagged suppliers and source their goods from elsewhere.

This creates a dangerous illusion of compliance. The company’s dashboard looks clean and the spreadsheet shows zero deforestation. However, this compromises both data integrity and sustainable development.

When vulnerable smallholder farmers are suddenly locked out of premium regulated markets due to algorithmic false positives, they don’t simply pack up their tools and quit farming. Due to financial necessity, we need to find another buyer. This dynamic forces farmers aggressively into “leaky markets” – areas or buyers with lower environmental standards, fewer NDPE policies, zero EUDR oversight, and lower prices.

These black markets allow deforestation to continue unchecked. One unintended consequence of laws like the EUDR is that they essentially use advanced spatial techniques to clean up Europe’s supply chains, while at the same time quietly pushing the actual root causes of deforestation into the shadows. Accurate environmental monitoring cannot be achieved at the expense of ground reality.

people need to use technology

To build surveillance systems that actually work, the geospatial industry must instead approach more complex methodologies. Satellites and AI are tools, but validating them requires accurate baselines and verified ground data.

Integrating human validation into spatial workflows may feel like a step backwards for those embedded in the world of technology. However, when it comes to extremely important issues like deforestation, we need to make full use of orbital technology and human geography to ensure that NDPE policies and laws like the EUDR have a truly positive impact. Effective due diligence must go beyond simple risk screening. An end-to-end workflow that combines landscape analysis and field implementation is required.

First, the monitoring system must be anchored to a validated baseline. Deforestation alerts cannot be generated from common old global datasets. It requires a rigorous methodology that combines high-resolution spatial data of plantations with digital cadastral records and official protected area maps.

By linking suppliers to specific plot polygons with stable IDs, validating those polygons, and attaching physical ownership documents, companies can build legally defensible master datasets. Without this detailed traceability, spatial alerts cannot be translated into meaningful accountability.

Second, technology must spark conversation…not automatically shut down. High-risk tree cover loss alerts should be combined with response protocols that require on-site verification. The local surveyor or agronomist must then establish the true context of the clearing and determine whether it is intentional, legally permitted, or related to a complex land ownership dispute. This includes assigning a risk classification and defining specific mitigation actions for flagged plots.

Migration to Verification-as-a-Service

The broader market is already waking up to the liability of relying solely on algorithms. We are witnessing a clear and rapid shift in AgTech investment trends. Money is flowing away from standalone Software-as-a-Service (SaaS) monitoring platforms and into Agriculture Technology-as-a-Service (ATaaS) and Verification-as-a-Service (VaaS) models.

Woman holding a microphone and speaking to the audience
Author Priscilia Moulin

Investors and big brands now realize that satellite imagery is just raw material, and the real value lies in verified insights. The VaaS model combines scalable remote sensing with a robust network of local agronomists, surveyors, and field agents.

This hybrid approach provides the high-fidelity, legally defensible data required by rigorous frameworks such as EUDR, while significantly reducing reputational and supply chain risks associated with automated false positives.

conclusion

The EUDR is a landmark piece of legislation that has provided the geospatial industry with an essential tool for unprecedented visibility into global supply chains. But we must remember that they are just that: tools.

Laws and satellites don’t save forests; humans save them. Extracting compliance data from space is a great first step, but it is incomplete without an accurate baseline and verifiable ground activity.

The future of surveying, mapping, and global sustainability requires combining the best of satellite technology with human insight, recognizing the messy complexities of the truth on the ground, and working toward a more just and accurate future for everyone.

Priscillia Moulin is co-founder and strategy director at MosaiX and senior advisor to Earthqualizer Foundation and Inovasi Digital.





Source link