Databricks introduces Genie ZeroOps for data and AI operations

Databricks announced Genie ZeroOps, an autonomous background agent designed to monitor operational data and AI workloads, investigate issues, and suggest fixes that teams can validate before applying.

The new agent is built into the Databricks platform and is intended to help data teams spend less time maintaining production pipelines, jobs, tables, machine learning models, and related assets.

Databricks said its data and AI teams spend significant time responding to operational issues such as broken pipelines, upstream schema changes, late arriving data, silent data quality issues, and machine learning model drift. The company said the rise of large-scale language models and agent development tools has led to faster pipeline and model shipping, increasing the need for automated operational support.

Genie ZeroOps works by continuously monitoring data and AI assets, detecting failures and data quality issues, assessing root causes using Unity Catalog lineage, generating suggested fixes, and validating them in a secure sandbox before any operational changes are made.

Because agents run within Databricks, they have access to platform observability data, including metrics, events, logs, execution history, and lineage information while operating under Unity Catalog governance. According to Databricks, this allows Genie ZeroOps to trace issues back to their root source, such as code bugs, upstream schema changes, or bad data introduced by another pipeline.

The platform uses a sandbox environment with zero-copy shallow clones of production data, scoped permissions, and network isolation. This allows you to test proposed fixes against real data without touching the production system.

According to Databricks, Genie ZeroOps is designed for data and AI operations, not general coding assistance. The company noted that while coding agents are useful for creating software, they typically lack access to the telemetry, lineage, managed production data, and secure validation environments needed to discover, diagnose, and validate modifications to data and AI workloads.

For machine learning workloads, Genie ZeroOps can diagnose problems when a model continues to run but has degraded predictions. The agent can build modified candidates, evaluate them against the same evaluation suite used for the production model, and replace surfaces only if performance is measurable.

Users can configure which assets Genie ZeroOps monitors and what actions it is allowed to take. Issues are displayed in an inbox-style interface prioritized by severity, along with root cause analysis and suggested fixes. Databricks says nothing will be applied to production without user approval.

Genie ZeroOps will go into private preview in the coming weeks, starting with support for jobs, pipelines, tables, and machine learning workloads. The app and Lakebase database are on the roadmap.

Source link