Ben Kunkle talks about building Zed’s Zeta2 predictive model

Machine Learning


Zed Industries leader Ben Kunkle detailed the process of building Zeta2, an AI model designed to predict the next edit as users type. In his presentation, Kunkle discussed the technical pipeline and data considerations involved in training such models, and highlighted the challenges and solutions encountered in production.

Ben Kunkle talks about building Zed's Zeta2 predictive model - AI Engineer

Ben Kunkle talks about building Zed’s Zeta2 predictive model — From an AI engineer

Visual TL;DR. Predicting code edits leads to the Zeta2 model. Ultra-low latency requires Zeta2 model. The training pipeline trains the Zeta2 model. Data considerations inform the training pipeline. The Teacher Frontier model uses a training pipeline. Offline evaluation leads to production monitoring. Zeta2 models allow faster coding.

  1. Predict code edits: AI models predict the next code edit as you type.
  2. Zeta2 Model: Specialized small AI model for fast keystroke prediction
  3. Ultra-low latency: Must operate in less than 300 milliseconds per keystroke for real-time use.
  4. Training pipeline: Ingest production and synthetic data for model training
  5. Data considerations: Focus on “settled data” vs. production data and synthetic sources
  6. Teacher Frontier Model: Generate training data for the Zeta2 predictive model.
  7. Offline evaluation: Evaluate model performance before production deployment
  8. Production monitoring: Continuously track model performance in a live environment.
  9. Faster coding: Enables users to write code faster and more efficiently.

Visual TL;DR
Visual TL;DR—startuphub.ai Predicting code edits leads to the Zeta2 model. Ultra-low latency requires Zeta2 model. The training pipeline trains the Zeta2 model. Zeta2 model enables fast coding need train enable Code editing predictions

zeta 2 model

Ultra low latency

training pipeline

Coding faster

From startuphub.ai · Publishers behind this format

Visual TL;DR—startuphub.ai Predicting code edits leads to the Zeta2 model. Ultra-low latency requires Zeta2 model. The training pipeline trains the Zeta2 model. Zeta2 model enables fast coding need train enable Predict the codeedit

zeta 2 model

Ultra low latency

training pipeline

Coding faster

From startuphub.ai · Publishers behind this format

Visual TL;DR—startuphub.ai Predicting code edits leads to the Zeta2 model. Ultra-low latency requires Zeta2 model. The training pipeline trains the Zeta2 model. Zeta2 model enables fast coding need train enable Code editing predictions The AI ​​model predicts your next code edit as follows:they type zeta 2 model Small AI model specialized for high speedkeystroke prediction Ultra low latency Must operate in less than 300ms per keystroke.real time use training pipeline Ingest production and synthetic data.model training Coding faster Enables faster, more efficient codewrite for users

From startuphub.ai · Publishers behind this format

Visual TL;DR—startuphub.ai Predicting code edits leads to the Zeta2 model. Ultra-low latency requires Zeta2 model. The training pipeline trains the Zeta2 model. Zeta2 model enables fast coding need train enable Predict the codeedit AI model predictsThe following code for the userEdit as you type zeta 2 model Specialized, smallFast AI modelsKeystroke… Ultra low latency It should work with300ms per keystrokeFor real-time use training pipeline bring in productionand synthetic dataFor model training Coding faster faster andmore efficient codewrite for users

From startuphub.ai · Publishers behind this format

Visual TL;DR—startuphub.ai Predicting code edits leads to the Zeta2 model. Ultra-low latency requires Zeta2 model. The training pipeline trains the Zeta2 model. Data considerations inform the training pipeline. The Teacher Frontier model uses a training pipeline. Offline evaluation leads to production monitoring. Zeta2 model enables fast coding need train inform Purpose leads to enable Code editing predictions The AI ​​model predicts your next code edit as follows:they type zeta 2 model Small AI model specialized for high speedkeystroke prediction Ultra low latency Must operate in less than 300ms per keystroke.real time use training pipeline Ingest production and synthetic data.model training Data considerations “Paid data” and production environmentsynthetic source teacher frontier model Generate training data for Zeta2.predictive model Offline evaluation Pre-evaluate model performanceProduction deployment production monitoring Continuously track model performancein a live environment Coding faster Enables faster, more efficient codewrite for users

From startuphub.ai · Publishers behind this format

Visual TL;DR—startuphub.ai Predicting code edits leads to the Zeta2 model. Ultra-low latency requires Zeta2 model. The training pipeline trains the Zeta2 model. Data considerations inform the training pipeline. The Teacher Frontier model uses a training pipeline. Offline evaluation leads to production monitoring. Zeta2 model enables fast coding need train inform Purpose leads to enable Predict the codeedit AI model predictsThe following code for the userEdit as you type zeta 2 model Specialized, smallFast AI modelsKeystroke… Ultra low latency It should work with300ms per keystrokeFor real-time use training pipeline bring in productionand synthetic dataFor model training dataconsiderations Focus on “retention”data” andProduction vs. teacher frontiermodel generate trainingZeta2 datapredictive model off-lineevaluation Model evaluationprevious performanceproduction… productionmonitoring continuous trackingmodel’sLive performance… Coding faster faster andmore efficient codewrite for users

From startuphub.ai · Publishers behind this format

Understand edit predictions

Kunkle began by defining edit prediction as the task of providing context around the user’s cursor and recent edits to a model to predict subsequent edits, along with type and variable definitions, diagnostics, and errors. This process needs to be very fast, with a latency budget of less than 300ms for every keystroke, requiring a small and specialized model.

training pipeline

At the core of the training process is a pipeline that ingests both “production data” (snapshots of user activity) and “synthetic data” (git commits). This data is input into the “Teacher Frontier” model to generate predictions. These predictions are then evaluated, and failing predictions are sent to a “repair” stage, where the teacher model attempts to correct the predictions. The modified data is fed back into the distillation process to train the student model. Kunkle emphasized that each stage of this pipeline enriches data, converts JSONL input into enriched “samples,” and outputs JSONL. This is important for efficiently managing large datasets across experiments.

Data considerations and “settled data”

A major challenge in training edit prediction models is the inherent noise in the data. Kunkle explained that they use a concept called “settled data” to address this. This involves waiting for the prediction region to stabilize and then taking the final state of the code as the “answer”. By comparing your model’s predictions to this “steady state,” you can filter out noisy examples and identify high-quality training data. This method allows training on ideal examples where the match between predictions and final code is clear and unambiguous.

Offline evaluation and production monitoring

Regarding offline evaluation, Kunkle mentioned metrics such as “deltaChrF” (character F score), exact line match, reversal rate, and keep rate. These metrics are used to evaluate the model’s performance on the retained test set. He also touched on the importance of tracking model performance in production after deployment. This includes using structured logging of latency, retention, and token counts, as well as dashboards to monitor acceptance rates and A/B test results across different model versions. The goal is to continuously monitor and improve the effectiveness of the model in real-world use.

© 2026 StartupHub.ai. Unauthorized reproduction is prohibited. Please do not type, scrape, copy, reproduce or republish this article in whole or in part. Use for AI training, fine-tuning, search enhancement generation, or as input to any machine learning system is prohibited without a written license. Substantially similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer abuse laws. See our Clause.



Source link