Braintrust’s Phil Hetzel recently discussed the evolving role of data scientists in the context of Generative AI agents and asked an important question: “Does GenAI ‘belong’ to” data scientists? Hetzel’s presentation at the AI Engineer Europe event explored the nuances of agent development and ownership, highlighting how different organizational structures and team compositions impact the process.

Who is Phil Hetzel?
Braintrust’s Head of Solutions Engineering, Phil Hetzel, brings more than 12 years of consulting and implementation experience to his role. Previously, he led Slalom’s global Databricks business unit. Mr. Hetzel’s personal interests include playing chess (he’s bad at it) and spending time with his wife and dachshund, Pistol Pete, who made a cameo appearance in the presentation.
Understanding agent development: traditional and AI-native approaches
Hetzel realized that construction agents, a result of advances in data science with large-scale language models (LLMs), can be approached differently depending on the nature of the organization. In traditional companies, the inspiration to build agents often comes from leaders such as CEOs and CIOs who read about AI in industry publications. These leaders delegate building agents to existing ML platform teams who are familiar with building models and deployment pipelines. This approach often reuses existing ideas for agent evaluation and deployment.
In contrast, in AI-native companies, agent building is often driven by founders with a specific problem to solve. These companies tend to have small groups of engineers who build their solutions, in part through agents. In these small businesses, everyone is often close to the problem, so it’s easier to identify what needs to be done. A key difference Hetzel pointed out is that in these AI-native environments, models are often already built and functionality can be added using natural language, which is very different from traditional ML development.
Data science workflows and their application to agents
Hetzel outlined a simplified data science workflow: data, label, train, test, deploy, implement, and observe. He contrasted this with the process of generative AI agents, where the initial data processing, training, and deployment phases are often completed upfront by the LLM provider. For AI teams, the focus shifts to testing and implementation. Hetzel emphasized that while the underlying LLM may have been pre-built, the process of testing and evaluating agent performance is still important. This includes ensuring that the response is appropriate and that the agent’s behavior matches the intended use case.
Examples of data scientists in agent development
Hetzel presented three core arguments for why data scientists are essential to agent development.
- Agents use models, and models are managed by data scientists. Data scientists have the expertise to understand and manage the underlying models that power these agents.
- Existing model deployment workflows can be reused. Organizations already have established processes for model deployment and management that can be adapted to agent deployment.
- Rigorous thinking about testing: Data scientists take a rigorous approach to testing, which is essential to ensuring the reliability, safety, and effectiveness of AI agents.
He elaborated that data scientists can contribute through education and help product engineers and managers understand technology. You can also stay up to date on new research and keep your broader team informed. Additionally, expertise in evaluating discrete outputs and applying traditional machine learning metrics such as precision, recall, and F1 is essential for developing reliable agents.
Counterargument: Agents are more than data scientists
Hetzel also presented a counterargument suggesting that agents can and should extend beyond data scientists.
- LLM is just an API that product engineers can use as follows: Product engineers are already adept at using APIs within the applications they build, making the LLM API a natural extension of their skillsets.
- Agents can be complex systems. Agents can be complex distributed systems, and relying solely on data scientists may not be optimal for managing this complexity.
- Product managers and SMEs understand success and failure as: Product managers and subject matter experts (SMEs) have a deep understanding of the problem space and can better identify what makes an agent successful or unsuccessful.
From this perspective, you can see that LLM is essentially a powerful API that can be easily leveraged by product engineers who are already skilled at integrating APIs into their applications. Additionally, the inherent complexity of agents as distributed systems means that a multidisciplinary approach is needed, involving not only data scientists but also engineers and small businesses who understand the specific use case.
An ideal match: collaboration is key
Ultimately, Hetzel concluded that the most effective approach to developing agents requires a collaborative, interdisciplinary team. He proposed an ideal combination where data scientists focus on teaching, performing ML-style scoring, and fine-tuning models as needed. Product/Application/System Engineers handle implementing requirements, building agent-centric systems for operational readiness, and implementing evaluation and observability pipelines. Non-technical experts such as product managers and SMEs contribute by gathering requirements, providing domain expertise, annotating data, and experimenting with prompts using natural language.
Hetzel emphasized that the subject matter expertise of non-technical team members is invaluable in shaping automated LLM judge scoring and understanding agent performance in real-world scenarios. The rigor of data scientists in evaluating and validating these agents is also important to building confidence in their deployment. The core message is that while data scientists play a critical role, developing successful AI agents requires a blend of skills and perspectives across the organization.
