How to get multiple agents to play nice at scale

The Stack Overflow Podcast27mApril 22, 2026

Get the full intelligence

Search transcripts, export clips, track mentions, and explore all topics from “How to get multiple agents to play nice at scale” inside PodZeus.

AI-Generated Summary

In this episode of The Stack Overflow Podcast, host Ryan Donovan explores the challenges and strategies behind orchestrating multiple AI agents at scale, featuring Stephen Kalesha and Chase Ruzin from Intuit. The conversation delves into how Intuit has evolved its AI infrastructure over years, leveraging foundational platforms like GenOS to build a composable, enterprise-grade agentic system. Rather than relying on isolated, domain-specific agents, Intuit has transitioned to a skills-and-tools-based architecture with a central planner, enabling cross-domain problem solving and reducing the need for users to navigate multiple interfaces. The team emphasizes evaluation rigor—using offline, online, and human evaluations—to ensure accuracy, especially in sensitive financial contexts. They also discuss managing token costs, latency, system reliability, and the importance of observability in a fast-moving AI landscape. The episode concludes with a vision of AI that performs 'done-for-you' work, freeing users to focus on higher-level decisions. Key takeaways include: 1) A central planner with access to a unified skill and tool library enables better cross-domain coordination than isolated agents; 2) Rigorous evaluation—especially human-in-the-loop testing—is essential for trust in financial AI; 3) Observability and cost monitoring are critical in AI systems due to variable token usage; 4) Foundational platform investments (like GenOS) provide the agility needed to innovate rapidly; 5) The future of AI at scale lies in reducing user effort to near-zero through intelligent automation. The tone is optimistic and forward-looking, celebrating engineering progress while acknowledging real-world constraints.

Key Takeaways
1

Adopt a skills-and-tools architecture with a central planner to enable cross-domain problem solving across multiple AI agents.

2

Use a multi-layered evaluation strategy (offline, online, human) to ensure accuracy, especially in high-stakes domains like finance.

3

Leverage foundational platform investments (e.g., GenOS) to accelerate innovation and maintain system reliability at scale.

4

Prioritize observability and cost monitoring to manage token usage and system performance in AI-native applications.

5

Design for 'done-for-you' experiences where AI handles complex workflows autonomously, reducing user effort to near zero.

Chapters
0:00
4 min

Introducing the Multi-Agent Challenge

Host Ryan Donovan welcomes Stephen Kalesha and Chase Ruzin from Intuit to discuss the complexities of orchestrating multiple AI agents at enterprise scale. The episode sets the stage by highlighting the shift from isolated agents to coordinated, cross-functional systems.

4:20
6 min

From Isolated Agents to a Central Planner

Customers don't just ask like a question that should go to one agent or this agent or that agent, right? Very commonly they're getting cross-domain questions.

Highlight
10:00
8 min

The Role of Evaluation in AI Trust

We want to infuse that with all of the amazing experts that work with us at Intuit. That is what's kind of giving us this upper hand when we're looking across the ecosystem.

Highlight
17:30
7 min

Managing Determinism and Cost at Scale

The more input tokens, the more output tokens. That is going to impact the cost.

Highlight
24:10
4 min

The Future of AI: Done-for-You Work

The ideal utopia is you come in and it's like, hey, the work's done for you. I think the technology is starting to move in that direction.

Highlight
High-Impact Quotes
The ideal utopia is you come in and it's like, hey, the work's done for you. I think the technology is starting to move in that direction.
Chase Ruzin25:40
Viral: 90.0
Customers don't just ask like a question that should go to one agent or this agent or that agent, right? Very commonly they're getting cross-domain questions.
Chase Ruzin17:05
Viral: 85.0
We want to infuse that with all of the amazing experts that work with us at Intuit. That is what's kind of giving us this upper hand when we're looking across the ecosystem.
Stephen Kalesha11:22
Viral: 80.0
Speakers

Host

Ryan Donovan

Guests

Stephen KaleshaChase Ruzin
Topics Discussed
multi-agent orchestration95%enterprise AI architecture90%AI evaluation and trust88%skills and tools-based AI87%deterministic tools in AI85%financial AI safety82%AI cost and token management80%AI observability75%
People & Brands

Intuit

organization

32xPositive

Stephen Kalesha

person

18xPositive

Chase Ruzin

person

17xPositive

Ryan Donovan

person

12xPositive

GenOS

product

6xPositive

LLM judges

other

4xNeutral

QuickBooks

product

3xNeutral

LinkedIn

product

3xNeutral

Stack Overflow Podcast

media

2xPositive

MCP

other

2xNeutral

Get the full intelligence

Search transcripts, export clips, track mentions, and explore all topics from “How to get multiple agents to play nice at scale” inside PodZeus.

Start discovering podcast insights today

Start with a 7-day trial and explore a growing catalog of popular podcasts. No credit card required.

No credit card required • 7-day trial • Cancel anytime