SE Radio 715: Sahaj Garg on Designing for Ambiguity in Human Input
Get the full intelligence
Search transcripts, export clips, track mentions, and explore all topics from “SE Radio 715: Sahaj Garg on Designing for Ambiguity in Human Input” inside PodZeus.
In this episode of Software Engineering Radio, host Amay Ambadeh interviews Sahaj Garg, co-founder and CTO of Whisper, a voice-to-text AI company focused on designing systems that handle ambiguity in human input. Garg discusses how human communication is inherently ambiguous due to lack of context, tone, accent, and intent—challenges that machine learning models struggle with because they typically process inputs in isolation without retaining context. Whisper's mission is to build a voice-first interface that adapts to users' preferences, styles, and histories by leveraging context, user behavior, and personalization. The conversation explores how ambiguity can be reduced through additional context, such as voice characteristics, prior corrections, and conversational history, and how models can be trained using synthetic data, instruction tuning, and user feedback. Garg emphasizes that the key to solving ambiguity lies in understanding what users truly want, not just what they say, and that systems should learn from 'revealed preferences'—like repeated corrections—rather than relying on explicit instructions. He also highlights the importance of balancing personalization with consistency, especially when users are inconsistent, and warns against AI regressing toward generic, unoriginal content unless users provide clear intent and narrative structure. The episode concludes with broader lessons applicable across AI applications: more context leads to better decisions, and the most successful systems mirror natural human interaction.
Ambiguity in human input stems from lack of context, tone, and intent—key challenges for AI systems that process inputs in isolation.
The most effective way to resolve ambiguity is by providing more context, including user history, voice patterns, and conversational flow.
Systems should learn from 'revealed preferences'—like repeated corrections—rather than relying on explicit user feedback.
Instruction tuning and context engineering are critical for training models to produce desired outputs based on user intent.
Personalization must be balanced with consistency; even inconsistent users deserve a coherent experience.
…and 3 more takeaways available in PodZeus
Introduction to the Episode and Guest
Host Amay Ambadeh introduces the episode and welcomes Sahaj Garg, co-founder and CTO of Whisper, a voice-to-text AI company, to discuss designing systems for ambiguity in human input.
Defining Ambiguity in Human Communication
“Ambiguity is kind of intrinsic as a property. If you're communicating something and you haven't given all the context or given all of the information, that's inherently ambiguous.”
Why Machines Struggle with Ambiguity
“Most machine learning models, they're just given a little bit of information at a point in time, they do a task and then they forget about what's happened.”
Ambiguity in Voice Input: Real-World Challenges
“Sometimes they're actually hard, and those are the kinds of cases of ambiguity that we're focused on solving here at Whisper.”
Types of Ambiguity and Contextual Resolution
The discussion covers different forms of ambiguity—style, tone, accent, and jargon—and how context (like audience, topic, or prior behavior) helps resolve them. Garg uses the example of someone speaking in mixed languages as a 'third language' requiring specialized data.
“Ambiguity gets resolved with more context. The more context you can give a system, the better it does.”
“If you speak into your computer and you fix a mistake twice, ideally you shouldn't have to fix it again. We should be able to learn that from you and personalize it to your desired output.”
“The only way is for me to convey what I want, right? And if I do that, then it will do what I want.”
Host
Guest
Sahaj Garg
person
LLM
other
Whisper
organization
Amay Ambadeh
person
ChatGPT
product
IEEE Computer Society
organization
TikTok
organization
Reels
product
Cloud Code
product
GSD
other
SE Radio 714: Costa Alexoglou on Remote Pair Programming
Software Engineering Radio - the podcast for professional software developers • 51m • 4/1/2026
SE Radio 716: Martin Kleppmann Local-First Software
Software Engineering Radio - the podcast for professional software developers • 55m • 4/15/2026
SE Radio 717: Eric Tschetter on Decoupling Observability
Software Engineering Radio - the podcast for professional software developers • 1h 0m • 4/23/2026
SE Radio 718: Will Sentance on JS Modernization
Software Engineering Radio - the podcast for professional software developers • 58m • 4/29/2026
SE Radio 719: Birol Yildiz on Building an Agentic AI SRE
Software Engineering Radio - the podcast for professional software developers • 53m • 5/6/2026
Get the full intelligence
Search transcripts, export clips, track mentions, and explore all topics from “SE Radio 715: Sahaj Garg on Designing for Ambiguity in Human Input” inside PodZeus.
Start discovering podcast insights today
Start with a 7-day trial and explore a growing catalog of popular podcasts. No credit card required.
No credit card required • 7-day trial • Cancel anytime
