India is in the middle of a Voice AI gold rush. We've seen quite a few Voice AI startups cropping up in recent months - Bolna, Ringg, Arrowhead, Vaani, Navana, to name a few.
The premise makes sense: replace the telecaller, reduce costs, and scale the outreach. Many of them have also shown early success with their design partners, with the BFSI sector leading the charge.
On paper, the value prop is obvious. You deal with different languages and dialects at scale without being constrained by a telecaller vendor who needs constant training and monitoring. You now have Voice AI front-ending conversations reasonably well - it doesn't tire out, it adapts to dialects, and it is infinitely patient.
On the surface, this looks like a classic India success story. Just as we made shampoo, mobile data, and payments cheap and accessible, we are now sachetising AI — low-cost, bite-sized, ubiquitous. Vendors are selling minutes for pennies.
But there is a fatal, overlooked distinction.
Sachetisation as a product (access) is revolutionary.
Sachetisation as a service (outreach) is noise.
The technology is the same. The architectural choice is not.
as Access
as Outreach
1. The Spam Factory
Most Voice AI deployments in India are built for Resource Substitution. Effectively, you're replacing the human telecaller with Voice AI.
- Old Model: Hire humans to read rigid scripts.
- New Model: Spin up AI agents to read the same rigid scripts at 1,000x scale.
It helps to take a step back and analyze the customer experience. Did anyone actually enjoy the credit card and loan calls? Why did Truecaller take off in India? Because we became increasingly wary of these unsolicited calls, which got annoying after a point.
We have seen this play out before with SMS. 15 years ago, the SMS inbox was a place for conversation. Today, it is a graveyard of spam, and perhaps its utility is relegated to just OTPs.
This happened because we optimized the channel for Cost, not Relevance.
When it became cheap enough to blast a million messages, businesses stopped caring about relevance. The signal-to-noise ratio collapsed, and users migrated to WhatsApp (which had friction/consent barriers) for actual conversation.
Voice AI today seems to take on this same trajectory. When a lender blasts 1 million leads from a generic database, they aren't engaging; they are fishing with zero context and zero personalization. And now, you are scaling the outreach by an order of magnitude, but the lack of personalization is making users even more frustrated. The ennui sets in sooner, and if we continue down this path, the Phone Call will go the way of the SMS: a channel that is functional but socially bankrupt.
Noise
2. The Uncanny Valley
The second failure is structural. Most startups are retrofitting Generative AI into call-center logic designed in the 1990s.
Consider the typical experience:
- Voice AI: Great conversation, understands intent, but hits a wall.
- Human Agent 1: Zero context passed. Customer repeats everything.
- Sales Agent: Zero context passed. Customer repeats everything again.
- Outcome: The AI worked, but the System failed.
The voice sounded human, but there was no continuity whatsoever. The customer doesn't care that "The AI is good." They care about the experience of engaging with the Brand (whether it's AI or humans) and in the context of their intent.
3. The Real Opportunity
The promise of Voice AI in India is not cost reduction but a Structural Leapfrogging.
We have seen glimpses of this with the Soundbox. Why did Soundboxes win in India? Because merchants didn't want to look at screens or log into dashboards. The audio confirmation of payments addressed the baseline trust.
Voice conversations are non-linear. Customers move between frustration, curiosity, hesitation, and intent in seconds. When you bolt AI onto a rigid, script-based structure — a classic case of the Copilot Fallacy — you create a system that can speak but cannot understand the user. The unspoken words, the hesitation, the context shift mid-call: these are what the script cannot read.
We need to apply the Ditto Insurance philosophy to Voice AI. Ditto led with understanding the user's hesitation, not shoving a policy in their face. They set up conversations, understood your needs, educated you, and then solutioned in a way that didn't look like selling.
We need to lead with trust. Voice AI has the opportunity to assist in building that, unless we make the same mistake of unilaterally optimizing and sachetizing the tech.
Millions of Indian businesses (Kiranas, Traders, Service Providers) skipped the PC era and struggled with the SaaS era. ERPs are too complex for businesses that run on WhatsApp and intuition.
Voice AI offers the opportunity. Imagine a business with no website, no app, and no login - just a Google Maps listing front-ended by an Agentic Voice AI. In this world, Voice is not a "Dialer or IVR." Voice is the Operating System. The business owner doesn't manage software and instead speaks to the software & customers.
4. Build The Personal Secretary
To unlock this, we must shift the metaphor from Telecaller to Personal Secretary (or Personal Agent). You have a transactional engagement with a telecaller, but with a secretary, it's contextual.
A good secretary (or a good salesperson) knows:
- The Business: They deeply understand the product.
- The User: They know where the prospect is in the lifecycle.
- The Context: They maintain a persistent Context Graph of the relationship (remembering the last conversation).
This requires Omni-channel, Multi-session Continuity. It is not a linear, transactional sell but a relationship.
- The AI front-ends the initiative.
- It knows exactly when to handoff to a human to create the outcome.
- The user doesn't feel like they are engaging with a siloed bot, but with the Brand (sometimes with AI, other times with humans), and always with continuity and personalization.
The Bottom Line
If you treat Voice AI as an outsourced service ("Get me leads at ₹X/min"), you will get the Spam Factory. If you treat Voice AI as an unlock to build relationships at scale, you get the new OS that expands the market.
Voice AI startups can help you with the tech and capabilities. But operationalizing it is the brand's job, and it works only if you do the change management. Otherwise, it looks good for pilots but ends up as just another cost center value prop.
The channel will be sachetised regardless. The only question is whether it becomes a spam pipe or an operating system. That is not a product decision. It is a structural one — and it has to be made before the channel is bankrupt.
Factory
Operating System
Published on January 30, 2026
← Back to The Agentic Manifesto