The conversation data from an AI voice campaign is the gold. The calls themselves are the activity, but the conversations — the words voters actually use, the issues they raise, the sentiment they convey — are what make AI fundamentally different from old-style robocalls.
Most campaigns under-invest in the analytics layer. They run lakhs of calls, collect transcripts, and never extract the insight that would have changed the strategy. This guide is how to build (or buy) a sentiment analysis pipeline that actually moves decisions, not just dashboards.
What sentiment analysis really delivers
The hand-wavy promise: "we'll analyse what voters say". The actual deliverables are sharper than that. A working sentiment pipeline produces:
1. Per-call structured record.
For every conversation:
- Sentiment score (-1 to +1, or a 5-class label: Strongly Negative, Negative, Neutral, Positive, Strongly Positive)
- Intent classification (Supportive, Undecided, Negative, Neutral, Refused-to-Engage)
- Top 3 issues mentioned (from a predefined or open vocabulary)
- Mention of specific entities (the candidate, opponents, schemes, local landmarks)
- Confidence scores on each of the above
- Hand-off flag (does this voter need human follow-up?)
2. Booth-level aggregates.
For every booth (typically 800-1500 voters):
- Sentiment distribution
- Top 10 issues by mention frequency
- Demographic breakdown (age, gender) by intent
- Trajectory over time (sentiment last week vs this week)
- Comparison to neighbouring booths
3. Issue clusters.
Open-vocabulary clustering of what voters are talking about, surfaced without a predefined issue list:
- "Voters in Block 3 are talking about pension delays — first time this week"
- "Hospital staffing is rising on the issue list in AC-103"
- "School fees emerged as an issue specific to one block in AC-201"
4. Anomaly detection.
Booths whose sentiment trajectory diverges from expected — usually a leading indicator that something specific happened (good or bad):
- Sudden negative spike: probably a controversy or local incident
- Sudden positive spike: probably a successful event or scheme delivery
- Slow drift in either direction: a deeper structural change in voter mood
The analytics pipeline architecture
A production pipeline has five stages. Each stage has specific tooling choices.
Stage 1: Transcript ingestion
As each call completes, the full transcript (timestamped, speaker-tagged) and metadata flow into a message queue. Volume scales with call rate — 5 lakh calls/day produces ~30 million transcribed words per day.
- Tooling: Kafka or NATS for the queue, Postgres for the transcript store.
- Latency target: transcript available in analytics within 60 seconds of call end.
Stage 2: NLP processing
Each transcript passes through several models:
- Sentiment classifier: typically a fine-tuned BERT or LLM (Gemini Flash, Claude Haiku). Output: sentiment score + class label.
- Intent classifier: same model with a different prompt, or a separate fine-tuned model. Output: Supportive / Undecided / Negative / Neutral / Refused.
- Issue extractor: LLM with a structured prompt: "list the 3 main issues this voter raised, in Hindi, one sentence each."
- Entity recognition: identifies references to specific people, places, schemes.
This stage runs ~₹0.05–₹0.20 per transcript depending on the model and depth.
Stage 3: Aggregation
The structured outputs flow into a data warehouse — typically BigQuery, Snowflake, or a managed Postgres equivalent — with materialised views per booth, AC, PC.
Updates happen continuously. Materialised views refresh every 1-5 minutes. War-room dashboards subscribe to these views.
Stage 4: Dashboard surfacing
The dashboards the campaign team actually sees:
- War-room screen: real-time today metrics, anomaly alerts, top emerging issues
- Booth-level deep dive: drill into a specific booth — sentiment history, top issues, recent conversation samples
- Campaign manager weekly: summary email with key changes, recommended actions
- Field-team mobile app: each karyakarta sees their assigned booths' sentiment + top complaints
Stage 5: Action triggers
The most under-built layer. Sentiment data should automatically trigger campaign actions:
- Negative spike in a booth → page the karyakarta + ground team supervisor
- New issue cluster trending → notify the manifesto team
- Specific voter raises a grievance → create a ticket in the CRM, assign to local karyakarta
Without action triggers, the analytics layer is just art. With them, it's the campaign's real-time control plane.
Sentiment classification: getting it right
The trick to good sentiment in Indian languages is what you measure, not just how you measure.
What works
- Multilingual transformer models (Gemini, Claude, GPT-4, fine-tuned XLM-RoBERTa) on the transcript. These understand Hindi+English code-switching natively.
- Asking the model to explain its classification: "Classify the sentiment of this transcript and give 2 evidence quotes from the transcript". The evidence quotes are debuggable; pure score outputs are not.
- Calibration with a held-out human-rated sample: 500-1000 transcripts hand-rated by native speakers, used to validate the model's accuracy.
What doesn't work
- Lexicon-based sentiment (looking for positive/negative words). Misses sarcasm, context, dialect. Accuracy on Indian Hindi maxes out around 60%.
- English-only models translated to Hindi. Loses nuance.
- Single-model classification without confidence scores. The campaign needs to know when to trust the classifier and when to escalate to a human.
Common errors to expect
- Sarcasm (especially in negative). "वाह, क्या सरकार है" can be deeply negative or genuinely positive — depends on tone.
- Politeness drowning out negative content. A voter who politely says "मुझे थोड़ी समस्या है" might actually be very angry — the model sometimes classifies the politeness, not the substance.
- Mixed sentiment. A voter who is positive about the candidate but negative about a specific policy decision. Single sentiment score loses this; multi-dimensional scoring captures it.
Issue extraction: the harder problem
Sentiment is straightforward. The harder problem is what the voter is talking about.
Two approaches:
1. Predefined taxonomy. Maintain a list of 100-200 known issues (water, roads, jobs, schools, hospitals, pension, ration, electricity, security, scheme delivery). The model classifies each transcript against this list.
- Pro: comparable across time, clean dashboards
- Con: misses emerging issues that aren't in the list
2. Open-vocabulary clustering. The model freely describes what the voter raised. A clustering step groups similar descriptions across thousands of transcripts to surface emergent themes.
- Pro: discovers new issues automatically
- Con: harder to track over time, dashboards more chaotic
Hybrid approach (what production systems do): predefined taxonomy for the top 80% of issues + open vocabulary for the long tail + a weekly review where new emergent themes get promoted into the taxonomy.
Privacy and DPDP considerations
Sentiment analytics processes voter conversations. DPDP rules apply.
- Hash voter identifiers before sentiment processing. The pipeline should not need raw phone numbers.
- Aggregate at booth level for dashboard surfacing. Individual-voter sentiment should not be surfaced casually.
- Right-to-erasure must remove sentiment records too, not just call transcripts.
- Retention policy: same 24-month default applies to sentiment-derived data.
In particular, do not export sentiment data outside the DPDP-residency boundary. The processing should run in India-hosted infrastructure even if the underlying models are accessed via API.
What the dashboards should actually show
Most analytics dashboards fail at the design step — they show too much data and not enough decision-relevant insight. A working campaign dashboard has:
Front page (war-room view):
- Today's call volume + completion rate
- Sentiment distribution today vs yesterday
- Top 3 anomalous booths (with one-click drill-down)
- Top 5 emerging issues
- Operational alerts (if any)
Booth deep-dive page:
- Sentiment history (last 30 days, daily granularity)
- Top 10 issues with trend arrows
- Demographic breakdown
- 5 sample voter quotes (anonymised) per sentiment class
- Karyakarta assigned to this booth + last visit date
Field-team mobile view:
- Their booths only
- Today's top complaints
- Voters flagged for follow-up
- One-tap to add an update
The wrong way to design these: dump every metric on every page. The right way: each page answers one specific decision the user is about to make.
When sentiment data lies to you
Sentiment from AI calls is not the only source of truth. It systematically over-represents:
- Voters who answer the phone. Younger and more engaged voters.
- Voters who actually engage. Self-selection toward those willing to talk.
- Voters whose language matches the agent's dialect. Bad dialect coverage produces refusal-skewed samples.
It systematically under-represents:
- Senior voters who don't answer unfamiliar numbers
- Apolitical voters who don't want to engage on political topics
- Voters who have strong views but communicate through different channels (e.g., local karyakarta visits)
The campaign that treats AI sentiment as the only truth misses important segments. The right approach is to triangulate with door-to-door survey data, traditional polling and field-team intelligence.
Where AiSewak fits
AiSewak's analytics layer ships with:
- Real-time sentiment classification (Hindi + 10 regional languages)
- Predefined taxonomy of 200 Indian political/civic issues + open vocabulary discovery
- Booth-level aggregation at 60-second refresh
- Three default dashboards (war-room, booth deep dive, field-team mobile)
- Configurable action triggers (alerts, ticket creation, escalation)
- 24-month historical retention with DPDP-compliant erasure
Where to go next
- The Two-Way Voter Engagement Engine Architecture — the system underneath
- GOTV with AI: Polling Day Playbook — how sentiment data drives the GOTV wave
- AI for Booth Workers — getting sentiment to the karyakarta who can act
- Conversational AI Use Cases — the use cases that produce the conversations
Sentiment analysis is what turns AI calls from outreach activity into political intelligence. The campaigns that figure this out by mid-2026 will be operating at a level of decision-precision that their competitors won't match without it.