nVoice
A 3-second pause feels like an eternity to a caller. They start talking again. The agent gets confused. The conversation falls apart.
nVoice is the voice infrastructure that doesn't pause. Sub-1.5s end-to-end latency from caller speech to agent response. STT, LLM, and TTS in one pipeline. Multilingual including Hinglish. Built on Pipecat and LiveKit. Open source.
SIGNAL_FLOW
Architecture
Inputs
Engine
Outputs
SYSTEM_CAPABILITIES
Full-stack voice infrastructure.
01
Sub-1.5s Latency
End-to-end from caller speech to agent response in under 1.5 seconds.
02
Multilingual
Supports English, Hindi, Hinglish, and extensible to other languages.
03
Pulse-Modulated Output
TTS output adapts pace, tone, and emotion based on real-time nPulse signals.
04
STT Integration
Pluggable speech-to-text. Deepgram, Whisper, or custom models.
05
TTS Integration
Pluggable text-to-speech. ElevenLabs, Resemble AI, or custom voices.
06
Transport Layer
Built on Pipecat and LiveKit for reliable, low-latency WebRTC streaming.
Open Source
nVoice is fully open source. Inspect every line of the voice pipeline. Contribute, fork, or self-host.
View on GitHub →