VOICE PIPELINE DEMO

Voice Support Agent

One model handles both speech-to-text and text-to-speech. A specialist 350M model classifies intent in 15ms. No separate ASR or TTS services needed.

🎙️
Speech-to-Text
LFM2-Audio-1.5B
🧠
Intent Classification
LFM2-350M
🔊
Text-to-Speech
LFM2-Audio-1.5B
Unified Audio Model: LFM2-Audio-1.5B handles both STT and TTS — replacing separate Whisper + TTS services with a single model on one GPU.
You said:
Press and hold the microphone to speak
Agent response:
Waiting for your message...
Try saying:
"I was charged twice on my credit card"
"My internet keeps dropping"
"Third time calling about this issue"
"Update my email address"
"What are your business hours?"