Complete guide to ai & voice agents for ViciDial and Asterisk. 10 in-depth tutorials covering everything from basics to advanced production setups.
FastAPI + NISQA Neural Model + Silero VAD + Claude AI
AudioSocket + Deepgram STT + Groq LLM + Cartesia TTS
Cloud-Hosted Conversational AI with Custom Webhook Tools, SIP Trunk Routing, and Dynamic Call Context Injection
Batch transcription of ViciDial call recordings using Faster-Whisper (OpenAI Whisper optimized with CTranslate2) for speech-to-text at scale — on CPU, without cloud APIs.
Designing Personalities, Conversation Flows, and Tool-Calling Workflows from Real Call Transcription Analysis
ElevenLabs Cloud vs Deepgram+Groq+Cartesia Local -- Architecture, Latency, Cost, and Migration
Build an automated quality assurance pipeline that transcribes every inbound call using Faster-Whisper and scores agent performance with AI — running entirely on your existing ViciDial server with zero impact on live calls.
Build a Production Voice Agent Using OpenAI's Native Speech-to-Speech API with Asterisk AudioSocket
Build a Live Monitoring System That Transcribes Active Calls and Displays Sentiment Analysis in Real-Time
Build a self-hosted answering machine detection (AMD) system that replaces Asterisk's built-in `AMD()` application with a Whisper-based speech recognition + machine learning classifier pipeline. Traditional AMD relies on energy detection and cadence analysis, achieving only 60-70% accuracy in real-world conditions — misclassifying live humans as machines (killing revenue-generating calls) and letting voicemail greetings through to agents (wasting expensive seat time). This tutorial's AI approach transcribes the first 3-5 seconds of answered audio using OpenAI's Whisper model, then feeds the transcript and audio features into a trained ML classifier that distinguishes human pickups from answering machines with 95%+ accuracy. The entire system runs on your own hardware with no per-call API costs, processes decisions in under 2 seconds, and continuously improves as you feed it new labeled data from your call center's actual traffic.