AI & Voice Agents — Complete ViciDial & VoIP Guide

10 Tutorials

Building an AI-Powered VoIP Call Quality Analysis Service Advanced · 40 min

FastAPI + NISQA Neural Model + Silero VAD + Claude AI

Building a Real-Time AI Voice Agent for Asterisk Advanced · 44 min

AudioSocket + Deepgram STT + Groq LLM + Cartesia TTS

ElevenLabs Cloud Voice Agent with Asterisk SIP Integration Advanced · 46 min

Cloud-Hosted Conversational AI with Custom Webhook Tools, SIP Trunk Routing, and Dynamic Call Context Injection

Call Recording Transcription with Faster-Whisper Intermediate · 27 min

Batch transcription of ViciDial call recordings using Faster-Whisper (OpenAI Whisper optimized with CTranslate2) for speech-to-text at scale — on CPU, without cloud APIs.

AI Voice Agent Prompt Engineering & Conversation Design Intermediate · 58 min

Designing Personalities, Conversation Flows, and Tool-Calling Workflows from Real Call Transcription Analysis

Voice Agent Tech Stack Comparison: Local vs Cloud with Shared Booking Backend Advanced · 42 min

ElevenLabs Cloud vs Deepgram+Groq+Cartesia Local -- Architecture, Latency, Cost, and Migration

QA Pipeline — Call Transcription + AI Quality Scoring Advanced · 71 min

Build an automated quality assurance pipeline that transcribes every inbound call using Faster-Whisper and scores agent performance with AI — running entirely on your existing ViciDial server with zero impact on live calls.

AI Voice Agent — OpenAI Realtime API + Asterisk Advanced · 76 min

Build a Production Voice Agent Using OpenAI's Native Speech-to-Speech API with Asterisk AudioSocket

Real-Time Call Transcription & Sentiment Dashboard Advanced · 89 min

Build a Live Monitoring System That Transcribes Active Calls and Displays Sentiment Analysis in Real-Time

AI-Powered Answering Machine Detection — Whisper + ML Classifier Advanced · 82 min

Build a self-hosted answering machine detection (AMD) system that replaces Asterisk's built-in `AMD()` application with a Whisper-based speech recognition + machine learning classifier pipeline. Traditional AMD relies on energy detection and cadence analysis, achieving only 60-70% accuracy in real-world conditions — misclassifying live humans as machines (killing revenue-generating calls) and letting voicemail greetings through to agents (wasting expensive seat time). This tutorial's AI approach transcribes the first 3-5 seconds of answered audio using OpenAI's Whisper model, then feeds the transcript and audio features into a trained ML classifier that distinguishes human pickups from answering machines with 95%+ accuracy. The entire system runs on your own hardware with no per-call API costs, processes decisions in under 2 seconds, and continuously improves as you feed it new labeled data from your call center's actual traffic.

Explore Other Topics

◎ Monitoring & Observability ⚙ ViciDial Administration ★ Infrastructure & DevOps ⚛ AI-Powered Operations

⚗ AI & Voice Agents

10 Tutorials

Explore Other Topics