VoIP Troubleshooting Runbook
A Systematic 7-Step Diagnostic Procedure for Asterisk & ViciDial Issues
Difficulty: Intermediate to Advanced | Use case: Day-to-day operations | Asterisk version: 16+ (tested on 18.x, compatible with 11.x/13.x)
| Difficulty | Intermediate to Advanced |
| Time to Use | 5-30 minutes per incident |
| Prerequisites | Root SSH access to Asterisk/ViciDial server, basic SQL knowledge, familiarity with SIP concepts |
| Tested On | Asterisk 18.26, ViciDial 2.14+, MariaDB 10.5+, Homer 7.x, Smokeping 2.8 |
Table of Contents
- Introduction: Why You Need a Runbook
- Diagnostic Philosophy
- The 7-Step Diagnostic Procedure
- Common Issues and Solutions
- Decision Trees
- SIP Response Code Reference
- Asterisk Hangup Cause Reference
- ViciDial Status and Term Reason Reference
- Tool Commands Quick Reference
- Building Your Own Diagnostic CLI Tool
- Appendix: Configuration File Locations
1. Introduction: Why You Need a Runbook
At 2 AM, a trunk goes down. At 9 AM on Monday, three agents report choppy audio. At noon, a client calls to say customers are hearing silence after they answer. In each of these situations, you need to diagnose the problem quickly and accurately -- not fumble through random CLI commands hoping to stumble on the cause.
This runbook is the exact diagnostic procedure used to troubleshoot a production ViciDial call center fleet handling thousands of calls per day across multiple servers and SIP providers. Every command, every SQL query, every decision tree comes from real incidents that cost real money when they were not caught quickly.
What this runbook gives you:
- A repeatable 7-step procedure that works for any VoIP issue -- from one-way audio to trunk failures to LAGGED agents
- Copy-paste commands for every diagnostic step, with explanations of what to look for in the output
- Decision trees for the three most common complaint categories
- A complete SIP response code reference (1xx through 6xx) with plain-English explanations
- A complete Asterisk hangup cause reference (PRI causes 1 through 127) with what each one actually means
- A ViciDial status code reference so you can decode DISMX, DCMX, NANQUE, and every other cryptic disposition
Who this is for:
- VoIP engineers who maintain Asterisk or ViciDial servers
- NOC operators who need to diagnose call quality issues during shifts
- System administrators who inherited a VoIP platform and need a structured approach
- Anyone who has ever SSH'd into a server and typed
sip show peerswithout knowing what to do next
2. Diagnostic Philosophy
Before diving into the steps, internalize these principles:
Always work from the database outward
The database is the source of truth. Every call that passes through ViciDial gets logged with timestamps, statuses, hangup reasons, and unique IDs. Start there, get the facts, then go hunting in logs and live systems.
Correlate across layers
A single symptom can have causes at different layers:
[Application Layer] ViciDial disposition codes, agent states, call routing
|
[Signaling Layer] SIP INVITE/BYE/CANCEL, 4xx/5xx responses, SDP negotiation
|
[Media Layer] RTP streams, codecs, jitter buffers, packet loss
|
[Network Layer] Latency, packet loss, MTU, firewall rules, NAT traversal
|
[Infrastructure] Server load, disk space, DNS resolution, time sync
A call that "drops after 30 seconds" could be a missing SIP ACK (signaling), asymmetric NAT (network), a full disk stopping recording (infrastructure), or a ViciDial timeout (application). The 7-step procedure checks all layers systematically.
Document as you go
When you find the root cause at 3 AM, you will not remember the details at 9 AM. Capture:
- The phone number and call time
- The uniqueid from the database
- The relevant log lines
- What you changed to fix it
The 80/20 of VoIP problems
In practice, 80% of VoIP issues fall into five categories:
| Category | Typical share | Root cause |
|---|---|---|
| NAT / firewall | 30% | Agent behind restrictive NAT, RTP ports blocked, SIP ALG interference |
| Trunk / carrier | 25% | Provider outage, IP changed, SIP credentials expired, route congestion |
| Network quality | 15% | High latency (>150ms), packet loss (>1%), jitter (>30ms) |
| Agent software | 15% | Old softphone, wrong codec, microphone issues, VPN interference |
| Server-side | 15% | Disk full, Asterisk overloaded, misconfigured dialplan, time drift |
3. The 7-Step Diagnostic Procedure
Step 1: Identify the Call in the Database
Goal: Get the uniqueid, timestamps, status codes, and call metadata. This anchors every subsequent step.
For inbound calls (customer called in):
-- Search by phone number in the inbound call log
SELECT call_date, phone_number, length_in_sec, status, term_reason,
uniqueid, user, campaign_id, queue_seconds, closecallid
FROM vicidial_closer_log
WHERE phone_number LIKE '%PHONE_NUMBER%'
ORDER BY call_date DESC
LIMIT 20;
For outbound calls (dialer called out):
-- Search by phone number in the outbound call log
SELECT call_date, phone_number, length_in_sec, status, term_reason,
uniqueid, user, campaign_id, list_id
FROM vicidial_log
WHERE phone_number LIKE '%PHONE_NUMBER%'
ORDER BY call_date DESC
LIMIT 20;
What to look for:
| Field | What it tells you |
|---|---|
status |
Call disposition -- A (answered), DISMX (abnormal disconnect inbound), DCMX (abnormal disconnect outbound), DROP (abandoned in queue), NANQUE (no agent in queue), AFTHRS (after hours) |
term_reason |
Who ended the call -- CALLER, AGENT, NONE (system), ABANDON (hung up in queue), NOAGENT, AFTERHOURS |
length_in_sec |
Duration -- calls under 15 seconds with CALLER termination often indicate one-way audio (customer hears nothing, hangs up) |
queue_seconds |
Time spent waiting -- high values indicate no available agents |
uniqueid |
The Asterisk channel unique ID -- you need this for every subsequent step |
user |
The agent who handled the call (if any) |
Quick pattern recognition:
-- Find all abnormal disconnects in the last 24 hours
SELECT call_date, phone_number, status, term_reason, uniqueid, user
FROM vicidial_closer_log
WHERE status IN ('DISMX', 'DCMX')
AND call_date > NOW() - INTERVAL 24 HOUR
ORDER BY call_date DESC;
-- Find calls that dropped within 15 seconds (possible one-way audio)
SELECT call_date, phone_number, length_in_sec, status, term_reason, user
FROM vicidial_closer_log
WHERE length_in_sec < 15
AND length_in_sec > 0
AND term_reason = 'CALLER'
AND call_date > NOW() - INTERVAL 24 HOUR
ORDER BY call_date DESC;
Tip: If the user reports "calls dropping" but you see
term_reason = CALLER, the customer is hanging up. This often points to audio quality issues rather than system failures -- the customer cannot hear the agent, so they hang up.
Step 2: Trace the Call in Homer SIP Capture
Goal: See the complete SIP signaling flow -- INVITE, 100 Trying, 180 Ringing, 200 OK, ACK, BYE -- and identify where the conversation broke down.
If you have Homer deployed (see Tutorial 01), search by the phone number or Call-ID:
- Open Homer web UI at
http://YOUR_MONITORING_SERVER:9080 - Set the time range to cover the call
- Search by:
- Calling number or Called number (the phone number from Step 1)
- Call-ID (if you have it from Asterisk logs)
- Click on the call to see the SIP ladder diagram
What to look for in the ladder diagram:
| Pattern | Meaning |
|---|---|
| INVITE -> 100 -> 180 -> 200 -> ACK -> (talk) -> BYE | Normal call flow |
| INVITE -> 100 -> 480 | Temporarily unavailable -- agent not registered |
| INVITE -> 100 -> 486 | Busy -- agent on another call |
| INVITE -> 100 -> 408 | Request timeout -- network/registration issue |
| INVITE -> 100 -> 503 | Service unavailable -- trunk overloaded or down |
| INVITE -> 200 -> ACK -> (short pause) -> BYE | Call connected but dropped quickly -- check media |
| INVITE -> 200 -> ACK -> (no BYE for a long time) | Zombie channel -- call ended but no BYE sent |
| No INVITE at all | Call never reached the trunk -- check dialplan |
Check SDP (Session Description Protocol) for media issues:
In the 200 OK response, examine the SDP body:
v=0
o=- 12345 12345 IN IP4 203.0.113.50
c=IN IP4 203.0.113.50 <-- Media IP (should be public)
m=audio 18000 RTP/AVP 0 8 18 <-- RTP port and codec offers
a=rtpmap:0 PCMU/8000 <-- G.711 ulaw
a=rtpmap:8 PCMA/8000 <-- G.711 alaw
a=rtpmap:18 G729/8000 <-- G.729
Red flags in SDP:
- Media IP (
c=) is a private address (10.x, 172.16-31.x, 192.168.x) -- carrier has NAT issue - No common codec between INVITE and 200 -- codec mismatch will cause no audio
- RTP port is 0 -- media is being held or rejected
If you do not have Homer:
You can still capture SIP packets on the server:
# Live capture of SIP traffic on port 5060 (run during a test call)
tcpdump -i eth0 -n -s 0 port 5060 -w /tmp/sip-capture.pcap
# Then analyze with sngrep (if installed) or download the pcap
sngrep -I /tmp/sip-capture.pcap
# Or use ngrep for real-time SIP message viewing
ngrep -W byline -d eth0 port 5060
Step 3: Check Asterisk Logs by Call ID
Goal: Follow the call through Asterisk's internal processing -- channel creation, dialplan execution, bridge setup, hangup cause.
Find the call in Asterisk logs:
# Using the uniqueid from Step 1, search the Asterisk message log
grep 'UNIQUEID_FROM_STEP_1' /var/log/asterisk/messages
# If you have a Call-ID (C-XXXXXXXX format), search by that
grep 'C-XXXXXXXX' /var/log/asterisk/messages
# Search by phone number to find the Call-ID first
grep 'PHONE_NUMBER' /var/log/asterisk/messages | tail -50
Critical log patterns to look for:
Normal call flow:
-- Executing [s@default:1] Answer()
-- Executing [s@default:2] Dial(SIP/agent_100,,tT)
-- SIP/agent_100-00001234 answered SIP/trunk_provider-00001235
-- Channel SIP/trunk_provider-00001235 joined bridge
-- Channel SIP/agent_100-00001234 joined bridge
Jitter buffer resyncing (IAX2 audio issues):
chan_iax2.c: Resyncing the jb -- Loss: 0.0024 Delay: 15234 Jit: 0.03
If the delay values are in the thousands or tens of thousands, the IAX2 jitter buffer is struggling. This causes choppy, delayed, or garbled audio on conference bridges.
# Count jitter buffer resyncs today
grep "$(date +%Y-%m-%d)" /var/log/asterisk/messages | grep -c 'Resyncing the jb'
# More than 100/day = problem; more than 1000/day = critical
Strict RTP source switching (NAT issue):
res_rtp_asterisk.c: Strict RTP switching source address to 198.51.100.25:12345
This means Asterisk received RTP from a different IP than expected. Common when agents are behind NAT. Usually harmless (Asterisk adapts), but if it happens repeatedly on the same call, it indicates unstable NAT mapping.
Dial failures:
app_dial.c: Called SIP/trunk_provider/441234567890
app_dial.c: SIP/trunk_provider-0001 is circuit-busy
The trunk is at capacity or not responding.
Hangup cause in the h extension:
-- Executing [h@default:1] NoOp("SIP/trunk-0001", "Hangup cause: 16")
Cause 16 is normal clearing. Anything else warrants investigation (see the full reference in Section 7).
Channel errors:
func_hangupcause.c: Unable to find information for channel
bridge_channel.c: Channel SIP/agent_100-0001 left bridge
These indicate abnormal channel teardown -- the call was not cleanly terminated.
Step 4: Inspect Carrier-Level Details
Goal: Determine if the problem is on the carrier/trunk side -- SIP response codes, dial timing, and hangup causes at the network edge.
-- Check carrier log for the specific call (use uniqueid from Step 1)
SELECT call_date, dialstatus, hangup_cause, sip_hangup_cause,
sip_hangup_reason, dial_time, answered_time, channel
FROM vicidial_carrier_log
WHERE uniqueid = 'UNIQUEID_FROM_STEP_1';
-- Or search by trunk name for recent failures
SELECT call_date, dialstatus, hangup_cause, sip_hangup_cause,
sip_hangup_reason, channel
FROM vicidial_carrier_log
WHERE channel LIKE '%TRUNK_NAME%'
ORDER BY call_date DESC
LIMIT 20;
-- Aggregate trunk health: failure rate in the last hour
SELECT
SUBSTRING_INDEX(channel, '-', 1) AS trunk,
COUNT(*) AS total_calls,
SUM(CASE WHEN dialstatus = 'ANSWER' THEN 1 ELSE 0 END) AS answered,
SUM(CASE WHEN dialstatus = 'BUSY' THEN 1 ELSE 0 END) AS busy,
SUM(CASE WHEN dialstatus = 'NOANSWER' THEN 1 ELSE 0 END) AS noanswer,
SUM(CASE WHEN dialstatus = 'CHANUNAVAIL' THEN 1 ELSE 0 END) AS unavail,
SUM(CASE WHEN dialstatus = 'CONGESTION' THEN 1 ELSE 0 END) AS congestion
FROM vicidial_carrier_log
WHERE call_date > NOW() - INTERVAL 1 HOUR
GROUP BY trunk
ORDER BY total_calls DESC;
Interpreting carrier log fields:
| Field | Values | Meaning |
|---|---|---|
dialstatus |
ANSWER |
Call was answered normally |
BUSY |
Destination returned 486 Busy | |
NOANSWER |
Ring timeout -- no answer within dial_time | |
CHANUNAVAIL |
Trunk is down or unregistered | |
CONGESTION |
Network congestion or invalid number (usually 503/404) | |
CANCEL |
Caller hung up before answer | |
hangup_cause |
1-127 | PRI/Q.931 cause code (see Section 7) |
sip_hangup_cause |
200-699 | SIP response code (see Section 6) |
sip_hangup_reason |
Text | Human-readable reason from the carrier |
dial_time |
Seconds | How long the call rang before answer/hangup |
answered_time |
Seconds | Duration of the answered portion |
Red flags:
- A trunk showing 100%
CHANUNAVAIL= trunk is down sip_hangup_causein the 400-499 range = client-side issue (your config)sip_hangup_causein the 500-599 range = server-side issue (carrier problem)hangup_cause = 38(network out of order) orhangup_cause = 34(no circuit) = carrier capacity issue
Step 5: Analyze Audio and Recordings
Goal: Listen to the actual call audio to confirm the reported symptom and identify the audio problem type.
Locate the recording:
# Recordings are stored by date. Find by the call date from Step 1:
find /var/spool/asterisk/monitorDONE/YYYYMMDD/ -name "*UNIQUEID*" -o -name "*PHONE_NUMBER*"
# Or search the ViciDial recordings table:
mysql -u USER -pPASSWORD DATABASE -e "
SELECT recording_id, filename, location, start_time, length_in_sec
FROM recording_log
WHERE lead_id = LEAD_ID
ORDER BY start_time DESC
LIMIT 10;
"
Audio analysis checklist:
| Listen for | Indicates |
|---|---|
| Silence on one side only | One-way audio (NAT / firewall blocking RTP in one direction) |
| Complete silence both sides | RTP not flowing at all (ports blocked, codec mismatch, wrong media IP) |
| Choppy / cutting in and out | Packet loss > 2% or high jitter > 30ms |
| Robotic / metallic voice | Extreme jitter, jitter buffer underruns |
| Echo | Impedance mismatch at PSTN gateway, or acoustic echo from agent's speaker/mic |
| Background static / hiss | Low-quality codec (G.729) + packet loss, or analog line noise |
| Delayed audio (walkie-talkie) | High latency > 300ms (satellite link, distant agent, VPN overhead) |
| Audio fine then sudden drop | Network route change, NAT mapping timeout, SIP session timer mismatch |
Automated audio analysis (if you have the service):
# If you have deployed the AI audio analysis service (Tutorial 02):
curl "http://YOUR_MONITORING_SERVER:8084/analyze?recording_url=http://YOUR_VICIDIAL_SERVER/RECORDINGS/path/to/file.wav"
# This returns NISQA MOS scores, silence detection, and AI-generated analysis
Step 6: Check Network Quality (Smokeping / RTCP)
Goal: Determine if the network path between your server and the carrier/agent has latency, packet loss, or jitter that explains audio problems.
Check Smokeping for trunk provider latency:
If you have Smokeping deployed (see Tutorial 01):
- Open
http://YOUR_MONITORING_SERVER:8081 - Navigate to your SIP providers target group
- Look for:
- Latency spikes that coincide with the reported call time
- Packet loss (gaps in the graph) at the same time
- Baseline comparison -- is today's latency higher than the 7-day average?
Check live RTP statistics:
# Show RTP channel statistics for all active calls
asterisk -rx 'sip show channelstats'
Output format:
Peer Call ID Duration Recv: Pack Lost ( %) Jitter Send: Pack Lost ( %) Jitter
203.0.113.50 2cb43240375 00:44:44 0000000134K 0000000008 ( 0.01%) 0.0019 0000000134K 0000000016 ( 0.01%) 0.0004
198.51.100.25 9af21c9f-99 00:01:58 0000005766 0000000000 ( 0.00%) 0.0000 0000005892 0000000001 ( 0.02%) 0.0000
Interpreting the output:
| Metric | Good | Warning | Critical |
|---|---|---|---|
| Packet Loss | < 0.5% | 0.5% - 2% | > 2% |
| Jitter | < 10ms | 10 - 30ms | > 30ms |
Latency (from sip show peer) |
< 80ms | 80 - 150ms | > 150ms |
Quick network tests:
# Ping the trunk provider (check latency and loss)
ping -c 20 TRUNK_PROVIDER_IP
# Traceroute to identify where latency is introduced
traceroute -n TRUNK_PROVIDER_IP
# Check for asymmetric routing (can cause one-way audio)
mtr --report --report-cycles 50 TRUNK_PROVIDER_IP
# Verify RTP port range is open in firewall
iptables -L INPUT -n --line-numbers | grep -E "udp.*10000|udp.*20000"
# You should see an ACCEPT rule for UDP ports 10000:20000 (or your configured range)
# Check RTP configuration
asterisk -rx 'rtp show settings'
Check RTP port range matches firewall:
# Read the configured RTP port range
grep -E 'rtpstart|rtpend' /etc/asterisk/rtp.conf
# Example output:
# rtpstart=10000
# rtpend=20000
# Verify firewall allows that UDP range
iptables -S INPUT | grep -E 'udp.*dport.*(10000|20000)'
# Must see: -A INPUT -p udp --dport 10000:20000 -j ACCEPT
If there is a mismatch between the RTP port range and the firewall rule, media will be partially or completely blocked.
Step 7: Verify Agent State and SIP Registration
Goal: Check if the problem is agent-specific -- their softphone, network connection, or registration state.
Check agent SIP registration:
# Show detailed SIP peer information for a specific agent extension
asterisk -rx 'sip show peer AGENT_EXTENSION'
Key fields to check:
| Field | What to look for |
|---|---|
Status |
OK (XXms) -- the ms value is latency. Under 100ms is good; over 150ms will cause noticeable delay |
Addr->IP |
Agent's public IP address. If it is a private address (10.x, 192.168.x), registration is coming through a NAT without proper traversal |
Useragent |
Software version. Old softphones (eyeBeam 1.x, X-Lite 3.x) have known SIP bugs |
Codecs |
Must match what the trunk supports. If agent offers only G.729 and trunk requires G.711, audio will fail or require transcoding |
Qualify |
Must be enabled for Asterisk to detect when the agent goes offline |
Nat |
Should show force_rport+comedia for agents behind NAT |
ACL |
If configured, verify agent's IP is within the allowed range |
Check if agent is LAGGED:
# Show all SIP peers and their qualification status
asterisk -rx 'sip show peers' | grep -E 'LAGGED|UNREACHABLE|UNKNOWN'
LAGGED means the qualify response took longer than the configured threshold (default 2000ms). This usually means:
- The agent's internet connection is slow or congested
- The agent is connected through a VPN that adds latency
- There is a routing issue between the server and the agent
-- Check LAGGED agents in ViciDial's live agent table
SELECT user, server_ip, status, callerid, last_update_time,
TIMESTAMPDIFF(SECOND, last_update_time, NOW()) AS seconds_since_update
FROM vicidial_live_agents
WHERE status = 'LAGGED'
OR TIMESTAMPDIFF(SECOND, last_update_time, NOW()) > 30;
An agent whose last_update_time is more than 30 seconds old is effectively frozen -- they will not receive calls even if they appear logged in.
Check agent live status in ViciDial:
-- All agents currently logged in with their state
SELECT la.user, vu.full_name, la.status, la.callerid,
la.campaign_id, la.phone_login, la.server_ip,
TIMESTAMPDIFF(SECOND, la.last_update_time, NOW()) AS idle_seconds
FROM vicidial_live_agents la
JOIN vicidial_users vu ON la.user = vu.user
ORDER BY la.status, idle_seconds DESC;
| Status | Meaning |
|---|---|
READY |
Waiting for a call |
INCALL |
On an active call |
PAUSED |
Manually paused |
CLOSER |
Waiting for inbound calls |
QUEUE |
Call ringing to agent |
LAGGED |
SIP registration delayed |
DEAD |
Connection lost -- agent will not receive calls |
4. Common Issues and Solutions
4.1 One-Way Audio
Symptom: One party can hear the other, but not vice versa. Or both parties hear silence.
Root cause: RTP (audio) packets are flowing in one direction only. Almost always a NAT/firewall issue.
Diagnosis:
# 1. Check the agent's SIP peer for NAT settings
asterisk -rx 'sip show peer AGENT_EXT' | grep -E 'Nat|Addr|Status'
# 2. Check if strict RTP is interfering
grep 'Strict RTP' /var/log/asterisk/messages | tail -10
# 3. Check RTP settings
grep strictrtp /etc/asterisk/rtp.conf
Fix:
; In sip.conf or sip-vicidial.conf, for the agent's peer definition:
[agent_100]
nat=force_rport,comedia ; Force symmetric RTP and rport
qualify=yes ; Keep NAT pinholes open
qualifyfreq=30 ; Qualify every 30 seconds
directmedia=no ; Force media through Asterisk (never direct)
For rtp.conf, if strict RTP is causing issues:
; Try seqno mode instead of full strict RTP
strictrtp=seqno
If the carrier's SDP has a private IP in the media address:
; Add to the trunk peer definition
nat=force_rport,comedia
This tells Asterisk to ignore the SDP media IP and use the IP from which it actually receives RTP packets.
4.2 LAGGED Agents
Symptom: Agent shows as "LAGGED" in sip show peers. They appear logged in but do not receive calls.
Diagnosis:
# Check qualify time for the agent
asterisk -rx 'sip show peer AGENT_EXT' | grep -E 'Status|Qualify|Addr'
# Check all LAGGED peers
asterisk -rx 'sip show peers' | grep LAGGED
Common causes and fixes:
| Cause | Fix |
|---|---|
| Agent's internet is slow | Ask agent to run a speed test, check for downloads/streaming |
| Agent is on VPN | VPN adds latency; try split tunneling or direct connection |
| Qualify threshold too aggressive | Increase qualifyfreq and qualify timeout |
| DNS resolution delay | Use IP addresses instead of hostnames in SIP config |
| Server overloaded | Check top / uptime -- high load delays qualify responses |
; Increase qualify tolerance for remote agents:
[agent_100]
qualify=5000 ; Allow up to 5 seconds (default is 2000ms)
qualifyfreq=60 ; Check every 60 seconds instead of 30
4.3 Call Drops Mid-Conversation
Symptom: Calls are answered and both parties can hear each other, then the call suddenly disconnects during the conversation.
Diagnosis:
-- Find mid-call disconnects (abnormal dispositions)
SELECT call_date, phone_number, length_in_sec, status, term_reason, uniqueid
FROM vicidial_closer_log
WHERE status IN ('DISMX', 'DCMX')
AND call_date > NOW() - INTERVAL 24 HOUR
ORDER BY call_date DESC;
# Check if there is a pattern in the call duration
# SIP session timers expire at fixed intervals (commonly 1800s = 30 min)
# If calls consistently drop at the same duration, it is a timer issue
# Check Asterisk log for the specific call
grep 'UNIQUEID' /var/log/asterisk/messages | grep -i 'hangup\|destroy\|bye'
Common causes:
| Pattern | Likely cause | Fix |
|---|---|---|
| Drops at exactly 30 min (1800s) | SIP session timer mismatch | Set session-timers=refuse on the trunk or session-expires=7200 |
| Drops at random times | NAT mapping timeout (typically 60-300s for UDP) | Enable SIP keepalives: qualify=yes, qualifyfreq=25 |
| Drops correlate with network events | Route change, ISP failover | Check Smokeping for latency spikes at call time |
| Drops only on specific trunk | Carrier issue | Contact carrier with Call-IDs and timestamps |
| Multiple agents affected simultaneously | Server issue | Check uptime, disk space, Asterisk core dumps |
4.4 Trunk UNREACHABLE
Symptom: sip show peers shows the trunk as UNREACHABLE. All outbound calls through that trunk fail.
Diagnosis:
# 1. Check trunk status
asterisk -rx 'sip show peers' | grep TRUNK_NAME
# 2. Ping the trunk provider
ping -c 5 TRUNK_PROVIDER_IP
# 3. Check if the IP is whitelisted in the firewall
iptables -S INPUT | grep TRUNK_PROVIDER_IP
# 4. Check if SIP port 5060 is reachable
nc -zvu TRUNK_PROVIDER_IP 5060
# 5. Check DNS resolution (if trunk uses hostname)
dig TRUNK_HOSTNAME
# 6. Manually send a SIP OPTIONS to test connectivity
sipvicious_svmap TRUNK_PROVIDER_IP # or use sipsak if installed
sipsak -vv -s sip:TRUNK_PROVIDER_IP:5060
Fix by cause:
| Cause | Fix |
|---|---|
| Provider is down | Contact provider; route traffic to backup trunk |
| IP address changed | Update sip-vicidial.conf with new IP; reload SIP |
| Firewall blocking | Add iptables -I INPUT -s TRUNK_IP -j ACCEPT; save rules |
| DNS resolution failure | Use IP instead of hostname; fix /etc/resolv.conf |
| SIP credentials expired | Update secret= in trunk config; reload SIP |
| Provider requires re-registration | Add register => line to sip.conf; reload |
# After fixing, reload SIP configuration (does not drop active calls):
asterisk -rx 'sip reload'
# Verify the trunk comes back:
asterisk -rx 'sip show peer TRUNK_NAME' | grep Status
4.5 Codec Mismatches
Symptom: Calls connect (SIP 200 OK) but there is no audio. Or audio is distorted with clicking/popping sounds.
Diagnosis:
# Check what codecs the trunk supports vs what it is offered
asterisk -rx 'sip show peer TRUNK_NAME' | grep -A5 Codecs
# Check active call codec negotiation
asterisk -rx 'core show channels verbose' | grep -E 'Codec|Format'
# Look for transcoding (uses CPU, degrades quality)
asterisk -rx 'core show translation'
Fix:
; Ensure trunk and agent peers agree on codecs
; In the trunk peer definition:
[trunk_provider]
disallow=all
allow=alaw ; G.711 A-law (European standard)
allow=ulaw ; G.711 Mu-law (North American standard)
allow=g729 ; G.729 (bandwidth-efficient, requires license)
; In the agent peer definition -- match the trunk:
[agent_100]
disallow=all
allow=alaw
allow=ulaw
Best practices for codec ordering:
- Use the same codec on both trunk and agent sides to avoid transcoding
- G.711 (alaw/ulaw) provides the best quality but uses 87 kbps per direction
- G.729 uses only 31 kbps but requires a license and sounds slightly worse
- Never mix G.722 (wideband) with G.711 (narrowband) without understanding the transcoding cost
4.6 NAT and Firewall Issues
Symptom: Agents behind home routers or corporate firewalls experience one-way audio, registration drops, or LAGGED status.
Comprehensive NAT checklist:
# 1. Verify the agent is behind NAT (their registered IP will be public, not private)
asterisk -rx 'sip show peer AGENT_EXT' | grep 'Addr->IP'
# 2. Check if the SIP ALG is interfering (common on consumer routers)
# The agent needs to disable SIP ALG on their router
# Signs of SIP ALG: mangled Contact headers, wrong port in Via
# 3. Verify NAT settings in the peer config
grep -A 20 '\[AGENT_EXT\]' /etc/asterisk/sip-vicidial.conf | grep nat
# 4. Check if RTP ports are open
iptables -L INPUT -n | grep -E "udp.*(10000|20000)"
Required peer settings for NAT'd agents:
[agent_100]
nat=force_rport,comedia
qualify=yes
qualifyfreq=25
directmedia=no
; directmedia=no forces ALL media through Asterisk.
; Without this, Asterisk may try to send RTP directly between
; the trunk and agent -- which fails if the agent is behind NAT.
Required global settings:
; In [general] section of sip.conf:
localnet=10.0.0.0/8 ; Define your local networks
localnet=172.16.0.0/12
localnet=192.168.0.0/16
externaddr=YOUR_PUBLIC_IP ; Your server's public IP
Agent-side fixes:
- Disable SIP ALG on the router (Application Layer Gateway -- it mangles SIP headers)
- Use port 5060 or a non-standard port (some ISPs block 5060)
- Enable STUN in the softphone if available
- Use TCP for SIP signaling if UDP is unreliable (some firewalls drop UDP after timeout)
4.7 High Jitter and Choppy Audio
Symptom: Audio cuts in and out, sounds robotic, or has a "underwater" quality.
Diagnosis:
# Check live RTP stats for all active calls
asterisk -rx 'sip show channelstats'
# Check for IAX2 jitter buffer issues (if using IAX trunks or ConfBridge with IAX2)
grep -c 'Resyncing the jb' /var/log/asterisk/messages
# Check today's jitter buffer resyncs
grep "$(date +%b\ %d)" /var/log/asterisk/messages | grep -c 'Resyncing the jb'
Jitter buffer resync thresholds:
| Count (per day) | Severity | Action |
|---|---|---|
| < 10 | Normal | No action needed |
| 10 - 100 | Warning | Monitor; check agent connections |
| 100 - 1000 | Problem | Identify affected channels; check clock sync |
| > 1000 | Critical | Likely clock desync in ConfBridge/MeetMe IAX2 loopback |
Fix for IAX2 jitter buffer issues:
; In iax.conf, disable the jitter buffer for the local loopback:
[general]
jitterbuffer=no
forcejitterbuffer=no
Fix for agent-side jitter:
The jitter is happening on the agent's internet connection. You cannot fix their ISP, but you can mitigate:
; Enable Asterisk-side jitter buffer for the agent peer
[agent_100]
jbimpl=adaptive
jbmaxsize=200 ; Maximum jitter buffer in milliseconds
jbresyncthreshold=1000
jblog=no
4.8 DISMX / DCMX Dispositions
Symptom: Calls show disposition DISMX (inbound) or DCMX (outbound) -- indicating an abnormal disconnect by the system rather than by either party.
What these mean:
| Code | Full name | Meaning |
|---|---|---|
DISMX |
Disconnect - Manager eXternal | Inbound call disconnected abnormally (not by caller or agent) |
DCMX |
Disconnect - Campaign Manager eXternal | Outbound call disconnected abnormally |
Common causes:
- Agent's SIP registration dropped during the call
- Asterisk crashed or restarted
- Network interruption between server and agent/trunk
- Disk full -- Asterisk could not write to logs/recordings
- SIP session timer expired
- NAT mapping timed out (UDP keepalive was not frequent enough)
Investigation:
-- Check if DISMX/DCMX correlates with specific agents
SELECT user, COUNT(*) AS abnormal_count
FROM vicidial_closer_log
WHERE status IN ('DISMX', 'DCMX')
AND call_date > NOW() - INTERVAL 7 DAY
GROUP BY user
ORDER BY abnormal_count DESC
LIMIT 10;
-- If one agent has significantly more than others, it is their connection
-- Check if DISMX/DCMX correlates with time of day
SELECT HOUR(call_date) AS hour, COUNT(*) AS count
FROM vicidial_closer_log
WHERE status IN ('DISMX', 'DCMX')
AND call_date > NOW() - INTERVAL 7 DAY
GROUP BY hour
ORDER BY hour;
-- Spikes at specific hours suggest network congestion patterns
4.9 Conference Bridge Issues
Symptom: Agents join a conference bridge but hear echo, garbled audio, or no audio at all. MeetMe/ConfBridge conferences show zombie channels.
Diagnosis:
# Check active conferences
asterisk -rx 'confbridge list' # For ConfBridge
asterisk -rx 'meetme list' # For MeetMe
# Check for zombie conferences (conferences with 0 or 1 participant that persist)
asterisk -rx 'confbridge list' | awk '$2 <= 1 {print}'
# Check DAHDI timing (required for MeetMe)
asterisk -rx 'dahdi show status'
# Should show timer device with accuracy close to 100%
# Check for stuck channels in conferences
asterisk -rx 'core show channels' | grep -i conf
Fix zombie conferences:
# Kick all users from a specific conference
asterisk -rx 'confbridge kick CONF_NUMBER all'
# Or for MeetMe:
asterisk -rx 'meetme kick CONF_NUMBER all'
Fix DAHDI timing (MeetMe):
# Check if DAHDI timer module is loaded
lsmod | grep dahdi
# If not loaded:
modprobe dahdi
dahdi_genconf
dahdi_cfg -vv
# Verify timing accuracy
cat /proc/dahdi/timer
4.10 Recording Failures
Symptom: Calls are not being recorded, or recordings are 0 bytes, or recordings contain only silence.
Diagnosis:
# Check disk space (most common cause)
df -h /var/spool/asterisk/monitorDONE/
# If > 90% full, recordings will fail silently
# Check recording directory permissions
ls -la /var/spool/asterisk/monitor/
# Owner should be asterisk:asterisk with write permission
# Check for 0-byte recording files today
find /var/spool/asterisk/monitorDONE/$(date +%Y%m%d)/ -size 0 -name "*.wav" | wc -l
# Check Asterisk recording settings
asterisk -rx 'core show settings' | grep -i record
# Verify MixMonitor/Monitor is running on active calls
asterisk -rx 'core show channels verbose' | grep -i mix
Common fixes:
| Cause | Fix |
|---|---|
| Disk full | Clean old recordings: find /var/spool/asterisk/monitorDONE/ -mtime +90 -name "*.wav" -delete |
| Wrong permissions | chown -R asterisk:asterisk /var/spool/asterisk/monitor/ |
| Missing sox/lame | Install: yum install sox or zypper install sox |
| Recording format wrong | Check mixmon_format in ViciDial system settings |
5. Decision Trees
5.1 "Caller Reports No Audio" Flowchart
CALLER REPORTS NO AUDIO
│
├── Is it ONE-WAY or BOTH-WAY silence?
│ │
│ ├── ONE-WAY (caller hears agent but agent can't hear caller, or vice versa)
│ │ │
│ │ ├── Check: Is agent behind NAT?
│ │ │ ├── YES → Verify nat=force_rport,comedia and directmedia=no
│ │ │ │ Also check: Is SIP ALG disabled on agent's router?
│ │ │ └── NO → Check firewall: Are RTP ports (10000-20000 UDP) open?
│ │ │
│ │ ├── Check: Does carrier SDP show private IP in c= line?
│ │ │ ├── YES → Add nat=force_rport,comedia to trunk peer config
│ │ │ └── NO → Check codec negotiation in SDP INVITE vs 200 OK
│ │ │
│ │ └── Check: Is strictrtp enabled?
│ │ ├── YES → Try strictrtp=seqno or strictrtp=no temporarily
│ │ └── NO → Escalate: capture RTP with tcpdump, analyze packet flow
│ │
│ └── BOTH-WAY (complete silence for both parties)
│ │
│ ├── Check: Are there ANY RTP packets flowing?
│ │ ├── NO → Firewall is blocking ALL RTP. Check iptables UDP rules.
│ │ └── YES → Codec mismatch. Check 'core show translation' for errors.
│ │
│ ├── Check: Is the media IP reachable?
│ │ ├── NO → Carrier or agent has wrong externaddr/externip
│ │ └── YES → Check if directmedia=yes is causing media bypass issues
│ │
│ └── Check: Did the call actually connect? (200 OK + ACK in Homer)
│ ├── NO → SIP signaling issue, not media issue
│ └── YES → Media path is broken. Capture and compare SDP from both legs.
5.2 "Calls Keep Dropping" Flowchart
CALLS KEEP DROPPING
│
├── Check: Are ALL calls dropping or just some?
│ │
│ ├── ALL CALLS (every call on the system)
│ │ │
│ │ ├── Check: Is Asterisk running?
│ │ │ ├── NO → Restart: systemctl start asterisk
│ │ │ │ Check /var/log/asterisk/ for crash dumps
│ │ │ └── YES → Check server load: uptime, top, df -h
│ │ │
│ │ ├── Check: Is disk full?
│ │ │ ├── YES → Emergency cleanup of old recordings/logs
│ │ │ └── NO → Check: Are ALL trunks UNREACHABLE?
│ │ │
│ │ └── Check: Is there a network outage?
│ │ ├── YES → Contact datacenter/ISP
│ │ └── NO → Check Asterisk error log for repeated errors
│ │
│ ├── CALLS ON ONE TRUNK ONLY
│ │ │
│ │ ├── Check: sip show peers | grep TRUNK_NAME
│ │ │ ├── UNREACHABLE → Trunk is down (see Section 4.4)
│ │ │ └── OK → Trunk is up but failing
│ │ │
│ │ ├── Check: carrier_log for sip_hangup_cause patterns
│ │ │ ├── 503 consistently → Carrier is overloaded
│ │ │ ├── 403 consistently → Authentication failure
│ │ │ └── Various codes → Check each code in Section 6
│ │ │
│ │ └── Contact the carrier with Call-IDs and timestamps
│ │
│ └── CALLS FOR ONE AGENT ONLY
│ │
│ ├── Check: sip show peer AGENT_EXT
│ │ ├── LAGGED → Agent's internet is slow (Section 4.2)
│ │ ├── UNREACHABLE → Agent disconnected
│ │ └── OK (high ms)→ Marginal connection, drops under load
│ │
│ ├── Check: Does the agent's call always drop at the same duration?
│ │ ├── YES (e.g., always at 30 min) → SIP session timer issue
│ │ └── NO (random times) → NAT mapping timeout
│ │
│ └── Ask agent: Are they on WiFi, VPN, or shared internet?
│ Any of these can cause intermittent drops.
5.3 "Trunk Down" Flowchart
TRUNK IS UNREACHABLE
│
├── Step 1: Can you ping the trunk IP?
│ │
│ ├── NO (100% loss)
│ │ │
│ │ ├── Check: Has the provider's IP changed?
│ │ │ ├── dig PROVIDER_HOSTNAME → Compare with config
│ │ │ └── If changed → Update sip-vicidial.conf, reload SIP
│ │ │
│ │ ├── Check: Is the IP blocked by your firewall?
│ │ │ ├── iptables -S INPUT | grep PROVIDER_IP
│ │ │ └── If missing → Add: iptables -I INPUT -s IP -j ACCEPT
│ │ │
│ │ └── Provider may be down → Check their status page
│ │ └── Route traffic to backup trunk while waiting
│ │
│ └── YES (ping works)
│ │
│ ├── Check: Can you reach SIP port 5060?
│ │ ├── nc -zvu PROVIDER_IP 5060
│ │ ├── If CLOSED → Provider firewall is blocking you
│ │ │ └── Verify your server's IP is whitelisted with provider
│ │ └── If OPEN → SIP is reachable but qualify fails
│ │
│ ├── Check: Is the SIP registration failing?
│ │ ├── asterisk -rx 'sip show registry'
│ │ ├── If "Rejected" → Wrong username/password
│ │ ├── If "Timeout" → Network issue or wrong port
│ │ └── If "No Registry" → This trunk does not use registration
│ │
│ └── Check: Is there a TLS/SRTP mismatch?
│ └── Try connecting without encryption to test
│
├── Step 2: Do you have a backup trunk?
│ │
│ ├── YES → Reroute traffic via dialplan or ViciDial carrier settings
│ └── NO → This is a single point of failure. Plan for redundancy.
│
└── Step 3: Document and notify
├── Record the time, trunk name, and symptoms
├── Open a ticket with the carrier
└── Set up monitoring to alert on trunk state changes (see Tutorial 01)
6. SIP Response Code Reference
1xx -- Provisional (Information)
| Code | Name | What It Means |
|---|---|---|
| 100 | Trying | The request has been received and is being processed. Next-hop server is working on it. |
| 180 | Ringing | The destination phone is ringing. The caller should hear ring-back tone. |
| 181 | Call is Being Forwarded | The call is being redirected to another destination. |
| 182 | Queued | The call has been queued because the destination is temporarily unavailable. |
| 183 | Session Progress | Early media is available (e.g., the carrier is playing an in-band ringback or announcement). |
| 199 | Early Dialog Terminated | A provisional dialog was terminated before the final response. |
2xx -- Success
| Code | Name | What It Means |
|---|---|---|
| 200 | OK | The request was successful. For INVITE, this means the call was answered. For REGISTER, registration accepted. |
| 202 | Accepted | The request has been accepted for processing (used for REFER/MESSAGE). |
| 204 | No Notification | The request was successful but no notification body is included. |
3xx -- Redirection
| Code | Name | What It Means |
|---|---|---|
| 300 | Multiple Choices | The destination can be reached at multiple addresses; the caller should choose. |
| 301 | Moved Permanently | The user is no longer at this address. Update your records. |
| 302 | Moved Temporarily | The user is temporarily at a different address. Try the Contact header. |
| 305 | Use Proxy | The request must be routed through the specified proxy. |
| 380 | Alternative Service | The call failed but an alternative service (e.g., voicemail) is available. |
4xx -- Client Error (Your Side)
| Code | Name | What It Means | Common Cause |
|---|---|---|---|
| 400 | Bad Request | Malformed SIP message. | Broken SIP ALG, buggy softphone, or truncated packet. |
| 401 | Unauthorized | Authentication required (used by registrars). | Missing or wrong secret= in peer config. |
| 403 | Forbidden | The server understood the request but refuses to fulfill it. | IP not whitelisted, credentials revoked, or calling number blocked. |
| 404 | Not Found | The destination user/number does not exist. | Wrong number, DID not configured on trunk, or typo in dialplan. |
| 405 | Method Not Allowed | The SIP method (e.g., INFO, MESSAGE) is not supported. | Trying to use a method the proxy does not implement. |
| 406 | Not Acceptable | The response content is not acceptable (based on Accept header). | Codec or content-type negotiation failure. |
| 407 | Proxy Authentication Required | Authentication required by the proxy. | Similar to 401, but from an intermediate proxy. |
| 408 | Request Timeout | The server could not respond in time. | Network latency, overloaded server, or DNS timeout. |
| 410 | Gone | The user existed but is no longer available at this URI. | Deactivated account or ported number. |
| 412 | Conditional Request Failed | A precondition (If-Match header) failed. | SRTP preconditions not met. |
| 413 | Request Entity Too Large | The SIP message body is too large. | Oversized SDP with too many codec lines. |
| 415 | Unsupported Media Type | The server does not support the content type. | Wrong SDP format or non-SDP body. |
| 416 | Unsupported URI Scheme | The URI scheme (e.g., tel:) is not supported. | Use sip: instead of tel: for the destination. |
| 420 | Bad Extension | The server does not support a required SIP extension. | Remove unsupported Require: headers. |
| 421 | Extension Required | The server needs a specific extension that the client did not provide. | Add the required extension. |
| 422 | Session Interval Too Small | The Session-Expires value is too small. |
Increase session-timers interval in sip.conf. |
| 423 | Interval Too Brief | The registration expiry time is too short. | Increase the Expires value in REGISTER. |
| 424 | Bad Location Information | The location information in the request is malformed. | E911/location service configuration issue. |
| 428 | Use Identity Header | The server requires an Identity header for authentication. | STIR/SHAKEN configuration required. |
| 429 | Provide Referrer Identity | A Referred-By header is needed for REFER. | Add Referred-By when doing attended transfers. |
| 433 | Anonymity Disallowed | The server does not accept anonymous calls. | Remove Privacy header or present valid caller ID. |
| 436 | Bad Identity-Info | The Identity-Info header URI is invalid. | STIR/SHAKEN certificate issue. |
| 437 | Unsupported Certificate | The certificate used for Identity validation is not supported. | Update STIR/SHAKEN certificates. |
| 438 | Invalid Identity Header | The Identity header is present but invalid. | STIR/SHAKEN signing issue. |
| 439 | First Hop Lacks Outbound Support | The first proxy does not support the outbound extension. | Proxy configuration issue. |
| 440 | Max-Breadth Exceeded | The server cannot fork the request to more destinations. | Too many simultaneous ring targets. |
| 469 | Bad Info Package | The Info-Package header references an unknown package. | Remove unsupported INFO packages. |
| 470 | Consent Needed | The server requires consent for this operation. | User consent/privacy configuration. |
| 480 | Temporarily Unavailable | The user is registered but currently not answering. | Agent busy, DND enabled, or phone in power-save mode. |
| 481 | Call/Transaction Does Not Exist | The server received a BYE or CANCEL for a nonexistent call. | Race condition, or the call was already terminated. |
| 482 | Loop Detected | The server detected a routing loop. | Misconfigured proxy or dialplan creating circular route. |
| 483 | Too Many Hops | The Max-Forwards counter reached zero. | Routing loop or excessively deep proxy chain. |
| 484 | Address Incomplete | The destination address is too short or incomplete. | Missing digits in dialed number. |
| 485 | Ambiguous | The destination address matches multiple users. | Ambiguous number routing configuration. |
| 486 | Busy Here | The destination is busy. | Agent is on another call. |
| 487 | Request Terminated | The INVITE was cancelled by a CANCEL request. | Caller hung up before the call was answered. Normal. |
| 488 | Not Acceptable Here | No common codec or media capability could be negotiated. | Codec mismatch between trunk and destination. Fix allow= settings. |
| 489 | Bad Event | The Event header references an unknown event package. | Subscription event type not supported. |
| 491 | Request Pending | The server has a pending request and cannot process another. | Re-INVITE collision. Usually resolves automatically. |
| 493 | Undecipherable | The server cannot decrypt S/MIME body. | Encryption key mismatch. |
| 494 | Security Agreement Required | A security mechanism negotiation is needed. | TLS/IPSEC configuration required. |
5xx -- Server Error (Their Side)
| Code | Name | What It Means | Common Cause |
|---|---|---|---|
| 500 | Server Internal Error | The server encountered an unexpected condition. | Carrier software crash or overload. |
| 501 | Not Implemented | The server does not support the requested functionality. | SIP method not implemented by the carrier. |
| 502 | Bad Gateway | The server received an invalid response from a downstream server. | Carrier's upstream route is broken. |
| 503 | Service Unavailable | The server is temporarily unable to handle the request. | Carrier overloaded, maintenance, or all circuits busy. |
| 504 | Server Time-out | The server did not receive a response from a downstream server. | Carrier's upstream provider is not responding. |
| 505 | Version Not Supported | The SIP version in the request is not supported. | Version mismatch (rare -- almost everything is SIP/2.0). |
| 513 | Message Too Large | The SIP message exceeds the server's maximum size limit. | Oversized request body. Reduce codec offerings. |
| 555 | Push Notification Service Not Supported | The push notification service is not available. | Mobile push notification issue. |
| 580 | Precondition Failure | A required precondition (QoS, security) was not met. | QoS reservation failure. |
6xx -- Global Failure
| Code | Name | What It Means | Common Cause |
|---|---|---|---|
| 600 | Busy Everywhere | The user is busy at all known locations. | All of the user's devices are in use. |
| 603 | Decline | The user explicitly declined the call. | Call rejection or DND. |
| 604 | Does Not Exist Anywhere | The destination number does not exist on any server. | Invalid or decommissioned number. |
| 606 | Not Acceptable | The user's capabilities do not match the request. | No compatible codecs or media types. |
| 607 | Unwanted | The call has been identified as unwanted (spam). | STIR/SHAKEN attestation or spam filter. |
| 608 | Rejected | The call was rejected by a policy or intermediary. | Carrier-level call blocking. |
7. Asterisk Hangup Cause Reference
These are Q.931/PRI cause codes used by Asterisk internally. They appear in the h extension, the HANGUPCAUSE channel variable, and the hangup_cause field in vicidial_carrier_log.
Normal Operation (1-31)
| Cause | Name | What It Means | Action |
|---|---|---|---|
| 1 | Unallocated Number | The dialed number is not assigned to any route. | Check the number is valid; verify trunk routing. |
| 2 | No Route to Network | The carrier cannot find a route to the destination network. | Carrier routing issue. Try another trunk. |
| 3 | No Route to Destination | The carrier cannot route to the specific destination. | Number may be invalid for that carrier or region. |
| 4 | Send Special Information Tone | An operator intercept recording should be played. | Usually means the number is disconnected. |
| 5 | Misdialed Trunk Prefix | The trunk prefix (e.g., international dialing code) is wrong. | Fix the dial pattern in the outbound route. |
| 6 | Channel Unacceptable | The requested channel cannot be used. | Try a different channel or trunk. |
| 7 | Call Awarded and Being Delivered | The call is being connected (used in interworking). | Informational; no action needed. |
| 8 | Preemption | A higher-priority call preempted this one. | Rare; usually only in military/government networks. |
| 9 | Preemption - Circuit Reserved | The circuit was reserved for a higher-priority call. | Same as above. |
| 16 | Normal Clearing | The call was hung up normally by one of the parties. | This is the expected hangup cause for normal calls. |
| 17 | User Busy | The destination phone is busy. | Normal -- the person is on another call. |
| 18 | No User Responding | The destination phone is ringing but not being answered. | Normal -- no answer. Check ring timeout. |
| 19 | No Answer from User | Same as 18, but the ringing phase completed without answer. | Normal -- ring timeout expired. |
| 20 | Subscriber Absent | The destination user is not registered/reachable. | Mobile is off, or SIP user not registered. |
| 21 | Call Rejected | The destination explicitly rejected the call. | The callee pressed reject, or a call screening rule blocked it. |
| 22 | Number Changed | The number has been changed to a new number. | Update your records with the new number. |
| 23 | Redirection to New Destination | The call is being redirected. | The call is being forwarded. |
| 25 | Exchange Routing Error | A routing error occurred in the exchange. | Carrier-side routing misconfiguration. |
| 26 | Non-Selected User Clearing | The called party was not selected (hunt group scenario). | Normal for hunt groups and ring groups. |
| 27 | Destination Out of Order | The destination is unreachable due to a fault. | Could be a dead phone, severed line, or crashed PBX. |
| 28 | Invalid Number Format | The number format is invalid (too short, wrong prefix). | Fix the dial pattern. Check country code and number length. |
| 29 | Facility Rejected | A requested facility (e.g., call transfer) was rejected. | The network does not support the requested feature. |
| 30 | Response to STATUS ENQUIRY | Informational response to a status check. | Informational; no action needed. |
| 31 | Normal, Unspecified | Normal call clearing with no specific reason. | Usually benign -- similar to cause 16. |
Resource Issues (34-47)
| Cause | Name | What It Means | Action |
|---|---|---|---|
| 34 | No Circuit/Channel Available | All circuits to the destination are busy. | Trunk capacity exhausted. Wait or add more channels. |
| 38 | Network Out of Order | The network is experiencing a failure. | Major carrier issue. Switch to backup trunk. |
| 41 | Temporary Failure | A temporary failure occurred in the network. | Retry the call. If persistent, contact carrier. |
| 42 | Switching Equipment Congestion | The switching equipment is overloaded. | Carrier is overloaded. Reduce call volume or use alt trunk. |
| 43 | Access Information Discarded | Required information was lost during transit. | Carrier interworking issue. |
| 44 | Requested Circuit Not Available | The specific requested circuit is not available. | Similar to 34. Use any available circuit. |
| 46 | Precedence Call Blocked | A higher-precedence call blocked this one. | Government/military networks only. |
| 47 | Resource Unavailable, Unspecified | Resources are not available (no specific detail). | General capacity issue. Retry or use alt trunk. |
Service Issues (49-69)
| Cause | Name | What It Means | Action |
|---|---|---|---|
| 49 | Quality of Service Unavailable | The requested QoS cannot be provided. | Network cannot guarantee quality. Try without QoS. |
| 50 | Requested Facility Not Subscribed | The requested feature is not part of your subscription. | Enable the feature with your carrier. |
| 55 | Incoming Calls Barred within CUG | Incoming calls are blocked for this group. | Call restriction setting on the destination. |
| 57 | Bearer Capability Not Authorized | You are not authorized for this type of call (e.g., data). | Check your service subscription with the carrier. |
| 58 | Bearer Capability Not Available | The requested bearer capability is not available. | The circuit type does not support this call type. |
| 63 | Service/Option Not Available | The service is not available (unspecified reason). | Contact carrier for details. |
| 65 | Bearer Capability Not Implemented | The requested bearer type is not implemented. | Use a different call type or codec. |
| 66 | Channel Type Not Implemented | The requested channel type is not supported. | Use a different channel type. |
| 69 | Requested Facility Not Implemented | The requested network feature is not implemented. | Feature not available on this network. |
Invalid Messages (79-100)
| Cause | Name | What It Means | Action |
|---|---|---|---|
| 79 | Service/Option Not Implemented | The service option is valid but not implemented. | Feature request to carrier. |
| 81 | Invalid Call Reference | The call reference value is not valid. | Protocol error -- usually a software bug. |
| 82 | Identified Channel Does Not Exist | The referenced channel does not exist. | Configuration mismatch or timing issue. |
| 83 | A Suspended Call Exists | There is already a suspended call on this reference. | Resume the existing call first. |
| 84 | Call Identity In Use | The call identity is already in use. | Race condition in call setup. |
| 85 | No Call Suspended | There is no call to resume. | The call was already terminated. |
| 86 | Call Has Been Cleared | The referenced call has already been cleared. | Normal in some race conditions. |
| 87 | User Not Member of CUG | The user is not part of the Closed User Group. | Add the user to the group. |
| 88 | Incompatible Destination | The destination is incompatible with this call type. | Codec or service mismatch. |
| 95 | Invalid Message, Unspecified | An invalid message was received (no specific detail). | Protocol error. Check SIP message format. |
| 96 | Mandatory IE Missing | A required information element is missing from the message. | Broken SIP implementation. Update software. |
| 97 | Message Type Non-Existent | The message type is not recognized. | Protocol version mismatch. |
| 98 | Message Type Incompatible | The message type is incompatible with the call state. | State machine error. Usually a bug. |
| 99 | IE Non-Existent or Not Implemented | An information element is not recognized or implemented. | Usually non-fatal. May cause feature limitation. |
| 100 | Invalid IE Contents | An information element has invalid content. | Corrupted or malformed message. |
Protocol Errors (101-127)
| Cause | Name | What It Means | Action |
|---|---|---|---|
| 101 | Message Not Compatible with Call State | The message was received at the wrong time. | State machine error. Usually resolves on retry. |
| 102 | Recovery on Timer Expiry | A protocol timer expired and recovery was attempted. | Network congestion or slow processing. |
| 103 | Parameter Non-Existent | A parameter does not exist (passed through). | Interworking issue between networks. |
| 111 | Protocol Error, Unspecified | A protocol error occurred (no specific detail). | Check logs for more detail. May need software update. |
| 127 | Interworking, Unspecified | An error occurred at the boundary between networks. | Carrier gateway issue. Contact carrier. |
8. ViciDial Status and Term Reason Reference
Call Disposition Statuses
| Status | Full Name | Description |
|---|---|---|
A |
Answered | Call was answered by an agent and properly dispositioned |
DROP |
Drop | Call was abandoned by the caller while waiting in queue |
XDROP |
Extended Drop | Call was dropped by the system after exceeding the max wait time |
NANQUE |
No Agent in Queue | Call arrived but no agents were logged in or available for the inbound group |
AFTHRS |
After Hours | Call arrived outside the configured business hours |
DISMX |
Disconnect Manager External | Inbound call disconnected abnormally (not by caller or agent) |
DCMX |
Disconnect Campaign Manager External | Outbound call disconnected abnormally |
INCALL |
In Call | Call is currently active (should not persist after call ends) |
QUEUE |
Queue | Call is currently waiting in queue (should not persist after call ends) |
DISPO |
Disposition | Agent is in the disposition screen for this call |
Term Reason Values
| Term Reason | Meaning | Diagnostic Value |
|---|---|---|
CALLER |
The caller hung up | Normal -- customer ended the call |
AGENT |
The agent hung up | Normal -- agent ended the call after completing the interaction |
NONE |
No termination reason recorded | System event -- could indicate a crash, timeout, or unclean disconnect |
ABANDON |
Caller abandoned the queue | Caller hung up while waiting for an agent. Check queue times. |
NOAGENT |
No agent available | No agents were logged in or in READY state for this inbound group |
AFTERHOURS |
After hours routing triggered | Call came in outside business hours. Check after-hours config. |
9. Tool Commands Quick Reference
Asterisk CLI Commands
# === SIP STATUS ===
asterisk -rx 'sip show peers' # All SIP peers with status
asterisk -rx 'sip show peers' | grep LAGGED # Find LAGGED agents
asterisk -rx 'sip show peers' | grep UNREACHABLE # Find dead trunks/agents
asterisk -rx 'sip show peer EXTENSION' # Detailed info for one peer
asterisk -rx 'sip show registry' # SIP trunk registrations
asterisk -rx 'sip show channelstats' # Live RTP stats (loss, jitter)
# === CALLS & CHANNELS ===
asterisk -rx 'core show channels' # Active channels summary
asterisk -rx 'core show channels concise' # Machine-readable channel list
asterisk -rx 'core show channels verbose' # Detailed channel info
asterisk -rx 'core show channel SIP/peer-id' # Single channel deep dive
# === CONFERENCES ===
asterisk -rx 'confbridge list' # ConfBridge conferences
asterisk -rx 'meetme list' # MeetMe conferences
asterisk -rx 'confbridge kick CONF all' # Kick all from a conference
# === CODEC & MEDIA ===
asterisk -rx 'core show translation' # Codec translation paths
asterisk -rx 'rtp show settings' # RTP configuration
asterisk -rx 'core show codecs' # Available codecs
# === DIAGNOSTICS ===
asterisk -rx 'core show uptime' # Asterisk uptime
asterisk -rx 'core show version' # Asterisk version
asterisk -rx 'core show settings' # Global settings
asterisk -rx 'core show sysinfo' # System resource info
# === DEBUGGING ===
asterisk -rx 'sip set debug on' # Enable SIP debug (VERBOSE)
asterisk -rx 'sip set debug off' # Disable SIP debug
asterisk -rx 'rtp set debug on' # Enable RTP debug (VERY VERBOSE)
asterisk -rx 'rtp set debug off' # Disable RTP debug
# === RELOAD (safe, does not drop calls) ===
asterisk -rx 'sip reload' # Reload SIP config
asterisk -rx 'dialplan reload' # Reload dialplan
MySQL/MariaDB Diagnostic Queries
-- Active calls right now
SELECT COUNT(*) AS active_calls FROM vicidial_auto_calls;
-- Agents currently logged in
SELECT user, status, campaign_id, phone_login,
TIMESTAMPDIFF(SECOND, last_update_time, NOW()) AS idle_sec
FROM vicidial_live_agents
ORDER BY status, idle_sec DESC;
-- Call volume by hour (today)
SELECT HOUR(call_date) AS hr, COUNT(*) AS calls
FROM vicidial_closer_log
WHERE call_date >= CURDATE()
GROUP BY hr ORDER BY hr;
-- Trunk failure rate (last hour)
SELECT SUBSTRING_INDEX(channel, '-', 1) AS trunk,
COUNT(*) AS total,
SUM(dialstatus != 'ANSWER') AS failed,
ROUND(SUM(dialstatus != 'ANSWER') / COUNT(*) * 100, 1) AS fail_pct
FROM vicidial_carrier_log
WHERE call_date > NOW() - INTERVAL 1 HOUR
GROUP BY trunk ORDER BY fail_pct DESC;
-- Average queue time by inbound group (today)
SELECT campaign_id, COUNT(*) AS calls,
ROUND(AVG(queue_seconds), 0) AS avg_queue_sec,
MAX(queue_seconds) AS max_queue_sec
FROM vicidial_closer_log
WHERE call_date >= CURDATE()
GROUP BY campaign_id ORDER BY avg_queue_sec DESC;
-- Abnormal disconnect trending (by day, last 7 days)
SELECT DATE(call_date) AS day,
SUM(status = 'DISMX') AS dismx,
SUM(status = 'DCMX') AS dcmx,
COUNT(*) AS total_calls,
ROUND((SUM(status IN ('DISMX','DCMX')) / COUNT(*)) * 100, 2) AS abnormal_pct
FROM vicidial_closer_log
WHERE call_date > NOW() - INTERVAL 7 DAY
GROUP BY day ORDER BY day;
Network and System Commands
# === NETWORK ===
ping -c 10 TRUNK_IP # Basic latency test
traceroute -n TRUNK_IP # Route path
mtr --report -c 50 TRUNK_IP # Continuous route + loss report
nc -zvu TRUNK_IP 5060 # Test SIP port reachability
tcpdump -i eth0 -n port 5060 -c 100 # Capture 100 SIP packets
ngrep -W byline -d eth0 port 5060 # Real-time SIP message dump
# === SYSTEM HEALTH ===
uptime # Load average
df -h # Disk space
free -m # Memory usage
top -bn1 | head -20 # CPU/process overview
iostat -x 1 5 # Disk I/O stats
# === LOG ANALYSIS ===
# Count Asterisk errors today
grep "$(date +%b\ %d)" /var/log/asterisk/messages | grep -ci error
# Find all WARNING and ERROR lines in the last 100 lines
tail -100 /var/log/asterisk/messages | grep -iE 'WARNING|ERROR'
# Count jitter buffer resyncs (IAX2 audio issues)
grep -c 'Resyncing the jb' /var/log/asterisk/messages
# Find strict RTP source switches (NAT issues)
grep -c 'Strict RTP switching' /var/log/asterisk/messages
# === FIREWALL ===
iptables -L INPUT -n --line-numbers # Show all INPUT rules with numbers
iptables -S INPUT | grep TRUNK_IP # Check if trunk IP is whitelisted
10. Building Your Own Diagnostic CLI Tool
Typing these commands repeatedly is tedious and error-prone. Wrap your most-used diagnostics into a single shell script:
#!/bin/bash
# voip-diag -- VoIP Diagnostic Tool
# Usage: voip-diag <command>
case "${1:-help}" in
# Show SIP peer status overview
sip)
echo "=== SIP Peers ==="
asterisk -rx 'sip show peers' | head -50
echo ""
echo "LAGGED: $(asterisk -rx 'sip show peers' | grep -c LAGGED)"
echo "UNREACHABLE: $(asterisk -rx 'sip show peers' | grep -c UNREACHABLE)"
echo "OK: $(asterisk -rx 'sip show peers' | grep -c 'OK (')"
;;
# Show active calls
calls)
echo "=== Active Calls ==="
asterisk -rx 'core show channels'
;;
# Show logged-in agents
agents)
mysql -u USER -pPASSWORD DATABASE -e "
SELECT user, status, campaign_id,
TIMESTAMPDIFF(SECOND, last_update_time, NOW()) AS idle_sec
FROM vicidial_live_agents
ORDER BY status, idle_sec DESC;" 2>/dev/null
;;
# Show RTP channel statistics
rtp)
echo "=== RTP Channel Stats ==="
asterisk -rx 'sip show channelstats'
;;
# Check a specific trunk
trunk)
if [ -z "$2" ]; then
echo "Usage: voip-diag trunk TRUNK_NAME"
exit 1
fi
echo "=== Trunk: $2 ==="
asterisk -rx "sip show peer $2" | grep -E 'Status|Addr|Codecs|Qualify'
echo ""
echo "=== Recent Carrier Log ==="
mysql -u USER -pPASSWORD DATABASE -e "
SELECT call_date, dialstatus, hangup_cause, sip_hangup_cause
FROM vicidial_carrier_log
WHERE channel LIKE '%$2%'
ORDER BY call_date DESC LIMIT 10;" 2>/dev/null
;;
# Check a specific agent
agent)
if [ -z "$2" ]; then
echo "Usage: voip-diag agent EXTENSION"
exit 1
fi
echo "=== Agent: $2 ==="
asterisk -rx "sip show peer $2" | grep -E 'Status|Addr|Useragent|Codecs|Nat|Qualify'
echo ""
echo "=== ViciDial Status ==="
mysql -u USER -pPASSWORD DATABASE -e "
SELECT user, status, campaign_id, phone_login,
TIMESTAMPDIFF(SECOND, last_update_time, NOW()) AS idle_sec
FROM vicidial_live_agents
WHERE phone_login = '$2';" 2>/dev/null
;;
# Investigate a specific call by phone number
call)
if [ -z "$2" ]; then
echo "Usage: voip-diag call PHONE_NUMBER"
exit 1
fi
echo "=== Inbound Calls for $2 ==="
mysql -u USER -pPASSWORD DATABASE -e "
SELECT call_date, length_in_sec, status, term_reason, uniqueid, user
FROM vicidial_closer_log
WHERE phone_number LIKE '%$2%'
ORDER BY call_date DESC LIMIT 10;" 2>/dev/null
echo ""
echo "=== Outbound Calls for $2 ==="
mysql -u USER -pPASSWORD DATABASE -e "
SELECT call_date, length_in_sec, status, term_reason, uniqueid, user
FROM vicidial_log
WHERE phone_number LIKE '%$2%'
ORDER BY call_date DESC LIMIT 10;" 2>/dev/null
;;
# Show system health
health)
echo "=== System Health ==="
echo "Uptime: $(uptime)"
echo "Disk: $(df -h / | tail -1)"
echo "Memory: $(free -m | grep Mem | awk '{printf "%dMB used / %dMB total (%.0f%%)\n", $3, $2, $3/$2*100}')"
echo ""
echo "=== Asterisk ==="
asterisk -rx 'core show uptime'
echo "Active channels: $(asterisk -rx 'core show channels concise' | wc -l)"
echo ""
echo "=== Errors Today ==="
echo "Asterisk errors: $(grep "$(date +'%b %d')" /var/log/asterisk/messages 2>/dev/null | grep -ci error)"
echo "JB resyncs: $(grep "$(date +'%b %d')" /var/log/asterisk/messages 2>/dev/null | grep -c 'Resyncing the jb')"
echo "RTP switches: $(grep "$(date +'%b %d')" /var/log/asterisk/messages 2>/dev/null | grep -c 'Strict RTP')"
;;
help|*)
cat <<'EOF'
VoIP Diagnostic Tool
====================
Usage: voip-diag <command> [args]
Commands:
sip Show SIP peers status summary
calls Show active calls
agents Show logged-in agents
rtp Show live RTP statistics
trunk NAME Check a specific trunk
agent EXT Check a specific agent extension
call NUMBER Look up a phone number in call logs
health System health overview
help Show this help
Examples:
voip-diag sip
voip-diag trunk my_provider
voip-diag agent 100
voip-diag call 441234567890
EOF
;;
esac
Install it:
# Save the script
sudo cp voip-diag /usr/local/bin/voip-diag
sudo chmod +x /usr/local/bin/voip-diag
# Update the USER, PASSWORD, and DATABASE variables in the script
# to match your ViciDial MySQL credentials
11. Appendix: Configuration File Locations
Asterisk Configuration Files
| File | Purpose | When to check |
|---|---|---|
/etc/asterisk/sip.conf |
Global SIP settings | NAT, codecs, qualify defaults |
/etc/asterisk/sip-vicidial.conf |
SIP peers and trunks (ViciDial-managed) | Agent/trunk registration, NAT settings |
/etc/asterisk/extensions.conf |
Main dialplan | Call routing logic |
/etc/asterisk/extensions-vicidial.conf |
ViciDial dialplan additions | Carrier routing, outbound call flow |
/etc/asterisk/rtp.conf |
RTP port range, strict RTP, DTLS | One-way audio, media issues |
/etc/asterisk/iax.conf |
IAX2 peer config and jitter buffer | IAX trunk issues, jitter buffer resyncs |
/etc/asterisk/logger.conf |
Log file configuration | If logs are missing or too verbose |
/etc/asterisk/modules.conf |
Module loading | If a feature is not available |
ViciDial Key Files
| File | Purpose |
|---|---|
/etc/astguiclient.conf |
ViciDial server settings (server_ip, DB credentials) |
/srv/www/htdocs/agc/ or /var/www/html/agc/ |
Agent interface web files |
/srv/www/htdocs/vicidial/ or /var/www/html/vicidial/ |
Admin interface web files |
/var/spool/asterisk/monitor/ |
Active recording directory |
/var/spool/asterisk/monitorDONE/ |
Completed recordings (organized by date) |
Log Files
| File | Contains | Rotation |
|---|---|---|
/var/log/asterisk/messages |
Asterisk notices, warnings, errors | Check logrotate config |
/var/log/asterisk/queue_log |
Queue events (join, leave, abandon) | Grows indefinitely if not rotated |
/var/log/astguiclient/*.log |
ViciDial process logs | Daily rotation by ViciDial |
Key Database Tables
| Table | Purpose | Key fields |
|---|---|---|
vicidial_closer_log |
Inbound call records | phone_number, status, term_reason, uniqueid, user |
vicidial_log |
Outbound call records | phone_number, status, term_reason, uniqueid, user |
vicidial_carrier_log |
Carrier-level call details | dialstatus, hangup_cause, sip_hangup_cause, channel |
vicidial_agent_log |
Agent state changes | user, event, event_time, pause_sec, wait_sec, talk_sec |
vicidial_live_agents |
Current agent state (real-time) | user, status, last_update_time, campaign_id |
vicidial_auto_calls |
Currently active calls (real-time) | phone_number, status, campaign_id |
recording_log |
Recording file locations | filename, location, start_time, lead_id |
vicidial_inbound_dids |
DID routing configuration | did_pattern, did_route, group_id |
vicidial_inbound_groups |
Inbound group settings | group_id, group_name, no_agent_action |
system_settings |
Global ViciDial configuration | Various server-wide settings |
Final Notes
When to escalate
Not every problem can be solved from the command line. Escalate when:
- The carrier confirms their side is fine and you have eliminated all server-side causes -- it may be a routing issue in transit
- Multiple unrelated agents report the same symptom simultaneously -- this suggests a server or network issue, not an agent issue
- Asterisk produces core dumps -- file a bug report with the crash details
- The problem is intermittent and defies all diagnostics -- deploy continuous monitoring (Tutorial 01) to catch it when it happens
Building institutional knowledge
Every incident you diagnose is a learning opportunity. Maintain a simple incident log:
Date: 2026-03-13
Symptom: Agent reports one-way audio on all calls
Root cause: SIP ALG enabled on agent's new router
Fix: Disabled SIP ALG, added nat=force_rport,comedia to peer config
Time to resolve: 45 minutes
After 50 entries, you will have a searchable knowledge base that makes the next 50 diagnoses twice as fast.
Recommended monitoring stack
To catch problems before users report them, deploy:
- Prometheus + Grafana for metrics (trunk status, agent counts, call volume)
- Loki for centralized log aggregation (search all server logs from one UI)
- Homer for SIP capture (full ladder diagrams for any call)
- Smokeping for continuous latency monitoring to all SIP providers
- Custom Asterisk exporter for Asterisk-specific metrics
See Tutorial 01 for the complete deployment guide.
This runbook is based on procedures used to troubleshoot a production ViciDial call center fleet handling thousands of daily calls across multiple servers and seven SIP providers. Every command was tested in production. Every decision tree was refined through real incidents.