Tutorial 43: Kamailio + FreeSWITCH — Load Balancing & High Availability
Build a carrier-grade VoIP platform by combining Kamailio as a SIP proxy/load balancer with multiple FreeSWITCH media servers. This advanced tutorial covers dispatcher-based load balancing, RTPEngine for media relay, WebRTC gateway integration, database-driven routing, geographic failover, and full high availability with no single point of failure. This is the architecture used by every major VoIP provider handling millions of calls — and by the end of this tutorial, you will have a production-ready platform capable of 10,000+ concurrent calls with zero-downtime upgrades and geographic redundancy.
Difficulty: Advanced Reading time: ~80 minutes Prerequisites: Tutorial 41 — FreeSWITCH Fundamentals, Tutorial 42 — Kamailio Fundamentals Technologies: Kamailio, FreeSWITCH, RTPEngine, Keepalived, MariaDB Galera, WebRTC, TLS, DNS SRV, Prometheus, Homer OS: Debian 12 (Bookworm) for all servers
Table of Contents
- Introduction — Why Combine Kamailio + FreeSWITCH
- Architecture Overview
- Prerequisites & Server Planning
- Kamailio SBC Configuration
- Dispatcher — Load Balancing FreeSWITCH
- RTPEngine — Media Relay
- FreeSWITCH Media Server Configuration
- Database-Driven Routing
- WebRTC Gateway
- High Availability — Kamailio
- High Availability — FreeSWITCH
- Geographic Distribution
- Monitoring & Operations
- Troubleshooting
1. Introduction
Why Combine Kamailio and FreeSWITCH?
Kamailio and FreeSWITCH are both excellent SIP platforms, but they excel at fundamentally different things:
| Capability | Kamailio | FreeSWITCH |
|---|---|---|
| SIP proxy/routing | Exceptional — 50,000+ TPS | Basic — not designed for proxying |
| Media handling | None — signaling only | Exceptional — codec, mixing, recording |
| Load balancing | Built-in (dispatcher) | Not applicable |
| NAT traversal | Signaling only (nathelper) | Full (media + signaling) |
| IVR / call queues | Not available | Full-featured |
| Conference bridges | Not available | Full-featured |
| Voicemail | Not available | Full-featured |
| Concurrent calls | 100,000+ (signaling) | 2,000-5,000 (with media) |
| Horizontal scaling | Stateless — easy | Stateful — complex |
| WebRTC | WSS termination | Media bridging |
The combination gives you separation of concerns:
- Kamailio handles everything at the signaling layer: SIP routing, load balancing, authentication, rate limiting, topology hiding, NAT fixing, and DDoS protection.
- FreeSWITCH handles everything at the media layer: IVR menus, call queues, conference bridges, voicemail, recording, codec transcoding, and call control logic.
- RTPEngine sits between them handling media relay: NAT traversal for RTP, SRTP/DTLS bridging for WebRTC, and codec transcoding when needed.
What This Architecture Gets You
- 10,000+ concurrent calls — scale FreeSWITCH horizontally (add more servers)
- Zero-downtime upgrades — drain a FreeSWITCH node, upgrade, re-add to the pool
- Geographic distribution — Kamailio clusters in multiple data centers
- No single point of failure — every component is redundant
- WebRTC support — Kamailio terminates WSS, RTPEngine bridges DTLS↔RTP
- DDoS resilience — Kamailio's pike module and rate limiting protect backend servers
- Topology hiding — external parties never see your internal FreeSWITCH IPs
Who Uses This Architecture?
Every major VoIP carrier and CPaaS provider runs some variation of this stack:
- Twilio — Kamailio for SIP routing, custom media servers
- Vonage/Nexmo — Kamailio + FreeSWITCH at scale
- Plivo — Kamailio + FreeSWITCH (open about their stack)
- Large call centers — 500+ agents typically need this architecture
- Wholesale VoIP carriers — millions of minutes per month
2. Architecture Overview
Production Architecture Diagram
┌──────────────────────────────────────┐
│ DNS / SRV Records │
│ sip.YOUR_DOMAIN → Kamailio VIP │
│ _sip._udp.YOUR_DOMAIN SRV │
└──────────────────┬───────────────────┘
│
┌────────────────┼────────────────┐
│ │ │
┌───────────▼──┐ ┌────────▼────┐ ┌──────▼───────┐
Layer 2 │ Kamailio-A │ │ Keepalived │ │ Kamailio-B │
SBC Pair │ (Active) │◄──►│ VIP Float │◄─►│ (Standby) │
│ SIP+WSS │ │ │ │ SIP+WSS │
└──────┬───────┘ └─────────────┘ └──────┬───────┘
│ │
│ SIP (signaling) │
┌─────────┴──────────────────────────────────────┴─────────┐
│ │
┌────────▼────────┐ ┌─────────────▼──┐
│ RTPEngine-1 │ Layer 3 │ RTPEngine-2 │
│ (Media Relay) │ Media Relay │ (Media Relay) │
└────────┬────────┘ └──────┬─────────┘
│ RTP (media) │
└─────────┬─────────────────────────┬───────────────┘
│ │
┌─────────▼──────┐ ┌─────────▼──────┐
│ FreeSWITCH-1 │ │ FreeSWITCH-2 │ ...N
Layer 4 │ (Media/App) │ │ (Media/App) │
│ IVR/Queue/Rec │ │ IVR/Queue/Rec │
└────────┬───────┘ └────────┬───────┘
│ │
└──────────┬───────────────┘
│
┌──────────▼──────────┐
Layer 5 │ MariaDB Galera │
Database │ (3-node cluster) │
│ Users/CDR/Config │
└──────────────────────┘
Component Responsibility Matrix
| Component | Layer | Primary Role | Handles |
|---|---|---|---|
| DNS/SRV | 1 | Geographic failover | Route clients to nearest DC |
| Kamailio (x2) | 2 | SIP proxy / SBC | Authentication, routing, load balancing, NAT fix, rate limiting, topology hiding, WebSocket termination |
| RTPEngine (x2) | 3 | Media relay | RTP proxying, NAT traversal for media, SRTP↔RTP bridging, DTLS termination, codec transcoding |
| FreeSWITCH (x2-N) | 4 | Media application | IVR, call queues, conferencing, voicemail, recording, call control, DTMF handling |
| MariaDB Galera (x3) | 5 | Shared state | User credentials, CDRs, routing rules, call state, configuration |
Traffic Flow — Inbound Call
1. External SIP INVITE → DNS resolves to Kamailio VIP
2. Kamailio: authenticate trunk, apply rate limits
3. Kamailio: nathelper fixes Contact/Via headers
4. Kamailio: rtpengine_offer() — RTPEngine rewrites SDP (external→internal)
5. Kamailio: dispatcher selects FreeSWITCH from pool
6. Kamailio: forward INVITE to FreeSWITCH (topology hidden)
7. FreeSWITCH: executes dialplan (IVR, queue, bridge to agent)
8. FreeSWITCH: 200 OK → Kamailio
9. Kamailio: rtpengine_answer() — RTPEngine rewrites SDP (internal→external)
10. Kamailio: 200 OK → external trunk
11. Media flows: External ↔ RTPEngine ↔ FreeSWITCH
3. Prerequisites & Server Planning
Minimum Lab Setup (5 Servers)
For learning and development, you need at minimum:
| Role | Hostname | IP (example) | vCPU | RAM | Disk | Bandwidth |
|---|---|---|---|---|---|---|
| Kamailio SBC | kam01 | 10.0.1.10 | 2 | 2 GB | 20 GB | 100 Mbps |
| RTPEngine | rtp01 | 10.0.1.20 | 4 | 4 GB | 20 GB | 1 Gbps |
| FreeSWITCH 1 | fs01 | 10.0.1.30 | 4 | 8 GB | 100 GB | 1 Gbps |
| FreeSWITCH 2 | fs02 | 10.0.1.31 | 4 | 8 GB | 100 GB | 1 Gbps |
| MariaDB | db01 | 10.0.1.40 | 2 | 4 GB | 50 GB | 100 Mbps |
Production Setup (10+ Servers)
For production with high availability:
| Role | Count | vCPU each | RAM each | Disk | Notes |
|---|---|---|---|---|---|
| Kamailio SBC | 2 | 4 | 4 GB | 20 GB | Active/standby with Keepalived VIP |
| RTPEngine | 2 | 8 | 8 GB | 20 GB | Stateless — either can handle any stream |
| FreeSWITCH | 3-4 | 8 | 16 GB | 500 GB | Recordings stored locally or NFS |
| MariaDB Galera | 3 | 4 | 16 GB | 200 GB | 3-node cluster for quorum |
| Homer (SIP capture) | 1 | 4 | 8 GB | 500 GB | Optional but strongly recommended |
Capacity Planning
Approximate capacity per server (your mileage will vary with codec, recording, and transcoding):
| Component | Metric | Capacity |
|---|---|---|
| Kamailio (4 vCPU) | SIP transactions/sec | 5,000-10,000 |
| Kamailio (4 vCPU) | Concurrent registrations | 100,000+ |
| RTPEngine (8 vCPU) | Concurrent RTP streams | 2,000-4,000 |
| FreeSWITCH (8 vCPU, 16 GB) | Concurrent calls (G.711) | 1,500-2,500 |
| FreeSWITCH (8 vCPU, 16 GB) | Concurrent calls (with recording) | 800-1,500 |
| FreeSWITCH (8 vCPU, 16 GB) | Concurrent calls (with transcoding) | 500-1,000 |
Network Requirements
- Internal network: All components on same VLAN or low-latency private network (<1ms RTT)
- External network: Kamailio and RTPEngine need public IPs (or 1:1 NAT)
- Firewall: Only Kamailio and RTPEngine exposed externally; FreeSWITCH and DB are internal only
- Bandwidth: RTP uses ~87 kbps per call (G.711), so 1 Gbps supports ~10,000 concurrent call streams
IP Addressing Plan
# Public IPs (exposed to internet)
YOUR_PUBLIC_VIP = Floating VIP for Kamailio HA pair
YOUR_KAM1_PUBLIC = Kamailio-A public IP
YOUR_KAM2_PUBLIC = Kamailio-B public IP
YOUR_RTP1_PUBLIC = RTPEngine-1 public IP
YOUR_RTP2_PUBLIC = RTPEngine-2 public IP
# Private IPs (internal network — 10.0.1.0/24)
YOUR_KAM1_PRIVATE = 10.0.1.10 # Kamailio-A
YOUR_KAM2_PRIVATE = 10.0.1.11 # Kamailio-B
YOUR_RTP1_PRIVATE = 10.0.1.20 # RTPEngine-1
YOUR_RTP2_PRIVATE = 10.0.1.21 # RTPEngine-2
YOUR_FS1_IP = 10.0.1.30 # FreeSWITCH-1
YOUR_FS2_IP = 10.0.1.31 # FreeSWITCH-2
YOUR_FS3_IP = 10.0.1.32 # FreeSWITCH-3
YOUR_DB1_IP = 10.0.1.40 # MariaDB node 1
YOUR_DB2_IP = 10.0.1.41 # MariaDB node 2
YOUR_DB3_IP = 10.0.1.42 # MariaDB node 3
Base OS Setup (All Servers)
Run on every Debian 12 server:
#!/bin/bash
# base-setup.sh — Run on all servers
# Set timezone
timedatectl set-timezone UTC
# Update system
apt-get update && apt-get upgrade -y
# Install common packages
apt-get install -y \
curl wget gnupg2 lsb-release apt-transport-https \
ca-certificates software-properties-common \
net-tools tcpdump ngrep sngrep \
htop iotop sysstat \
vim tmux git \
ufw fail2ban \
ntp
# Enable NTP (critical for SIP — clock skew breaks authentication)
systemctl enable --now ntp
# Set hostname (replace per server)
# hostnamectl set-hostname kam01.YOUR_DOMAIN
# Configure /etc/hosts on all servers
cat >> /etc/hosts << 'EOF'
10.0.1.10 kam01
10.0.1.11 kam02
10.0.1.20 rtp01
10.0.1.21 rtp02
10.0.1.30 fs01
10.0.1.31 fs02
10.0.1.32 fs03
10.0.1.40 db01
10.0.1.41 db02
10.0.1.42 db03
EOF
# Kernel tuning for VoIP
cat > /etc/sysctl.d/90-voip.conf << 'EOF'
# Network buffer sizes
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 1048576
net.core.wmem_default = 1048576
net.core.netdev_max_backlog = 50000
# Connection tracking (high for SIP)
net.netfilter.nf_conntrack_max = 1000000
net.netfilter.nf_conntrack_udp_timeout = 60
net.netfilter.nf_conntrack_udp_timeout_stream = 180
# TCP tuning
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
# File descriptors
fs.file-max = 1000000
fs.nr_open = 1000000
# Disable SIP ALG (critical!)
net.netfilter.nf_conntrack_helper = 0
EOF
sysctl -p /etc/sysctl.d/90-voip.conf
# Increase file descriptor limits
cat > /etc/security/limits.d/voip.conf << 'EOF'
* soft nofile 1000000
* hard nofile 1000000
* soft nproc 65535
* hard nproc 65535
EOF
echo "Base setup complete. Reboot recommended."
4. Kamailio SBC Configuration
Install Kamailio with Required Modules
#!/bin/bash
# install-kamailio-sbc.sh — Run on kam01 and kam02
# Add Kamailio 5.8 repository
curl -fsSL https://deb.kamailio.org/kamailiodebkey.gpg | gpg --dearmor -o /usr/share/keyrings/kamailio.gpg
echo "deb [signed-by=/usr/share/keyrings/kamailio.gpg] http://deb.kamailio.org/kamailio58 bookworm main" \
> /etc/apt/sources.list.d/kamailio.list
apt-get update
# Install Kamailio + all modules we need
apt-get install -y \
kamailio \
kamailio-mysql-modules \
kamailio-tls-modules \
kamailio-websocket-modules \
kamailio-json-modules \
kamailio-extra-modules \
kamailio-utils-modules
# Enable Kamailio service
systemctl enable kamailio
Generate TLS Certificates
# Install certbot for Let's Encrypt
apt-get install -y certbot
# Get certificate (stop any service on port 80 first)
certbot certonly --standalone -d sip.YOUR_DOMAIN --agree-tos -m admin@YOUR_DOMAIN
# Create Kamailio TLS directory
mkdir -p /etc/kamailio/tls
# Link certificates
ln -sf /etc/letsencrypt/live/sip.YOUR_DOMAIN/fullchain.pem /etc/kamailio/tls/server.pem
ln -sf /etc/letsencrypt/live/sip.YOUR_DOMAIN/privkey.pem /etc/kamailio/tls/server.key
# Set permissions
chown -R kamailio:kamailio /etc/kamailio/tls/
chmod 600 /etc/kamailio/tls/server.key
Production kamailio.cfg
This is the complete, production-ready configuration for Kamailio operating as an SBC/load balancer in front of FreeSWITCH:
#!KAMAILIO
##
## Kamailio SBC Configuration
## Role: SIP proxy / load balancer / WebRTC gateway
## Backend: FreeSWITCH media server pool via dispatcher
##
## ---- Global Parameters ----
#!define DBURL "mysql://kamailio:YOUR_DB_PASSWORD@YOUR_DB1_IP/kamailio"
#!define FLT_NATS 5 # roles for NAT traversal
#!define FLB_NATB 6 # roles for NAT contact binding
#!define FLT_DLG 4 # roles for dialog tracking
#!define MY_PUBLIC_IP "YOUR_KAM1_PUBLIC"
#!define MY_PRIVATE_IP "YOUR_KAM1_PRIVATE"
#!define MY_DOMAIN "sip.YOUR_DOMAIN"
#!define WITH_MYSQL
#!define WITH_NAT
#!define WITH_TLS
#!define WITH_WEBSOCKETS
#!define WITH_RTPENGINE
#!define WITH_DISPATCHER
#!define WITH_ANTIFLOOD
#!define WITH_TOPOH
## ---- Core Parameters ----
debug=2
log_stderror=no
log_facility=LOG_LOCAL0
log_prefix="{$mt $hdr(CSeq) $ci} "
memdbg=5
memlog=5
fork=yes
children=8 # Worker processes — adjust based on CPU cores
tcp_children=4 # TCP/TLS/WSS worker processes
listen=udp:MY_PRIVATE_IP:5060
listen=tcp:MY_PRIVATE_IP:5060
listen=udp:MY_PUBLIC_IP:5060
listen=tcp:MY_PUBLIC_IP:5060
#!ifdef WITH_TLS
listen=tls:MY_PUBLIC_IP:5061
#!endif
#!ifdef WITH_WEBSOCKETS
listen=tcp:MY_PRIVATE_IP:8080 # WS (behind Nginx)
listen=tls:MY_PUBLIC_IP:8443 # WSS (direct or behind Nginx)
#!endif
tcp_connection_lifetime=3605
tcp_accept_no_cl=yes
tcp_rd_buf_size=16384
server_header="Server: VoIP-Platform"
user_agent_header="User-Agent: VoIP-Platform"
## ---- Module Loading ----
loadmodule "jsonrpcs.so"
loadmodule "kex.so"
loadmodule "corex.so"
loadmodule "tm.so"
loadmodule "tmx.so"
loadmodule "sl.so"
loadmodule "rr.so"
loadmodule "pv.so"
loadmodule "maxfwd.so"
loadmodule "textops.so"
loadmodule "siputils.so"
loadmodule "xlog.so"
loadmodule "sanity.so"
loadmodule "ctl.so"
loadmodule "cfg_rpc.so"
loadmodule "counters.so"
loadmodule "sdpops.so"
loadmodule "path.so"
#!ifdef WITH_MYSQL
loadmodule "db_mysql.so"
#!endif
loadmodule "usrloc.so"
loadmodule "registrar.so"
loadmodule "nathelper.so"
loadmodule "rtpengine.so"
loadmodule "dialog.so"
loadmodule "pike.so"
loadmodule "htable.so"
#!ifdef WITH_TLS
loadmodule "tls.so"
#!endif
#!ifdef WITH_WEBSOCKETS
loadmodule "websocket.so"
loadmodule "xhttp.so"
#!endif
#!ifdef WITH_DISPATCHER
loadmodule "dispatcher.so"
#!endif
#!ifdef WITH_TOPOH
loadmodule "topoh.so"
#!endif
## ---- Module Parameters ----
# -- jsonrpcs --
modparam("jsonrpcs", "pretty_format", 1)
modparam("jsonrpcs", "transport", 1)
# -- tm --
modparam("tm", "failure_reply_mode", 3)
modparam("tm", "fr_timer", 30000) # 30s final response timeout
modparam("tm", "fr_inv_timer", 120000) # 120s INVITE response timeout
modparam("tm", "restart_fr_on_each_reply", 1)
modparam("tm", "auto_inv_100_reason", "Trying")
# -- rr (Record-Route) --
modparam("rr", "enable_full_lr", 1)
modparam("rr", "append_fromtag", 1)
modparam("rr", "enable_double_rr", 1) # Required for topology hiding
# -- registrar --
modparam("registrar", "method_filtering", 1)
modparam("registrar", "max_expires", 3600)
modparam("registrar", "default_expires", 300)
modparam("registrar", "gruu_enabled", 0)
# -- usrloc --
#!ifdef WITH_MYSQL
modparam("usrloc", "db_url", DBURL)
modparam("usrloc", "db_mode", 2) # Write-through for HA
#!else
modparam("usrloc", "db_mode", 0) # Memory only
#!endif
modparam("usrloc", "nat_bflag", FLB_NATB)
# -- nathelper --
modparam("nathelper", "natping_interval", 30)
modparam("nathelper", "ping_nated_only", 1)
modparam("nathelper", "sipping_bflag", FLB_NATB)
modparam("nathelper", "sipping_from", "sip:keepalive@MY_DOMAIN")
modparam("nathelper", "sipping_method", "OPTIONS")
# -- rtpengine --
modparam("rtpengine", "rtpengine_sock", "udp:YOUR_RTP1_PRIVATE:2223")
# For multiple RTPEngine instances:
# modparam("rtpengine", "rtpengine_sock", "udp:YOUR_RTP1_PRIVATE:2223=1 udp:YOUR_RTP2_PRIVATE:2223=1")
# -- dialog --
modparam("dialog", "dlg_flag", FLT_DLG)
modparam("dialog", "track_cseq_updates", 1)
#!ifdef WITH_MYSQL
modparam("dialog", "db_url", DBURL)
modparam("dialog", "db_mode", 1) # Realtime for HA
modparam("dialog", "db_update_period", 60)
#!endif
# -- pike (rate limiting) --
#!ifdef WITH_ANTIFLOOD
modparam("pike", "sampling_time_unit", 2)
modparam("pike", "reqs_density_per_unit", 30) # 30 req/2sec per IP
modparam("pike", "remove_latency", 4)
#!endif
# -- htable (hash tables for rate limiting / blacklisting) --
modparam("htable", "htable", "blocked=>size=8;autoexpire=300;")
modparam("htable", "htable", "failcnt=>size=8;autoexpire=60;initval=0;")
# -- dispatcher --
#!ifdef WITH_DISPATCHER
modparam("dispatcher", "db_url", DBURL)
modparam("dispatcher", "ds_ping_method", "OPTIONS")
modparam("dispatcher", "ds_ping_interval", 10) # Ping every 10 seconds
modparam("dispatcher", "ds_ping_reply_codes", "class2;class3;class4")
modparam("dispatcher", "ds_probing_mode", 1) # Probe all destinations
modparam("dispatcher", "ds_probing_threshold", 3) # 3 failures = inactive
modparam("dispatcher", "ds_inactive_threshold", 3) # 3 successes = active again
modparam("dispatcher", "ds_ping_latency_stats", 1)
#!endif
# -- TLS --
#!ifdef WITH_TLS
modparam("tls", "config", "/etc/kamailio/tls.cfg")
modparam("tls", "tls_force_run", 1)
#!endif
# -- WebSocket --
#!ifdef WITH_WEBSOCKETS
modparam("websocket", "keepalive_mechanism", 1) # PING frames
modparam("websocket", "keepalive_timeout", 30)
modparam("websocket", "keepalive_processes", 1)
#!endif
# -- topoh (topology hiding) --
#!ifdef WITH_TOPOH
modparam("topoh", "mask_ip", "255.255.255.255")
modparam("topoh", "mask_callid", 1)
modparam("topoh", "th_callid_prefix", "VoIP-")
modparam("topoh", "th_ip_prefix", "sbc.")
#!endif
## ==== Routing Logic ====
## ---- Main Request Route ----
request_route {
# Per-request logging
xlog("L_INFO", ">>> $rm from $fu ($si:$sp) to $ru\n");
# Max forwards check
if (!mf_process_maxfwd_header("10")) {
sl_send_reply("483", "Too Many Hops");
exit;
}
# Sanity checks
if (!sanity_check("17895", "7")) {
xlog("L_WARN", "Malformed SIP from $si:$sp\n");
exit;
}
# ---- Anti-flood / DDoS Protection ----
#!ifdef WITH_ANTIFLOOD
route(ANTIFLOOD);
#!endif
# ---- Handle WebSocket connections ----
#!ifdef WITH_WEBSOCKETS
if (proto == WS || proto == WSS) {
# WebSocket SIP — force record-route with WS
if (is_method("REGISTER")) {
# Allow WS registrations
}
}
#!endif
# ---- CANCEL processing ----
if (is_method("CANCEL")) {
if (t_check_trans()) {
route(RTPENGINE_DELETE);
t_relay();
}
exit;
}
# ---- Retransmission handling ----
if (!is_method("ACK")) {
if (t_precheck_trans()) {
t_check_trans();
exit;
}
t_check_trans();
}
# ---- Record-Route for dialogs ----
if (is_method("INVITE|SUBSCRIBE")) {
record_route();
}
# ---- Sequential requests (in-dialog) ----
if (has_totag()) {
route(WITHINDLG);
exit;
}
# ---- Initial requests ----
# Handle REGISTER
if (is_method("REGISTER")) {
route(REGISTRAR);
exit;
}
# Handle OPTIONS (keepalive)
if (is_method("OPTIONS") && uri == myself) {
sl_send_reply("200", "OK");
exit;
}
# Handle INVITE — main call processing
if (is_method("INVITE")) {
# Enable dialog tracking
setflag(FLT_DLG);
dlg_manage();
# NAT detection and fixing
route(NATDETECT);
# RTPEngine: offer (external→internal bridging)
route(RTPENGINE_OFFER);
# Dispatch to FreeSWITCH pool
route(DISPATCH);
exit;
}
# Handle other methods
if (is_method("NOTIFY|INFO|UPDATE|PRACK")) {
route(RELAY);
exit;
}
# Reject anything else
sl_send_reply("405", "Method Not Allowed");
exit;
}
## ---- In-Dialog Request Routing ----
route[WITHINDLG] {
if (!loose_route()) {
if (is_method("ACK")) {
if (!t_check_trans()) {
# ACK without matching transaction — absorb
exit;
}
}
sl_send_reply("404", "Not Found");
exit;
}
if (is_method("ACK")) {
route(NATMANAGE);
} else if (is_method("BYE")) {
route(RTPENGINE_DELETE);
} else if (is_method("INVITE")) {
# Re-INVITE — handle RTPEngine for media changes
route(NATDETECT);
route(RTPENGINE_OFFER);
}
route(RELAY);
exit;
}
## ---- Relay Route ----
route[RELAY] {
if (is_method("INVITE|BYE|SUBSCRIBE|UPDATE|REFER")) {
if (!t_is_set("branch_route")) {
t_on_branch("MANAGE_BRANCH");
}
}
if (is_method("INVITE|SUBSCRIBE|UPDATE")) {
if (!t_is_set("onreply_route")) {
t_on_reply("MANAGE_REPLY");
}
}
if (is_method("INVITE")) {
if (!t_is_set("failure_route")) {
t_on_failure("MANAGE_FAILURE");
}
}
if (!t_relay()) {
sl_reply_error();
}
exit;
}
## ---- REGISTER Handling ----
route[REGISTRAR] {
# NAT detection for registrations
route(NATDETECT);
# Option 1: Store locally (Kamailio manages registrations)
if (!save("location")) {
sl_reply_error();
}
exit;
# Option 2: Proxy registrations to FreeSWITCH (uncomment if FS manages registrations)
# route(DISPATCH);
# exit;
}
## ---- Dispatcher — Load Balance to FreeSWITCH ----
route[DISPATCH] {
# Set 1 = FreeSWITCH media servers
# Algorithm 0 = hash over callid (sticky sessions — in-dialog goes to same FS)
# Flags: 2 = failover support, 4 = use only active destinations
if (!ds_select_dst("1", "0", "6")) {
xlog("L_ERR", "DISPATCH: No FreeSWITCH servers available!\n");
sl_send_reply("503", "Service Unavailable");
exit;
}
xlog("L_INFO", "DISPATCH: Routing $rm to $du (FS pool)\n");
t_on_failure("DISPATCH_FAILURE");
route(RELAY);
exit;
}
## ---- NAT Detection ----
route[NATDETECT] {
force_rport();
if (nat_uac_test("19")) {
# Client is behind NAT
setflag(FLT_NATS);
setbflag(FLB_NATB);
if (is_first_hop()) {
set_contact_alias();
}
}
}
## ---- NAT Management ----
route[NATMANAGE] {
if (is_request()) {
if (has_totag()) {
if (check_route_param("nat=yes")) {
setbflag(FLB_NATB);
}
}
}
if (isbflagset(FLB_NATB)) {
if (is_request()) {
add_contact_alias();
} else {
add_contact_alias();
}
}
}
## ---- RTPEngine Routes ----
route[RTPENGINE_OFFER] {
if (!is_method("INVITE|UPDATE")) return;
if (!has_body("application/sdp")) return;
$var(rtpflags) = "replace-origin replace-session-connection";
# Determine direction based on source
if ($si == "YOUR_FS1_IP" || $si == "YOUR_FS2_IP" || $si == "YOUR_FS3_IP") {
# From FreeSWITCH → going external
$var(rtpflags) = $var(rtpflags) + " direction=internal direction=external";
} else {
# From external → going to FreeSWITCH
$var(rtpflags) = $var(rtpflags) + " direction=external direction=internal";
}
# WebRTC client — need ICE and DTLS
if (proto == WS || proto == WSS) {
$var(rtpflags) = $var(rtpflags) + " ICE=force DTLS=passive SDES-off";
}
rtpengine_offer("$var(rtpflags)");
}
route[RTPENGINE_ANSWER] {
if (!has_body("application/sdp")) return;
$var(rtpflags) = "replace-origin replace-session-connection";
# Mirror the direction logic from the offer
if ($si == "YOUR_FS1_IP" || $si == "YOUR_FS2_IP" || $si == "YOUR_FS3_IP") {
$var(rtpflags) = $var(rtpflags) + " direction=internal direction=external";
} else {
$var(rtpflags) = $var(rtpflags) + " direction=external direction=internal";
}
if (proto == WS || proto == WSS) {
$var(rtpflags) = $var(rtpflags) + " ICE=force DTLS=passive SDES-off";
}
rtpengine_answer("$var(rtpflags)");
}
route[RTPENGINE_DELETE] {
rtpengine_delete();
}
## ---- Anti-Flood Protection ----
#!ifdef WITH_ANTIFLOOD
route[ANTIFLOOD] {
# Skip checks for trusted IPs (FreeSWITCH servers, trunks)
if ($si == "YOUR_FS1_IP" || $si == "YOUR_FS2_IP" || $si == "YOUR_FS3_IP") {
return;
}
# Check if IP is in blocked table
if ($sht(blocked=>$si) != $null) {
xlog("L_WARN", "ANTIFLOOD: Blocked request from $si\n");
exit;
}
# Pike rate limiter
if (!pike_check_req()) {
xlog("L_ALERT", "ANTIFLOOD: Pike blocking $si — rate limit exceeded\n");
$sht(blocked=>$si) = 1; # Block for 300 seconds (htable autoexpire)
exit;
}
}
#!endif
## ---- Branch Route ----
branch_route[MANAGE_BRANCH] {
xlog("L_DBG", "BRANCH: new branch [$T_branch_idx] to $ru\n");
route(NATMANAGE);
}
## ---- Reply Route ----
onreply_route[MANAGE_REPLY] {
xlog("L_DBG", "REPLY: $rs $rr from $si\n");
if (status =~ "[12][0-9][0-9]") {
route(NATMANAGE);
}
# RTPEngine answer on 183/200 with SDP
if (status =~ "(183|200)" && has_body("application/sdp")) {
route(RTPENGINE_ANSWER);
}
}
## ---- Failure Route — Dispatcher Failover ----
failure_route[DISPATCH_FAILURE] {
if (t_is_canceled()) exit;
xlog("L_WARN", "DISPATCH_FAILURE: $rs from $du — trying next FS\n");
# On failure (timeout, 5xx), try next server
if (t_check_status("5[0-9][0-9]") || t_check_status("408")) {
# Clean up RTPEngine session for failed branch
route(RTPENGINE_DELETE);
# Try next dispatcher destination
if (ds_next_dst()) {
xlog("L_INFO", "DISPATCH_FAILURE: Failing over to $du\n");
route(RTPENGINE_OFFER);
route(RELAY);
exit;
}
}
# All dispatchers failed
xlog("L_ERR", "DISPATCH_FAILURE: All FreeSWITCH servers failed\n");
send_reply("503", "All Media Servers Unavailable");
}
failure_route[MANAGE_FAILURE] {
if (t_is_canceled()) exit;
xlog("L_WARN", "FAILURE: $rs for $rm to $ru\n");
}
## ---- WebSocket HTTP Handling ----
#!ifdef WITH_WEBSOCKETS
event_route[xhttp:request] {
set_reply_close();
set_reply_no_connect();
if ($hdr(Upgrade) =~ "websocket" &&
$hdr(Connection) =~ "Upgrade" &&
$rm =~ "GET") {
# Validate WebSocket handshake
if ($hdr(Sec-WebSocket-Protocol) =~ "sip") {
# Accept the WebSocket upgrade
if (ws_handle_handshake()) {
exit;
}
}
}
# Not a WebSocket request — return 403
xhttp_reply("403", "Forbidden", "text/html",
"<html><body>Forbidden</body></html>");
}
#!endif
TLS Configuration
Create /etc/kamailio/tls.cfg:
# /etc/kamailio/tls.cfg
[server:default]
method = TLSv1.2+
certificate = /etc/kamailio/tls/server.pem
private_key = /etc/kamailio/tls/server.key
verify_certificate = no
require_certificate = no
cipher_list = HIGH:!aNULL:!MD5:!DSS
[client:default]
method = TLSv1.2+
verify_certificate = no
Initialize Kamailio Database
# Create the Kamailio database and tables
kamdbctl create
# When prompted:
# MySQL password for root: (your MySQL root password)
# Database name: kamailio (default)
# Install extra tables? Yes
# Install presence tables? No (not needed for SBC role)
# Verify tables exist
mysql -u kamailio -pYOUR_DB_PASSWORD kamailio -e "SHOW TABLES;"
Firewall Rules for Kamailio
# UFW firewall rules for Kamailio SBC
ufw default deny incoming
ufw default allow outgoing
# SSH
ufw allow 22/tcp
# SIP (UDP + TCP)
ufw allow 5060/udp
ufw allow 5060/tcp
# SIP TLS
ufw allow 5061/tcp
# WebSocket (WSS)
ufw allow 8443/tcp
# Allow all traffic from internal network
ufw allow from 10.0.1.0/24
# Enable firewall
ufw enable
Start and Verify
# Check configuration syntax
kamailio -c /etc/kamailio/kamailio.cfg
# Start Kamailio
systemctl start kamailio
# Verify it is listening
ss -ulnp | grep kamailio
ss -tlnp | grep kamailio
# Check logs
journalctl -u kamailio -f
# Test SIP response
sipgrep -p 5060 &
sipsak -s sip:test@YOUR_KAM1_PUBLIC:5060
5. Dispatcher — Load Balancing FreeSWITCH
Understanding Dispatcher
The dispatcher module is Kamailio's built-in load balancer. It maintains a list of backend servers organized in destination sets (groups), monitors their health via SIP OPTIONS pings, and distributes traffic using configurable algorithms.
Dispatcher Algorithms
| Algorithm | ID | Description | Best For |
|---|---|---|---|
| Hash over Call-ID | 0 | Same Call-ID always goes to same server | Standard calls — in-dialog requests stay together |
| Hash over From URI | 1 | Same caller always goes to same server | User affinity |
| Hash over To URI | 2 | Same destination always goes to same server | DID-based routing |
| Hash over Request-URI | 3 | Same R-URI goes to same server | Service-based routing |
| Round-robin | 4 | Sequential rotation through servers | Even distribution |
| Hash over auth username | 5 | Authenticated user affinity | Registered users |
| Random | 6 | Random selection | Simple load spreading |
| Hash over PV | 7 | Hash over any pseudo-variable | Custom logic |
| Weight-based | 8 | Proportional distribution by weight | Heterogeneous servers |
| Call load | 9 | Least connections (tracks active calls) | Best for even load |
| Relative weight | 10 | Weight-based with relative proportions | Mixed-capacity servers |
Recommended for VoIP: Algorithm 0 (Call-ID hash) ensures all SIP messages for the same call go to the same FreeSWITCH. Algorithm 9 (call load distribution) gives the most even load if you use stateless mode.
Why Call-ID Hash Matters
SIP calls involve multiple transactions:
INVITE → 100 Trying → 180 Ringing → 200 OK → ACK
... call in progress ...
re-INVITE (hold/resume)
BYE → 200 OK
All messages for the same call must reach the same FreeSWITCH. Call-ID hash guarantees this because the Call-ID header is constant for the entire call. Without it, a re-INVITE or BYE could go to a different FreeSWITCH that knows nothing about the call.
Database-Backed Dispatcher Configuration
Populate the dispatcher table:
-- Connect to kamailio database
USE kamailio;
-- Destination set 1: FreeSWITCH media servers
-- Columns: id, setid, destination, flags, priority, attrs, description
INSERT INTO dispatcher (setid, destination, flags, priority, attrs, description)
VALUES
(1, 'sip:YOUR_FS1_IP:5060', 0, 0, 'weight=50;duid=fs01', 'FreeSWITCH-1 Media'),
(1, 'sip:YOUR_FS2_IP:5060', 0, 0, 'weight=50;duid=fs02', 'FreeSWITCH-2 Media'),
(1, 'sip:YOUR_FS3_IP:5060', 0, 0, 'weight=50;duid=fs03', 'FreeSWITCH-3 Media');
-- Destination set 2: Conference-dedicated FreeSWITCH (optional)
-- Useful to route high-resource conference calls to dedicated servers
INSERT INTO dispatcher (setid, destination, flags, priority, attrs, description)
VALUES
(2, 'sip:YOUR_FS3_IP:5060', 0, 0, 'weight=100;duid=fs03-conf', 'FreeSWITCH-3 Conference');
-- Verify
SELECT * FROM dispatcher;
Dispatcher Flags and Options
Flags in ds_select_dst("setid", "algorithm", "flags"):
Flag 1: Try next destination on failure (basic failover)
Flag 2: Store all destination addresses (for ds_next_dst)
Flag 4: Skip inactive destinations (honor probing results)
Flag 8: Select from active only (same as 4, explicit)
Flag 16: Use addresses from AVP as destination set
Common combinations:
6 = 2 + 4 = Store all + skip inactive (recommended)
14 = 2 + 4 + 8 = Full failover with active-only selection
Enhanced Dispatch Route
Here is an improved dispatch route with monitoring and logging:
## ---- Advanced Dispatcher Route ----
route[DISPATCH] {
# Determine dispatch set based on call type
$var(dispatch_set) = 1; # Default: general media servers
# Route conference calls to dedicated set (if configured)
if ($rU =~ "^conf[0-9]+$") {
$var(dispatch_set) = 2;
}
# Select destination with:
# Algorithm 0 = Call-ID hash (sticky sessions)
# Flags 6 = failover support (2) + skip inactive (4)
if (!ds_select_dst("$var(dispatch_set)", "0", "6")) {
xlog("L_ERR", "DISPATCH: No destinations available in set $var(dispatch_set)!\n");
# Try fallback set if primary set is empty
if ($var(dispatch_set) != 1) {
xlog("L_WARN", "DISPATCH: Falling back to general set 1\n");
if (!ds_select_dst("1", "0", "6")) {
sl_send_reply("503", "Service Unavailable — No Media Servers");
exit;
}
} else {
sl_send_reply("503", "Service Unavailable — No Media Servers");
exit;
}
}
# Log the selected destination
xlog("L_INFO", "DISPATCH: $rm $fu → $du (set=$var(dispatch_set))\n");
# Set failure route for failover
t_on_failure("DISPATCH_FAILURE");
route(RELAY);
exit;
}
## ---- Dispatcher Failure Route ----
failure_route[DISPATCH_FAILURE] {
if (t_is_canceled()) exit;
# Only failover on server errors or timeouts
if (t_check_status("5[0-9][0-9]") || t_check_status("408")) {
xlog("L_WARN", "DISPATCH_FAILURE: $rs from $du — trying next\n");
# Mark this destination as probing (will be checked by OPTIONS pings)
ds_mark_dst("p");
# Clean up RTPEngine for the failed branch
route(RTPENGINE_DELETE);
# Try next destination in the set
if (ds_next_dst()) {
xlog("L_INFO", "DISPATCH_FAILURE: Failover to $du\n");
route(RTPENGINE_OFFER);
route(RELAY);
exit;
}
xlog("L_ERR", "DISPATCH_FAILURE: All servers exhausted\n");
}
# 4xx responses: pass through to caller (authentication errors, etc.)
if (t_check_status("4[0-9][0-9]")) {
xlog("L_INFO", "DISPATCH_FAILURE: 4xx response $rs — passing through\n");
}
}
Runtime Dispatcher Management
# List all dispatcher destinations and their status
kamcmd dispatcher.list
# Output example:
# DEST: {
# URI: sip:10.0.1.30:5060
# FLAGS: AP (A=Active, P=Probing enabled)
# PRIORITY: 0
# LATENCY: {
# AVG: 2.450ms
# MAX: 8.120ms
# TIMEOUT: 0
# }
# }
# Manually set a destination as inactive (for maintenance)
kamcmd dispatcher.set_state i 1 sip:YOUR_FS1_IP:5060
# State codes: a=active, i=inactive, d=disabled, p=probing
# Re-enable a destination after maintenance
kamcmd dispatcher.set_state a 1 sip:YOUR_FS1_IP:5060
# Reload dispatcher table from database (after adding/removing servers)
kamcmd dispatcher.reload
# Check the number of active destinations per set
kamcmd dispatcher.list | grep -c "FLAGS: AP"
Probing Configuration Details
The probing system sends SIP OPTIONS pings to each backend to detect failures:
Sequence:
1. Kamailio sends OPTIONS to FreeSWITCH every ds_ping_interval seconds
2. FreeSWITCH responds with 200 OK (healthy) or no response (down)
3. After ds_probing_threshold consecutive failures → destination marked INACTIVE
4. Probing continues on inactive destinations
5. After ds_inactive_threshold consecutive successes → destination marked ACTIVE
Timeline example (ds_ping_interval=10, ds_probing_threshold=3):
t=0s OPTIONS → FS1: 200 OK (active, count=0)
t=10s OPTIONS → FS1: timeout (active, fail_count=1)
t=20s OPTIONS → FS1: timeout (active, fail_count=2)
t=30s OPTIONS → FS1: timeout (INACTIVE, fail_count=3) ← traffic stops
t=40s OPTIONS → FS1: 200 OK (inactive, ok_count=1) ← still probing
t=50s OPTIONS → FS1: 200 OK (inactive, ok_count=2)
t=60s OPTIONS → FS1: 200 OK (ACTIVE, ok_count=3) ← traffic resumes
Detection time: 30 seconds (3 failures x 10 second interval). For faster detection, reduce ds_ping_interval to 5 seconds, but be aware of the additional OPTIONS traffic.
6. RTPEngine — Media Relay
Why RTPEngine?
In the Kamailio + FreeSWITCH architecture, RTPEngine solves critical media-layer problems:
| Problem | RTPEngine Solution |
|---|---|
| NAT traversal (media) | Relays RTP through a public IP — no direct path needed between endpoints |
| WebRTC ↔ SIP bridging | Converts DTLS-SRTP (WebRTC) ↔ plain RTP (SIP) |
| Topology hiding (media) | External parties see RTPEngine's IP, not FreeSWITCH's internal IP |
| Codec transcoding | Converts between codecs (e.g., G.729 ↔ G.711) without burdening FreeSWITCH |
| Call recording | Can record RTP streams to pcap files |
| SRTP | Terminates and originates SRTP for encrypted calls |
Without RTPEngine, you would need FreeSWITCH on a public IP (security risk) or complex iptables NAT rules (fragile and hard to scale).
Install RTPEngine on Debian 12
#!/bin/bash
# install-rtpengine.sh — Run on rtp01 and rtp02
# Add Sipwise repository for RTPEngine
echo "deb [signed-by=/usr/share/keyrings/sipwise.gpg] https://deb.sipwise.com/spce/mr12.5.1/ bookworm main" \
> /etc/apt/sources.list.d/sipwise.list
curl -fsSL https://deb.sipwise.com/spce/keyring/sipwise-keyring-bootstrap.gpg | \
gpg --dearmor -o /usr/share/keyrings/sipwise.gpg
apt-get update
# Install RTPEngine
apt-get install -y rtpengine
# If the Sipwise repo is not available, build from source:
# apt-get install -y build-essential dpkg-dev debhelper iptables-dev \
# libavcodec-dev libavfilter-dev libavformat-dev libavutil-dev \
# libbencode-perl libcrypt-openssl-rsa-perl libcrypt-rijndael-perl \
# libcurl4-openssl-dev libdigest-hmac-perl libevent-dev \
# libglib2.0-dev libhiredis-dev libio-multiplex-perl \
# libio-socket-inet6-perl libjson-glib-dev libmnl-dev \
# libnet-interface-perl libnftnl-dev libpcap0.8-dev \
# libpcre3-dev libspandsp-dev libssl-dev libsystemd-dev \
# libwebsockets-dev libxmlrpc-core-c3-dev markdown nfs-common \
# pandoc
#
# git clone https://github.com/sipwise/rtpengine.git
# cd rtpengine
# dpkg-buildpackage -b -uc -us
# dpkg -i ../rtpengine_*.deb
RTPEngine Configuration
Create /etc/rtpengine/rtpengine.conf:
# /etc/rtpengine/rtpengine.conf
# RTPEngine configuration for Kamailio + FreeSWITCH platform
[rtpengine]
# Control socket — Kamailio connects here
listen-ng = YOUR_RTP1_PRIVATE:2223
# Network interfaces
# Format: label/IP or label/internal_IP!external_IP
# "internal" = towards FreeSWITCH (private network)
# "external" = towards the internet (public IP)
interface = internal/YOUR_RTP1_PRIVATE
interface = external/YOUR_RTP1_PRIVATE!YOUR_RTP1_PUBLIC
# RTP port range
port-min = 20000
port-max = 40000
# Timeouts
timeout = 60 # RTP timeout (no media received)
silent-timeout = 3600 # Timeout for calls with no RTP at all
final-timeout = 7200 # Hard maximum call duration
# TOS/DSCP for QoS
tos = 184 # EF (Expedited Forwarding) for voice
# Recording (optional — pcap files)
# recording-dir = /var/spool/rtpengine
# recording-method = pcap
# Codec transcoding support
# Requires compilation with ffmpeg/libavcodec
# allow-transcoding = true
# Logging
log-level = 5 # 5=notice, 6=info, 7=debug
log-facility = daemon
log-facility-cdr = local1
# Process settings
pidfile = /run/rtpengine/rtpengine.pid
foreground = false
num-threads = 0 # 0 = auto (one per CPU core)
# Table (iptables/nftables kernel module — for kernel-space forwarding)
# table = 0 # Uncomment for kernel-space RTP relay (better performance)
# no-fallback = false
Systemd Service
Create or edit /etc/systemd/system/rtpengine.service:
[Unit]
Description=RTPEngine Media Proxy
After=network.target
Requires=network.target
[Service]
Type=forking
PIDFile=/run/rtpengine/rtpengine.pid
ExecStartPre=/bin/mkdir -p /run/rtpengine
ExecStartPre=/bin/chown rtpengine:rtpengine /run/rtpengine
ExecStart=/usr/bin/rtpengine --config-file=/etc/rtpengine/rtpengine.conf
ExecStop=/bin/kill -TERM $MAINPID
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Firewall Rules for RTPEngine
# Control interface — only from Kamailio
ufw allow from YOUR_KAM1_PRIVATE to any port 2223 proto udp
ufw allow from YOUR_KAM2_PRIVATE to any port 2223 proto udp
# RTP port range — from anywhere (media comes from external endpoints)
ufw allow 20000:40000/udp
# SSH
ufw allow 22/tcp
ufw enable
Start and Verify RTPEngine
# Start the service
systemctl daemon-reload
systemctl enable --now rtpengine
# Check it is running
systemctl status rtpengine
# Verify listening
ss -ulnp | grep rtpengine
# Test the control interface (from Kamailio server)
echo 'd7:command4:pinge' | nc -u YOUR_RTP1_PRIVATE 2223
# Should respond with: d6:result4:ponge
# Check active sessions (from RTPEngine server)
rtpengine-ctl list sessions
Kamailio ↔ RTPEngine Integration
The rtpengine module in Kamailio communicates with RTPEngine via the ng (next-generation) control protocol over UDP. The flow is:
Inbound INVITE (with SDP):
1. Kamailio receives INVITE from external trunk
2. route(RTPENGINE_OFFER): Kamailio sends "offer" to RTPEngine
- RTPEngine allocates UDP ports for RTP relay
- RTPEngine rewrites SDP: external endpoint ↔ RTPEngine ↔ internal FreeSWITCH
- SDP in INVITE now points to RTPEngine's internal IP (towards FreeSWITCH)
3. Kamailio forwards modified INVITE to FreeSWITCH
4. FreeSWITCH sends 200 OK (with SDP)
5. onreply_route: Kamailio sends "answer" to RTPEngine
- RTPEngine rewrites SDP in 200 OK: FreeSWITCH → RTPEngine's external IP
6. Kamailio sends modified 200 OK to external trunk
7. Media flows: External ↔ RTPEngine (external iface) ↔ RTPEngine (internal iface) ↔ FreeSWITCH
SDP Manipulation Example
Original SDP from external trunk:
c=IN IP4 203.0.113.50 ← trunk's RTP IP
m=audio 30000 RTP/AVP 0 8 ← trunk's RTP port
After rtpengine_offer() — SDP sent to FreeSWITCH:
c=IN IP4 10.0.1.20 ← RTPEngine's INTERNAL interface
m=audio 20100 RTP/AVP 0 8 ← RTPEngine's allocated port (internal side)
FreeSWITCH answers with SDP:
c=IN IP4 10.0.1.30 ← FreeSWITCH's IP
m=audio 19200 RTP/AVP 0 ← FreeSWITCH's RTP port
After rtpengine_answer() — SDP sent to external trunk:
c=IN IP4 YOUR_RTP1_PUBLIC ← RTPEngine's EXTERNAL interface
m=audio 20200 RTP/AVP 0 ← RTPEngine's allocated port (external side)
Result: External trunk sends RTP to RTPEngine's public IP
RTPEngine relays to FreeSWITCH on private network
Neither party knows the other's real IP
RTPEngine Clustering
For high availability, run multiple RTPEngine instances. Kamailio can be configured to use them:
# In kamailio.cfg — multiple RTPEngine backends
# Format: "udp:IP:PORT=weight udp:IP:PORT=weight"
modparam("rtpengine", "rtpengine_sock",
"udp:YOUR_RTP1_PRIVATE:2223=1 udp:YOUR_RTP2_PRIVATE:2223=1")
RTPEngine instances are stateless from a clustering perspective — each instance independently manages its own RTP sessions. Kamailio uses consistent hashing (based on Call-ID) to ensure that offer, answer, and delete for the same call go to the same RTPEngine instance.
If an RTPEngine instance goes down:
- Existing calls on that instance lose media (unavoidable — RTP state is local)
- New calls are automatically routed to the surviving instance
- Kamailio detects the failure via the control socket timeout
7. FreeSWITCH Media Server Configuration
Design Principle: Headless Media Server
When FreeSWITCH runs behind Kamailio, its role changes:
| Traditional FreeSWITCH | Behind Kamailio |
|---|---|
| Handles SIP registration | Kamailio handles registration |
| Manages NAT traversal | RTPEngine handles NAT |
| Authenticates SIP peers | Kamailio authenticates |
| Exposed to internet | Internal network only |
| Processes all SIP methods | Only receives pre-routed calls |
| Single instance | Multiple instances in a pool |
FreeSWITCH becomes a headless media application server — it focuses purely on call logic, media processing, and application features.
Install FreeSWITCH
#!/bin/bash
# install-freeswitch.sh — Run on fs01, fs02, fs03
# Add SignalWire repository
TOKEN="YOUR_SIGNALWIRE_TOKEN" # Get from signalwire.com (free account)
apt-get install -y gnupg2 lsb-release
curl -fsSL https://freeswitch.signalwire.com/repo/deb/debian-release/signalwire-freeswitch-repo.gpg \
> /usr/share/keyrings/signalwire-freeswitch-repo.gpg
echo "machine freeswitch.signalwire.com login signalwire password $TOKEN" > /etc/apt/auth.conf
chmod 600 /etc/apt/auth.conf
echo "deb [signed-by=/usr/share/keyrings/signalwire-freeswitch-repo.gpg] \
https://freeswitch.signalwire.com/repo/deb/debian-release/ bookworm main" \
> /etc/apt/sources.list.d/freeswitch.list
apt-get update
# Install FreeSWITCH with common modules
apt-get install -y \
freeswitch-meta-codecs \
freeswitch-mod-commands \
freeswitch-mod-conference \
freeswitch-mod-console \
freeswitch-mod-db \
freeswitch-mod-dialplan-xml \
freeswitch-mod-dptools \
freeswitch-mod-enum \
freeswitch-mod-event-socket \
freeswitch-mod-fifo \
freeswitch-mod-hash \
freeswitch-mod-httapi \
freeswitch-mod-local-stream \
freeswitch-mod-logfile \
freeswitch-mod-loopback \
freeswitch-mod-native-file \
freeswitch-mod-say-en \
freeswitch-mod-sndfile \
freeswitch-mod-sofia \
freeswitch-mod-tone-stream \
freeswitch-mod-voicemail \
freeswitch-mod-xml-cdr \
freeswitch-mod-xml-curl
systemctl enable freeswitch
SIP Profile — Internal Only
FreeSWITCH should only accept SIP from Kamailio. Create a dedicated SIP profile.
Edit /etc/freeswitch/sip_profiles/kamailio.xml:
<!--
FreeSWITCH SIP Profile: kamailio
Purpose: Accept calls only from Kamailio SBC
This profile listens on the private network and trusts Kamailio
-->
<profile name="kamailio">
<settings>
<!-- Listen only on private network -->
<param name="sip-ip" value="$${local_ip_v4}"/>
<param name="sip-port" value="5060"/>
<param name="rtp-ip" value="$${local_ip_v4}"/>
<!-- Disable RTP timer — RTPEngine handles media relay/timeout -->
<param name="rtp-timeout-sec" value="0"/>
<param name="rtp-hold-timeout-sec" value="0"/>
<!-- Dialplan context for calls from Kamailio -->
<param name="context" value="from-kamailio"/>
<!-- Disable authentication — Kamailio already authenticated the call -->
<param name="challenge-realm" value="auto_from"/>
<!-- Accept all calls from trusted IPs (Kamailio) -->
<param name="apply-inbound-acl" value="kamailio-acl"/>
<!-- Disable registration on this profile -->
<param name="accept-blind-reg" value="false"/>
<!-- Codec preferences -->
<param name="inbound-codec-prefs" value="PCMA,PCMU,G722,opus"/>
<param name="outbound-codec-prefs" value="PCMA,PCMU,G722,opus"/>
<param name="inbound-codec-negotiation" value="generous"/>
<!-- Dialog management -->
<param name="manage-presence" value="false"/>
<param name="manage-shared-appearance" value="false"/>
<!-- Pass-through — let Kamailio handle NAT -->
<param name="aggressive-nat-detection" value="false"/>
<param name="local-network-acl" value="localnet.auto"/>
<!-- SIP options -->
<param name="disable-transfer" value="false"/>
<param name="enable-timer" value="false"/>
<param name="enable-100rel" value="false"/>
<!-- Logging -->
<param name="log-auth-failures" value="true"/>
<param name="debug" value="0"/>
</settings>
</profile>
ACL — Only Accept from Kamailio
Edit /etc/freeswitch/autoload_configs/acl.conf.xml:
<configuration name="acl.conf" description="Network ACL">
<network-lists>
<!-- Kamailio SBC servers -->
<list name="kamailio-acl" default="deny">
<node type="allow" cidr="YOUR_KAM1_PRIVATE/32"/>
<node type="allow" cidr="YOUR_KAM2_PRIVATE/32"/>
</list>
<!-- RTPEngine servers (for direct media) -->
<list name="rtpengine-acl" default="deny">
<node type="allow" cidr="YOUR_RTP1_PRIVATE/32"/>
<node type="allow" cidr="YOUR_RTP2_PRIVATE/32"/>
</list>
<!-- Internal network -->
<list name="internal-acl" default="deny">
<node type="allow" cidr="10.0.1.0/24"/>
</list>
</network-lists>
</configuration>
Disable Default External Profile
FreeSWITCH ships with internal and external profiles that listen on default ports. Disable them since we use our custom kamailio profile:
# Disable default profiles (move them out of the way)
mv /etc/freeswitch/sip_profiles/internal.xml /etc/freeswitch/sip_profiles/internal.xml.disabled
mv /etc/freeswitch/sip_profiles/external.xml /etc/freeswitch/sip_profiles/external.xml.disabled
# If you need the internal profile for registered extensions, keep it
# but change its port to avoid conflict:
# <param name="sip-port" value="5080"/>
Dialplan — Calls from Kamailio
Create /etc/freeswitch/dialplan/from-kamailio.xml:
<!--
Dialplan context: from-kamailio
Handles all calls dispatched by the Kamailio SBC
-->
<include>
<context name="from-kamailio">
<!-- ============================================ -->
<!-- IVR: Main Auto-Attendant -->
<!-- ============================================ -->
<extension name="main-ivr">
<condition field="destination_number" expression="^(ivr|2000)$">
<action application="answer"/>
<action application="sleep" data="500"/>
<action application="ivr" data="main_ivr"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- Call Queue: Sales -->
<!-- ============================================ -->
<extension name="queue-sales">
<condition field="destination_number" expression="^(sales|3001)$">
<action application="answer"/>
<action application="set" data="fifo_music=/usr/share/freeswitch/sounds/music/hold.wav"/>
<action application="fifo" data="sales@${domain_name} in"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- Call Queue: Support -->
<!-- ============================================ -->
<extension name="queue-support">
<condition field="destination_number" expression="^(support|3002)$">
<action application="answer"/>
<action application="set" data="fifo_music=/usr/share/freeswitch/sounds/music/hold.wav"/>
<action application="fifo" data="support@${domain_name} in"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- Conference Bridge -->
<!-- ============================================ -->
<extension name="conference">
<condition field="destination_number" expression="^conf(\d+)$">
<action application="answer"/>
<action application="conference" data="room-$1@default"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- Voicemail: Leave Message -->
<!-- ============================================ -->
<extension name="voicemail-leave">
<condition field="destination_number" expression="^vm(\d+)$">
<action application="answer"/>
<action application="sleep" data="500"/>
<action application="voicemail" data="default ${domain_name} $1"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- DID Routing: Route by called number -->
<!-- ============================================ -->
<extension name="did-routing">
<condition field="destination_number" expression="^(\+?\d{10,15})$">
<!-- Look up DID routing from database -->
<action application="set" data="continue_on_fail=true"/>
<action application="set" data="hangup_after_bridge=true"/>
<!-- Example: direct extension mapping -->
<!-- In production, use mod_xml_curl for dynamic DID→destination lookup -->
<action application="bridge" data="user/${destination_number}@${domain_name}"/>
<!-- If bridge fails, send to voicemail -->
<action application="voicemail" data="default ${domain_name} ${destination_number}"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- Internal Extension Dialing (1000-1999) -->
<!-- ============================================ -->
<extension name="local-extensions">
<condition field="destination_number" expression="^(1\d{3})$">
<action application="set" data="call_timeout=30"/>
<action application="set" data="continue_on_fail=true"/>
<action application="set" data="hangup_after_bridge=true"/>
<action application="bridge" data="user/$1@${domain_name}"/>
<!-- No answer → voicemail -->
<action application="voicemail" data="default ${domain_name} $1"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- Outbound Calls (via Kamailio) -->
<!-- ============================================ -->
<extension name="outbound">
<condition field="destination_number" expression="^9(\d+)$">
<!-- Strip the 9 prefix and send back to Kamailio for trunk routing -->
<action application="set" data="effective_caller_id_number=${outbound_caller_id_number}"/>
<action application="bridge" data="sofia/kamailio/$1@YOUR_KAM1_PRIVATE"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- Echo Test -->
<!-- ============================================ -->
<extension name="echo">
<condition field="destination_number" expression="^9196$">
<action application="answer"/>
<action application="echo"/>
</condition>
</extension>
<!-- ============================================ -->
<!-- Catch-all: Unknown Destination -->
<!-- ============================================ -->
<extension name="catch-all">
<condition field="destination_number" expression="^(.*)$">
<action application="log" data="WARNING: Unrouted call to ${destination_number} from ${caller_id_number}"/>
<action application="respond" data="404"/>
</condition>
</extension>
</context>
</include>
Event Socket Layer (ESL) Configuration
ESL allows external applications to control FreeSWITCH. This is essential for integration with custom applications, monitoring, and call control.
Edit /etc/freeswitch/autoload_configs/event_socket.conf.xml:
<configuration name="event_socket.conf" description="Socket Client">
<settings>
<!-- Listen on private network only -->
<param name="listen-ip" value="0.0.0.0"/>
<param name="listen-port" value="8021"/>
<param name="password" value="YOUR_ESL_PASSWORD"/>
<!-- ACL restriction — only allow from management network -->
<param name="apply-inbound-acl" value="internal-acl"/>
</settings>
</configuration>
XML CDR — Call Detail Records
Configure FreeSWITCH to POST CDRs to a central collector:
Edit /etc/freeswitch/autoload_configs/xml_cdr.conf.xml:
<configuration name="xml_cdr.conf" description="XML CDR">
<settings>
<!-- POST CDRs to central collector -->
<param name="url" value="http://YOUR_DB1_IP:8080/cdr"/>
<param name="retries" value="3"/>
<param name="delay" value="5"/>
<param name="log-http-and-disk" value="true"/>
<param name="log-dir" value="/var/log/freeswitch/cdr-csv"/>
<param name="err-log-dir" value="/var/log/freeswitch/cdr-csv/errors"/>
<param name="encode" value="true"/>
<param name="disable-100-continue" value="true"/>
</settings>
</configuration>
FreeSWITCH Firewall (Internal Only)
# FreeSWITCH servers are internal only
ufw default deny incoming
ufw default allow outgoing
# SSH
ufw allow 22/tcp
# SIP from Kamailio only
ufw allow from YOUR_KAM1_PRIVATE to any port 5060 proto udp
ufw allow from YOUR_KAM1_PRIVATE to any port 5060 proto tcp
ufw allow from YOUR_KAM2_PRIVATE to any port 5060 proto udp
ufw allow from YOUR_KAM2_PRIVATE to any port 5060 proto tcp
# RTP from RTPEngine only
ufw allow from YOUR_RTP1_PRIVATE to any port 16384:32768 proto udp
ufw allow from YOUR_RTP2_PRIVATE to any port 16384:32768 proto udp
# ESL from management network
ufw allow from 10.0.1.0/24 to any port 8021 proto tcp
# Internal network (database, monitoring)
ufw allow from 10.0.1.0/24
ufw enable
Start and Verify FreeSWITCH
# Start FreeSWITCH
systemctl start freeswitch
# Verify SIP profile is loaded
fs_cli -x "sofia status"
# Should show: kamailio sip:mod_sofia@YOUR_FS1_IP:5060 RUNNING
# Verify profile details
fs_cli -x "sofia status profile kamailio"
# Test: Send a SIP OPTIONS from Kamailio
# On Kamailio server:
kamcmd dispatcher.list
# Should show FreeSWITCH as Active (AP flags)
# Run echo test through the full chain:
# SIP phone → Kamailio → RTPEngine → FreeSWITCH (9196 echo)
Per-Instance Configuration
Each FreeSWITCH instance needs a unique switch.conf.xml with its own identity:
<!-- /etc/freeswitch/autoload_configs/switch.conf.xml -->
<configuration name="switch.conf" description="Core Configuration">
<settings>
<!-- Unique per instance -->
<param name="switchname" value="fs01"/>
<!-- Core settings -->
<param name="max-sessions" value="5000"/>
<param name="sessions-per-second" value="100"/>
<param name="rtp-start-port" value="16384"/>
<param name="rtp-end-port" value="32768"/>
<!-- Logging -->
<param name="loglevel" value="warning"/>
<param name="colorize-console" value="false"/>
<!-- Performance -->
<param name="max-db-handles" value="50"/>
<param name="db-handle-timeout" value="10"/>
</settings>
</configuration>
Change switchname to fs02, fs03, etc. on each instance. This value appears in CDRs and logs, making it easy to identify which FreeSWITCH handled a call.
8. Database-Driven Routing
Shared Database Architecture
All components share a central MariaDB (or Galera cluster) for configuration, state, and CDRs. This enables:
- Dynamic routing — change DID→destination mapping without restarting anything
- Multi-tenant — domain-based isolation of users and routes
- Shared user directory — FreeSWITCH instances share the same user/extension database
- Centralized CDR — all call records in one place regardless of which FreeSWITCH handled the call
- Runtime changes — add/remove servers, DIDs, routes via database without restarts
Install MariaDB (Single Node or Galera Cluster)
For a single-node setup:
#!/bin/bash
# install-mariadb.sh — Run on db01
apt-get install -y mariadb-server mariadb-client
# Secure the installation
mysql_secure_installation
# Set root password, remove anonymous users, disable remote root, remove test DB
# Allow remote connections from private network
sed -i 's/bind-address.*/bind-address = 0.0.0.0/' /etc/mysql/mariadb.conf.d/50-server.cnf
# Performance tuning for VoIP
cat >> /etc/mysql/mariadb.conf.d/50-server.cnf << 'EOF'
# VoIP platform tuning
innodb_buffer_pool_size = 4G
innodb_log_file_size = 512M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
max_connections = 500
query_cache_type = 0
table_open_cache = 4000
tmp_table_size = 64M
max_heap_table_size = 64M
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
EOF
systemctl restart mariadb
For a Galera cluster (3 nodes), add to the config on each node:
# /etc/mysql/mariadb.conf.d/60-galera.cnf
[galera]
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_name = "voip-cluster"
wsrep_cluster_address = "gcomm://YOUR_DB1_IP,YOUR_DB2_IP,YOUR_DB3_IP"
wsrep_node_address = "YOUR_DB1_IP" # Change per node
wsrep_node_name = "db01" # Change per node
wsrep_sst_method = mariabackup
wsrep_sst_auth = "sst_user:YOUR_SST_PASSWORD"
binlog_format = ROW
default_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2
Database Schema
Create the databases and tables used by each component:
-- ================================================
-- Kamailio database (created by kamdbctl create)
-- Key tables used by our SBC configuration:
-- ================================================
-- subscriber — SIP user credentials
-- (auto-created by kamdbctl, shown here for reference)
CREATE TABLE IF NOT EXISTS subscriber (
id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
username VARCHAR(64) NOT NULL DEFAULT '',
domain VARCHAR(64) NOT NULL DEFAULT '',
password VARCHAR(64) NOT NULL DEFAULT '',
ha1 VARCHAR(128) NOT NULL DEFAULT '',
ha1b VARCHAR(128) NOT NULL DEFAULT '',
PRIMARY KEY (id),
UNIQUE KEY sub_idx (username, domain)
) ENGINE=InnoDB;
-- dispatcher — load balancer backends (auto-created)
-- Already populated in Section 5
-- ================================================
-- Custom routing tables
-- ================================================
-- DID routing: maps incoming DIDs to destinations
CREATE TABLE did_routing (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
did VARCHAR(20) NOT NULL COMMENT 'Incoming DID number (E.164)',
domain VARCHAR(64) NOT NULL DEFAULT 'default' COMMENT 'Tenant domain',
destination VARCHAR(128) NOT NULL COMMENT 'Destination (extension, queue, IVR)',
dest_type ENUM('extension','queue','ivr','conference','voicemail','external') NOT NULL DEFAULT 'extension',
priority INT NOT NULL DEFAULT 0 COMMENT 'Higher = preferred',
active TINYINT(1) NOT NULL DEFAULT 1,
description VARCHAR(255) DEFAULT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id),
UNIQUE KEY did_domain_idx (did, domain),
KEY active_idx (active)
) ENGINE=InnoDB;
-- Example DID routing entries
INSERT INTO did_routing (did, domain, destination, dest_type, description) VALUES
('+442012345678', 'default', '2000', 'ivr', 'UK Main — IVR'),
('+442012345679', 'default', '3001', 'queue', 'UK Sales Direct'),
('+442012345680', 'default', '1001', 'extension', 'UK CEO Direct'),
('+33123456789', 'tenant-fr.example.com', '2000', 'ivr', 'France Main — IVR'),
('+33123456790', 'tenant-fr.example.com', '3002', 'queue', 'France Support');
-- Trunk routing: outbound carrier selection
CREATE TABLE trunk_routing (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
prefix VARCHAR(20) NOT NULL COMMENT 'Dialed prefix (longest match wins)',
domain VARCHAR(64) NOT NULL DEFAULT 'default',
trunk_name VARCHAR(64) NOT NULL COMMENT 'SIP trunk identifier',
trunk_uri VARCHAR(256) NOT NULL COMMENT 'SIP URI for the trunk',
priority INT NOT NULL DEFAULT 0,
weight INT NOT NULL DEFAULT 100 COMMENT 'Weight for load distribution',
active TINYINT(1) NOT NULL DEFAULT 1,
description VARCHAR(255) DEFAULT NULL,
PRIMARY KEY (id),
KEY prefix_idx (prefix, domain, active, priority)
) ENGINE=InnoDB;
-- Example trunk routing
INSERT INTO trunk_routing (prefix, domain, trunk_name, trunk_uri, priority, description) VALUES
('+44', 'default', 'carrier-a-uk', 'sip:[email protected]', 10, 'Carrier A — UK primary'),
('+44', 'default', 'carrier-b-uk', 'sip:[email protected]', 5, 'Carrier B — UK backup'),
('+33', 'default', 'carrier-a-fr', 'sip:[email protected]', 10, 'Carrier A — France'),
('+1', 'default', 'carrier-c-us', 'sip:[email protected]', 10, 'Carrier C — US/Canada');
-- ================================================
-- CDR table (all FreeSWITCH instances write here)
-- ================================================
CREATE TABLE cdr (
id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
switch_name VARCHAR(32) NOT NULL COMMENT 'FreeSWITCH instance (fs01, fs02, ...)',
call_uuid VARCHAR(64) NOT NULL,
sip_call_id VARCHAR(128) DEFAULT NULL,
caller_id_number VARCHAR(32) DEFAULT NULL,
caller_id_name VARCHAR(64) DEFAULT NULL,
destination_number VARCHAR(32) DEFAULT NULL,
context VARCHAR(64) DEFAULT NULL,
start_stamp DATETIME NOT NULL,
answer_stamp DATETIME DEFAULT NULL,
end_stamp DATETIME NOT NULL,
duration INT NOT NULL DEFAULT 0,
billsec INT NOT NULL DEFAULT 0,
hangup_cause VARCHAR(64) DEFAULT NULL,
sip_hangup_disposition VARCHAR(64) DEFAULT NULL,
direction ENUM('inbound','outbound','internal') DEFAULT 'inbound',
accountcode VARCHAR(32) DEFAULT NULL,
domain VARCHAR(64) DEFAULT NULL,
recording_path VARCHAR(512) DEFAULT NULL,
PRIMARY KEY (id),
KEY call_uuid_idx (call_uuid),
KEY sip_call_id_idx (sip_call_id),
KEY start_stamp_idx (start_stamp),
KEY caller_idx (caller_id_number),
KEY dest_idx (destination_number),
KEY domain_idx (domain)
) ENGINE=InnoDB
PARTITION BY RANGE (YEAR(start_stamp) * 100 + MONTH(start_stamp)) (
PARTITION p202601 VALUES LESS THAN (202602),
PARTITION p202602 VALUES LESS THAN (202603),
PARTITION p202603 VALUES LESS THAN (202604),
PARTITION p202604 VALUES LESS THAN (202605),
PARTITION p202605 VALUES LESS THAN (202606),
PARTITION p202606 VALUES LESS THAN (202607),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
Database Users and Permissions
-- Kamailio user (needs read/write on kamailio DB)
CREATE USER 'kamailio'@'10.0.1.%' IDENTIFIED BY 'YOUR_DB_PASSWORD';
GRANT ALL PRIVILEGES ON kamailio.* TO 'kamailio'@'10.0.1.%';
-- FreeSWITCH user (read routing tables, write CDRs)
CREATE USER 'freeswitch'@'10.0.1.%' IDENTIFIED BY 'YOUR_FS_DB_PASSWORD';
GRANT SELECT ON kamailio.did_routing TO 'freeswitch'@'10.0.1.%';
GRANT SELECT ON kamailio.trunk_routing TO 'freeswitch'@'10.0.1.%';
GRANT SELECT ON kamailio.subscriber TO 'freeswitch'@'10.0.1.%';
GRANT INSERT, SELECT ON kamailio.cdr TO 'freeswitch'@'10.0.1.%';
-- Grafana / monitoring (read-only)
CREATE USER 'grafana'@'10.0.1.%' IDENTIFIED BY 'YOUR_GRAFANA_DB_PASSWORD';
GRANT SELECT ON kamailio.* TO 'grafana'@'10.0.1.%';
FLUSH PRIVILEGES;
Kamailio: DID-Based Routing from Database
Add this route to kamailio.cfg to look up DID routing from the database before dispatching:
## ---- DID Routing from Database ----
route[DID_ROUTING] {
# Look up the called number (R-URI user) in did_routing table
if (!sql_query("ca", "SELECT destination, dest_type FROM did_routing \
WHERE did='$rU' AND domain='$fd' AND active=1 \
ORDER BY priority DESC LIMIT 1", "ra")) {
xlog("L_ERR", "DID_ROUTING: Database query failed\n");
return;
}
if ($dbr(ra=>rows) > 0) {
$var(destination) = $dbr(ra=>[0,0]);
$var(dest_type) = $dbr(ra=>[0,1]);
xlog("L_INFO", "DID_ROUTING: $rU → $var(destination) ($var(dest_type))\n");
# Rewrite the R-URI with the looked-up destination
# FreeSWITCH will use this to determine what to do
$rU = $var(destination);
# Optionally set a header so FreeSWITCH knows the destination type
append_hf("X-Dest-Type: $var(dest_type)\r\n");
} else {
xlog("L_WARN", "DID_ROUTING: No route found for DID $rU in domain $fd\n");
# Use default routing or reject
sl_send_reply("404", "DID Not Found");
exit;
}
}
Then call this route before dispatching in the INVITE handler:
# Handle INVITE — main call processing
if (is_method("INVITE")) {
setflag(FLT_DLG);
dlg_manage();
route(NATDETECT);
route(DID_ROUTING); # <-- Look up DID first
route(RTPENGINE_OFFER);
route(DISPATCH);
exit;
}
FreeSWITCH: Dynamic User Directory via mod_xml_curl
Instead of static XML user files on each FreeSWITCH, use mod_xml_curl to fetch user configuration from a central HTTP API backed by the database.
Edit /etc/freeswitch/autoload_configs/xml_curl.conf.xml:
<configuration name="xml_curl.conf" description="cURL XML Gateway">
<bindings>
<binding name="directory">
<param name="gateway-url" value="http://YOUR_DB1_IP:8080/freeswitch/directory"/>
<param name="gateway-credentials" value="freeswitch:YOUR_API_PASSWORD"/>
<param name="auth-scheme" value="basic"/>
<param name="timeout" value="5"/>
<param name="disable-100-continue" value="true"/>
<param name="enable-post-mapping" value="false"/>
</binding>
</bindings>
</configuration>
Example Python API that serves user directory XML (runs on the DB server or a separate API server):
#!/usr/bin/env python3
"""
freeswitch_directory_api.py
Serves FreeSWITCH user directory from MariaDB
Run with: uvicorn freeswitch_directory_api:app --host 0.0.0.0 --port 8080
"""
from fastapi import FastAPI, Form, Response
import mysql.connector
app = FastAPI()
DB_CONFIG = {
"host": "YOUR_DB1_IP",
"user": "freeswitch",
"password": "YOUR_FS_DB_PASSWORD",
"database": "kamailio"
}
@app.post("/freeswitch/directory")
async def directory(
section: str = Form(default="directory"),
key_name: str = Form(default=""),
key_value: str = Form(default=""),
user: str = Form(default=""),
domain: str = Form(default=""),
):
"""Return FreeSWITCH directory XML for a user lookup."""
if section != "directory" or not user or not domain:
return Response(
content='<?xml version="1.0"?><document type="freeswitch/xml"><section name="directory"></section></document>',
media_type="text/xml"
)
# Look up user in subscriber table
conn = mysql.connector.connect(**DB_CONFIG)
cursor = conn.cursor(dictionary=True)
cursor.execute(
"SELECT username, password, domain FROM subscriber WHERE username=%s AND domain=%s",
(user, domain)
)
row = cursor.fetchone()
cursor.close()
conn.close()
if not row:
return Response(
content='<?xml version="1.0"?><document type="freeswitch/xml"><section name="directory"></section></document>',
media_type="text/xml"
)
xml = f'''<?xml version="1.0" encoding="UTF-8"?>
<document type="freeswitch/xml">
<section name="directory">
<domain name="{domain}">
<user id="{row["username"]}">
<params>
<param name="password" value="{row["password"]}"/>
<param name="vm-password" value="{row["password"]}"/>
</params>
<variables>
<variable name="accountcode" value="{row["username"]}"/>
<variable name="user_context" value="from-kamailio"/>
<variable name="effective_caller_id_name" value="{row["username"]}"/>
<variable name="effective_caller_id_number" value="{row["username"]}"/>
</variables>
</user>
</domain>
</section>
</document>'''
return Response(content=xml, media_type="text/xml")
Multi-Tenant Routing
For multi-tenant deployments, use the SIP domain to isolate tenants:
-- Tenant A: company-a.example.com
INSERT INTO did_routing (did, domain, destination, dest_type) VALUES
('+442012345678', 'company-a.example.com', '2000', 'ivr'),
('+442012345679', 'company-a.example.com', '3001', 'queue');
-- Tenant B: company-b.example.com
INSERT INTO did_routing (did, domain, destination, dest_type) VALUES
('+442087654321', 'company-b.example.com', '2000', 'ivr'),
('+442087654322', 'company-b.example.com', '3001', 'queue');
Kamailio routes based on the $fd (From domain) or $rd (Request-URI domain), and FreeSWITCH uses the domain in its user directory lookups. Same extension number 2000 can map to completely different IVRs for each tenant.
9. WebRTC Gateway
Architecture for WebRTC
Browser (WebRTC) SIP Trunk
│ │
│ WSS (SIP over WebSocket) │ UDP/TCP SIP
│ DTLS-SRTP (encrypted media) │ RTP (unencrypted)
▼ ▼
┌──────────┐ SIP ┌──────────┐ SIP ┌──────────┐
│ Kamailio │◄────────►│ RTPEngine│◄────────►│FreeSWITCH│
│ (WSS) │ │(DTLS↔RTP)│ │ (media) │
└──────────┘ └──────────┘ └──────────┘
Kamailio: Terminates WebSocket, handles SIP-over-WS
RTPEngine: Bridges DTLS-SRTP (WebRTC) ↔ plain RTP (FreeSWITCH/trunks)
FreeSWITCH: Processes calls normally (does not know about WebRTC)
TLS Certificates (Let's Encrypt Wildcard)
# Install certbot with DNS plugin (for wildcard certs)
apt-get install -y certbot python3-certbot-dns-cloudflare
# Create credentials file (example for Cloudflare DNS)
mkdir -p /root/.secrets
cat > /root/.secrets/cloudflare.ini << 'EOF'
dns_cloudflare_api_token = YOUR_CLOUDFLARE_API_TOKEN
EOF
chmod 600 /root/.secrets/cloudflare.ini
# Get wildcard certificate
certbot certonly \
--dns-cloudflare \
--dns-cloudflare-credentials /root/.secrets/cloudflare.ini \
-d "*.YOUR_DOMAIN" \
-d "YOUR_DOMAIN" \
--agree-tos \
-m admin@YOUR_DOMAIN
# Link for Kamailio
ln -sf /etc/letsencrypt/live/YOUR_DOMAIN/fullchain.pem /etc/kamailio/tls/server.pem
ln -sf /etc/letsencrypt/live/YOUR_DOMAIN/privkey.pem /etc/kamailio/tls/server.key
# Link for RTPEngine (DTLS)
mkdir -p /etc/rtpengine/tls
ln -sf /etc/letsencrypt/live/YOUR_DOMAIN/fullchain.pem /etc/rtpengine/tls/cert.pem
ln -sf /etc/letsencrypt/live/YOUR_DOMAIN/privkey.pem /etc/rtpengine/tls/key.pem
# Auto-renewal cron (reload services after renewal)
cat > /etc/letsencrypt/renewal-hooks/deploy/reload-voip.sh << 'SCRIPT'
#!/bin/bash
systemctl reload kamailio 2>/dev/null || true
systemctl restart rtpengine 2>/dev/null || true
SCRIPT
chmod +x /etc/letsencrypt/renewal-hooks/deploy/reload-voip.sh
Kamailio WSS Configuration
The WebSocket handling is already in the main kamailio.cfg from Section 4. Key pieces:
# Listeners (already defined)
listen=tls:MY_PUBLIC_IP:8443 # WSS direct
# WebSocket module (already loaded)
loadmodule "websocket.so"
loadmodule "xhttp.so"
# xhttp event route handles the WebSocket upgrade (already defined)
event_route[xhttp:request] { ... }
Additional WebSocket-specific routing logic to add in the main request_route:
# ---- WebRTC-specific handling ----
if (proto == WS || proto == WSS) {
# Force record-route with WebSocket transport
if (is_method("INVITE|SUBSCRIBE")) {
record_route_preset("MY_PUBLIC_IP:8443;transport=wss");
}
# WebRTC clients use SIP Outbound (RFC 5626)
if (is_method("REGISTER")) {
# Add Path header so replies find the WebSocket connection
add_path_received();
}
}
RTPEngine DTLS Configuration
Add DTLS support to /etc/rtpengine/rtpengine.conf:
# Add to [rtpengine] section
# DTLS certificate for WebRTC
dtls-cert = /etc/rtpengine/tls/cert.pem
dtls-key = /etc/rtpengine/tls/key.pem
# Enable DTLS and ICE
ice-lite = true
Nginx Reverse Proxy for WSS
For production, put Nginx in front of Kamailio for WSS. This provides proper TLS termination, HTTP/2, and the ability to serve the web client from the same domain:
# /etc/nginx/sites-available/webrtc-gateway
upstream kamailio_wss {
server YOUR_KAM1_PRIVATE:8080; # WS (unencrypted) — Nginx handles TLS
server YOUR_KAM2_PRIVATE:8080 backup;
}
server {
listen 443 ssl http2;
server_name webrtc.YOUR_DOMAIN;
ssl_certificate /etc/letsencrypt/live/YOUR_DOMAIN/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/YOUR_DOMAIN/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
# WebSocket proxy to Kamailio
location /ws {
proxy_pass http://kamailio_wss;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 3600s;
proxy_send_timeout 3600s;
}
# Serve the WebRTC web client
location / {
root /var/www/webrtc;
index index.html;
}
}
Browser Client — SIP.js Example
Create /var/www/webrtc/index.html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>WebRTC Phone</title>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/lib/platform/web/sip.js"></script>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
max-width: 500px;
margin: 50px auto;
padding: 20px;
background: #1a1a2e;
color: #e0e0e0;
}
h1 { color: #00d4ff; text-align: center; }
.status {
text-align: center;
padding: 10px;
margin: 20px 0;
border-radius: 8px;
background: #16213e;
}
.status.connected { border-left: 4px solid #00ff88; }
.status.disconnected { border-left: 4px solid #ff4444; }
.status.calling { border-left: 4px solid #ffaa00; }
input, button {
width: 100%;
padding: 12px;
margin: 5px 0;
border: none;
border-radius: 6px;
font-size: 16px;
box-sizing: border-box;
}
input {
background: #16213e;
color: #e0e0e0;
border: 1px solid #333;
}
button {
cursor: pointer;
font-weight: bold;
}
.btn-call { background: #00ff88; color: #000; }
.btn-hangup { background: #ff4444; color: #fff; }
.btn-answer { background: #00d4ff; color: #000; }
.btn-register { background: #9b59b6; color: #fff; }
button:hover { opacity: 0.9; }
button:disabled { opacity: 0.4; cursor: not-allowed; }
.controls { margin: 20px 0; }
audio { display: none; }
</style>
</head>
<body>
<h1>WebRTC Phone</h1>
<div id="status" class="status disconnected">Disconnected</div>
<div class="controls">
<input type="text" id="server" placeholder="WSS Server" value="wss://webrtc.YOUR_DOMAIN/ws">
<input type="text" id="username" placeholder="SIP Username (e.g., 1001)">
<input type="password" id="password" placeholder="SIP Password">
<input type="text" id="domain" placeholder="SIP Domain" value="YOUR_DOMAIN">
<button class="btn-register" onclick="doRegister()">Register</button>
</div>
<div class="controls">
<input type="text" id="target" placeholder="Number to call">
<button class="btn-call" id="btnCall" onclick="doCall()" disabled>Call</button>
<button class="btn-answer" id="btnAnswer" onclick="doAnswer()" disabled>Answer</button>
<button class="btn-hangup" id="btnHangup" onclick="doHangup()" disabled>Hang Up</button>
</div>
<audio id="remoteAudio" autoplay></audio>
<script>
let userAgent = null;
let registerer = null;
let currentSession = null;
function setStatus(text, className) {
const el = document.getElementById('status');
el.textContent = text;
el.className = 'status ' + className;
}
async function doRegister() {
const server = document.getElementById('server').value;
const username = document.getElementById('username').value;
const password = document.getElementById('password').value;
const domain = document.getElementById('domain').value;
const uri = SIP.UserAgent.makeURI(`sip:${username}@${domain}`);
const transportOptions = {
server: server,
traceSip: true
};
userAgent = new SIP.UserAgent({
uri: uri,
transportOptions: transportOptions,
authorizationUsername: username,
authorizationPassword: password,
displayName: username,
delegate: {
onInvite: (invitation) => {
currentSession = invitation;
setStatus('Incoming call from ' + invitation.remoteIdentity.displayName, 'calling');
document.getElementById('btnAnswer').disabled = false;
document.getElementById('btnHangup').disabled = false;
}
}
});
await userAgent.start();
registerer = new SIP.Registerer(userAgent);
registerer.stateChange.addListener((state) => {
switch (state) {
case SIP.RegistererState.Registered:
setStatus('Registered as ' + username, 'connected');
document.getElementById('btnCall').disabled = false;
break;
case SIP.RegistererState.Unregistered:
setStatus('Unregistered', 'disconnected');
document.getElementById('btnCall').disabled = true;
break;
}
});
await registerer.register();
}
async function doCall() {
const target = document.getElementById('target').value;
const domain = document.getElementById('domain').value;
if (!target || !userAgent) return;
const targetURI = SIP.UserAgent.makeURI(`sip:${target}@${domain}`);
if (!targetURI) {
alert('Invalid target');
return;
}
const inviter = new SIP.Inviter(userAgent, targetURI, {
sessionDescriptionHandlerOptions: {
constraints: { audio: true, video: false }
}
});
currentSession = inviter;
setupSessionListeners(inviter);
setStatus('Calling ' + target + '...', 'calling');
document.getElementById('btnHangup').disabled = false;
document.getElementById('btnCall').disabled = true;
await inviter.invite();
}
async function doAnswer() {
if (!currentSession) return;
await currentSession.accept({
sessionDescriptionHandlerOptions: {
constraints: { audio: true, video: false }
}
});
setupSessionListeners(currentSession);
setStatus('In call', 'connected');
document.getElementById('btnAnswer').disabled = true;
}
function doHangup() {
if (!currentSession) return;
switch (currentSession.state) {
case SIP.SessionState.Initial:
case SIP.SessionState.Establishing:
if (currentSession instanceof SIP.Inviter) {
currentSession.cancel();
} else {
currentSession.reject();
}
break;
case SIP.SessionState.Established:
currentSession.bye();
break;
}
resetCallUI();
}
function setupSessionListeners(session) {
session.stateChange.addListener((state) => {
switch (state) {
case SIP.SessionState.Established:
setStatus('In call', 'connected');
// Attach remote audio
const remoteStream = new MediaStream();
session.sessionDescriptionHandler.peerConnection
.getReceivers()
.forEach((receiver) => {
if (receiver.track) {
remoteStream.addTrack(receiver.track);
}
});
document.getElementById('remoteAudio').srcObject = remoteStream;
break;
case SIP.SessionState.Terminated:
setStatus('Call ended', 'disconnected');
resetCallUI();
break;
}
});
}
function resetCallUI() {
currentSession = null;
document.getElementById('btnCall').disabled = false;
document.getElementById('btnAnswer').disabled = true;
document.getElementById('btnHangup').disabled = true;
document.getElementById('remoteAudio').srcObject = null;
setTimeout(() => setStatus('Registered', 'connected'), 2000);
}
</script>
</body>
</html>
Testing WebRTC
- Open
https://webrtc.YOUR_DOMAIN/in Chrome or Firefox - Enter your SIP credentials and click Register
- Status should change to "Registered"
- Enter a number (e.g.,
9196for echo test) and click Call - Verify audio flows both directions
Debugging WebRTC issues:
# On Kamailio — watch WebSocket connections
kamcmd ws.dump
# On RTPEngine — check DTLS sessions
rtpengine-ctl list sessions
# On RTPEngine — verify DTLS is working
# Look for "DTLS" in the session details
rtpengine-ctl list totals
# Browser — check WebRTC internals
# Chrome: chrome://webrtc-internals/
# Firefox: about:webrtc
10. High Availability — Kamailio
Keepalived + Virtual IP (VIP)
The Kamailio HA pair uses Keepalived to manage a floating Virtual IP (VIP). The active node holds the VIP and processes all traffic. If it fails, the standby node takes over the VIP within seconds.
Normal operation:
VIP (YOUR_PUBLIC_VIP) → Kamailio-A (active)
Kamailio-B (standby, idle)
After Kamailio-A failure:
VIP (YOUR_PUBLIC_VIP) → Kamailio-B (now active)
Kamailio-A (down)
Failover time: 3-6 seconds (VRRP advertisement interval + detection)
Install Keepalived
# On both kam01 and kam02
apt-get install -y keepalived
Health Check Script
Create /etc/keepalived/check_kamailio.sh:
#!/bin/bash
#
# Kamailio health check for Keepalived
# Returns 0 (healthy) or 1 (unhealthy)
# Tests actual SIP responsiveness, not just process existence
#
# Check 1: Is the process running?
if ! pgrep -x kamailio > /dev/null 2>&1; then
echo "FAIL: Kamailio process not running"
exit 1
fi
# Check 2: Can it respond to SIP OPTIONS?
# Send OPTIONS to localhost and expect a response within 2 seconds
RESPONSE=$(sipsak -s sip:[email protected]:5060 -v --timeout 2 2>&1)
if [ $? -ne 0 ]; then
echo "FAIL: Kamailio not responding to SIP OPTIONS"
exit 1
fi
# Check 3: Check that the control socket is responsive
if ! kamcmd core.uptime > /dev/null 2>&1; then
echo "FAIL: Kamailio RPC not responding"
exit 1
fi
# Check 4: Verify at least one dispatcher destination is active
ACTIVE=$(kamcmd dispatcher.list 2>/dev/null | grep -c "FLAGS: AP")
if [ "$ACTIVE" -eq 0 ]; then
echo "WARN: No active dispatcher destinations (not failing over for this)"
# Don't fail for this — it might be a temporary condition
# and failing over won't help if all FS servers are down
fi
echo "OK: Kamailio healthy (${ACTIVE} active dispatchers)"
exit 0
chmod +x /etc/keepalived/check_kamailio.sh
apt-get install -y sipsak # Needed for the health check
Keepalived Configuration — Active Node (kam01)
Create /etc/keepalived/keepalived.conf on kam01:
# /etc/keepalived/keepalived.conf — Kamailio-A (MASTER)
global_defs {
router_id KAM01
script_user root
enable_script_security
# Notification emails (optional)
# notification_email {
# admin@YOUR_DOMAIN
# }
# notification_email_from keepalived@kam01
# smtp_server localhost
}
# Health check script
vrrp_script check_kamailio {
script "/etc/keepalived/check_kamailio.sh"
interval 3 # Check every 3 seconds
weight -20 # Subtract 20 from priority on failure
fall 2 # 2 consecutive failures = unhealthy
rise 2 # 2 consecutive successes = healthy
}
# VRRP instance for SIP VIP
vrrp_instance VI_SIP {
state MASTER
interface eth0 # Change to your network interface
virtual_router_id 51 # Must be same on both nodes
priority 100 # Higher = preferred (kam01 is preferred)
advert_int 1 # VRRP advertisement every 1 second
authentication {
auth_type PASS
auth_pass YOUR_VRRP_PASSWORD # Same on both nodes
}
virtual_ipaddress {
YOUR_PUBLIC_VIP/32 dev eth0 # The floating VIP
}
track_script {
check_kamailio
}
# Notify scripts (optional — for logging/alerting)
notify_master "/bin/bash -c 'logger -t keepalived MASTER — VIP acquired on kam01'"
notify_backup "/bin/bash -c 'logger -t keepalived BACKUP — VIP released on kam01'"
notify_fault "/bin/bash -c 'logger -t keepalived FAULT — health check failing on kam01'"
}
Keepalived Configuration — Standby Node (kam02)
Create /etc/keepalived/keepalived.conf on kam02 (differences highlighted):
# /etc/keepalived/keepalived.conf — Kamailio-B (BACKUP)
global_defs {
router_id KAM02
script_user root
enable_script_security
}
vrrp_script check_kamailio {
script "/etc/keepalived/check_kamailio.sh"
interval 3
weight -20
fall 2
rise 2
}
vrrp_instance VI_SIP {
state BACKUP # <-- BACKUP (not MASTER)
interface eth0
virtual_router_id 51 # Must match kam01
priority 90 # <-- Lower priority (kam01 preferred)
advert_int 1
authentication {
auth_type PASS
auth_pass YOUR_VRRP_PASSWORD # Must match kam01
}
virtual_ipaddress {
YOUR_PUBLIC_VIP/32 dev eth0
}
track_script {
check_kamailio
}
notify_master "/bin/bash -c 'logger -t keepalived MASTER — VIP acquired on kam02'"
notify_backup "/bin/bash -c 'logger -t keepalived BACKUP — VIP released on kam02'"
notify_fault "/bin/bash -c 'logger -t keepalived FAULT — health check failing on kam02'"
}
Start Keepalived
# On both nodes
systemctl enable --now keepalived
# Verify VIP is on kam01 (the master)
ip addr show eth0 | grep YOUR_PUBLIC_VIP
# Check keepalived status
systemctl status keepalived
journalctl -u keepalived -f
# Test failover: stop Kamailio on kam01
systemctl stop kamailio
# Within 3-6 seconds, VIP should move to kam02:
# On kam02: ip addr show eth0 | grep YOUR_PUBLIC_VIP
# Restore kam01
systemctl start kamailio
# VIP moves back to kam01 (higher priority, preemption)
Shared Location Table (usrloc to DB)
For seamless failover of registered users, both Kamailio nodes must share the location table in the database. This is already configured in our kamailio.cfg:
modparam("usrloc", "db_url", DBURL)
modparam("usrloc", "db_mode", 2) # Write-through: every registration written to DB immediately
With db_mode=2, when a user registers via kam01, the registration is written to the database. If kam01 fails and kam02 takes over, kam02 reads the location table from the database and can route calls to registered users without re-registration.
Important: db_mode=2 has higher database load than db_mode=1 (write-back). For very high registration volumes (100K+ registered users), consider db_mode=1 with a short timer_interval (e.g., 30 seconds).
Dialog Replication with DMQ
For in-progress calls to survive a failover, Kamailio supports Dialog replication between nodes using the DMQ (Distributed Message Queue) module. This replicates dialog state so the standby node can handle in-dialog requests (BYE, re-INVITE) for calls that were set up by the active node.
Add to kamailio.cfg:
# Load DMQ module
loadmodule "dmq.so"
# DMQ parameters
modparam("dmq", "server_address", "sip:MY_PRIVATE_IP:5062")
modparam("dmq", "notification_address", "sip:10.0.1.10:5062") # Use kam01 as notification peer
modparam("dmq", "multi_notify", 1)
modparam("dmq", "num_workers", 4)
modparam("dmq", "ping_interval", 15)
# Add DMQ listener
listen=udp:MY_PRIVATE_IP:5062
# Enable dialog replication via DMQ
modparam("dialog", "enable_dmq", 1)
Add DMQ routing in the main request_route:
# DMQ traffic — handle before anything else
if ($rm == "KDMQ" && $rP == "udp" && $sp == 5062) {
dmq_handle_message();
exit;
}
With DMQ active, both Kamailio nodes maintain synchronized dialog state. During failover, in-progress calls continue working because the new active node has the complete dialog table.
11. High Availability — FreeSWITCH
Why FreeSWITCH HA Is Different
Unlike Kamailio (which is a stateless proxy that can easily share state via database), FreeSWITCH is a stateful media server — it holds active call sessions, media streams, and application state in memory. This makes traditional active/standby HA impractical for FreeSWITCH.
Instead, FreeSWITCH HA relies on a pool architecture:
- Multiple FreeSWITCH instances run simultaneously (not standby — all active)
- Kamailio's dispatcher distributes calls across the pool
- If one FreeSWITCH fails, only its active calls are lost (not the entire platform)
- New calls are automatically routed to surviving instances
- The more instances in the pool, the smaller the blast radius of any single failure
Blast Radius Analysis
| Pool Size | Calls per FS (at 3000 total) | Impact of 1 FS Failure |
|---|---|---|
| 2 instances | 1,500 each | 50% of calls lost |
| 3 instances | 1,000 each | 33% of calls lost |
| 4 instances | 750 each | 25% of calls lost |
| 6 instances | 500 each | 17% of calls lost |
With 4+ instances, a single failure affects a manageable percentage of calls, and the surviving instances have enough headroom to absorb the redistributed load.
Graceful Draining — Zero-Downtime Maintenance
The key to zero-downtime FreeSWITCH maintenance is draining: stop sending new calls to a node while letting existing calls finish naturally.
#!/bin/bash
# drain-freeswitch.sh — Gracefully drain a FreeSWITCH instance
# Usage: ./drain-freeswitch.sh fs01 YOUR_FS1_IP
FS_NAME=$1
FS_IP=$2
KAM_HOST="YOUR_KAM1_PRIVATE"
echo "=== Draining FreeSWITCH: $FS_NAME ($FS_IP) ==="
# Step 1: Mark as inactive in Kamailio dispatcher (no new calls)
echo "Step 1: Removing from dispatcher..."
ssh $KAM_HOST "kamcmd dispatcher.set_state i 1 sip:${FS_IP}:5060"
echo " Done. No new calls will be sent to $FS_NAME."
# Step 2: Wait for existing calls to finish
echo "Step 2: Waiting for active calls to finish..."
while true; do
CALLS=$(ssh $FS_IP "fs_cli -x 'show calls count' 2>/dev/null" | grep -oP '\d+(?= total)')
CALLS=${CALLS:-0}
echo " Active calls: $CALLS"
if [ "$CALLS" -eq 0 ]; then
break
fi
sleep 10
done
echo " All calls finished."
# Step 3: Now safe to perform maintenance
echo "Step 3: $FS_NAME is fully drained. Safe to stop/upgrade."
echo ""
echo " When done, re-enable with:"
echo " ssh $KAM_HOST 'kamcmd dispatcher.set_state a 1 sip:${FS_IP}:5060'"
Zero-Downtime Upgrade Procedure
#!/bin/bash
# upgrade-freeswitch.sh — Zero-downtime FreeSWITCH upgrade
# Upgrades one instance at a time (rolling upgrade)
INSTANCES=("fs01:YOUR_FS1_IP" "fs02:YOUR_FS2_IP" "fs03:YOUR_FS3_IP")
KAM_HOST="YOUR_KAM1_PRIVATE"
for instance in "${INSTANCES[@]}"; do
IFS=':' read -r name ip <<< "$instance"
echo "============================================"
echo "Upgrading $name ($ip)"
echo "============================================"
# 1. Drain
echo " Draining..."
ssh $KAM_HOST "kamcmd dispatcher.set_state i 1 sip:${ip}:5060"
# Wait for calls to finish (max 30 minutes)
TIMEOUT=1800
ELAPSED=0
while [ $ELAPSED -lt $TIMEOUT ]; do
CALLS=$(ssh $ip "fs_cli -x 'show calls count' 2>/dev/null" | grep -oP '\d+(?= total)')
CALLS=${CALLS:-0}
if [ "$CALLS" -eq 0 ]; then break; fi
echo " $CALLS calls remaining (${ELAPSED}s elapsed)..."
sleep 15
ELAPSED=$((ELAPSED + 15))
done
# 2. Stop FreeSWITCH
echo " Stopping FreeSWITCH..."
ssh $ip "systemctl stop freeswitch"
# 3. Upgrade
echo " Upgrading..."
ssh $ip "apt-get update && apt-get upgrade -y freeswitch*"
# 4. Start FreeSWITCH
echo " Starting FreeSWITCH..."
ssh $ip "systemctl start freeswitch"
sleep 5 # Wait for SIP profile to register
# 5. Verify it responds
echo " Verifying..."
ssh $ip "fs_cli -x 'sofia status'" || { echo "FAILED to start $name!"; exit 1; }
# 6. Re-enable in dispatcher
echo " Re-enabling in dispatcher..."
ssh $KAM_HOST "kamcmd dispatcher.set_state a 1 sip:${ip}:5060"
echo " $name upgraded successfully."
echo ""
# Wait before upgrading next instance (let it stabilize)
sleep 30
done
echo "All instances upgraded. Verifying dispatcher state..."
ssh $KAM_HOST "kamcmd dispatcher.list"
Shared Storage for Recordings
FreeSWITCH call recordings need to be accessible regardless of which instance handled the call. Options:
Option A: NFS (simplest)
# On NFS server (db01 or dedicated storage)
apt-get install -y nfs-kernel-server
mkdir -p /srv/recordings
chown freeswitch:freeswitch /srv/recordings
echo "/srv/recordings 10.0.1.0/24(rw,sync,no_subtree_check,no_root_squash)" >> /etc/exports
exportfs -ra
# On each FreeSWITCH server
apt-get install -y nfs-common
mkdir -p /var/lib/freeswitch/recordings
echo "YOUR_DB1_IP:/srv/recordings /var/lib/freeswitch/recordings nfs defaults,soft,timeo=50 0 0" >> /etc/fstab
mount -a
Option B: S3-compatible storage (scalable)
Create a post-recording script that uploads to S3:
#!/bin/bash
# /usr/local/bin/upload-recording.sh
# Called by FreeSWITCH after each recording completes
FILE=$1
BUCKET="s3://your-recordings-bucket"
if [ -f "$FILE" ]; then
aws s3 cp "$FILE" "$BUCKET/$(date +%Y/%m/%d)/$(basename $FILE)" \
--storage-class STANDARD_IA
# Optionally delete local file after upload
# rm -f "$FILE"
fi
Session Recovery Limitations
It is important to understand what FreeSWITCH HA cannot do:
- Active calls on a failed node are lost. The media streams are in that instance's memory and cannot be transferred.
- Conference bridges on a failed node are terminated. All participants must rejoin.
- Voicemail sessions in progress are lost. The caller must call back.
These limitations are inherent to any media server. The mitigation is to have enough pool instances that the blast radius of any single failure is acceptable. For critical applications (emergency services, etc.), consider having callers automatically redialed by the application layer when a session is lost.
12. Geographic Distribution
Multi-DC Architecture
┌─────────────────────┐
│ Global DNS (SRV) │
│ sip.YOUR_DOMAIN │
└──────────┬──────────┘
│
┌────────────────┼────────────────┐
│ │ │
┌─────────▼──────┐ ┌─────▼──────┐ ┌──────▼────────┐
│ DC Europe │ │ DC US-East │ │ DC US-West │
│ (London) │ │ (Virginia) │ │ (Oregon) │
│ │ │ │ │ │
│ Kam+FS+RTP │ │ Kam+FS+RTP │ │ Kam+FS+RTP │
│ Galera node │ │ Galera node│ │ Galera node │
└────────────────┘ └─────────────┘ └───────────────┘
│ │ │
└────────────────┼────────────────┘
│
┌──────────▼──────────┐
│ Galera WAN Cluster │
│ (async replication)│
└─────────────────────┘
DNS SRV Records
DNS SRV records allow SIP clients to discover your servers and automatically failover between data centers:
; NAPTR records — tell SIP clients which transports are available
YOUR_DOMAIN. IN NAPTR 10 10 "S" "SIP+D2U" "" _sip._udp.YOUR_DOMAIN.
YOUR_DOMAIN. IN NAPTR 20 10 "S" "SIP+D2T" "" _sip._tcp.YOUR_DOMAIN.
YOUR_DOMAIN. IN NAPTR 30 10 "S" "SIPS+D2T" "" _sips._tcp.YOUR_DOMAIN.
; SRV records — specify servers and priorities per transport
; Lower priority number = preferred. Same priority = load balance by weight.
; UDP SIP
_sip._udp.YOUR_DOMAIN. IN SRV 10 60 5060 sip-eu.YOUR_DOMAIN. ; EU primary
_sip._udp.YOUR_DOMAIN. IN SRV 10 40 5060 sip-us.YOUR_DOMAIN. ; US secondary
_sip._udp.YOUR_DOMAIN. IN SRV 20 50 5060 sip-eu2.YOUR_DOMAIN. ; EU backup
_sip._udp.YOUR_DOMAIN. IN SRV 20 50 5060 sip-us2.YOUR_DOMAIN. ; US backup
; TCP SIP
_sip._tcp.YOUR_DOMAIN. IN SRV 10 60 5060 sip-eu.YOUR_DOMAIN.
_sip._tcp.YOUR_DOMAIN. IN SRV 10 40 5060 sip-us.YOUR_DOMAIN.
; TLS SIP
_sips._tcp.YOUR_DOMAIN. IN SRV 10 60 5061 sip-eu.YOUR_DOMAIN.
_sips._tcp.YOUR_DOMAIN. IN SRV 10 40 5061 sip-us.YOUR_DOMAIN.
; A records for each SIP edge
sip-eu.YOUR_DOMAIN. IN A YOUR_EU_VIP
sip-us.YOUR_DOMAIN. IN A YOUR_US_VIP
sip-eu2.YOUR_DOMAIN. IN A YOUR_EU2_VIP
sip-us2.YOUR_DOMAIN. IN A YOUR_US2_VIP
How SIP clients use SRV records:
- Client resolves
_sip._udp.YOUR_DOMAINand gets 2 records with priority 10 - Client distributes requests based on weight: 60% to EU, 40% to US
- If the priority-10 servers fail, client falls back to priority-20 servers
- SIP INVITE includes a
Routeheader for the selected server
Geographic Routing with GeoIP
Kamailio can use the GeoIP2 module to route calls based on the geographic location of the caller:
# Load GeoIP2 module
loadmodule "geoip2.so"
modparam("geoip2", "path", "/usr/share/GeoIP/GeoLite2-City.mmdb")
# Geographic routing route
route[GEO_ROUTE] {
# Look up caller's country
if (geoip2_match("$si", "src")) {
$var(country) = $gip2(src=>cc);
$var(continent) = $gip2(src=>cont);
xlog("L_INFO", "GEO: Caller from $si — country=$var(country), continent=$var(continent)\n");
# Route to closest DC based on continent
switch ($var(continent)) {
case "EU":
# European callers → EU FreeSWITCH pool (set 10)
if (!ds_select_dst("10", "0", "6")) {
# Fallback to US pool
ds_select_dst("20", "0", "6");
}
break;
case "NA":
# North American callers → US-East pool (set 20)
if (!ds_select_dst("20", "0", "6")) {
ds_select_dst("10", "0", "6");
}
break;
default:
# Everyone else → round-robin across all DCs
ds_select_dst("1", "4", "6");
break;
}
} else {
# GeoIP lookup failed — use default pool
ds_select_dst("1", "0", "6");
}
}
Database Replication Across Data Centers
For multi-DC deployments, use MariaDB Galera with WAN replication:
# On each Galera node, add WAN-specific settings:
[galera]
wsrep_cluster_address = "gcomm://EU_DB_IP,US_EAST_DB_IP,US_WEST_DB_IP"
# WAN optimizations
wsrep_provider_options = "evs.send_window=256; evs.user_send_window=128; evs.keepalive_period=PT3S; evs.suspect_timeout=PT30S; evs.inactive_timeout=PT1M; gcache.size=1G"
# Segment-aware replication (reduces cross-DC traffic)
# EU nodes: gmcast.segment=0
# US-East nodes: gmcast.segment=1
# US-West nodes: gmcast.segment=2
wsrep_provider_options = "gmcast.segment=0" # Change per DC
Important latency considerations:
- Galera writes are synchronous — a write in EU must be acknowledged by US nodes before committing
- Cross-Atlantic latency is typically 80-120ms RTT
- This adds ~100ms to every database write (registration, CDR insert)
- For very high write volumes, consider:
- Asynchronous replication (standard MySQL replication) for CDRs
- Local caching in Kamailio htables for frequently-read data
- Read/write splitting: reads from local node, writes to any node
Latency Considerations for Media
Media (RTP) is latency-sensitive. Key rules:
- RTPEngine should be in the same DC as the caller (or as close as possible)
- FreeSWITCH should be in the same DC as the RTPEngine it works with
- Cross-DC media relay adds 80-120ms of latency each way — noticeable in voice calls
- For calls between users in different DCs, the media should anchor at one DC (caller's preferred)
Kamailio can select the right RTPEngine based on the caller's location:
# Select RTPEngine based on caller geography
route[SELECT_RTPENGINE] {
if ($var(continent) == "EU") {
# Use EU RTPEngine
modparam("rtpengine", "rtpengine_sock", "udp:EU_RTP_IP:2223");
} else {
# Use US RTPEngine
modparam("rtpengine", "rtpengine_sock", "udp:US_RTP_IP:2223");
}
}
13. Monitoring & Operations
Prometheus Metrics — All Components
A unified monitoring stack provides visibility into every layer of the platform.
Kamailio Exporter
# Install kamailio-exporter
# Option 1: Pre-built binary
wget https://github.com/florentchauveau/kamailio_exporter/releases/latest/download/kamailio_exporter_linux_amd64 \
-O /usr/local/bin/kamailio_exporter
chmod +x /usr/local/bin/kamailio_exporter
# Create systemd service
cat > /etc/systemd/system/kamailio-exporter.service << 'EOF'
[Unit]
Description=Kamailio Prometheus Exporter
After=kamailio.service
[Service]
ExecStart=/usr/local/bin/kamailio_exporter \
--kamailio.address=unix:/var/run/kamailio/kamailio_ctl \
--web.listen-address=:9494
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now kamailio-exporter
Key Kamailio metrics:
| Metric | Meaning | Alert Threshold |
|---|---|---|
kamailio_dialog_active |
Active calls through Kamailio | >5000 (capacity warning) |
kamailio_tmx_code_total{code="5xx"} |
5xx SIP errors | >10/min |
kamailio_tmx_code_total{code="408"} |
Request timeouts | >5/min |
kamailio_dispatcher_target_up |
Backend FS health | == 0 (all down) |
kamailio_sl_sent_replies_total |
Reply rate | Sudden drop/spike |
kamailio_pike_blocked |
Rate-limited IPs | >0 (potential attack) |
kamailio_core_shm_free |
Shared memory free | <10% (memory pressure) |
FreeSWITCH Exporter
# Install freeswitch-exporter
pip3 install freeswitch-exporter
# Or use a custom script via ESL
cat > /usr/local/bin/freeswitch_exporter.py << 'PYEOF'
#!/usr/bin/env python3
"""FreeSWITCH Prometheus exporter via ESL."""
import subprocess
import time
from prometheus_client import start_http_server, Gauge
# Metrics
calls_active = Gauge('freeswitch_calls_active', 'Active calls')
channels_active = Gauge('freeswitch_channels_active', 'Active channels')
registrations = Gauge('freeswitch_registrations_active', 'Active registrations')
cpu_idle = Gauge('freeswitch_cpu_idle_percent', 'CPU idle percentage')
sessions_peak = Gauge('freeswitch_sessions_peak', 'Peak sessions since start')
sessions_per_sec = Gauge('freeswitch_sessions_per_second', 'Current sessions per second')
uptime = Gauge('freeswitch_uptime_seconds', 'Uptime in seconds')
def collect():
try:
# Active calls
out = subprocess.check_output(["fs_cli", "-x", "show calls count"], text=True)
calls_active.set(int(out.strip().split()[0]))
# Channels
out = subprocess.check_output(["fs_cli", "-x", "show channels count"], text=True)
channels_active.set(int(out.strip().split()[0]))
# Registrations
out = subprocess.check_output(["fs_cli", "-x", "show registrations count"], text=True)
registrations.set(int(out.strip().split()[0]))
# Status
out = subprocess.check_output(["fs_cli", "-x", "status"], text=True)
for line in out.split('\n'):
if 'session(s) - peak' in line:
parts = line.split()
sessions_peak.set(int(parts[0]))
if 'session(s) per Sec' in line:
parts = line.split()
sessions_per_sec.set(float(parts[0]))
if 'years' in line or 'days' in line or 'hours' in line:
# Parse uptime — simplified
pass
except Exception as e:
print(f"Collection error: {e}")
if __name__ == '__main__':
start_http_server(9282)
while True:
collect()
time.sleep(15)
PYEOF
chmod +x /usr/local/bin/freeswitch_exporter.py
# Create systemd service
cat > /etc/systemd/system/freeswitch-exporter.service << 'EOF'
[Unit]
Description=FreeSWITCH Prometheus Exporter
After=freeswitch.service
[Service]
ExecStart=/usr/local/bin/freeswitch_exporter.py
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now freeswitch-exporter
RTPEngine Exporter
# rtpengine-exporter scrapes RTPEngine's statistics interface
cat > /usr/local/bin/rtpengine_exporter.py << 'PYEOF'
#!/usr/bin/env python3
"""RTPEngine Prometheus exporter via ng control protocol."""
import socket
import bencodepy
import time
from prometheus_client import start_http_server, Gauge
RTPENGINE_HOST = "127.0.0.1"
RTPENGINE_PORT = 2223
# Metrics
sessions = Gauge('rtpengine_sessions_active', 'Active media sessions')
sessions_total = Gauge('rtpengine_sessions_total', 'Total sessions since start')
errors = Gauge('rtpengine_errors_total', 'Total errors')
offer_total = Gauge('rtpengine_offer_total', 'Total offer commands')
answer_total = Gauge('rtpengine_answer_total', 'Total answer commands')
delete_total = Gauge('rtpengine_delete_total', 'Total delete commands')
packets_relayed = Gauge('rtpengine_packets_relayed', 'Packets relayed')
bytes_relayed = Gauge('rtpengine_bytes_relayed', 'Bytes relayed')
def query_rtpengine(command):
"""Send ng protocol command to RTPEngine."""
cookie = "stats_" + str(int(time.time()))
msg = bencodepy.encode({
b"command": command.encode()
})
full_msg = f"{cookie} ".encode() + msg
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.settimeout(2)
sock.sendto(full_msg, (RTPENGINE_HOST, RTPENGINE_PORT))
data, _ = sock.recvfrom(65535)
sock.close()
# Strip cookie prefix
space_idx = data.index(b' ')
return bencodepy.decode(data[space_idx + 1:])
def collect():
try:
result = query_rtpengine("list totals")
if b'result' in result and result[b'result'] == b'ok':
totals = result.get(b'totals', {})
sessions.set(totals.get(b'current_sessions', 0))
sessions_total.set(totals.get(b'total_sessions', 0))
offer_total.set(totals.get(b'offer', 0))
answer_total.set(totals.get(b'answer', 0))
delete_total.set(totals.get(b'delete', 0))
except Exception as e:
print(f"Collection error: {e}")
if __name__ == '__main__':
start_http_server(9283)
while True:
collect()
time.sleep(15)
PYEOF
chmod +x /usr/local/bin/rtpengine_exporter.py
Prometheus Scrape Configuration
Add to your prometheus.yml:
scrape_configs:
# Kamailio
- job_name: 'kamailio'
static_configs:
- targets:
- 'YOUR_KAM1_PRIVATE:9494'
- 'YOUR_KAM2_PRIVATE:9494'
labels:
component: 'kamailio'
# FreeSWITCH
- job_name: 'freeswitch'
static_configs:
- targets:
- 'YOUR_FS1_IP:9282'
- 'YOUR_FS2_IP:9282'
- 'YOUR_FS3_IP:9282'
labels:
component: 'freeswitch'
# RTPEngine
- job_name: 'rtpengine'
static_configs:
- targets:
- 'YOUR_RTP1_PRIVATE:9283'
- 'YOUR_RTP2_PRIVATE:9283'
labels:
component: 'rtpengine'
# MariaDB (via mysqld_exporter)
- job_name: 'mariadb'
static_configs:
- targets:
- 'YOUR_DB1_IP:9104'
- 'YOUR_DB2_IP:9104'
- 'YOUR_DB3_IP:9104'
labels:
component: 'database'
Grafana Dashboard
Import or create a dashboard with these panels:
Row 1: Platform Overview
- Total active calls (sum of all FS instances)
- Active registrations
- Calls per second (rate)
- Platform uptime
Row 2: Kamailio
- Active dialogs (gauge)
- SIP response codes (stacked bar: 2xx, 3xx, 4xx, 5xx)
- Dispatcher backend status (table: name, state, latency)
- Shared memory usage (%)
Row 3: FreeSWITCH
- Active calls per instance (stacked area)
- Channels per instance (line)
- CPU usage per instance (line)
- Sessions per second (rate)
Row 4: RTPEngine
- Active media sessions (gauge)
- Packets relayed per second (rate)
- Media errors (rate)
- Session duration histogram
Row 5: Database
- Queries per second
- Replication lag (Galera)
- Connection count
- Slow queries
Homer — SIP Capture and Analysis
Homer provides deep SIP packet analysis — essential for debugging call flows across multiple components.
# Install heplify agent on each SIP component (Kamailio, FreeSWITCH)
wget https://github.com/sipcapture/heplify/releases/latest/download/heplify -O /usr/local/bin/heplify
chmod +x /usr/local/bin/heplify
# Run heplify on Kamailio servers
cat > /etc/systemd/system/heplify.service << 'EOF'
[Unit]
Description=HEPlify SIP Capture Agent
After=network.target
[Service]
ExecStart=/usr/local/bin/heplify \
-i eth0 \
-hs YOUR_HOMER_IP:9060 \
-m SIP \
-dim REGISTER \
-pr 5060-5061
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now heplify
Alerting Rules
# prometheus/alerts/voip-platform.yml
groups:
- name: voip_platform
rules:
# All FreeSWITCH servers down
- alert: AllMediaServersDown
expr: count(freeswitch_calls_active) == 0
for: 1m
labels:
severity: critical
annotations:
summary: "All FreeSWITCH media servers are down"
# Single FreeSWITCH down
- alert: MediaServerDown
expr: up{job="freeswitch"} == 0
for: 2m
labels:
severity: warning
annotations:
summary: "FreeSWITCH {{ $labels.instance }} is down"
# Kamailio high error rate
- alert: KamailioHighErrorRate
expr: rate(kamailio_tmx_code_total{code=~"5.."}[5m]) > 0.5
for: 5m
labels:
severity: warning
annotations:
summary: "Kamailio 5xx error rate > 0.5/sec"
# Dispatcher all backends down
- alert: DispatcherAllBackendsDown
expr: kamailio_dispatcher_target_up == 0
for: 30s
labels:
severity: critical
annotations:
summary: "All dispatcher backends are down"
# RTPEngine down
- alert: RTPEngineDown
expr: up{job="rtpengine"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "RTPEngine {{ $labels.instance }} is down"
# High call volume (capacity planning)
- alert: HighCallVolume
expr: sum(freeswitch_calls_active) > 2000
for: 5m
labels:
severity: warning
annotations:
summary: "Platform handling {{ $value }} concurrent calls (threshold: 2000)"
# Database replication lag
- alert: GaleraReplicationLag
expr: mysql_galera_cluster_status{wsrep_local_recv_queue_avg} > 10
for: 5m
labels:
severity: warning
annotations:
summary: "Galera replication queue building up"
Operational Runbook
Adding a New FreeSWITCH Node
# 1. Set up the new server (base OS + FreeSWITCH install from Section 7)
# 2. Configure SIP profile, ACL, dialplan (copy from existing FS node)
# 3. Test locally: fs_cli -x "sofia status profile kamailio"
# 4. Add to dispatcher database
mysql -u kamailio -pYOUR_DB_PASSWORD kamailio -e \
"INSERT INTO dispatcher (setid, destination, flags, priority, attrs, description) \
VALUES (1, 'sip:NEW_FS_IP:5060', 0, 0, 'weight=50;duid=fs04', 'FreeSWITCH-4 Media');"
# 5. Reload dispatcher on Kamailio
kamcmd dispatcher.reload
# 6. Verify the new node appears
kamcmd dispatcher.list
# 7. Monitor — should start receiving calls within seconds
Removing a FreeSWITCH Node
# 1. Drain the node (Section 11)
./drain-freeswitch.sh fs03 YOUR_FS3_IP
# 2. Stop FreeSWITCH
ssh YOUR_FS3_IP "systemctl stop freeswitch"
# 3. Remove from dispatcher database
mysql -u kamailio -pYOUR_DB_PASSWORD kamailio -e \
"DELETE FROM dispatcher WHERE destination='sip:YOUR_FS3_IP:5060';"
# 4. Reload dispatcher
kamcmd dispatcher.reload
Certificate Rotation
# 1. Renew certificate (certbot handles this automatically)
certbot renew
# 2. Reload services (handled by deploy hook, but manual if needed)
systemctl reload kamailio
systemctl restart rtpengine
# 3. Verify TLS
openssl s_client -connect YOUR_PUBLIC_VIP:5061 -brief
openssl s_client -connect YOUR_PUBLIC_VIP:8443 -brief
14. Troubleshooting
Call Flow Debugging Across Components
When a call fails, you need to trace it through all components. The Call-ID is the common thread:
Step 1: Find the Call-ID
- From the SIP phone/trunk: check the INVITE headers
- From Kamailio logs: grep for the caller/callee number
- From Homer: search by phone number or time range
Step 2: Trace through Kamailio
grep "CALL-ID-HERE" /var/log/kamailio.log
Step 3: Check which FreeSWITCH received it
- Look for "DISPATCH:" log line with the Call-ID
- Note the destination IP
Step 4: Trace on FreeSWITCH
ssh fs01 "grep 'CALL-ID-HERE' /var/log/freeswitch/freeswitch.log"
Step 5: Check RTPEngine
- RTPEngine logs show SDP manipulation per Call-ID
journalctl -u rtpengine | grep "CALL-ID-HERE"
Live Debugging Commands
# ---- Kamailio ----
# Watch SIP traffic in real-time
sngrep -d eth0 port 5060
# Enable debug logging temporarily
kamcmd cfg.seti core debug 4
# ... reproduce the issue ...
kamcmd cfg.seti core debug 2 # Restore normal level
# Check active dialogs
kamcmd dlg.list
# Check dispatcher status
kamcmd dispatcher.list
# Memory usage
kamcmd core.shmmem
# ---- FreeSWITCH ----
# Show active calls
fs_cli -x "show calls"
# Show active channels with details
fs_cli -x "show channels"
# Trace a specific call (enable sofia debug)
fs_cli -x "sofia loglevel all 9"
# ... reproduce the issue ...
fs_cli -x "sofia loglevel all 0" # Restore
# SIP trace on the kamailio profile
fs_cli -x "sofia profile kamailio siptrace on"
# ... reproduce ...
fs_cli -x "sofia profile kamailio siptrace off"
# Check codec negotiation
fs_cli -x "show channels" | grep -E "codec|read_codec|write_codec"
# ---- RTPEngine ----
# List all active sessions
rtpengine-ctl list sessions
# Show detailed stats
rtpengine-ctl list totals
# Show per-session details (requires Call-ID)
rtpengine-ctl list sessions CALL-ID-HERE
Common Issues and Solutions
Calls Not Reaching FreeSWITCH (Dispatcher Issues)
Symptom: Kamailio returns 503 "Service Unavailable"
Check 1: Are FreeSWITCH servers marked as active?
kamcmd dispatcher.list
Look for "FLAGS: AP" (Active + Probing)
If "FLAGS: IP" or "FLAGS: DX" — server is detected as down
Check 2: Can Kamailio reach FreeSWITCH on port 5060?
# From Kamailio server
nc -u -z YOUR_FS1_IP 5060 && echo OK || echo FAIL
sipsak -s sip:test@YOUR_FS1_IP:5060
Check 3: Is FreeSWITCH actually listening?
ssh YOUR_FS1_IP "ss -ulnp | grep 5060"
ssh YOUR_FS1_IP "fs_cli -x 'sofia status profile kamailio'"
Check 4: ACL blocking?
ssh YOUR_FS1_IP "fs_cli -x 'reloadacl'"
Check /var/log/freeswitch/freeswitch.log for "ACL reject"
Fix: If FS is running but dispatcher shows inactive, manually reset:
kamcmd dispatcher.set_state a 1 sip:YOUR_FS1_IP:5060
One-Way Audio (RTPEngine Issues)
Symptom: Call connects but audio only flows in one direction (or no audio)
Check 1: Is RTPEngine running and reachable?
echo 'd7:command4:pinge' | nc -u YOUR_RTP1_PRIVATE 2223
Expected: 'd6:result4:ponge'
Check 2: Are the RTPEngine interfaces correct?
rtpengine-ctl list sessions
Verify the session shows correct internal and external IPs
Check 3: SDP analysis — is RTPEngine rewriting SDPs correctly?
sngrep on Kamailio — compare SDP in INVITE before and after rtpengine_offer()
The c= line should change from external IP to internal IP (towards FS)
The c= line in 200 OK should change from FS IP to external IP (towards trunk)
Check 4: Firewall — are RTP ports open?
On RTPEngine server: ufw status | grep 20000
Must allow 20000-40000/udp from anywhere (external endpoints)
Check 5: Are there asymmetric routes?
RTP must flow: External ↔ RTPEngine ↔ FreeSWITCH
If any hop has incorrect routing, media breaks
Common fix: Verify interface= lines in rtpengine.conf
interface = internal/PRIVATE_IP ← Must be reachable from FreeSWITCH
interface = external/PRIVATE_IP!PUBLIC_IP ← PUBLIC_IP must be routable from internet
Registration Loops
Symptom: Registrations fail or loop infinitely
Check: Kamailio is trying to proxy REGISTER to FreeSWITCH,
FreeSWITCH is sending it back to Kamailio
Fix: Ensure Kamailio handles registrations locally (save to location table)
OR ensure FreeSWITCH does not relay registrations back
In kamailio.cfg, the REGISTER handler should either:
save("location") — store locally
OR forward to FS and NOT relay back
In FreeSWITCH, ensure the kamailio profile does NOT have:
<param name="accept-blind-reg" value="true"/>
Keepalived VIP Not Floating
Symptom: VIP stays on failed node or does not move to standby
Check 1: Is Keepalived running on both nodes?
systemctl status keepalived
Check 2: VRRP communication
tcpdump -i eth0 vrrp
Both nodes should be sending VRRP advertisements
Check 3: Virtual router ID conflict?
Ensure virtual_router_id is the same on both nodes
Ensure no other Keepalived instance on the network uses the same ID
Check 4: Check health script
/etc/keepalived/check_kamailio.sh
echo $? # Should be 0 (healthy) or 1 (unhealthy)
Check 5: IP forwarding
sysctl net.ipv4.ip_nonlocal_bind
# Must be 1 for the backup node to send SIP from the VIP
echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.d/90-voip.conf
sysctl -p /etc/sysctl.d/90-voip.conf
Performance Bottleneck Identification
| Symptom | Likely Bottleneck | Check | Solution |
|---|---|---|---|
| High SIP latency | Kamailio CPU or database | top on Kamailio; slow query log |
Add Kamailio workers; optimize DB queries |
| Choppy audio | RTPEngine CPU or network | top on RTPEngine; packet loss check |
More RTPEngine CPU; check network path |
| Call setup delays | FreeSWITCH overloaded | fs_cli status; check sessions |
Add more FS instances to pool |
| Registration failures | Database slow | Check MariaDB slow query log | Index optimization; increase connections |
| WebRTC connection failures | TLS/DTLS issues | Check certificates; browser console | Renew certs; verify DTLS config |
| Failover too slow | Keepalived/dispatcher timing | Check advert_int and ds_ping_interval |
Reduce intervals (trade-off: more traffic) |
Capacity Planning Formulas
Kamailio (signaling only):
Max CPS = CPU_cores × 1000 (approximately)
4-core = ~4,000 calls/sec setup rate
Memory: ~1 KB per active dialog + ~0.5 KB per registration
RTPEngine (media relay):
Max streams = CPU_cores × 500 (G.711, no transcoding)
8-core = ~4,000 RTP streams = ~2,000 concurrent calls
With transcoding: divide by 3-5x
Bandwidth: 87 kbps × concurrent_calls × 2 (bidirectional)
FreeSWITCH (media processing):
G.711 (no transcoding): CPU_cores × 300
With recording: CPU_cores × 200
With transcoding: CPU_cores × 100
With conferencing: CPU_cores × 50 (mixing is expensive)
Memory: ~2 MB per active call (+ recording buffer)
Disk I/O: ~100 KB/s per recorded call (G.711)
Database:
1 registration = 1 write + periodic refreshes
1 call = ~5-10 queries (setup + routing + CDR)
10,000 concurrent calls ≈ 500-1,000 queries/sec
Essential Commands Quick Reference
| Component | Command | Purpose |
|---|---|---|
| Kamailio | kamcmd dispatcher.list |
Show backend status |
kamcmd dlg.list |
List active dialogs | |
kamcmd core.shmmem |
Check memory usage | |
kamcmd cfg.seti core debug 4 |
Enable debug logging | |
kamcmd ul.dump |
Dump registration table | |
kamcmd stats.get_statistics all |
All statistics | |
sngrep -d eth0 port 5060 |
Live SIP capture | |
| FreeSWITCH | fs_cli -x "show calls" |
List active calls |
fs_cli -x "show channels" |
List active channels | |
fs_cli -x "sofia status" |
SIP profile status | |
fs_cli -x "sofia status profile kamailio" |
Kamailio profile details | |
fs_cli -x "reloadxml" |
Reload XML config | |
fs_cli -x "status" |
Overall status | |
| RTPEngine | rtpengine-ctl list sessions |
Active media sessions |
rtpengine-ctl list totals |
Aggregate statistics | |
echo 'd7:command4:pinge' | nc -u IP 2223 |
Ping test | |
| Keepalived | ip addr show eth0 |
Check VIP assignment |
systemctl status keepalived |
Service status | |
journalctl -u keepalived -f |
Live logs | |
| MariaDB | SHOW STATUS LIKE 'wsrep%'; |
Galera cluster status |
SHOW PROCESSLIST; |
Active queries | |
| Homer | Web UI: http://HOMER_IP:9080 |
SIP trace search |
This concludes Tutorial 43. You now have the knowledge to build, operate, and troubleshoot a carrier-grade VoIP platform with Kamailio + FreeSWITCH + RTPEngine. The architecture described here scales from hundreds to tens of thousands of concurrent calls and provides the fault tolerance expected of production telecommunications infrastructure.
Key takeaways:
- Separation of concerns is the fundamental design principle: Kamailio for signaling, RTPEngine for media relay, FreeSWITCH for call logic
- Dispatcher is the heart of the load balancing: understand algorithms, probing, and failover
- RTPEngine solves NAT, WebRTC bridging, and topology hiding for media — it is essential in any production deployment
- HA comes from pool architecture for FreeSWITCH (not active/standby) and VIP failover for Kamailio
- Monitor everything with Prometheus + Grafana + Homer — you cannot fix what you cannot see
- Practice draining and failover before you need it in production