← All Tutorials

Kamailio + FreeSWITCH — Load Balancing & High Availability

Infrastructure & DevOps Advanced 75 min read #43

Tutorial 43: Kamailio + FreeSWITCH — Load Balancing & High Availability

Build a carrier-grade VoIP platform by combining Kamailio as a SIP proxy/load balancer with multiple FreeSWITCH media servers. This advanced tutorial covers dispatcher-based load balancing, RTPEngine for media relay, WebRTC gateway integration, database-driven routing, geographic failover, and full high availability with no single point of failure. This is the architecture used by every major VoIP provider handling millions of calls — and by the end of this tutorial, you will have a production-ready platform capable of 10,000+ concurrent calls with zero-downtime upgrades and geographic redundancy.

Difficulty: Advanced Reading time: ~80 minutes Prerequisites: Tutorial 41 — FreeSWITCH Fundamentals, Tutorial 42 — Kamailio Fundamentals Technologies: Kamailio, FreeSWITCH, RTPEngine, Keepalived, MariaDB Galera, WebRTC, TLS, DNS SRV, Prometheus, Homer OS: Debian 12 (Bookworm) for all servers


Table of Contents

  1. Introduction — Why Combine Kamailio + FreeSWITCH
  2. Architecture Overview
  3. Prerequisites & Server Planning
  4. Kamailio SBC Configuration
  5. Dispatcher — Load Balancing FreeSWITCH
  6. RTPEngine — Media Relay
  7. FreeSWITCH Media Server Configuration
  8. Database-Driven Routing
  9. WebRTC Gateway
  10. High Availability — Kamailio
  11. High Availability — FreeSWITCH
  12. Geographic Distribution
  13. Monitoring & Operations
  14. Troubleshooting

1. Introduction

Why Combine Kamailio and FreeSWITCH?

Kamailio and FreeSWITCH are both excellent SIP platforms, but they excel at fundamentally different things:

Capability Kamailio FreeSWITCH
SIP proxy/routing Exceptional — 50,000+ TPS Basic — not designed for proxying
Media handling None — signaling only Exceptional — codec, mixing, recording
Load balancing Built-in (dispatcher) Not applicable
NAT traversal Signaling only (nathelper) Full (media + signaling)
IVR / call queues Not available Full-featured
Conference bridges Not available Full-featured
Voicemail Not available Full-featured
Concurrent calls 100,000+ (signaling) 2,000-5,000 (with media)
Horizontal scaling Stateless — easy Stateful — complex
WebRTC WSS termination Media bridging

The combination gives you separation of concerns:

What This Architecture Gets You

Who Uses This Architecture?

Every major VoIP carrier and CPaaS provider runs some variation of this stack:


2. Architecture Overview

Production Architecture Diagram

                        ┌──────────────────────────────────────┐
                        │          DNS / SRV Records           │
                        │    sip.YOUR_DOMAIN → Kamailio VIP    │
                        │    _sip._udp.YOUR_DOMAIN SRV         │
                        └──────────────────┬───────────────────┘
                                           │
                          ┌────────────────┼────────────────┐
                          │                │                │
              ┌───────────▼──┐    ┌────────▼────┐   ┌──────▼───────┐
  Layer 2     │  Kamailio-A  │    │  Keepalived │   │  Kamailio-B  │
  SBC Pair    │  (Active)    │◄──►│  VIP Float  │◄─►│  (Standby)   │
              │  SIP+WSS     │    │             │   │  SIP+WSS     │
              └──────┬───────┘    └─────────────┘   └──────┬───────┘
                     │                                      │
                     │          SIP (signaling)             │
           ┌─────────┴──────────────────────────────────────┴─────────┐
           │                                                          │
  ┌────────▼────────┐                                   ┌─────────────▼──┐
  │  RTPEngine-1    │         Layer 3                   │  RTPEngine-2   │
  │  (Media Relay)  │         Media Relay               │  (Media Relay) │
  └────────┬────────┘                                   └──────┬─────────┘
           │                  RTP (media)                      │
           └─────────┬─────────────────────────┬───────────────┘
                     │                         │
           ┌─────────▼──────┐        ┌─────────▼──────┐
           │  FreeSWITCH-1  │        │  FreeSWITCH-2  │      ...N
  Layer 4  │  (Media/App)   │        │  (Media/App)   │
           │  IVR/Queue/Rec │        │  IVR/Queue/Rec │
           └────────┬───────┘        └────────┬───────┘
                    │                          │
                    └──────────┬───────────────┘
                               │
                    ┌──────────▼──────────┐
  Layer 5          │  MariaDB Galera      │
  Database         │  (3-node cluster)    │
                   │  Users/CDR/Config    │
                   └──────────────────────┘

Component Responsibility Matrix

Component Layer Primary Role Handles
DNS/SRV 1 Geographic failover Route clients to nearest DC
Kamailio (x2) 2 SIP proxy / SBC Authentication, routing, load balancing, NAT fix, rate limiting, topology hiding, WebSocket termination
RTPEngine (x2) 3 Media relay RTP proxying, NAT traversal for media, SRTP↔RTP bridging, DTLS termination, codec transcoding
FreeSWITCH (x2-N) 4 Media application IVR, call queues, conferencing, voicemail, recording, call control, DTMF handling
MariaDB Galera (x3) 5 Shared state User credentials, CDRs, routing rules, call state, configuration

Traffic Flow — Inbound Call

1. External SIP INVITE → DNS resolves to Kamailio VIP
2. Kamailio: authenticate trunk, apply rate limits
3. Kamailio: nathelper fixes Contact/Via headers
4. Kamailio: rtpengine_offer() — RTPEngine rewrites SDP (external→internal)
5. Kamailio: dispatcher selects FreeSWITCH from pool
6. Kamailio: forward INVITE to FreeSWITCH (topology hidden)
7. FreeSWITCH: executes dialplan (IVR, queue, bridge to agent)
8. FreeSWITCH: 200 OK → Kamailio
9. Kamailio: rtpengine_answer() — RTPEngine rewrites SDP (internal→external)
10. Kamailio: 200 OK → external trunk
11. Media flows: External ↔ RTPEngine ↔ FreeSWITCH

3. Prerequisites & Server Planning

Minimum Lab Setup (5 Servers)

For learning and development, you need at minimum:

Role Hostname IP (example) vCPU RAM Disk Bandwidth
Kamailio SBC kam01 10.0.1.10 2 2 GB 20 GB 100 Mbps
RTPEngine rtp01 10.0.1.20 4 4 GB 20 GB 1 Gbps
FreeSWITCH 1 fs01 10.0.1.30 4 8 GB 100 GB 1 Gbps
FreeSWITCH 2 fs02 10.0.1.31 4 8 GB 100 GB 1 Gbps
MariaDB db01 10.0.1.40 2 4 GB 50 GB 100 Mbps

Production Setup (10+ Servers)

For production with high availability:

Role Count vCPU each RAM each Disk Notes
Kamailio SBC 2 4 4 GB 20 GB Active/standby with Keepalived VIP
RTPEngine 2 8 8 GB 20 GB Stateless — either can handle any stream
FreeSWITCH 3-4 8 16 GB 500 GB Recordings stored locally or NFS
MariaDB Galera 3 4 16 GB 200 GB 3-node cluster for quorum
Homer (SIP capture) 1 4 8 GB 500 GB Optional but strongly recommended

Capacity Planning

Approximate capacity per server (your mileage will vary with codec, recording, and transcoding):

Component Metric Capacity
Kamailio (4 vCPU) SIP transactions/sec 5,000-10,000
Kamailio (4 vCPU) Concurrent registrations 100,000+
RTPEngine (8 vCPU) Concurrent RTP streams 2,000-4,000
FreeSWITCH (8 vCPU, 16 GB) Concurrent calls (G.711) 1,500-2,500
FreeSWITCH (8 vCPU, 16 GB) Concurrent calls (with recording) 800-1,500
FreeSWITCH (8 vCPU, 16 GB) Concurrent calls (with transcoding) 500-1,000

Network Requirements

IP Addressing Plan

# Public IPs (exposed to internet)
YOUR_PUBLIC_VIP    = Floating VIP for Kamailio HA pair
YOUR_KAM1_PUBLIC   = Kamailio-A public IP
YOUR_KAM2_PUBLIC   = Kamailio-B public IP
YOUR_RTP1_PUBLIC   = RTPEngine-1 public IP
YOUR_RTP2_PUBLIC   = RTPEngine-2 public IP

# Private IPs (internal network — 10.0.1.0/24)
YOUR_KAM1_PRIVATE  = 10.0.1.10   # Kamailio-A
YOUR_KAM2_PRIVATE  = 10.0.1.11   # Kamailio-B
YOUR_RTP1_PRIVATE  = 10.0.1.20   # RTPEngine-1
YOUR_RTP2_PRIVATE  = 10.0.1.21   # RTPEngine-2
YOUR_FS1_IP        = 10.0.1.30   # FreeSWITCH-1
YOUR_FS2_IP        = 10.0.1.31   # FreeSWITCH-2
YOUR_FS3_IP        = 10.0.1.32   # FreeSWITCH-3
YOUR_DB1_IP        = 10.0.1.40   # MariaDB node 1
YOUR_DB2_IP        = 10.0.1.41   # MariaDB node 2
YOUR_DB3_IP        = 10.0.1.42   # MariaDB node 3

Base OS Setup (All Servers)

Run on every Debian 12 server:

#!/bin/bash
# base-setup.sh — Run on all servers

# Set timezone
timedatectl set-timezone UTC

# Update system
apt-get update && apt-get upgrade -y

# Install common packages
apt-get install -y \
    curl wget gnupg2 lsb-release apt-transport-https \
    ca-certificates software-properties-common \
    net-tools tcpdump ngrep sngrep \
    htop iotop sysstat \
    vim tmux git \
    ufw fail2ban \
    ntp

# Enable NTP (critical for SIP — clock skew breaks authentication)
systemctl enable --now ntp

# Set hostname (replace per server)
# hostnamectl set-hostname kam01.YOUR_DOMAIN

# Configure /etc/hosts on all servers
cat >> /etc/hosts << 'EOF'
10.0.1.10   kam01
10.0.1.11   kam02
10.0.1.20   rtp01
10.0.1.21   rtp02
10.0.1.30   fs01
10.0.1.31   fs02
10.0.1.32   fs03
10.0.1.40   db01
10.0.1.41   db02
10.0.1.42   db03
EOF

# Kernel tuning for VoIP
cat > /etc/sysctl.d/90-voip.conf << 'EOF'
# Network buffer sizes
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 1048576
net.core.wmem_default = 1048576
net.core.netdev_max_backlog = 50000

# Connection tracking (high for SIP)
net.netfilter.nf_conntrack_max = 1000000
net.netfilter.nf_conntrack_udp_timeout = 60
net.netfilter.nf_conntrack_udp_timeout_stream = 180

# TCP tuning
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15

# File descriptors
fs.file-max = 1000000
fs.nr_open = 1000000

# Disable SIP ALG (critical!)
net.netfilter.nf_conntrack_helper = 0
EOF
sysctl -p /etc/sysctl.d/90-voip.conf

# Increase file descriptor limits
cat > /etc/security/limits.d/voip.conf << 'EOF'
*    soft    nofile    1000000
*    hard    nofile    1000000
*    soft    nproc     65535
*    hard    nproc     65535
EOF

echo "Base setup complete. Reboot recommended."

4. Kamailio SBC Configuration

Install Kamailio with Required Modules

#!/bin/bash
# install-kamailio-sbc.sh — Run on kam01 and kam02

# Add Kamailio 5.8 repository
curl -fsSL https://deb.kamailio.org/kamailiodebkey.gpg | gpg --dearmor -o /usr/share/keyrings/kamailio.gpg
echo "deb [signed-by=/usr/share/keyrings/kamailio.gpg] http://deb.kamailio.org/kamailio58 bookworm main" \
    > /etc/apt/sources.list.d/kamailio.list
apt-get update

# Install Kamailio + all modules we need
apt-get install -y \
    kamailio \
    kamailio-mysql-modules \
    kamailio-tls-modules \
    kamailio-websocket-modules \
    kamailio-json-modules \
    kamailio-extra-modules \
    kamailio-utils-modules

# Enable Kamailio service
systemctl enable kamailio

Generate TLS Certificates

# Install certbot for Let's Encrypt
apt-get install -y certbot

# Get certificate (stop any service on port 80 first)
certbot certonly --standalone -d sip.YOUR_DOMAIN --agree-tos -m admin@YOUR_DOMAIN

# Create Kamailio TLS directory
mkdir -p /etc/kamailio/tls

# Link certificates
ln -sf /etc/letsencrypt/live/sip.YOUR_DOMAIN/fullchain.pem /etc/kamailio/tls/server.pem
ln -sf /etc/letsencrypt/live/sip.YOUR_DOMAIN/privkey.pem /etc/kamailio/tls/server.key

# Set permissions
chown -R kamailio:kamailio /etc/kamailio/tls/
chmod 600 /etc/kamailio/tls/server.key

Production kamailio.cfg

This is the complete, production-ready configuration for Kamailio operating as an SBC/load balancer in front of FreeSWITCH:

#!KAMAILIO
##
## Kamailio SBC Configuration
## Role: SIP proxy / load balancer / WebRTC gateway
## Backend: FreeSWITCH media server pool via dispatcher
##

## ---- Global Parameters ----

#!define DBURL "mysql://kamailio:YOUR_DB_PASSWORD@YOUR_DB1_IP/kamailio"
#!define FLT_NATS 5      # roles for NAT traversal
#!define FLB_NATB 6      # roles for NAT contact binding
#!define FLT_DLG  4      # roles for dialog tracking

#!define MY_PUBLIC_IP "YOUR_KAM1_PUBLIC"
#!define MY_PRIVATE_IP "YOUR_KAM1_PRIVATE"
#!define MY_DOMAIN "sip.YOUR_DOMAIN"

#!define WITH_MYSQL
#!define WITH_NAT
#!define WITH_TLS
#!define WITH_WEBSOCKETS
#!define WITH_RTPENGINE
#!define WITH_DISPATCHER
#!define WITH_ANTIFLOOD
#!define WITH_TOPOH

## ---- Core Parameters ----

debug=2
log_stderror=no
log_facility=LOG_LOCAL0
log_prefix="{$mt $hdr(CSeq) $ci} "

memdbg=5
memlog=5

fork=yes
children=8           # Worker processes — adjust based on CPU cores
tcp_children=4       # TCP/TLS/WSS worker processes

listen=udp:MY_PRIVATE_IP:5060
listen=tcp:MY_PRIVATE_IP:5060
listen=udp:MY_PUBLIC_IP:5060
listen=tcp:MY_PUBLIC_IP:5060
#!ifdef WITH_TLS
listen=tls:MY_PUBLIC_IP:5061
#!endif
#!ifdef WITH_WEBSOCKETS
listen=tcp:MY_PRIVATE_IP:8080  # WS (behind Nginx)
listen=tls:MY_PUBLIC_IP:8443   # WSS (direct or behind Nginx)
#!endif

tcp_connection_lifetime=3605
tcp_accept_no_cl=yes
tcp_rd_buf_size=16384

server_header="Server: VoIP-Platform"
user_agent_header="User-Agent: VoIP-Platform"

## ---- Module Loading ----

loadmodule "jsonrpcs.so"
loadmodule "kex.so"
loadmodule "corex.so"
loadmodule "tm.so"
loadmodule "tmx.so"
loadmodule "sl.so"
loadmodule "rr.so"
loadmodule "pv.so"
loadmodule "maxfwd.so"
loadmodule "textops.so"
loadmodule "siputils.so"
loadmodule "xlog.so"
loadmodule "sanity.so"
loadmodule "ctl.so"
loadmodule "cfg_rpc.so"
loadmodule "counters.so"
loadmodule "sdpops.so"
loadmodule "path.so"

#!ifdef WITH_MYSQL
loadmodule "db_mysql.so"
#!endif

loadmodule "usrloc.so"
loadmodule "registrar.so"

loadmodule "nathelper.so"
loadmodule "rtpengine.so"

loadmodule "dialog.so"

loadmodule "pike.so"
loadmodule "htable.so"

#!ifdef WITH_TLS
loadmodule "tls.so"
#!endif

#!ifdef WITH_WEBSOCKETS
loadmodule "websocket.so"
loadmodule "xhttp.so"
#!endif

#!ifdef WITH_DISPATCHER
loadmodule "dispatcher.so"
#!endif

#!ifdef WITH_TOPOH
loadmodule "topoh.so"
#!endif

## ---- Module Parameters ----

# -- jsonrpcs --
modparam("jsonrpcs", "pretty_format", 1)
modparam("jsonrpcs", "transport", 1)

# -- tm --
modparam("tm", "failure_reply_mode", 3)
modparam("tm", "fr_timer", 30000)         # 30s final response timeout
modparam("tm", "fr_inv_timer", 120000)    # 120s INVITE response timeout
modparam("tm", "restart_fr_on_each_reply", 1)
modparam("tm", "auto_inv_100_reason", "Trying")

# -- rr (Record-Route) --
modparam("rr", "enable_full_lr", 1)
modparam("rr", "append_fromtag", 1)
modparam("rr", "enable_double_rr", 1)    # Required for topology hiding

# -- registrar --
modparam("registrar", "method_filtering", 1)
modparam("registrar", "max_expires", 3600)
modparam("registrar", "default_expires", 300)
modparam("registrar", "gruu_enabled", 0)

# -- usrloc --
#!ifdef WITH_MYSQL
modparam("usrloc", "db_url", DBURL)
modparam("usrloc", "db_mode", 2)         # Write-through for HA
#!else
modparam("usrloc", "db_mode", 0)         # Memory only
#!endif
modparam("usrloc", "nat_bflag", FLB_NATB)

# -- nathelper --
modparam("nathelper", "natping_interval", 30)
modparam("nathelper", "ping_nated_only", 1)
modparam("nathelper", "sipping_bflag", FLB_NATB)
modparam("nathelper", "sipping_from", "sip:keepalive@MY_DOMAIN")
modparam("nathelper", "sipping_method", "OPTIONS")

# -- rtpengine --
modparam("rtpengine", "rtpengine_sock", "udp:YOUR_RTP1_PRIVATE:2223")
# For multiple RTPEngine instances:
# modparam("rtpengine", "rtpengine_sock", "udp:YOUR_RTP1_PRIVATE:2223=1 udp:YOUR_RTP2_PRIVATE:2223=1")

# -- dialog --
modparam("dialog", "dlg_flag", FLT_DLG)
modparam("dialog", "track_cseq_updates", 1)
#!ifdef WITH_MYSQL
modparam("dialog", "db_url", DBURL)
modparam("dialog", "db_mode", 1)         # Realtime for HA
modparam("dialog", "db_update_period", 60)
#!endif

# -- pike (rate limiting) --
#!ifdef WITH_ANTIFLOOD
modparam("pike", "sampling_time_unit", 2)
modparam("pike", "reqs_density_per_unit", 30)  # 30 req/2sec per IP
modparam("pike", "remove_latency", 4)
#!endif

# -- htable (hash tables for rate limiting / blacklisting) --
modparam("htable", "htable", "blocked=>size=8;autoexpire=300;")
modparam("htable", "htable", "failcnt=>size=8;autoexpire=60;initval=0;")

# -- dispatcher --
#!ifdef WITH_DISPATCHER
modparam("dispatcher", "db_url", DBURL)
modparam("dispatcher", "ds_ping_method", "OPTIONS")
modparam("dispatcher", "ds_ping_interval", 10)      # Ping every 10 seconds
modparam("dispatcher", "ds_ping_reply_codes", "class2;class3;class4")
modparam("dispatcher", "ds_probing_mode", 1)         # Probe all destinations
modparam("dispatcher", "ds_probing_threshold", 3)    # 3 failures = inactive
modparam("dispatcher", "ds_inactive_threshold", 3)   # 3 successes = active again
modparam("dispatcher", "ds_ping_latency_stats", 1)
#!endif

# -- TLS --
#!ifdef WITH_TLS
modparam("tls", "config", "/etc/kamailio/tls.cfg")
modparam("tls", "tls_force_run", 1)
#!endif

# -- WebSocket --
#!ifdef WITH_WEBSOCKETS
modparam("websocket", "keepalive_mechanism", 1)      # PING frames
modparam("websocket", "keepalive_timeout", 30)
modparam("websocket", "keepalive_processes", 1)
#!endif

# -- topoh (topology hiding) --
#!ifdef WITH_TOPOH
modparam("topoh", "mask_ip", "255.255.255.255")
modparam("topoh", "mask_callid", 1)
modparam("topoh", "th_callid_prefix", "VoIP-")
modparam("topoh", "th_ip_prefix", "sbc.")
#!endif

## ==== Routing Logic ====

## ---- Main Request Route ----
request_route {
    # Per-request logging
    xlog("L_INFO", ">>> $rm from $fu ($si:$sp) to $ru\n");

    # Max forwards check
    if (!mf_process_maxfwd_header("10")) {
        sl_send_reply("483", "Too Many Hops");
        exit;
    }

    # Sanity checks
    if (!sanity_check("17895", "7")) {
        xlog("L_WARN", "Malformed SIP from $si:$sp\n");
        exit;
    }

    # ---- Anti-flood / DDoS Protection ----
    #!ifdef WITH_ANTIFLOOD
    route(ANTIFLOOD);
    #!endif

    # ---- Handle WebSocket connections ----
    #!ifdef WITH_WEBSOCKETS
    if (proto == WS || proto == WSS) {
        # WebSocket SIP — force record-route with WS
        if (is_method("REGISTER")) {
            # Allow WS registrations
        }
    }
    #!endif

    # ---- CANCEL processing ----
    if (is_method("CANCEL")) {
        if (t_check_trans()) {
            route(RTPENGINE_DELETE);
            t_relay();
        }
        exit;
    }

    # ---- Retransmission handling ----
    if (!is_method("ACK")) {
        if (t_precheck_trans()) {
            t_check_trans();
            exit;
        }
        t_check_trans();
    }

    # ---- Record-Route for dialogs ----
    if (is_method("INVITE|SUBSCRIBE")) {
        record_route();
    }

    # ---- Sequential requests (in-dialog) ----
    if (has_totag()) {
        route(WITHINDLG);
        exit;
    }

    # ---- Initial requests ----

    # Handle REGISTER
    if (is_method("REGISTER")) {
        route(REGISTRAR);
        exit;
    }

    # Handle OPTIONS (keepalive)
    if (is_method("OPTIONS") && uri == myself) {
        sl_send_reply("200", "OK");
        exit;
    }

    # Handle INVITE — main call processing
    if (is_method("INVITE")) {
        # Enable dialog tracking
        setflag(FLT_DLG);
        dlg_manage();

        # NAT detection and fixing
        route(NATDETECT);

        # RTPEngine: offer (external→internal bridging)
        route(RTPENGINE_OFFER);

        # Dispatch to FreeSWITCH pool
        route(DISPATCH);
        exit;
    }

    # Handle other methods
    if (is_method("NOTIFY|INFO|UPDATE|PRACK")) {
        route(RELAY);
        exit;
    }

    # Reject anything else
    sl_send_reply("405", "Method Not Allowed");
    exit;
}


## ---- In-Dialog Request Routing ----
route[WITHINDLG] {
    if (!loose_route()) {
        if (is_method("ACK")) {
            if (!t_check_trans()) {
                # ACK without matching transaction — absorb
                exit;
            }
        }
        sl_send_reply("404", "Not Found");
        exit;
    }

    if (is_method("ACK")) {
        route(NATMANAGE);
    } else if (is_method("BYE")) {
        route(RTPENGINE_DELETE);
    } else if (is_method("INVITE")) {
        # Re-INVITE — handle RTPEngine for media changes
        route(NATDETECT);
        route(RTPENGINE_OFFER);
    }

    route(RELAY);
    exit;
}


## ---- Relay Route ----
route[RELAY] {
    if (is_method("INVITE|BYE|SUBSCRIBE|UPDATE|REFER")) {
        if (!t_is_set("branch_route")) {
            t_on_branch("MANAGE_BRANCH");
        }
    }
    if (is_method("INVITE|SUBSCRIBE|UPDATE")) {
        if (!t_is_set("onreply_route")) {
            t_on_reply("MANAGE_REPLY");
        }
    }
    if (is_method("INVITE")) {
        if (!t_is_set("failure_route")) {
            t_on_failure("MANAGE_FAILURE");
        }
    }

    if (!t_relay()) {
        sl_reply_error();
    }
    exit;
}


## ---- REGISTER Handling ----
route[REGISTRAR] {
    # NAT detection for registrations
    route(NATDETECT);

    # Option 1: Store locally (Kamailio manages registrations)
    if (!save("location")) {
        sl_reply_error();
    }
    exit;

    # Option 2: Proxy registrations to FreeSWITCH (uncomment if FS manages registrations)
    # route(DISPATCH);
    # exit;
}


## ---- Dispatcher — Load Balance to FreeSWITCH ----
route[DISPATCH] {
    # Set 1 = FreeSWITCH media servers
    # Algorithm 0 = hash over callid (sticky sessions — in-dialog goes to same FS)
    # Flags: 2 = failover support, 4 = use only active destinations
    if (!ds_select_dst("1", "0", "6")) {
        xlog("L_ERR", "DISPATCH: No FreeSWITCH servers available!\n");
        sl_send_reply("503", "Service Unavailable");
        exit;
    }

    xlog("L_INFO", "DISPATCH: Routing $rm to $du (FS pool)\n");

    t_on_failure("DISPATCH_FAILURE");
    route(RELAY);
    exit;
}


## ---- NAT Detection ----
route[NATDETECT] {
    force_rport();
    if (nat_uac_test("19")) {
        # Client is behind NAT
        setflag(FLT_NATS);
        setbflag(FLB_NATB);

        if (is_first_hop()) {
            set_contact_alias();
        }
    }
}


## ---- NAT Management ----
route[NATMANAGE] {
    if (is_request()) {
        if (has_totag()) {
            if (check_route_param("nat=yes")) {
                setbflag(FLB_NATB);
            }
        }
    }
    if (isbflagset(FLB_NATB)) {
        if (is_request()) {
            add_contact_alias();
        } else {
            add_contact_alias();
        }
    }
}


## ---- RTPEngine Routes ----
route[RTPENGINE_OFFER] {
    if (!is_method("INVITE|UPDATE")) return;
    if (!has_body("application/sdp")) return;

    $var(rtpflags) = "replace-origin replace-session-connection";

    # Determine direction based on source
    if ($si == "YOUR_FS1_IP" || $si == "YOUR_FS2_IP" || $si == "YOUR_FS3_IP") {
        # From FreeSWITCH → going external
        $var(rtpflags) = $var(rtpflags) + " direction=internal direction=external";
    } else {
        # From external → going to FreeSWITCH
        $var(rtpflags) = $var(rtpflags) + " direction=external direction=internal";
    }

    # WebRTC client — need ICE and DTLS
    if (proto == WS || proto == WSS) {
        $var(rtpflags) = $var(rtpflags) + " ICE=force DTLS=passive SDES-off";
    }

    rtpengine_offer("$var(rtpflags)");
}

route[RTPENGINE_ANSWER] {
    if (!has_body("application/sdp")) return;

    $var(rtpflags) = "replace-origin replace-session-connection";

    # Mirror the direction logic from the offer
    if ($si == "YOUR_FS1_IP" || $si == "YOUR_FS2_IP" || $si == "YOUR_FS3_IP") {
        $var(rtpflags) = $var(rtpflags) + " direction=internal direction=external";
    } else {
        $var(rtpflags) = $var(rtpflags) + " direction=external direction=internal";
    }

    if (proto == WS || proto == WSS) {
        $var(rtpflags) = $var(rtpflags) + " ICE=force DTLS=passive SDES-off";
    }

    rtpengine_answer("$var(rtpflags)");
}

route[RTPENGINE_DELETE] {
    rtpengine_delete();
}


## ---- Anti-Flood Protection ----
#!ifdef WITH_ANTIFLOOD
route[ANTIFLOOD] {
    # Skip checks for trusted IPs (FreeSWITCH servers, trunks)
    if ($si == "YOUR_FS1_IP" || $si == "YOUR_FS2_IP" || $si == "YOUR_FS3_IP") {
        return;
    }

    # Check if IP is in blocked table
    if ($sht(blocked=>$si) != $null) {
        xlog("L_WARN", "ANTIFLOOD: Blocked request from $si\n");
        exit;
    }

    # Pike rate limiter
    if (!pike_check_req()) {
        xlog("L_ALERT", "ANTIFLOOD: Pike blocking $si — rate limit exceeded\n");
        $sht(blocked=>$si) = 1;    # Block for 300 seconds (htable autoexpire)
        exit;
    }
}
#!endif


## ---- Branch Route ----
branch_route[MANAGE_BRANCH] {
    xlog("L_DBG", "BRANCH: new branch [$T_branch_idx] to $ru\n");
    route(NATMANAGE);
}


## ---- Reply Route ----
onreply_route[MANAGE_REPLY] {
    xlog("L_DBG", "REPLY: $rs $rr from $si\n");

    if (status =~ "[12][0-9][0-9]") {
        route(NATMANAGE);
    }

    # RTPEngine answer on 183/200 with SDP
    if (status =~ "(183|200)" && has_body("application/sdp")) {
        route(RTPENGINE_ANSWER);
    }
}


## ---- Failure Route — Dispatcher Failover ----
failure_route[DISPATCH_FAILURE] {
    if (t_is_canceled()) exit;

    xlog("L_WARN", "DISPATCH_FAILURE: $rs from $du — trying next FS\n");

    # On failure (timeout, 5xx), try next server
    if (t_check_status("5[0-9][0-9]") || t_check_status("408")) {
        # Clean up RTPEngine session for failed branch
        route(RTPENGINE_DELETE);

        # Try next dispatcher destination
        if (ds_next_dst()) {
            xlog("L_INFO", "DISPATCH_FAILURE: Failing over to $du\n");
            route(RTPENGINE_OFFER);
            route(RELAY);
            exit;
        }
    }

    # All dispatchers failed
    xlog("L_ERR", "DISPATCH_FAILURE: All FreeSWITCH servers failed\n");
    send_reply("503", "All Media Servers Unavailable");
}

failure_route[MANAGE_FAILURE] {
    if (t_is_canceled()) exit;

    xlog("L_WARN", "FAILURE: $rs for $rm to $ru\n");
}


## ---- WebSocket HTTP Handling ----
#!ifdef WITH_WEBSOCKETS
event_route[xhttp:request] {
    set_reply_close();
    set_reply_no_connect();

    if ($hdr(Upgrade) =~ "websocket" &&
        $hdr(Connection) =~ "Upgrade" &&
        $rm =~ "GET") {

        # Validate WebSocket handshake
        if ($hdr(Sec-WebSocket-Protocol) =~ "sip") {
            # Accept the WebSocket upgrade
            if (ws_handle_handshake()) {
                exit;
            }
        }
    }

    # Not a WebSocket request — return 403
    xhttp_reply("403", "Forbidden", "text/html",
        "<html><body>Forbidden</body></html>");
}
#!endif

TLS Configuration

Create /etc/kamailio/tls.cfg:

# /etc/kamailio/tls.cfg

[server:default]
method = TLSv1.2+
certificate = /etc/kamailio/tls/server.pem
private_key = /etc/kamailio/tls/server.key
verify_certificate = no
require_certificate = no
cipher_list = HIGH:!aNULL:!MD5:!DSS

[client:default]
method = TLSv1.2+
verify_certificate = no

Initialize Kamailio Database

# Create the Kamailio database and tables
kamdbctl create

# When prompted:
# MySQL password for root: (your MySQL root password)
# Database name: kamailio (default)
# Install extra tables? Yes
# Install presence tables? No (not needed for SBC role)

# Verify tables exist
mysql -u kamailio -pYOUR_DB_PASSWORD kamailio -e "SHOW TABLES;"

Firewall Rules for Kamailio

# UFW firewall rules for Kamailio SBC
ufw default deny incoming
ufw default allow outgoing

# SSH
ufw allow 22/tcp

# SIP (UDP + TCP)
ufw allow 5060/udp
ufw allow 5060/tcp

# SIP TLS
ufw allow 5061/tcp

# WebSocket (WSS)
ufw allow 8443/tcp

# Allow all traffic from internal network
ufw allow from 10.0.1.0/24

# Enable firewall
ufw enable

Start and Verify

# Check configuration syntax
kamailio -c /etc/kamailio/kamailio.cfg

# Start Kamailio
systemctl start kamailio

# Verify it is listening
ss -ulnp | grep kamailio
ss -tlnp | grep kamailio

# Check logs
journalctl -u kamailio -f

# Test SIP response
sipgrep -p 5060 &
sipsak -s sip:test@YOUR_KAM1_PUBLIC:5060

5. Dispatcher — Load Balancing FreeSWITCH

Understanding Dispatcher

The dispatcher module is Kamailio's built-in load balancer. It maintains a list of backend servers organized in destination sets (groups), monitors their health via SIP OPTIONS pings, and distributes traffic using configurable algorithms.

Dispatcher Algorithms

Algorithm ID Description Best For
Hash over Call-ID 0 Same Call-ID always goes to same server Standard calls — in-dialog requests stay together
Hash over From URI 1 Same caller always goes to same server User affinity
Hash over To URI 2 Same destination always goes to same server DID-based routing
Hash over Request-URI 3 Same R-URI goes to same server Service-based routing
Round-robin 4 Sequential rotation through servers Even distribution
Hash over auth username 5 Authenticated user affinity Registered users
Random 6 Random selection Simple load spreading
Hash over PV 7 Hash over any pseudo-variable Custom logic
Weight-based 8 Proportional distribution by weight Heterogeneous servers
Call load 9 Least connections (tracks active calls) Best for even load
Relative weight 10 Weight-based with relative proportions Mixed-capacity servers

Recommended for VoIP: Algorithm 0 (Call-ID hash) ensures all SIP messages for the same call go to the same FreeSWITCH. Algorithm 9 (call load distribution) gives the most even load if you use stateless mode.

Why Call-ID Hash Matters

SIP calls involve multiple transactions:

INVITE → 100 Trying → 180 Ringing → 200 OK → ACK
  ... call in progress ...
re-INVITE (hold/resume)
BYE → 200 OK

All messages for the same call must reach the same FreeSWITCH. Call-ID hash guarantees this because the Call-ID header is constant for the entire call. Without it, a re-INVITE or BYE could go to a different FreeSWITCH that knows nothing about the call.

Database-Backed Dispatcher Configuration

Populate the dispatcher table:

-- Connect to kamailio database
USE kamailio;

-- Destination set 1: FreeSWITCH media servers
-- Columns: id, setid, destination, flags, priority, attrs, description

INSERT INTO dispatcher (setid, destination, flags, priority, attrs, description)
VALUES
(1, 'sip:YOUR_FS1_IP:5060', 0, 0, 'weight=50;duid=fs01', 'FreeSWITCH-1 Media'),
(1, 'sip:YOUR_FS2_IP:5060', 0, 0, 'weight=50;duid=fs02', 'FreeSWITCH-2 Media'),
(1, 'sip:YOUR_FS3_IP:5060', 0, 0, 'weight=50;duid=fs03', 'FreeSWITCH-3 Media');

-- Destination set 2: Conference-dedicated FreeSWITCH (optional)
-- Useful to route high-resource conference calls to dedicated servers
INSERT INTO dispatcher (setid, destination, flags, priority, attrs, description)
VALUES
(2, 'sip:YOUR_FS3_IP:5060', 0, 0, 'weight=100;duid=fs03-conf', 'FreeSWITCH-3 Conference');

-- Verify
SELECT * FROM dispatcher;

Dispatcher Flags and Options

Flags in ds_select_dst("setid", "algorithm", "flags"):

  Flag 1:  Try next destination on failure (basic failover)
  Flag 2:  Store all destination addresses (for ds_next_dst)
  Flag 4:  Skip inactive destinations (honor probing results)
  Flag 8:  Select from active only (same as 4, explicit)
  Flag 16: Use addresses from AVP as destination set

Common combinations:
  6  = 2 + 4 = Store all + skip inactive (recommended)
  14 = 2 + 4 + 8 = Full failover with active-only selection

Enhanced Dispatch Route

Here is an improved dispatch route with monitoring and logging:

## ---- Advanced Dispatcher Route ----
route[DISPATCH] {
    # Determine dispatch set based on call type
    $var(dispatch_set) = 1;    # Default: general media servers

    # Route conference calls to dedicated set (if configured)
    if ($rU =~ "^conf[0-9]+$") {
        $var(dispatch_set) = 2;
    }

    # Select destination with:
    #   Algorithm 0 = Call-ID hash (sticky sessions)
    #   Flags 6 = failover support (2) + skip inactive (4)
    if (!ds_select_dst("$var(dispatch_set)", "0", "6")) {
        xlog("L_ERR", "DISPATCH: No destinations available in set $var(dispatch_set)!\n");

        # Try fallback set if primary set is empty
        if ($var(dispatch_set) != 1) {
            xlog("L_WARN", "DISPATCH: Falling back to general set 1\n");
            if (!ds_select_dst("1", "0", "6")) {
                sl_send_reply("503", "Service Unavailable — No Media Servers");
                exit;
            }
        } else {
            sl_send_reply("503", "Service Unavailable — No Media Servers");
            exit;
        }
    }

    # Log the selected destination
    xlog("L_INFO", "DISPATCH: $rm $fu → $du (set=$var(dispatch_set))\n");

    # Set failure route for failover
    t_on_failure("DISPATCH_FAILURE");
    route(RELAY);
    exit;
}


## ---- Dispatcher Failure Route ----
failure_route[DISPATCH_FAILURE] {
    if (t_is_canceled()) exit;

    # Only failover on server errors or timeouts
    if (t_check_status("5[0-9][0-9]") || t_check_status("408")) {
        xlog("L_WARN", "DISPATCH_FAILURE: $rs from $du — trying next\n");

        # Mark this destination as probing (will be checked by OPTIONS pings)
        ds_mark_dst("p");

        # Clean up RTPEngine for the failed branch
        route(RTPENGINE_DELETE);

        # Try next destination in the set
        if (ds_next_dst()) {
            xlog("L_INFO", "DISPATCH_FAILURE: Failover to $du\n");
            route(RTPENGINE_OFFER);
            route(RELAY);
            exit;
        }

        xlog("L_ERR", "DISPATCH_FAILURE: All servers exhausted\n");
    }

    # 4xx responses: pass through to caller (authentication errors, etc.)
    if (t_check_status("4[0-9][0-9]")) {
        xlog("L_INFO", "DISPATCH_FAILURE: 4xx response $rs — passing through\n");
    }
}

Runtime Dispatcher Management

# List all dispatcher destinations and their status
kamcmd dispatcher.list

# Output example:
# DEST: {
#   URI: sip:10.0.1.30:5060
#   FLAGS: AP    (A=Active, P=Probing enabled)
#   PRIORITY: 0
#   LATENCY: {
#     AVG: 2.450ms
#     MAX: 8.120ms
#     TIMEOUT: 0
#   }
# }

# Manually set a destination as inactive (for maintenance)
kamcmd dispatcher.set_state i 1 sip:YOUR_FS1_IP:5060

# State codes: a=active, i=inactive, d=disabled, p=probing

# Re-enable a destination after maintenance
kamcmd dispatcher.set_state a 1 sip:YOUR_FS1_IP:5060

# Reload dispatcher table from database (after adding/removing servers)
kamcmd dispatcher.reload

# Check the number of active destinations per set
kamcmd dispatcher.list | grep -c "FLAGS: AP"

Probing Configuration Details

The probing system sends SIP OPTIONS pings to each backend to detect failures:

Sequence:
1. Kamailio sends OPTIONS to FreeSWITCH every ds_ping_interval seconds
2. FreeSWITCH responds with 200 OK (healthy) or no response (down)
3. After ds_probing_threshold consecutive failures → destination marked INACTIVE
4. Probing continues on inactive destinations
5. After ds_inactive_threshold consecutive successes → destination marked ACTIVE

Timeline example (ds_ping_interval=10, ds_probing_threshold=3):
  t=0s   OPTIONS → FS1: 200 OK       (active, count=0)
  t=10s  OPTIONS → FS1: timeout       (active, fail_count=1)
  t=20s  OPTIONS → FS1: timeout       (active, fail_count=2)
  t=30s  OPTIONS → FS1: timeout       (INACTIVE, fail_count=3)  ← traffic stops
  t=40s  OPTIONS → FS1: 200 OK        (inactive, ok_count=1)    ← still probing
  t=50s  OPTIONS → FS1: 200 OK        (inactive, ok_count=2)
  t=60s  OPTIONS → FS1: 200 OK        (ACTIVE, ok_count=3)      ← traffic resumes

Detection time: 30 seconds (3 failures x 10 second interval). For faster detection, reduce ds_ping_interval to 5 seconds, but be aware of the additional OPTIONS traffic.


6. RTPEngine — Media Relay

Why RTPEngine?

In the Kamailio + FreeSWITCH architecture, RTPEngine solves critical media-layer problems:

Problem RTPEngine Solution
NAT traversal (media) Relays RTP through a public IP — no direct path needed between endpoints
WebRTC ↔ SIP bridging Converts DTLS-SRTP (WebRTC) ↔ plain RTP (SIP)
Topology hiding (media) External parties see RTPEngine's IP, not FreeSWITCH's internal IP
Codec transcoding Converts between codecs (e.g., G.729 ↔ G.711) without burdening FreeSWITCH
Call recording Can record RTP streams to pcap files
SRTP Terminates and originates SRTP for encrypted calls

Without RTPEngine, you would need FreeSWITCH on a public IP (security risk) or complex iptables NAT rules (fragile and hard to scale).

Install RTPEngine on Debian 12

#!/bin/bash
# install-rtpengine.sh — Run on rtp01 and rtp02

# Add Sipwise repository for RTPEngine
echo "deb [signed-by=/usr/share/keyrings/sipwise.gpg] https://deb.sipwise.com/spce/mr12.5.1/ bookworm main" \
    > /etc/apt/sources.list.d/sipwise.list

curl -fsSL https://deb.sipwise.com/spce/keyring/sipwise-keyring-bootstrap.gpg | \
    gpg --dearmor -o /usr/share/keyrings/sipwise.gpg

apt-get update

# Install RTPEngine
apt-get install -y rtpengine

# If the Sipwise repo is not available, build from source:
# apt-get install -y build-essential dpkg-dev debhelper iptables-dev \
#     libavcodec-dev libavfilter-dev libavformat-dev libavutil-dev \
#     libbencode-perl libcrypt-openssl-rsa-perl libcrypt-rijndael-perl \
#     libcurl4-openssl-dev libdigest-hmac-perl libevent-dev \
#     libglib2.0-dev libhiredis-dev libio-multiplex-perl \
#     libio-socket-inet6-perl libjson-glib-dev libmnl-dev \
#     libnet-interface-perl libnftnl-dev libpcap0.8-dev \
#     libpcre3-dev libspandsp-dev libssl-dev libsystemd-dev \
#     libwebsockets-dev libxmlrpc-core-c3-dev markdown nfs-common \
#     pandoc
#
# git clone https://github.com/sipwise/rtpengine.git
# cd rtpengine
# dpkg-buildpackage -b -uc -us
# dpkg -i ../rtpengine_*.deb

RTPEngine Configuration

Create /etc/rtpengine/rtpengine.conf:

# /etc/rtpengine/rtpengine.conf
# RTPEngine configuration for Kamailio + FreeSWITCH platform

[rtpengine]
# Control socket — Kamailio connects here
listen-ng = YOUR_RTP1_PRIVATE:2223

# Network interfaces
# Format: label/IP or label/internal_IP!external_IP
# "internal" = towards FreeSWITCH (private network)
# "external" = towards the internet (public IP)
interface = internal/YOUR_RTP1_PRIVATE
interface = external/YOUR_RTP1_PRIVATE!YOUR_RTP1_PUBLIC

# RTP port range
port-min = 20000
port-max = 40000

# Timeouts
timeout = 60              # RTP timeout (no media received)
silent-timeout = 3600     # Timeout for calls with no RTP at all
final-timeout = 7200      # Hard maximum call duration

# TOS/DSCP for QoS
tos = 184                 # EF (Expedited Forwarding) for voice

# Recording (optional — pcap files)
# recording-dir = /var/spool/rtpengine
# recording-method = pcap

# Codec transcoding support
# Requires compilation with ffmpeg/libavcodec
# allow-transcoding = true

# Logging
log-level = 5             # 5=notice, 6=info, 7=debug
log-facility = daemon
log-facility-cdr = local1

# Process settings
pidfile = /run/rtpengine/rtpengine.pid
foreground = false
num-threads = 0           # 0 = auto (one per CPU core)

# Table (iptables/nftables kernel module — for kernel-space forwarding)
# table = 0               # Uncomment for kernel-space RTP relay (better performance)
# no-fallback = false

Systemd Service

Create or edit /etc/systemd/system/rtpengine.service:

[Unit]
Description=RTPEngine Media Proxy
After=network.target
Requires=network.target

[Service]
Type=forking
PIDFile=/run/rtpengine/rtpengine.pid
ExecStartPre=/bin/mkdir -p /run/rtpengine
ExecStartPre=/bin/chown rtpengine:rtpengine /run/rtpengine
ExecStart=/usr/bin/rtpengine --config-file=/etc/rtpengine/rtpengine.conf
ExecStop=/bin/kill -TERM $MAINPID
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Firewall Rules for RTPEngine

# Control interface — only from Kamailio
ufw allow from YOUR_KAM1_PRIVATE to any port 2223 proto udp
ufw allow from YOUR_KAM2_PRIVATE to any port 2223 proto udp

# RTP port range — from anywhere (media comes from external endpoints)
ufw allow 20000:40000/udp

# SSH
ufw allow 22/tcp

ufw enable

Start and Verify RTPEngine

# Start the service
systemctl daemon-reload
systemctl enable --now rtpengine

# Check it is running
systemctl status rtpengine

# Verify listening
ss -ulnp | grep rtpengine

# Test the control interface (from Kamailio server)
echo 'd7:command4:pinge' | nc -u YOUR_RTP1_PRIVATE 2223
# Should respond with: d6:result4:ponge

# Check active sessions (from RTPEngine server)
rtpengine-ctl list sessions

Kamailio ↔ RTPEngine Integration

The rtpengine module in Kamailio communicates with RTPEngine via the ng (next-generation) control protocol over UDP. The flow is:

Inbound INVITE (with SDP):
1. Kamailio receives INVITE from external trunk
2. route(RTPENGINE_OFFER): Kamailio sends "offer" to RTPEngine
   - RTPEngine allocates UDP ports for RTP relay
   - RTPEngine rewrites SDP: external endpoint ↔ RTPEngine ↔ internal FreeSWITCH
   - SDP in INVITE now points to RTPEngine's internal IP (towards FreeSWITCH)
3. Kamailio forwards modified INVITE to FreeSWITCH
4. FreeSWITCH sends 200 OK (with SDP)
5. onreply_route: Kamailio sends "answer" to RTPEngine
   - RTPEngine rewrites SDP in 200 OK: FreeSWITCH → RTPEngine's external IP
6. Kamailio sends modified 200 OK to external trunk
7. Media flows: External ↔ RTPEngine (external iface) ↔ RTPEngine (internal iface) ↔ FreeSWITCH

SDP Manipulation Example

Original SDP from external trunk:
  c=IN IP4 203.0.113.50      ← trunk's RTP IP
  m=audio 30000 RTP/AVP 0 8  ← trunk's RTP port

After rtpengine_offer() — SDP sent to FreeSWITCH:
  c=IN IP4 10.0.1.20         ← RTPEngine's INTERNAL interface
  m=audio 20100 RTP/AVP 0 8  ← RTPEngine's allocated port (internal side)

FreeSWITCH answers with SDP:
  c=IN IP4 10.0.1.30         ← FreeSWITCH's IP
  m=audio 19200 RTP/AVP 0    ← FreeSWITCH's RTP port

After rtpengine_answer() — SDP sent to external trunk:
  c=IN IP4 YOUR_RTP1_PUBLIC  ← RTPEngine's EXTERNAL interface
  m=audio 20200 RTP/AVP 0    ← RTPEngine's allocated port (external side)

Result: External trunk sends RTP to RTPEngine's public IP
        RTPEngine relays to FreeSWITCH on private network
        Neither party knows the other's real IP

RTPEngine Clustering

For high availability, run multiple RTPEngine instances. Kamailio can be configured to use them:

# In kamailio.cfg — multiple RTPEngine backends
# Format: "udp:IP:PORT=weight udp:IP:PORT=weight"
modparam("rtpengine", "rtpengine_sock",
    "udp:YOUR_RTP1_PRIVATE:2223=1 udp:YOUR_RTP2_PRIVATE:2223=1")

RTPEngine instances are stateless from a clustering perspective — each instance independently manages its own RTP sessions. Kamailio uses consistent hashing (based on Call-ID) to ensure that offer, answer, and delete for the same call go to the same RTPEngine instance.

If an RTPEngine instance goes down:


7. FreeSWITCH Media Server Configuration

Design Principle: Headless Media Server

When FreeSWITCH runs behind Kamailio, its role changes:

Traditional FreeSWITCH Behind Kamailio
Handles SIP registration Kamailio handles registration
Manages NAT traversal RTPEngine handles NAT
Authenticates SIP peers Kamailio authenticates
Exposed to internet Internal network only
Processes all SIP methods Only receives pre-routed calls
Single instance Multiple instances in a pool

FreeSWITCH becomes a headless media application server — it focuses purely on call logic, media processing, and application features.

Install FreeSWITCH

#!/bin/bash
# install-freeswitch.sh — Run on fs01, fs02, fs03

# Add SignalWire repository
TOKEN="YOUR_SIGNALWIRE_TOKEN"  # Get from signalwire.com (free account)

apt-get install -y gnupg2 lsb-release

curl -fsSL https://freeswitch.signalwire.com/repo/deb/debian-release/signalwire-freeswitch-repo.gpg \
    > /usr/share/keyrings/signalwire-freeswitch-repo.gpg

echo "machine freeswitch.signalwire.com login signalwire password $TOKEN" > /etc/apt/auth.conf
chmod 600 /etc/apt/auth.conf

echo "deb [signed-by=/usr/share/keyrings/signalwire-freeswitch-repo.gpg] \
    https://freeswitch.signalwire.com/repo/deb/debian-release/ bookworm main" \
    > /etc/apt/sources.list.d/freeswitch.list

apt-get update

# Install FreeSWITCH with common modules
apt-get install -y \
    freeswitch-meta-codecs \
    freeswitch-mod-commands \
    freeswitch-mod-conference \
    freeswitch-mod-console \
    freeswitch-mod-db \
    freeswitch-mod-dialplan-xml \
    freeswitch-mod-dptools \
    freeswitch-mod-enum \
    freeswitch-mod-event-socket \
    freeswitch-mod-fifo \
    freeswitch-mod-hash \
    freeswitch-mod-httapi \
    freeswitch-mod-local-stream \
    freeswitch-mod-logfile \
    freeswitch-mod-loopback \
    freeswitch-mod-native-file \
    freeswitch-mod-say-en \
    freeswitch-mod-sndfile \
    freeswitch-mod-sofia \
    freeswitch-mod-tone-stream \
    freeswitch-mod-voicemail \
    freeswitch-mod-xml-cdr \
    freeswitch-mod-xml-curl

systemctl enable freeswitch

SIP Profile — Internal Only

FreeSWITCH should only accept SIP from Kamailio. Create a dedicated SIP profile.

Edit /etc/freeswitch/sip_profiles/kamailio.xml:

<!--
  FreeSWITCH SIP Profile: kamailio
  Purpose: Accept calls only from Kamailio SBC
  This profile listens on the private network and trusts Kamailio
-->
<profile name="kamailio">
  <settings>
    <!-- Listen only on private network -->
    <param name="sip-ip" value="$${local_ip_v4}"/>
    <param name="sip-port" value="5060"/>
    <param name="rtp-ip" value="$${local_ip_v4}"/>

    <!-- Disable RTP timer — RTPEngine handles media relay/timeout -->
    <param name="rtp-timeout-sec" value="0"/>
    <param name="rtp-hold-timeout-sec" value="0"/>

    <!-- Dialplan context for calls from Kamailio -->
    <param name="context" value="from-kamailio"/>

    <!-- Disable authentication — Kamailio already authenticated the call -->
    <param name="challenge-realm" value="auto_from"/>

    <!-- Accept all calls from trusted IPs (Kamailio) -->
    <param name="apply-inbound-acl" value="kamailio-acl"/>

    <!-- Disable registration on this profile -->
    <param name="accept-blind-reg" value="false"/>

    <!-- Codec preferences -->
    <param name="inbound-codec-prefs" value="PCMA,PCMU,G722,opus"/>
    <param name="outbound-codec-prefs" value="PCMA,PCMU,G722,opus"/>
    <param name="inbound-codec-negotiation" value="generous"/>

    <!-- Dialog management -->
    <param name="manage-presence" value="false"/>
    <param name="manage-shared-appearance" value="false"/>

    <!-- Pass-through — let Kamailio handle NAT -->
    <param name="aggressive-nat-detection" value="false"/>
    <param name="local-network-acl" value="localnet.auto"/>

    <!-- SIP options -->
    <param name="disable-transfer" value="false"/>
    <param name="enable-timer" value="false"/>
    <param name="enable-100rel" value="false"/>

    <!-- Logging -->
    <param name="log-auth-failures" value="true"/>
    <param name="debug" value="0"/>
  </settings>
</profile>

ACL — Only Accept from Kamailio

Edit /etc/freeswitch/autoload_configs/acl.conf.xml:

<configuration name="acl.conf" description="Network ACL">
  <network-lists>
    <!-- Kamailio SBC servers -->
    <list name="kamailio-acl" default="deny">
      <node type="allow" cidr="YOUR_KAM1_PRIVATE/32"/>
      <node type="allow" cidr="YOUR_KAM2_PRIVATE/32"/>
    </list>

    <!-- RTPEngine servers (for direct media) -->
    <list name="rtpengine-acl" default="deny">
      <node type="allow" cidr="YOUR_RTP1_PRIVATE/32"/>
      <node type="allow" cidr="YOUR_RTP2_PRIVATE/32"/>
    </list>

    <!-- Internal network -->
    <list name="internal-acl" default="deny">
      <node type="allow" cidr="10.0.1.0/24"/>
    </list>
  </network-lists>
</configuration>

Disable Default External Profile

FreeSWITCH ships with internal and external profiles that listen on default ports. Disable them since we use our custom kamailio profile:

# Disable default profiles (move them out of the way)
mv /etc/freeswitch/sip_profiles/internal.xml /etc/freeswitch/sip_profiles/internal.xml.disabled
mv /etc/freeswitch/sip_profiles/external.xml /etc/freeswitch/sip_profiles/external.xml.disabled

# If you need the internal profile for registered extensions, keep it
# but change its port to avoid conflict:
# <param name="sip-port" value="5080"/>

Dialplan — Calls from Kamailio

Create /etc/freeswitch/dialplan/from-kamailio.xml:

<!--
  Dialplan context: from-kamailio
  Handles all calls dispatched by the Kamailio SBC
-->
<include>
  <context name="from-kamailio">

    <!-- ============================================ -->
    <!-- IVR: Main Auto-Attendant                    -->
    <!-- ============================================ -->
    <extension name="main-ivr">
      <condition field="destination_number" expression="^(ivr|2000)$">
        <action application="answer"/>
        <action application="sleep" data="500"/>
        <action application="ivr" data="main_ivr"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- Call Queue: Sales                           -->
    <!-- ============================================ -->
    <extension name="queue-sales">
      <condition field="destination_number" expression="^(sales|3001)$">
        <action application="answer"/>
        <action application="set" data="fifo_music=/usr/share/freeswitch/sounds/music/hold.wav"/>
        <action application="fifo" data="sales@${domain_name} in"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- Call Queue: Support                         -->
    <!-- ============================================ -->
    <extension name="queue-support">
      <condition field="destination_number" expression="^(support|3002)$">
        <action application="answer"/>
        <action application="set" data="fifo_music=/usr/share/freeswitch/sounds/music/hold.wav"/>
        <action application="fifo" data="support@${domain_name} in"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- Conference Bridge                           -->
    <!-- ============================================ -->
    <extension name="conference">
      <condition field="destination_number" expression="^conf(\d+)$">
        <action application="answer"/>
        <action application="conference" data="room-$1@default"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- Voicemail: Leave Message                    -->
    <!-- ============================================ -->
    <extension name="voicemail-leave">
      <condition field="destination_number" expression="^vm(\d+)$">
        <action application="answer"/>
        <action application="sleep" data="500"/>
        <action application="voicemail" data="default ${domain_name} $1"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- DID Routing: Route by called number         -->
    <!-- ============================================ -->
    <extension name="did-routing">
      <condition field="destination_number" expression="^(\+?\d{10,15})$">
        <!-- Look up DID routing from database -->
        <action application="set" data="continue_on_fail=true"/>
        <action application="set" data="hangup_after_bridge=true"/>

        <!-- Example: direct extension mapping -->
        <!-- In production, use mod_xml_curl for dynamic DID→destination lookup -->
        <action application="bridge" data="user/${destination_number}@${domain_name}"/>

        <!-- If bridge fails, send to voicemail -->
        <action application="voicemail" data="default ${domain_name} ${destination_number}"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- Internal Extension Dialing (1000-1999)      -->
    <!-- ============================================ -->
    <extension name="local-extensions">
      <condition field="destination_number" expression="^(1\d{3})$">
        <action application="set" data="call_timeout=30"/>
        <action application="set" data="continue_on_fail=true"/>
        <action application="set" data="hangup_after_bridge=true"/>
        <action application="bridge" data="user/$1@${domain_name}"/>
        <!-- No answer → voicemail -->
        <action application="voicemail" data="default ${domain_name} $1"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- Outbound Calls (via Kamailio)               -->
    <!-- ============================================ -->
    <extension name="outbound">
      <condition field="destination_number" expression="^9(\d+)$">
        <!-- Strip the 9 prefix and send back to Kamailio for trunk routing -->
        <action application="set" data="effective_caller_id_number=${outbound_caller_id_number}"/>
        <action application="bridge" data="sofia/kamailio/$1@YOUR_KAM1_PRIVATE"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- Echo Test                                   -->
    <!-- ============================================ -->
    <extension name="echo">
      <condition field="destination_number" expression="^9196$">
        <action application="answer"/>
        <action application="echo"/>
      </condition>
    </extension>

    <!-- ============================================ -->
    <!-- Catch-all: Unknown Destination              -->
    <!-- ============================================ -->
    <extension name="catch-all">
      <condition field="destination_number" expression="^(.*)$">
        <action application="log" data="WARNING: Unrouted call to ${destination_number} from ${caller_id_number}"/>
        <action application="respond" data="404"/>
      </condition>
    </extension>

  </context>
</include>

Event Socket Layer (ESL) Configuration

ESL allows external applications to control FreeSWITCH. This is essential for integration with custom applications, monitoring, and call control.

Edit /etc/freeswitch/autoload_configs/event_socket.conf.xml:

<configuration name="event_socket.conf" description="Socket Client">
  <settings>
    <!-- Listen on private network only -->
    <param name="listen-ip" value="0.0.0.0"/>
    <param name="listen-port" value="8021"/>
    <param name="password" value="YOUR_ESL_PASSWORD"/>

    <!-- ACL restriction — only allow from management network -->
    <param name="apply-inbound-acl" value="internal-acl"/>
  </settings>
</configuration>

XML CDR — Call Detail Records

Configure FreeSWITCH to POST CDRs to a central collector:

Edit /etc/freeswitch/autoload_configs/xml_cdr.conf.xml:

<configuration name="xml_cdr.conf" description="XML CDR">
  <settings>
    <!-- POST CDRs to central collector -->
    <param name="url" value="http://YOUR_DB1_IP:8080/cdr"/>
    <param name="retries" value="3"/>
    <param name="delay" value="5"/>
    <param name="log-http-and-disk" value="true"/>
    <param name="log-dir" value="/var/log/freeswitch/cdr-csv"/>
    <param name="err-log-dir" value="/var/log/freeswitch/cdr-csv/errors"/>
    <param name="encode" value="true"/>
    <param name="disable-100-continue" value="true"/>
  </settings>
</configuration>

FreeSWITCH Firewall (Internal Only)

# FreeSWITCH servers are internal only
ufw default deny incoming
ufw default allow outgoing

# SSH
ufw allow 22/tcp

# SIP from Kamailio only
ufw allow from YOUR_KAM1_PRIVATE to any port 5060 proto udp
ufw allow from YOUR_KAM1_PRIVATE to any port 5060 proto tcp
ufw allow from YOUR_KAM2_PRIVATE to any port 5060 proto udp
ufw allow from YOUR_KAM2_PRIVATE to any port 5060 proto tcp

# RTP from RTPEngine only
ufw allow from YOUR_RTP1_PRIVATE to any port 16384:32768 proto udp
ufw allow from YOUR_RTP2_PRIVATE to any port 16384:32768 proto udp

# ESL from management network
ufw allow from 10.0.1.0/24 to any port 8021 proto tcp

# Internal network (database, monitoring)
ufw allow from 10.0.1.0/24

ufw enable

Start and Verify FreeSWITCH

# Start FreeSWITCH
systemctl start freeswitch

# Verify SIP profile is loaded
fs_cli -x "sofia status"
# Should show: kamailio   sip:mod_sofia@YOUR_FS1_IP:5060   RUNNING

# Verify profile details
fs_cli -x "sofia status profile kamailio"

# Test: Send a SIP OPTIONS from Kamailio
# On Kamailio server:
kamcmd dispatcher.list
# Should show FreeSWITCH as Active (AP flags)

# Run echo test through the full chain:
# SIP phone → Kamailio → RTPEngine → FreeSWITCH (9196 echo)

Per-Instance Configuration

Each FreeSWITCH instance needs a unique switch.conf.xml with its own identity:

<!-- /etc/freeswitch/autoload_configs/switch.conf.xml -->
<configuration name="switch.conf" description="Core Configuration">
  <settings>
    <!-- Unique per instance -->
    <param name="switchname" value="fs01"/>

    <!-- Core settings -->
    <param name="max-sessions" value="5000"/>
    <param name="sessions-per-second" value="100"/>
    <param name="rtp-start-port" value="16384"/>
    <param name="rtp-end-port" value="32768"/>

    <!-- Logging -->
    <param name="loglevel" value="warning"/>
    <param name="colorize-console" value="false"/>

    <!-- Performance -->
    <param name="max-db-handles" value="50"/>
    <param name="db-handle-timeout" value="10"/>
  </settings>
</configuration>

Change switchname to fs02, fs03, etc. on each instance. This value appears in CDRs and logs, making it easy to identify which FreeSWITCH handled a call.


8. Database-Driven Routing

Shared Database Architecture

All components share a central MariaDB (or Galera cluster) for configuration, state, and CDRs. This enables:

Install MariaDB (Single Node or Galera Cluster)

For a single-node setup:

#!/bin/bash
# install-mariadb.sh — Run on db01

apt-get install -y mariadb-server mariadb-client

# Secure the installation
mysql_secure_installation
# Set root password, remove anonymous users, disable remote root, remove test DB

# Allow remote connections from private network
sed -i 's/bind-address.*/bind-address = 0.0.0.0/' /etc/mysql/mariadb.conf.d/50-server.cnf

# Performance tuning for VoIP
cat >> /etc/mysql/mariadb.conf.d/50-server.cnf << 'EOF'

# VoIP platform tuning
innodb_buffer_pool_size = 4G
innodb_log_file_size = 512M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
max_connections = 500
query_cache_type = 0
table_open_cache = 4000
tmp_table_size = 64M
max_heap_table_size = 64M
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
EOF

systemctl restart mariadb

For a Galera cluster (3 nodes), add to the config on each node:

# /etc/mysql/mariadb.conf.d/60-galera.cnf
[galera]
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_name = "voip-cluster"
wsrep_cluster_address = "gcomm://YOUR_DB1_IP,YOUR_DB2_IP,YOUR_DB3_IP"
wsrep_node_address = "YOUR_DB1_IP"  # Change per node
wsrep_node_name = "db01"            # Change per node
wsrep_sst_method = mariabackup
wsrep_sst_auth = "sst_user:YOUR_SST_PASSWORD"
binlog_format = ROW
default_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2

Database Schema

Create the databases and tables used by each component:

-- ================================================
-- Kamailio database (created by kamdbctl create)
-- Key tables used by our SBC configuration:
-- ================================================

-- subscriber — SIP user credentials
-- (auto-created by kamdbctl, shown here for reference)
CREATE TABLE IF NOT EXISTS subscriber (
    id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    username VARCHAR(64) NOT NULL DEFAULT '',
    domain VARCHAR(64) NOT NULL DEFAULT '',
    password VARCHAR(64) NOT NULL DEFAULT '',
    ha1 VARCHAR(128) NOT NULL DEFAULT '',
    ha1b VARCHAR(128) NOT NULL DEFAULT '',
    PRIMARY KEY (id),
    UNIQUE KEY sub_idx (username, domain)
) ENGINE=InnoDB;

-- dispatcher — load balancer backends (auto-created)
-- Already populated in Section 5

-- ================================================
-- Custom routing tables
-- ================================================

-- DID routing: maps incoming DIDs to destinations
CREATE TABLE did_routing (
    id INT UNSIGNED NOT NULL AUTO_INCREMENT,
    did VARCHAR(20) NOT NULL COMMENT 'Incoming DID number (E.164)',
    domain VARCHAR(64) NOT NULL DEFAULT 'default' COMMENT 'Tenant domain',
    destination VARCHAR(128) NOT NULL COMMENT 'Destination (extension, queue, IVR)',
    dest_type ENUM('extension','queue','ivr','conference','voicemail','external') NOT NULL DEFAULT 'extension',
    priority INT NOT NULL DEFAULT 0 COMMENT 'Higher = preferred',
    active TINYINT(1) NOT NULL DEFAULT 1,
    description VARCHAR(255) DEFAULT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    PRIMARY KEY (id),
    UNIQUE KEY did_domain_idx (did, domain),
    KEY active_idx (active)
) ENGINE=InnoDB;

-- Example DID routing entries
INSERT INTO did_routing (did, domain, destination, dest_type, description) VALUES
('+442012345678', 'default', '2000', 'ivr', 'UK Main — IVR'),
('+442012345679', 'default', '3001', 'queue', 'UK Sales Direct'),
('+442012345680', 'default', '1001', 'extension', 'UK CEO Direct'),
('+33123456789', 'tenant-fr.example.com', '2000', 'ivr', 'France Main — IVR'),
('+33123456790', 'tenant-fr.example.com', '3002', 'queue', 'France Support');

-- Trunk routing: outbound carrier selection
CREATE TABLE trunk_routing (
    id INT UNSIGNED NOT NULL AUTO_INCREMENT,
    prefix VARCHAR(20) NOT NULL COMMENT 'Dialed prefix (longest match wins)',
    domain VARCHAR(64) NOT NULL DEFAULT 'default',
    trunk_name VARCHAR(64) NOT NULL COMMENT 'SIP trunk identifier',
    trunk_uri VARCHAR(256) NOT NULL COMMENT 'SIP URI for the trunk',
    priority INT NOT NULL DEFAULT 0,
    weight INT NOT NULL DEFAULT 100 COMMENT 'Weight for load distribution',
    active TINYINT(1) NOT NULL DEFAULT 1,
    description VARCHAR(255) DEFAULT NULL,
    PRIMARY KEY (id),
    KEY prefix_idx (prefix, domain, active, priority)
) ENGINE=InnoDB;

-- Example trunk routing
INSERT INTO trunk_routing (prefix, domain, trunk_name, trunk_uri, priority, description) VALUES
('+44', 'default', 'carrier-a-uk', 'sip:[email protected]', 10, 'Carrier A — UK primary'),
('+44', 'default', 'carrier-b-uk', 'sip:[email protected]', 5, 'Carrier B — UK backup'),
('+33', 'default', 'carrier-a-fr', 'sip:[email protected]', 10, 'Carrier A — France'),
('+1',  'default', 'carrier-c-us', 'sip:[email protected]', 10, 'Carrier C — US/Canada');

-- ================================================
-- CDR table (all FreeSWITCH instances write here)
-- ================================================
CREATE TABLE cdr (
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
    switch_name VARCHAR(32) NOT NULL COMMENT 'FreeSWITCH instance (fs01, fs02, ...)',
    call_uuid VARCHAR(64) NOT NULL,
    sip_call_id VARCHAR(128) DEFAULT NULL,
    caller_id_number VARCHAR(32) DEFAULT NULL,
    caller_id_name VARCHAR(64) DEFAULT NULL,
    destination_number VARCHAR(32) DEFAULT NULL,
    context VARCHAR(64) DEFAULT NULL,
    start_stamp DATETIME NOT NULL,
    answer_stamp DATETIME DEFAULT NULL,
    end_stamp DATETIME NOT NULL,
    duration INT NOT NULL DEFAULT 0,
    billsec INT NOT NULL DEFAULT 0,
    hangup_cause VARCHAR(64) DEFAULT NULL,
    sip_hangup_disposition VARCHAR(64) DEFAULT NULL,
    direction ENUM('inbound','outbound','internal') DEFAULT 'inbound',
    accountcode VARCHAR(32) DEFAULT NULL,
    domain VARCHAR(64) DEFAULT NULL,
    recording_path VARCHAR(512) DEFAULT NULL,
    PRIMARY KEY (id),
    KEY call_uuid_idx (call_uuid),
    KEY sip_call_id_idx (sip_call_id),
    KEY start_stamp_idx (start_stamp),
    KEY caller_idx (caller_id_number),
    KEY dest_idx (destination_number),
    KEY domain_idx (domain)
) ENGINE=InnoDB
PARTITION BY RANGE (YEAR(start_stamp) * 100 + MONTH(start_stamp)) (
    PARTITION p202601 VALUES LESS THAN (202602),
    PARTITION p202602 VALUES LESS THAN (202603),
    PARTITION p202603 VALUES LESS THAN (202604),
    PARTITION p202604 VALUES LESS THAN (202605),
    PARTITION p202605 VALUES LESS THAN (202606),
    PARTITION p202606 VALUES LESS THAN (202607),
    PARTITION pmax VALUES LESS THAN MAXVALUE
);

Database Users and Permissions

-- Kamailio user (needs read/write on kamailio DB)
CREATE USER 'kamailio'@'10.0.1.%' IDENTIFIED BY 'YOUR_DB_PASSWORD';
GRANT ALL PRIVILEGES ON kamailio.* TO 'kamailio'@'10.0.1.%';

-- FreeSWITCH user (read routing tables, write CDRs)
CREATE USER 'freeswitch'@'10.0.1.%' IDENTIFIED BY 'YOUR_FS_DB_PASSWORD';
GRANT SELECT ON kamailio.did_routing TO 'freeswitch'@'10.0.1.%';
GRANT SELECT ON kamailio.trunk_routing TO 'freeswitch'@'10.0.1.%';
GRANT SELECT ON kamailio.subscriber TO 'freeswitch'@'10.0.1.%';
GRANT INSERT, SELECT ON kamailio.cdr TO 'freeswitch'@'10.0.1.%';

-- Grafana / monitoring (read-only)
CREATE USER 'grafana'@'10.0.1.%' IDENTIFIED BY 'YOUR_GRAFANA_DB_PASSWORD';
GRANT SELECT ON kamailio.* TO 'grafana'@'10.0.1.%';

FLUSH PRIVILEGES;

Kamailio: DID-Based Routing from Database

Add this route to kamailio.cfg to look up DID routing from the database before dispatching:

## ---- DID Routing from Database ----
route[DID_ROUTING] {
    # Look up the called number (R-URI user) in did_routing table
    if (!sql_query("ca", "SELECT destination, dest_type FROM did_routing \
        WHERE did='$rU' AND domain='$fd' AND active=1 \
        ORDER BY priority DESC LIMIT 1", "ra")) {
        xlog("L_ERR", "DID_ROUTING: Database query failed\n");
        return;
    }

    if ($dbr(ra=>rows) > 0) {
        $var(destination) = $dbr(ra=>[0,0]);
        $var(dest_type) = $dbr(ra=>[0,1]);

        xlog("L_INFO", "DID_ROUTING: $rU → $var(destination) ($var(dest_type))\n");

        # Rewrite the R-URI with the looked-up destination
        # FreeSWITCH will use this to determine what to do
        $rU = $var(destination);

        # Optionally set a header so FreeSWITCH knows the destination type
        append_hf("X-Dest-Type: $var(dest_type)\r\n");
    } else {
        xlog("L_WARN", "DID_ROUTING: No route found for DID $rU in domain $fd\n");
        # Use default routing or reject
        sl_send_reply("404", "DID Not Found");
        exit;
    }
}

Then call this route before dispatching in the INVITE handler:

    # Handle INVITE — main call processing
    if (is_method("INVITE")) {
        setflag(FLT_DLG);
        dlg_manage();
        route(NATDETECT);
        route(DID_ROUTING);        # <-- Look up DID first
        route(RTPENGINE_OFFER);
        route(DISPATCH);
        exit;
    }

FreeSWITCH: Dynamic User Directory via mod_xml_curl

Instead of static XML user files on each FreeSWITCH, use mod_xml_curl to fetch user configuration from a central HTTP API backed by the database.

Edit /etc/freeswitch/autoload_configs/xml_curl.conf.xml:

<configuration name="xml_curl.conf" description="cURL XML Gateway">
  <bindings>
    <binding name="directory">
      <param name="gateway-url" value="http://YOUR_DB1_IP:8080/freeswitch/directory"/>
      <param name="gateway-credentials" value="freeswitch:YOUR_API_PASSWORD"/>
      <param name="auth-scheme" value="basic"/>
      <param name="timeout" value="5"/>
      <param name="disable-100-continue" value="true"/>
      <param name="enable-post-mapping" value="false"/>
    </binding>
  </bindings>
</configuration>

Example Python API that serves user directory XML (runs on the DB server or a separate API server):

#!/usr/bin/env python3
"""
freeswitch_directory_api.py
Serves FreeSWITCH user directory from MariaDB
Run with: uvicorn freeswitch_directory_api:app --host 0.0.0.0 --port 8080
"""
from fastapi import FastAPI, Form, Response
import mysql.connector

app = FastAPI()
DB_CONFIG = {
    "host": "YOUR_DB1_IP",
    "user": "freeswitch",
    "password": "YOUR_FS_DB_PASSWORD",
    "database": "kamailio"
}

@app.post("/freeswitch/directory")
async def directory(
    section: str = Form(default="directory"),
    key_name: str = Form(default=""),
    key_value: str = Form(default=""),
    user: str = Form(default=""),
    domain: str = Form(default=""),
):
    """Return FreeSWITCH directory XML for a user lookup."""

    if section != "directory" or not user or not domain:
        return Response(
            content='<?xml version="1.0"?><document type="freeswitch/xml"><section name="directory"></section></document>',
            media_type="text/xml"
        )

    # Look up user in subscriber table
    conn = mysql.connector.connect(**DB_CONFIG)
    cursor = conn.cursor(dictionary=True)
    cursor.execute(
        "SELECT username, password, domain FROM subscriber WHERE username=%s AND domain=%s",
        (user, domain)
    )
    row = cursor.fetchone()
    cursor.close()
    conn.close()

    if not row:
        return Response(
            content='<?xml version="1.0"?><document type="freeswitch/xml"><section name="directory"></section></document>',
            media_type="text/xml"
        )

    xml = f'''<?xml version="1.0" encoding="UTF-8"?>
<document type="freeswitch/xml">
  <section name="directory">
    <domain name="{domain}">
      <user id="{row["username"]}">
        <params>
          <param name="password" value="{row["password"]}"/>
          <param name="vm-password" value="{row["password"]}"/>
        </params>
        <variables>
          <variable name="accountcode" value="{row["username"]}"/>
          <variable name="user_context" value="from-kamailio"/>
          <variable name="effective_caller_id_name" value="{row["username"]}"/>
          <variable name="effective_caller_id_number" value="{row["username"]}"/>
        </variables>
      </user>
    </domain>
  </section>
</document>'''

    return Response(content=xml, media_type="text/xml")

Multi-Tenant Routing

For multi-tenant deployments, use the SIP domain to isolate tenants:

-- Tenant A: company-a.example.com
INSERT INTO did_routing (did, domain, destination, dest_type) VALUES
('+442012345678', 'company-a.example.com', '2000', 'ivr'),
('+442012345679', 'company-a.example.com', '3001', 'queue');

-- Tenant B: company-b.example.com
INSERT INTO did_routing (did, domain, destination, dest_type) VALUES
('+442087654321', 'company-b.example.com', '2000', 'ivr'),
('+442087654322', 'company-b.example.com', '3001', 'queue');

Kamailio routes based on the $fd (From domain) or $rd (Request-URI domain), and FreeSWITCH uses the domain in its user directory lookups. Same extension number 2000 can map to completely different IVRs for each tenant.


9. WebRTC Gateway

Architecture for WebRTC

Browser (WebRTC)                          SIP Trunk
   │                                         │
   │ WSS (SIP over WebSocket)                │ UDP/TCP SIP
   │ DTLS-SRTP (encrypted media)             │ RTP (unencrypted)
   ▼                                         ▼
┌──────────┐    SIP    ┌──────────┐    SIP    ┌──────────┐
│ Kamailio │◄────────►│ RTPEngine│◄────────►│FreeSWITCH│
│  (WSS)   │          │(DTLS↔RTP)│          │ (media)  │
└──────────┘          └──────────┘          └──────────┘

Kamailio: Terminates WebSocket, handles SIP-over-WS
RTPEngine: Bridges DTLS-SRTP (WebRTC) ↔ plain RTP (FreeSWITCH/trunks)
FreeSWITCH: Processes calls normally (does not know about WebRTC)

TLS Certificates (Let's Encrypt Wildcard)

# Install certbot with DNS plugin (for wildcard certs)
apt-get install -y certbot python3-certbot-dns-cloudflare

# Create credentials file (example for Cloudflare DNS)
mkdir -p /root/.secrets
cat > /root/.secrets/cloudflare.ini << 'EOF'
dns_cloudflare_api_token = YOUR_CLOUDFLARE_API_TOKEN
EOF
chmod 600 /root/.secrets/cloudflare.ini

# Get wildcard certificate
certbot certonly \
    --dns-cloudflare \
    --dns-cloudflare-credentials /root/.secrets/cloudflare.ini \
    -d "*.YOUR_DOMAIN" \
    -d "YOUR_DOMAIN" \
    --agree-tos \
    -m admin@YOUR_DOMAIN

# Link for Kamailio
ln -sf /etc/letsencrypt/live/YOUR_DOMAIN/fullchain.pem /etc/kamailio/tls/server.pem
ln -sf /etc/letsencrypt/live/YOUR_DOMAIN/privkey.pem /etc/kamailio/tls/server.key

# Link for RTPEngine (DTLS)
mkdir -p /etc/rtpengine/tls
ln -sf /etc/letsencrypt/live/YOUR_DOMAIN/fullchain.pem /etc/rtpengine/tls/cert.pem
ln -sf /etc/letsencrypt/live/YOUR_DOMAIN/privkey.pem /etc/rtpengine/tls/key.pem

# Auto-renewal cron (reload services after renewal)
cat > /etc/letsencrypt/renewal-hooks/deploy/reload-voip.sh << 'SCRIPT'
#!/bin/bash
systemctl reload kamailio 2>/dev/null || true
systemctl restart rtpengine 2>/dev/null || true
SCRIPT
chmod +x /etc/letsencrypt/renewal-hooks/deploy/reload-voip.sh

Kamailio WSS Configuration

The WebSocket handling is already in the main kamailio.cfg from Section 4. Key pieces:

# Listeners (already defined)
listen=tls:MY_PUBLIC_IP:8443   # WSS direct

# WebSocket module (already loaded)
loadmodule "websocket.so"
loadmodule "xhttp.so"

# xhttp event route handles the WebSocket upgrade (already defined)
event_route[xhttp:request] { ... }

Additional WebSocket-specific routing logic to add in the main request_route:

    # ---- WebRTC-specific handling ----
    if (proto == WS || proto == WSS) {
        # Force record-route with WebSocket transport
        if (is_method("INVITE|SUBSCRIBE")) {
            record_route_preset("MY_PUBLIC_IP:8443;transport=wss");
        }

        # WebRTC clients use SIP Outbound (RFC 5626)
        if (is_method("REGISTER")) {
            # Add Path header so replies find the WebSocket connection
            add_path_received();
        }
    }

RTPEngine DTLS Configuration

Add DTLS support to /etc/rtpengine/rtpengine.conf:

# Add to [rtpengine] section
# DTLS certificate for WebRTC
dtls-cert = /etc/rtpengine/tls/cert.pem
dtls-key = /etc/rtpengine/tls/key.pem

# Enable DTLS and ICE
ice-lite = true

Nginx Reverse Proxy for WSS

For production, put Nginx in front of Kamailio for WSS. This provides proper TLS termination, HTTP/2, and the ability to serve the web client from the same domain:

# /etc/nginx/sites-available/webrtc-gateway
upstream kamailio_wss {
    server YOUR_KAM1_PRIVATE:8080;   # WS (unencrypted) — Nginx handles TLS
    server YOUR_KAM2_PRIVATE:8080 backup;
}

server {
    listen 443 ssl http2;
    server_name webrtc.YOUR_DOMAIN;

    ssl_certificate /etc/letsencrypt/live/YOUR_DOMAIN/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/YOUR_DOMAIN/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;

    # WebSocket proxy to Kamailio
    location /ws {
        proxy_pass http://kamailio_wss;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 3600s;
        proxy_send_timeout 3600s;
    }

    # Serve the WebRTC web client
    location / {
        root /var/www/webrtc;
        index index.html;
    }
}

Browser Client — SIP.js Example

Create /var/www/webrtc/index.html:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>WebRTC Phone</title>
    <script src="https://cdn.jsdelivr.net/npm/[email protected]/lib/platform/web/sip.js"></script>
    <style>
        body {
            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
            max-width: 500px;
            margin: 50px auto;
            padding: 20px;
            background: #1a1a2e;
            color: #e0e0e0;
        }
        h1 { color: #00d4ff; text-align: center; }
        .status {
            text-align: center;
            padding: 10px;
            margin: 20px 0;
            border-radius: 8px;
            background: #16213e;
        }
        .status.connected { border-left: 4px solid #00ff88; }
        .status.disconnected { border-left: 4px solid #ff4444; }
        .status.calling { border-left: 4px solid #ffaa00; }
        input, button {
            width: 100%;
            padding: 12px;
            margin: 5px 0;
            border: none;
            border-radius: 6px;
            font-size: 16px;
            box-sizing: border-box;
        }
        input {
            background: #16213e;
            color: #e0e0e0;
            border: 1px solid #333;
        }
        button {
            cursor: pointer;
            font-weight: bold;
        }
        .btn-call { background: #00ff88; color: #000; }
        .btn-hangup { background: #ff4444; color: #fff; }
        .btn-answer { background: #00d4ff; color: #000; }
        .btn-register { background: #9b59b6; color: #fff; }
        button:hover { opacity: 0.9; }
        button:disabled { opacity: 0.4; cursor: not-allowed; }
        .controls { margin: 20px 0; }
        audio { display: none; }
    </style>
</head>
<body>
    <h1>WebRTC Phone</h1>

    <div id="status" class="status disconnected">Disconnected</div>

    <div class="controls">
        <input type="text" id="server" placeholder="WSS Server" value="wss://webrtc.YOUR_DOMAIN/ws">
        <input type="text" id="username" placeholder="SIP Username (e.g., 1001)">
        <input type="password" id="password" placeholder="SIP Password">
        <input type="text" id="domain" placeholder="SIP Domain" value="YOUR_DOMAIN">
        <button class="btn-register" onclick="doRegister()">Register</button>
    </div>

    <div class="controls">
        <input type="text" id="target" placeholder="Number to call">
        <button class="btn-call" id="btnCall" onclick="doCall()" disabled>Call</button>
        <button class="btn-answer" id="btnAnswer" onclick="doAnswer()" disabled>Answer</button>
        <button class="btn-hangup" id="btnHangup" onclick="doHangup()" disabled>Hang Up</button>
    </div>

    <audio id="remoteAudio" autoplay></audio>

    <script>
    let userAgent = null;
    let registerer = null;
    let currentSession = null;

    function setStatus(text, className) {
        const el = document.getElementById('status');
        el.textContent = text;
        el.className = 'status ' + className;
    }

    async function doRegister() {
        const server = document.getElementById('server').value;
        const username = document.getElementById('username').value;
        const password = document.getElementById('password').value;
        const domain = document.getElementById('domain').value;

        const uri = SIP.UserAgent.makeURI(`sip:${username}@${domain}`);
        const transportOptions = {
            server: server,
            traceSip: true
        };

        userAgent = new SIP.UserAgent({
            uri: uri,
            transportOptions: transportOptions,
            authorizationUsername: username,
            authorizationPassword: password,
            displayName: username,
            delegate: {
                onInvite: (invitation) => {
                    currentSession = invitation;
                    setStatus('Incoming call from ' + invitation.remoteIdentity.displayName, 'calling');
                    document.getElementById('btnAnswer').disabled = false;
                    document.getElementById('btnHangup').disabled = false;
                }
            }
        });

        await userAgent.start();

        registerer = new SIP.Registerer(userAgent);
        registerer.stateChange.addListener((state) => {
            switch (state) {
                case SIP.RegistererState.Registered:
                    setStatus('Registered as ' + username, 'connected');
                    document.getElementById('btnCall').disabled = false;
                    break;
                case SIP.RegistererState.Unregistered:
                    setStatus('Unregistered', 'disconnected');
                    document.getElementById('btnCall').disabled = true;
                    break;
            }
        });

        await registerer.register();
    }

    async function doCall() {
        const target = document.getElementById('target').value;
        const domain = document.getElementById('domain').value;

        if (!target || !userAgent) return;

        const targetURI = SIP.UserAgent.makeURI(`sip:${target}@${domain}`);
        if (!targetURI) {
            alert('Invalid target');
            return;
        }

        const inviter = new SIP.Inviter(userAgent, targetURI, {
            sessionDescriptionHandlerOptions: {
                constraints: { audio: true, video: false }
            }
        });

        currentSession = inviter;
        setupSessionListeners(inviter);

        setStatus('Calling ' + target + '...', 'calling');
        document.getElementById('btnHangup').disabled = false;
        document.getElementById('btnCall').disabled = true;

        await inviter.invite();
    }

    async function doAnswer() {
        if (!currentSession) return;

        await currentSession.accept({
            sessionDescriptionHandlerOptions: {
                constraints: { audio: true, video: false }
            }
        });

        setupSessionListeners(currentSession);
        setStatus('In call', 'connected');
        document.getElementById('btnAnswer').disabled = true;
    }

    function doHangup() {
        if (!currentSession) return;

        switch (currentSession.state) {
            case SIP.SessionState.Initial:
            case SIP.SessionState.Establishing:
                if (currentSession instanceof SIP.Inviter) {
                    currentSession.cancel();
                } else {
                    currentSession.reject();
                }
                break;
            case SIP.SessionState.Established:
                currentSession.bye();
                break;
        }

        resetCallUI();
    }

    function setupSessionListeners(session) {
        session.stateChange.addListener((state) => {
            switch (state) {
                case SIP.SessionState.Established:
                    setStatus('In call', 'connected');
                    // Attach remote audio
                    const remoteStream = new MediaStream();
                    session.sessionDescriptionHandler.peerConnection
                        .getReceivers()
                        .forEach((receiver) => {
                            if (receiver.track) {
                                remoteStream.addTrack(receiver.track);
                            }
                        });
                    document.getElementById('remoteAudio').srcObject = remoteStream;
                    break;
                case SIP.SessionState.Terminated:
                    setStatus('Call ended', 'disconnected');
                    resetCallUI();
                    break;
            }
        });
    }

    function resetCallUI() {
        currentSession = null;
        document.getElementById('btnCall').disabled = false;
        document.getElementById('btnAnswer').disabled = true;
        document.getElementById('btnHangup').disabled = true;
        document.getElementById('remoteAudio').srcObject = null;
        setTimeout(() => setStatus('Registered', 'connected'), 2000);
    }
    </script>
</body>
</html>

Testing WebRTC

  1. Open https://webrtc.YOUR_DOMAIN/ in Chrome or Firefox
  2. Enter your SIP credentials and click Register
  3. Status should change to "Registered"
  4. Enter a number (e.g., 9196 for echo test) and click Call
  5. Verify audio flows both directions

Debugging WebRTC issues:

# On Kamailio — watch WebSocket connections
kamcmd ws.dump

# On RTPEngine — check DTLS sessions
rtpengine-ctl list sessions

# On RTPEngine — verify DTLS is working
# Look for "DTLS" in the session details
rtpengine-ctl list totals

# Browser — check WebRTC internals
# Chrome: chrome://webrtc-internals/
# Firefox: about:webrtc

10. High Availability — Kamailio

Keepalived + Virtual IP (VIP)

The Kamailio HA pair uses Keepalived to manage a floating Virtual IP (VIP). The active node holds the VIP and processes all traffic. If it fails, the standby node takes over the VIP within seconds.

Normal operation:
  VIP (YOUR_PUBLIC_VIP) → Kamailio-A (active)
                           Kamailio-B (standby, idle)

After Kamailio-A failure:
  VIP (YOUR_PUBLIC_VIP) → Kamailio-B (now active)
                           Kamailio-A (down)

Failover time: 3-6 seconds (VRRP advertisement interval + detection)

Install Keepalived

# On both kam01 and kam02
apt-get install -y keepalived

Health Check Script

Create /etc/keepalived/check_kamailio.sh:

#!/bin/bash
#
# Kamailio health check for Keepalived
# Returns 0 (healthy) or 1 (unhealthy)
# Tests actual SIP responsiveness, not just process existence
#

# Check 1: Is the process running?
if ! pgrep -x kamailio > /dev/null 2>&1; then
    echo "FAIL: Kamailio process not running"
    exit 1
fi

# Check 2: Can it respond to SIP OPTIONS?
# Send OPTIONS to localhost and expect a response within 2 seconds
RESPONSE=$(sipsak -s sip:[email protected]:5060 -v --timeout 2 2>&1)
if [ $? -ne 0 ]; then
    echo "FAIL: Kamailio not responding to SIP OPTIONS"
    exit 1
fi

# Check 3: Check that the control socket is responsive
if ! kamcmd core.uptime > /dev/null 2>&1; then
    echo "FAIL: Kamailio RPC not responding"
    exit 1
fi

# Check 4: Verify at least one dispatcher destination is active
ACTIVE=$(kamcmd dispatcher.list 2>/dev/null | grep -c "FLAGS: AP")
if [ "$ACTIVE" -eq 0 ]; then
    echo "WARN: No active dispatcher destinations (not failing over for this)"
    # Don't fail for this — it might be a temporary condition
    # and failing over won't help if all FS servers are down
fi

echo "OK: Kamailio healthy (${ACTIVE} active dispatchers)"
exit 0
chmod +x /etc/keepalived/check_kamailio.sh
apt-get install -y sipsak   # Needed for the health check

Keepalived Configuration — Active Node (kam01)

Create /etc/keepalived/keepalived.conf on kam01:

# /etc/keepalived/keepalived.conf — Kamailio-A (MASTER)

global_defs {
    router_id KAM01
    script_user root
    enable_script_security
    # Notification emails (optional)
    # notification_email {
    #     admin@YOUR_DOMAIN
    # }
    # notification_email_from keepalived@kam01
    # smtp_server localhost
}

# Health check script
vrrp_script check_kamailio {
    script "/etc/keepalived/check_kamailio.sh"
    interval 3          # Check every 3 seconds
    weight -20          # Subtract 20 from priority on failure
    fall 2              # 2 consecutive failures = unhealthy
    rise 2              # 2 consecutive successes = healthy
}

# VRRP instance for SIP VIP
vrrp_instance VI_SIP {
    state MASTER
    interface eth0              # Change to your network interface
    virtual_router_id 51       # Must be same on both nodes
    priority 100               # Higher = preferred (kam01 is preferred)
    advert_int 1               # VRRP advertisement every 1 second

    authentication {
        auth_type PASS
        auth_pass YOUR_VRRP_PASSWORD    # Same on both nodes
    }

    virtual_ipaddress {
        YOUR_PUBLIC_VIP/32 dev eth0     # The floating VIP
    }

    track_script {
        check_kamailio
    }

    # Notify scripts (optional — for logging/alerting)
    notify_master "/bin/bash -c 'logger -t keepalived MASTER — VIP acquired on kam01'"
    notify_backup "/bin/bash -c 'logger -t keepalived BACKUP — VIP released on kam01'"
    notify_fault  "/bin/bash -c 'logger -t keepalived FAULT — health check failing on kam01'"
}

Keepalived Configuration — Standby Node (kam02)

Create /etc/keepalived/keepalived.conf on kam02 (differences highlighted):

# /etc/keepalived/keepalived.conf — Kamailio-B (BACKUP)

global_defs {
    router_id KAM02
    script_user root
    enable_script_security
}

vrrp_script check_kamailio {
    script "/etc/keepalived/check_kamailio.sh"
    interval 3
    weight -20
    fall 2
    rise 2
}

vrrp_instance VI_SIP {
    state BACKUP                # <-- BACKUP (not MASTER)
    interface eth0
    virtual_router_id 51       # Must match kam01
    priority 90                # <-- Lower priority (kam01 preferred)
    advert_int 1

    authentication {
        auth_type PASS
        auth_pass YOUR_VRRP_PASSWORD    # Must match kam01
    }

    virtual_ipaddress {
        YOUR_PUBLIC_VIP/32 dev eth0
    }

    track_script {
        check_kamailio
    }

    notify_master "/bin/bash -c 'logger -t keepalived MASTER — VIP acquired on kam02'"
    notify_backup "/bin/bash -c 'logger -t keepalived BACKUP — VIP released on kam02'"
    notify_fault  "/bin/bash -c 'logger -t keepalived FAULT — health check failing on kam02'"
}

Start Keepalived

# On both nodes
systemctl enable --now keepalived

# Verify VIP is on kam01 (the master)
ip addr show eth0 | grep YOUR_PUBLIC_VIP

# Check keepalived status
systemctl status keepalived
journalctl -u keepalived -f

# Test failover: stop Kamailio on kam01
systemctl stop kamailio
# Within 3-6 seconds, VIP should move to kam02:
# On kam02: ip addr show eth0 | grep YOUR_PUBLIC_VIP

# Restore kam01
systemctl start kamailio
# VIP moves back to kam01 (higher priority, preemption)

Shared Location Table (usrloc to DB)

For seamless failover of registered users, both Kamailio nodes must share the location table in the database. This is already configured in our kamailio.cfg:

modparam("usrloc", "db_url", DBURL)
modparam("usrloc", "db_mode", 2)    # Write-through: every registration written to DB immediately

With db_mode=2, when a user registers via kam01, the registration is written to the database. If kam01 fails and kam02 takes over, kam02 reads the location table from the database and can route calls to registered users without re-registration.

Important: db_mode=2 has higher database load than db_mode=1 (write-back). For very high registration volumes (100K+ registered users), consider db_mode=1 with a short timer_interval (e.g., 30 seconds).

Dialog Replication with DMQ

For in-progress calls to survive a failover, Kamailio supports Dialog replication between nodes using the DMQ (Distributed Message Queue) module. This replicates dialog state so the standby node can handle in-dialog requests (BYE, re-INVITE) for calls that were set up by the active node.

Add to kamailio.cfg:

# Load DMQ module
loadmodule "dmq.so"

# DMQ parameters
modparam("dmq", "server_address", "sip:MY_PRIVATE_IP:5062")
modparam("dmq", "notification_address", "sip:10.0.1.10:5062")  # Use kam01 as notification peer
modparam("dmq", "multi_notify", 1)
modparam("dmq", "num_workers", 4)
modparam("dmq", "ping_interval", 15)

# Add DMQ listener
listen=udp:MY_PRIVATE_IP:5062

# Enable dialog replication via DMQ
modparam("dialog", "enable_dmq", 1)

Add DMQ routing in the main request_route:

    # DMQ traffic — handle before anything else
    if ($rm == "KDMQ" && $rP == "udp" && $sp == 5062) {
        dmq_handle_message();
        exit;
    }

With DMQ active, both Kamailio nodes maintain synchronized dialog state. During failover, in-progress calls continue working because the new active node has the complete dialog table.


11. High Availability — FreeSWITCH

Why FreeSWITCH HA Is Different

Unlike Kamailio (which is a stateless proxy that can easily share state via database), FreeSWITCH is a stateful media server — it holds active call sessions, media streams, and application state in memory. This makes traditional active/standby HA impractical for FreeSWITCH.

Instead, FreeSWITCH HA relies on a pool architecture:

Blast Radius Analysis

Pool Size Calls per FS (at 3000 total) Impact of 1 FS Failure
2 instances 1,500 each 50% of calls lost
3 instances 1,000 each 33% of calls lost
4 instances 750 each 25% of calls lost
6 instances 500 each 17% of calls lost

With 4+ instances, a single failure affects a manageable percentage of calls, and the surviving instances have enough headroom to absorb the redistributed load.

Graceful Draining — Zero-Downtime Maintenance

The key to zero-downtime FreeSWITCH maintenance is draining: stop sending new calls to a node while letting existing calls finish naturally.

#!/bin/bash
# drain-freeswitch.sh — Gracefully drain a FreeSWITCH instance
# Usage: ./drain-freeswitch.sh fs01 YOUR_FS1_IP

FS_NAME=$1
FS_IP=$2
KAM_HOST="YOUR_KAM1_PRIVATE"

echo "=== Draining FreeSWITCH: $FS_NAME ($FS_IP) ==="

# Step 1: Mark as inactive in Kamailio dispatcher (no new calls)
echo "Step 1: Removing from dispatcher..."
ssh $KAM_HOST "kamcmd dispatcher.set_state i 1 sip:${FS_IP}:5060"
echo "  Done. No new calls will be sent to $FS_NAME."

# Step 2: Wait for existing calls to finish
echo "Step 2: Waiting for active calls to finish..."
while true; do
    CALLS=$(ssh $FS_IP "fs_cli -x 'show calls count' 2>/dev/null" | grep -oP '\d+(?= total)')
    CALLS=${CALLS:-0}
    echo "  Active calls: $CALLS"
    if [ "$CALLS" -eq 0 ]; then
        break
    fi
    sleep 10
done

echo "  All calls finished."

# Step 3: Now safe to perform maintenance
echo "Step 3: $FS_NAME is fully drained. Safe to stop/upgrade."
echo ""
echo "  When done, re-enable with:"
echo "  ssh $KAM_HOST 'kamcmd dispatcher.set_state a 1 sip:${FS_IP}:5060'"

Zero-Downtime Upgrade Procedure

#!/bin/bash
# upgrade-freeswitch.sh — Zero-downtime FreeSWITCH upgrade
# Upgrades one instance at a time (rolling upgrade)

INSTANCES=("fs01:YOUR_FS1_IP" "fs02:YOUR_FS2_IP" "fs03:YOUR_FS3_IP")
KAM_HOST="YOUR_KAM1_PRIVATE"

for instance in "${INSTANCES[@]}"; do
    IFS=':' read -r name ip <<< "$instance"

    echo "============================================"
    echo "Upgrading $name ($ip)"
    echo "============================================"

    # 1. Drain
    echo "  Draining..."
    ssh $KAM_HOST "kamcmd dispatcher.set_state i 1 sip:${ip}:5060"

    # Wait for calls to finish (max 30 minutes)
    TIMEOUT=1800
    ELAPSED=0
    while [ $ELAPSED -lt $TIMEOUT ]; do
        CALLS=$(ssh $ip "fs_cli -x 'show calls count' 2>/dev/null" | grep -oP '\d+(?= total)')
        CALLS=${CALLS:-0}
        if [ "$CALLS" -eq 0 ]; then break; fi
        echo "    $CALLS calls remaining (${ELAPSED}s elapsed)..."
        sleep 15
        ELAPSED=$((ELAPSED + 15))
    done

    # 2. Stop FreeSWITCH
    echo "  Stopping FreeSWITCH..."
    ssh $ip "systemctl stop freeswitch"

    # 3. Upgrade
    echo "  Upgrading..."
    ssh $ip "apt-get update && apt-get upgrade -y freeswitch*"

    # 4. Start FreeSWITCH
    echo "  Starting FreeSWITCH..."
    ssh $ip "systemctl start freeswitch"
    sleep 5  # Wait for SIP profile to register

    # 5. Verify it responds
    echo "  Verifying..."
    ssh $ip "fs_cli -x 'sofia status'" || { echo "FAILED to start $name!"; exit 1; }

    # 6. Re-enable in dispatcher
    echo "  Re-enabling in dispatcher..."
    ssh $KAM_HOST "kamcmd dispatcher.set_state a 1 sip:${ip}:5060"

    echo "  $name upgraded successfully."
    echo ""

    # Wait before upgrading next instance (let it stabilize)
    sleep 30
done

echo "All instances upgraded. Verifying dispatcher state..."
ssh $KAM_HOST "kamcmd dispatcher.list"

Shared Storage for Recordings

FreeSWITCH call recordings need to be accessible regardless of which instance handled the call. Options:

Option A: NFS (simplest)

# On NFS server (db01 or dedicated storage)
apt-get install -y nfs-kernel-server
mkdir -p /srv/recordings
chown freeswitch:freeswitch /srv/recordings
echo "/srv/recordings 10.0.1.0/24(rw,sync,no_subtree_check,no_root_squash)" >> /etc/exports
exportfs -ra

# On each FreeSWITCH server
apt-get install -y nfs-common
mkdir -p /var/lib/freeswitch/recordings
echo "YOUR_DB1_IP:/srv/recordings /var/lib/freeswitch/recordings nfs defaults,soft,timeo=50 0 0" >> /etc/fstab
mount -a

Option B: S3-compatible storage (scalable)

Create a post-recording script that uploads to S3:

#!/bin/bash
# /usr/local/bin/upload-recording.sh
# Called by FreeSWITCH after each recording completes

FILE=$1
BUCKET="s3://your-recordings-bucket"

if [ -f "$FILE" ]; then
    aws s3 cp "$FILE" "$BUCKET/$(date +%Y/%m/%d)/$(basename $FILE)" \
        --storage-class STANDARD_IA
    # Optionally delete local file after upload
    # rm -f "$FILE"
fi

Session Recovery Limitations

It is important to understand what FreeSWITCH HA cannot do:

These limitations are inherent to any media server. The mitigation is to have enough pool instances that the blast radius of any single failure is acceptable. For critical applications (emergency services, etc.), consider having callers automatically redialed by the application layer when a session is lost.


12. Geographic Distribution

Multi-DC Architecture

                    ┌─────────────────────┐
                    │   Global DNS (SRV)  │
                    │   sip.YOUR_DOMAIN   │
                    └──────────┬──────────┘
                               │
              ┌────────────────┼────────────────┐
              │                │                │
    ┌─────────▼──────┐  ┌─────▼──────┐  ┌──────▼────────┐
    │  DC Europe     │  │  DC US-East │  │  DC US-West   │
    │  (London)      │  │  (Virginia) │  │  (Oregon)     │
    │                │  │             │  │               │
    │  Kam+FS+RTP    │  │  Kam+FS+RTP │  │  Kam+FS+RTP  │
    │  Galera node   │  │  Galera node│  │  Galera node  │
    └────────────────┘  └─────────────┘  └───────────────┘
              │                │                │
              └────────────────┼────────────────┘
                               │
                    ┌──────────▼──────────┐
                    │  Galera WAN Cluster │
                    │  (async replication)│
                    └─────────────────────┘

DNS SRV Records

DNS SRV records allow SIP clients to discover your servers and automatically failover between data centers:

; NAPTR records — tell SIP clients which transports are available
YOUR_DOMAIN.  IN NAPTR 10 10 "S" "SIP+D2U"  "" _sip._udp.YOUR_DOMAIN.
YOUR_DOMAIN.  IN NAPTR 20 10 "S" "SIP+D2T"  "" _sip._tcp.YOUR_DOMAIN.
YOUR_DOMAIN.  IN NAPTR 30 10 "S" "SIPS+D2T" "" _sips._tcp.YOUR_DOMAIN.

; SRV records — specify servers and priorities per transport
; Lower priority number = preferred. Same priority = load balance by weight.

; UDP SIP
_sip._udp.YOUR_DOMAIN.  IN SRV 10 60 5060 sip-eu.YOUR_DOMAIN.   ; EU primary
_sip._udp.YOUR_DOMAIN.  IN SRV 10 40 5060 sip-us.YOUR_DOMAIN.   ; US secondary
_sip._udp.YOUR_DOMAIN.  IN SRV 20 50 5060 sip-eu2.YOUR_DOMAIN.  ; EU backup
_sip._udp.YOUR_DOMAIN.  IN SRV 20 50 5060 sip-us2.YOUR_DOMAIN.  ; US backup

; TCP SIP
_sip._tcp.YOUR_DOMAIN.  IN SRV 10 60 5060 sip-eu.YOUR_DOMAIN.
_sip._tcp.YOUR_DOMAIN.  IN SRV 10 40 5060 sip-us.YOUR_DOMAIN.

; TLS SIP
_sips._tcp.YOUR_DOMAIN. IN SRV 10 60 5061 sip-eu.YOUR_DOMAIN.
_sips._tcp.YOUR_DOMAIN. IN SRV 10 40 5061 sip-us.YOUR_DOMAIN.

; A records for each SIP edge
sip-eu.YOUR_DOMAIN.  IN A YOUR_EU_VIP
sip-us.YOUR_DOMAIN.  IN A YOUR_US_VIP
sip-eu2.YOUR_DOMAIN. IN A YOUR_EU2_VIP
sip-us2.YOUR_DOMAIN. IN A YOUR_US2_VIP

How SIP clients use SRV records:

  1. Client resolves _sip._udp.YOUR_DOMAIN and gets 2 records with priority 10
  2. Client distributes requests based on weight: 60% to EU, 40% to US
  3. If the priority-10 servers fail, client falls back to priority-20 servers
  4. SIP INVITE includes a Route header for the selected server

Geographic Routing with GeoIP

Kamailio can use the GeoIP2 module to route calls based on the geographic location of the caller:

# Load GeoIP2 module
loadmodule "geoip2.so"
modparam("geoip2", "path", "/usr/share/GeoIP/GeoLite2-City.mmdb")

# Geographic routing route
route[GEO_ROUTE] {
    # Look up caller's country
    if (geoip2_match("$si", "src")) {
        $var(country) = $gip2(src=>cc);
        $var(continent) = $gip2(src=>cont);

        xlog("L_INFO", "GEO: Caller from $si — country=$var(country), continent=$var(continent)\n");

        # Route to closest DC based on continent
        switch ($var(continent)) {
            case "EU":
                # European callers → EU FreeSWITCH pool (set 10)
                if (!ds_select_dst("10", "0", "6")) {
                    # Fallback to US pool
                    ds_select_dst("20", "0", "6");
                }
                break;
            case "NA":
                # North American callers → US-East pool (set 20)
                if (!ds_select_dst("20", "0", "6")) {
                    ds_select_dst("10", "0", "6");
                }
                break;
            default:
                # Everyone else → round-robin across all DCs
                ds_select_dst("1", "4", "6");
                break;
        }
    } else {
        # GeoIP lookup failed — use default pool
        ds_select_dst("1", "0", "6");
    }
}

Database Replication Across Data Centers

For multi-DC deployments, use MariaDB Galera with WAN replication:

# On each Galera node, add WAN-specific settings:
[galera]
wsrep_cluster_address = "gcomm://EU_DB_IP,US_EAST_DB_IP,US_WEST_DB_IP"

# WAN optimizations
wsrep_provider_options = "evs.send_window=256; evs.user_send_window=128; evs.keepalive_period=PT3S; evs.suspect_timeout=PT30S; evs.inactive_timeout=PT1M; gcache.size=1G"

# Segment-aware replication (reduces cross-DC traffic)
# EU nodes: gmcast.segment=0
# US-East nodes: gmcast.segment=1
# US-West nodes: gmcast.segment=2
wsrep_provider_options = "gmcast.segment=0"   # Change per DC

Important latency considerations:

Latency Considerations for Media

Media (RTP) is latency-sensitive. Key rules:

Kamailio can select the right RTPEngine based on the caller's location:

# Select RTPEngine based on caller geography
route[SELECT_RTPENGINE] {
    if ($var(continent) == "EU") {
        # Use EU RTPEngine
        modparam("rtpengine", "rtpengine_sock", "udp:EU_RTP_IP:2223");
    } else {
        # Use US RTPEngine
        modparam("rtpengine", "rtpengine_sock", "udp:US_RTP_IP:2223");
    }
}

13. Monitoring & Operations

Prometheus Metrics — All Components

A unified monitoring stack provides visibility into every layer of the platform.

Kamailio Exporter

# Install kamailio-exporter
# Option 1: Pre-built binary
wget https://github.com/florentchauveau/kamailio_exporter/releases/latest/download/kamailio_exporter_linux_amd64 \
    -O /usr/local/bin/kamailio_exporter
chmod +x /usr/local/bin/kamailio_exporter

# Create systemd service
cat > /etc/systemd/system/kamailio-exporter.service << 'EOF'
[Unit]
Description=Kamailio Prometheus Exporter
After=kamailio.service

[Service]
ExecStart=/usr/local/bin/kamailio_exporter \
    --kamailio.address=unix:/var/run/kamailio/kamailio_ctl \
    --web.listen-address=:9494
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable --now kamailio-exporter

Key Kamailio metrics:

Metric Meaning Alert Threshold
kamailio_dialog_active Active calls through Kamailio >5000 (capacity warning)
kamailio_tmx_code_total{code="5xx"} 5xx SIP errors >10/min
kamailio_tmx_code_total{code="408"} Request timeouts >5/min
kamailio_dispatcher_target_up Backend FS health == 0 (all down)
kamailio_sl_sent_replies_total Reply rate Sudden drop/spike
kamailio_pike_blocked Rate-limited IPs >0 (potential attack)
kamailio_core_shm_free Shared memory free <10% (memory pressure)

FreeSWITCH Exporter

# Install freeswitch-exporter
pip3 install freeswitch-exporter

# Or use a custom script via ESL
cat > /usr/local/bin/freeswitch_exporter.py << 'PYEOF'
#!/usr/bin/env python3
"""FreeSWITCH Prometheus exporter via ESL."""
import subprocess
import time
from prometheus_client import start_http_server, Gauge

# Metrics
calls_active = Gauge('freeswitch_calls_active', 'Active calls')
channels_active = Gauge('freeswitch_channels_active', 'Active channels')
registrations = Gauge('freeswitch_registrations_active', 'Active registrations')
cpu_idle = Gauge('freeswitch_cpu_idle_percent', 'CPU idle percentage')
sessions_peak = Gauge('freeswitch_sessions_peak', 'Peak sessions since start')
sessions_per_sec = Gauge('freeswitch_sessions_per_second', 'Current sessions per second')
uptime = Gauge('freeswitch_uptime_seconds', 'Uptime in seconds')

def collect():
    try:
        # Active calls
        out = subprocess.check_output(["fs_cli", "-x", "show calls count"], text=True)
        calls_active.set(int(out.strip().split()[0]))

        # Channels
        out = subprocess.check_output(["fs_cli", "-x", "show channels count"], text=True)
        channels_active.set(int(out.strip().split()[0]))

        # Registrations
        out = subprocess.check_output(["fs_cli", "-x", "show registrations count"], text=True)
        registrations.set(int(out.strip().split()[0]))

        # Status
        out = subprocess.check_output(["fs_cli", "-x", "status"], text=True)
        for line in out.split('\n'):
            if 'session(s) - peak' in line:
                parts = line.split()
                sessions_peak.set(int(parts[0]))
            if 'session(s) per Sec' in line:
                parts = line.split()
                sessions_per_sec.set(float(parts[0]))
            if 'years' in line or 'days' in line or 'hours' in line:
                # Parse uptime — simplified
                pass
    except Exception as e:
        print(f"Collection error: {e}")

if __name__ == '__main__':
    start_http_server(9282)
    while True:
        collect()
        time.sleep(15)
PYEOF
chmod +x /usr/local/bin/freeswitch_exporter.py

# Create systemd service
cat > /etc/systemd/system/freeswitch-exporter.service << 'EOF'
[Unit]
Description=FreeSWITCH Prometheus Exporter
After=freeswitch.service

[Service]
ExecStart=/usr/local/bin/freeswitch_exporter.py
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable --now freeswitch-exporter

RTPEngine Exporter

# rtpengine-exporter scrapes RTPEngine's statistics interface
cat > /usr/local/bin/rtpengine_exporter.py << 'PYEOF'
#!/usr/bin/env python3
"""RTPEngine Prometheus exporter via ng control protocol."""
import socket
import bencodepy
import time
from prometheus_client import start_http_server, Gauge

RTPENGINE_HOST = "127.0.0.1"
RTPENGINE_PORT = 2223

# Metrics
sessions = Gauge('rtpengine_sessions_active', 'Active media sessions')
sessions_total = Gauge('rtpengine_sessions_total', 'Total sessions since start')
errors = Gauge('rtpengine_errors_total', 'Total errors')
offer_total = Gauge('rtpengine_offer_total', 'Total offer commands')
answer_total = Gauge('rtpengine_answer_total', 'Total answer commands')
delete_total = Gauge('rtpengine_delete_total', 'Total delete commands')
packets_relayed = Gauge('rtpengine_packets_relayed', 'Packets relayed')
bytes_relayed = Gauge('rtpengine_bytes_relayed', 'Bytes relayed')

def query_rtpengine(command):
    """Send ng protocol command to RTPEngine."""
    cookie = "stats_" + str(int(time.time()))
    msg = bencodepy.encode({
        b"command": command.encode()
    })
    full_msg = f"{cookie} ".encode() + msg

    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    sock.settimeout(2)
    sock.sendto(full_msg, (RTPENGINE_HOST, RTPENGINE_PORT))
    data, _ = sock.recvfrom(65535)
    sock.close()

    # Strip cookie prefix
    space_idx = data.index(b' ')
    return bencodepy.decode(data[space_idx + 1:])

def collect():
    try:
        result = query_rtpengine("list totals")
        if b'result' in result and result[b'result'] == b'ok':
            totals = result.get(b'totals', {})

            sessions.set(totals.get(b'current_sessions', 0))
            sessions_total.set(totals.get(b'total_sessions', 0))
            offer_total.set(totals.get(b'offer', 0))
            answer_total.set(totals.get(b'answer', 0))
            delete_total.set(totals.get(b'delete', 0))
    except Exception as e:
        print(f"Collection error: {e}")

if __name__ == '__main__':
    start_http_server(9283)
    while True:
        collect()
        time.sleep(15)
PYEOF
chmod +x /usr/local/bin/rtpengine_exporter.py

Prometheus Scrape Configuration

Add to your prometheus.yml:

scrape_configs:
  # Kamailio
  - job_name: 'kamailio'
    static_configs:
      - targets:
          - 'YOUR_KAM1_PRIVATE:9494'
          - 'YOUR_KAM2_PRIVATE:9494'
        labels:
          component: 'kamailio'

  # FreeSWITCH
  - job_name: 'freeswitch'
    static_configs:
      - targets:
          - 'YOUR_FS1_IP:9282'
          - 'YOUR_FS2_IP:9282'
          - 'YOUR_FS3_IP:9282'
        labels:
          component: 'freeswitch'

  # RTPEngine
  - job_name: 'rtpengine'
    static_configs:
      - targets:
          - 'YOUR_RTP1_PRIVATE:9283'
          - 'YOUR_RTP2_PRIVATE:9283'
        labels:
          component: 'rtpengine'

  # MariaDB (via mysqld_exporter)
  - job_name: 'mariadb'
    static_configs:
      - targets:
          - 'YOUR_DB1_IP:9104'
          - 'YOUR_DB2_IP:9104'
          - 'YOUR_DB3_IP:9104'
        labels:
          component: 'database'

Grafana Dashboard

Import or create a dashboard with these panels:

Row 1: Platform Overview
  - Total active calls (sum of all FS instances)
  - Active registrations
  - Calls per second (rate)
  - Platform uptime

Row 2: Kamailio
  - Active dialogs (gauge)
  - SIP response codes (stacked bar: 2xx, 3xx, 4xx, 5xx)
  - Dispatcher backend status (table: name, state, latency)
  - Shared memory usage (%)

Row 3: FreeSWITCH
  - Active calls per instance (stacked area)
  - Channels per instance (line)
  - CPU usage per instance (line)
  - Sessions per second (rate)

Row 4: RTPEngine
  - Active media sessions (gauge)
  - Packets relayed per second (rate)
  - Media errors (rate)
  - Session duration histogram

Row 5: Database
  - Queries per second
  - Replication lag (Galera)
  - Connection count
  - Slow queries

Homer — SIP Capture and Analysis

Homer provides deep SIP packet analysis — essential for debugging call flows across multiple components.

# Install heplify agent on each SIP component (Kamailio, FreeSWITCH)
wget https://github.com/sipcapture/heplify/releases/latest/download/heplify -O /usr/local/bin/heplify
chmod +x /usr/local/bin/heplify

# Run heplify on Kamailio servers
cat > /etc/systemd/system/heplify.service << 'EOF'
[Unit]
Description=HEPlify SIP Capture Agent
After=network.target

[Service]
ExecStart=/usr/local/bin/heplify \
    -i eth0 \
    -hs YOUR_HOMER_IP:9060 \
    -m SIP \
    -dim REGISTER \
    -pr 5060-5061
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable --now heplify

Alerting Rules

# prometheus/alerts/voip-platform.yml
groups:
  - name: voip_platform
    rules:
      # All FreeSWITCH servers down
      - alert: AllMediaServersDown
        expr: count(freeswitch_calls_active) == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "All FreeSWITCH media servers are down"

      # Single FreeSWITCH down
      - alert: MediaServerDown
        expr: up{job="freeswitch"} == 0
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "FreeSWITCH {{ $labels.instance }} is down"

      # Kamailio high error rate
      - alert: KamailioHighErrorRate
        expr: rate(kamailio_tmx_code_total{code=~"5.."}[5m]) > 0.5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Kamailio 5xx error rate > 0.5/sec"

      # Dispatcher all backends down
      - alert: DispatcherAllBackendsDown
        expr: kamailio_dispatcher_target_up == 0
        for: 30s
        labels:
          severity: critical
        annotations:
          summary: "All dispatcher backends are down"

      # RTPEngine down
      - alert: RTPEngineDown
        expr: up{job="rtpengine"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "RTPEngine {{ $labels.instance }} is down"

      # High call volume (capacity planning)
      - alert: HighCallVolume
        expr: sum(freeswitch_calls_active) > 2000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Platform handling {{ $value }} concurrent calls (threshold: 2000)"

      # Database replication lag
      - alert: GaleraReplicationLag
        expr: mysql_galera_cluster_status{wsrep_local_recv_queue_avg} > 10
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Galera replication queue building up"

Operational Runbook

Adding a New FreeSWITCH Node

# 1. Set up the new server (base OS + FreeSWITCH install from Section 7)
# 2. Configure SIP profile, ACL, dialplan (copy from existing FS node)
# 3. Test locally: fs_cli -x "sofia status profile kamailio"

# 4. Add to dispatcher database
mysql -u kamailio -pYOUR_DB_PASSWORD kamailio -e \
    "INSERT INTO dispatcher (setid, destination, flags, priority, attrs, description) \
     VALUES (1, 'sip:NEW_FS_IP:5060', 0, 0, 'weight=50;duid=fs04', 'FreeSWITCH-4 Media');"

# 5. Reload dispatcher on Kamailio
kamcmd dispatcher.reload

# 6. Verify the new node appears
kamcmd dispatcher.list

# 7. Monitor — should start receiving calls within seconds

Removing a FreeSWITCH Node

# 1. Drain the node (Section 11)
./drain-freeswitch.sh fs03 YOUR_FS3_IP

# 2. Stop FreeSWITCH
ssh YOUR_FS3_IP "systemctl stop freeswitch"

# 3. Remove from dispatcher database
mysql -u kamailio -pYOUR_DB_PASSWORD kamailio -e \
    "DELETE FROM dispatcher WHERE destination='sip:YOUR_FS3_IP:5060';"

# 4. Reload dispatcher
kamcmd dispatcher.reload

Certificate Rotation

# 1. Renew certificate (certbot handles this automatically)
certbot renew

# 2. Reload services (handled by deploy hook, but manual if needed)
systemctl reload kamailio
systemctl restart rtpengine

# 3. Verify TLS
openssl s_client -connect YOUR_PUBLIC_VIP:5061 -brief
openssl s_client -connect YOUR_PUBLIC_VIP:8443 -brief

14. Troubleshooting

Call Flow Debugging Across Components

When a call fails, you need to trace it through all components. The Call-ID is the common thread:

Step 1: Find the Call-ID
  - From the SIP phone/trunk: check the INVITE headers
  - From Kamailio logs: grep for the caller/callee number
  - From Homer: search by phone number or time range

Step 2: Trace through Kamailio
  grep "CALL-ID-HERE" /var/log/kamailio.log

Step 3: Check which FreeSWITCH received it
  - Look for "DISPATCH:" log line with the Call-ID
  - Note the destination IP

Step 4: Trace on FreeSWITCH
  ssh fs01 "grep 'CALL-ID-HERE' /var/log/freeswitch/freeswitch.log"

Step 5: Check RTPEngine
  - RTPEngine logs show SDP manipulation per Call-ID
  journalctl -u rtpengine | grep "CALL-ID-HERE"

Live Debugging Commands

# ---- Kamailio ----

# Watch SIP traffic in real-time
sngrep -d eth0 port 5060

# Enable debug logging temporarily
kamcmd cfg.seti core debug 4
# ... reproduce the issue ...
kamcmd cfg.seti core debug 2   # Restore normal level

# Check active dialogs
kamcmd dlg.list

# Check dispatcher status
kamcmd dispatcher.list

# Memory usage
kamcmd core.shmmem

# ---- FreeSWITCH ----

# Show active calls
fs_cli -x "show calls"

# Show active channels with details
fs_cli -x "show channels"

# Trace a specific call (enable sofia debug)
fs_cli -x "sofia loglevel all 9"
# ... reproduce the issue ...
fs_cli -x "sofia loglevel all 0"   # Restore

# SIP trace on the kamailio profile
fs_cli -x "sofia profile kamailio siptrace on"
# ... reproduce ...
fs_cli -x "sofia profile kamailio siptrace off"

# Check codec negotiation
fs_cli -x "show channels" | grep -E "codec|read_codec|write_codec"

# ---- RTPEngine ----

# List all active sessions
rtpengine-ctl list sessions

# Show detailed stats
rtpengine-ctl list totals

# Show per-session details (requires Call-ID)
rtpengine-ctl list sessions CALL-ID-HERE

Common Issues and Solutions

Calls Not Reaching FreeSWITCH (Dispatcher Issues)

Symptom: Kamailio returns 503 "Service Unavailable"

Check 1: Are FreeSWITCH servers marked as active?
  kamcmd dispatcher.list
  Look for "FLAGS: AP" (Active + Probing)
  If "FLAGS: IP" or "FLAGS: DX" — server is detected as down

Check 2: Can Kamailio reach FreeSWITCH on port 5060?
  # From Kamailio server
  nc -u -z YOUR_FS1_IP 5060 && echo OK || echo FAIL
  sipsak -s sip:test@YOUR_FS1_IP:5060

Check 3: Is FreeSWITCH actually listening?
  ssh YOUR_FS1_IP "ss -ulnp | grep 5060"
  ssh YOUR_FS1_IP "fs_cli -x 'sofia status profile kamailio'"

Check 4: ACL blocking?
  ssh YOUR_FS1_IP "fs_cli -x 'reloadacl'"
  Check /var/log/freeswitch/freeswitch.log for "ACL reject"

Fix: If FS is running but dispatcher shows inactive, manually reset:
  kamcmd dispatcher.set_state a 1 sip:YOUR_FS1_IP:5060

One-Way Audio (RTPEngine Issues)

Symptom: Call connects but audio only flows in one direction (or no audio)

Check 1: Is RTPEngine running and reachable?
  echo 'd7:command4:pinge' | nc -u YOUR_RTP1_PRIVATE 2223
  Expected: 'd6:result4:ponge'

Check 2: Are the RTPEngine interfaces correct?
  rtpengine-ctl list sessions
  Verify the session shows correct internal and external IPs

Check 3: SDP analysis — is RTPEngine rewriting SDPs correctly?
  sngrep on Kamailio — compare SDP in INVITE before and after rtpengine_offer()
  The c= line should change from external IP to internal IP (towards FS)
  The c= line in 200 OK should change from FS IP to external IP (towards trunk)

Check 4: Firewall — are RTP ports open?
  On RTPEngine server: ufw status | grep 20000
  Must allow 20000-40000/udp from anywhere (external endpoints)

Check 5: Are there asymmetric routes?
  RTP must flow: External ↔ RTPEngine ↔ FreeSWITCH
  If any hop has incorrect routing, media breaks

Common fix: Verify interface= lines in rtpengine.conf
  interface = internal/PRIVATE_IP        ← Must be reachable from FreeSWITCH
  interface = external/PRIVATE_IP!PUBLIC_IP  ← PUBLIC_IP must be routable from internet

Registration Loops

Symptom: Registrations fail or loop infinitely

Check: Kamailio is trying to proxy REGISTER to FreeSWITCH,
       FreeSWITCH is sending it back to Kamailio

Fix: Ensure Kamailio handles registrations locally (save to location table)
     OR ensure FreeSWITCH does not relay registrations back

In kamailio.cfg, the REGISTER handler should either:
  save("location")   — store locally
  OR forward to FS and NOT relay back

In FreeSWITCH, ensure the kamailio profile does NOT have:
  <param name="accept-blind-reg" value="true"/>

Keepalived VIP Not Floating

Symptom: VIP stays on failed node or does not move to standby

Check 1: Is Keepalived running on both nodes?
  systemctl status keepalived

Check 2: VRRP communication
  tcpdump -i eth0 vrrp
  Both nodes should be sending VRRP advertisements

Check 3: Virtual router ID conflict?
  Ensure virtual_router_id is the same on both nodes
  Ensure no other Keepalived instance on the network uses the same ID

Check 4: Check health script
  /etc/keepalived/check_kamailio.sh
  echo $?   # Should be 0 (healthy) or 1 (unhealthy)

Check 5: IP forwarding
  sysctl net.ipv4.ip_nonlocal_bind
  # Must be 1 for the backup node to send SIP from the VIP
  echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.d/90-voip.conf
  sysctl -p /etc/sysctl.d/90-voip.conf

Performance Bottleneck Identification

Symptom Likely Bottleneck Check Solution
High SIP latency Kamailio CPU or database top on Kamailio; slow query log Add Kamailio workers; optimize DB queries
Choppy audio RTPEngine CPU or network top on RTPEngine; packet loss check More RTPEngine CPU; check network path
Call setup delays FreeSWITCH overloaded fs_cli status; check sessions Add more FS instances to pool
Registration failures Database slow Check MariaDB slow query log Index optimization; increase connections
WebRTC connection failures TLS/DTLS issues Check certificates; browser console Renew certs; verify DTLS config
Failover too slow Keepalived/dispatcher timing Check advert_int and ds_ping_interval Reduce intervals (trade-off: more traffic)

Capacity Planning Formulas

Kamailio (signaling only):
  Max CPS = CPU_cores × 1000 (approximately)
  4-core = ~4,000 calls/sec setup rate
  Memory: ~1 KB per active dialog + ~0.5 KB per registration

RTPEngine (media relay):
  Max streams = CPU_cores × 500 (G.711, no transcoding)
  8-core = ~4,000 RTP streams = ~2,000 concurrent calls
  With transcoding: divide by 3-5x
  Bandwidth: 87 kbps × concurrent_calls × 2 (bidirectional)

FreeSWITCH (media processing):
  G.711 (no transcoding): CPU_cores × 300
  With recording: CPU_cores × 200
  With transcoding: CPU_cores × 100
  With conferencing: CPU_cores × 50 (mixing is expensive)
  Memory: ~2 MB per active call (+ recording buffer)
  Disk I/O: ~100 KB/s per recorded call (G.711)

Database:
  1 registration = 1 write + periodic refreshes
  1 call = ~5-10 queries (setup + routing + CDR)
  10,000 concurrent calls ≈ 500-1,000 queries/sec

Essential Commands Quick Reference

Component Command Purpose
Kamailio kamcmd dispatcher.list Show backend status
kamcmd dlg.list List active dialogs
kamcmd core.shmmem Check memory usage
kamcmd cfg.seti core debug 4 Enable debug logging
kamcmd ul.dump Dump registration table
kamcmd stats.get_statistics all All statistics
sngrep -d eth0 port 5060 Live SIP capture
FreeSWITCH fs_cli -x "show calls" List active calls
fs_cli -x "show channels" List active channels
fs_cli -x "sofia status" SIP profile status
fs_cli -x "sofia status profile kamailio" Kamailio profile details
fs_cli -x "reloadxml" Reload XML config
fs_cli -x "status" Overall status
RTPEngine rtpengine-ctl list sessions Active media sessions
rtpengine-ctl list totals Aggregate statistics
echo 'd7:command4:pinge' | nc -u IP 2223 Ping test
Keepalived ip addr show eth0 Check VIP assignment
systemctl status keepalived Service status
journalctl -u keepalived -f Live logs
MariaDB SHOW STATUS LIKE 'wsrep%'; Galera cluster status
SHOW PROCESSLIST; Active queries
Homer Web UI: http://HOMER_IP:9080 SIP trace search

This concludes Tutorial 43. You now have the knowledge to build, operate, and troubleshoot a carrier-grade VoIP platform with Kamailio + FreeSWITCH + RTPEngine. The architecture described here scales from hundreds to tens of thousands of concurrent calls and provides the fault tolerance expected of production telecommunications infrastructure.

Key takeaways:

Need expert help with your setup?

VoIP infrastructure consulting, AI voice agent integration, monitoring stacks, scaling — I've done it all in production.

Get a Free Consultation