Thursday, January 30, 2014

Attempts to create cluster fusionpbx

Configure cluster

Install some basic software: (Do on both servers)

yum install -y gd gd-devel glib-devel nc patch readline-devel libffi-devel ruby rubygems ruby-devel  htop mlocate python-dateutil redhat-rpm-config python-lxml


Add your two hosts to your hosts file: (Do on both servers)

nano /etc/hosts
# Use whatever your ips and hostnames are here
10.10.10.1   pbx1
10.10.10.2   pbx2

And change their hostnames: (Do on both servers)
nano /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=pbx1

Allow hosts to bind to nonlocal IP:
Add the following line to /etc/sysctl.conf or change it manually (nano /etc/sysctl.conf)
echo 'net.ipv4.ip_nonlocal_bind=1' >> /etc/sysctl.conf 

Restart networking:
/etc/init.d/network restart

Run:
sysctl -p

You should see:
net.ipv4.ip_nonlocal_bind = 1

The LSB Script
Copy and paste this script into /etc/init.d/FSSofia

#!/bin/sh
### -*- mode:shell-script; indent-tabs-mode:nil; sh-basic-offset:2 -*-
### BEGIN INIT INFO
# Provides: FSSofia
# Required-Start:
# Required-Stop:
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: FSSofia
# Description: FSSofia Status
### END INIT INFO
#set -x

FS_CLI_PROG='/usr/local/freeswitch/bin/fs_cli'
FS_CLI_HOST='127.0.0.1'
FS_CLI_PORT='8021'
FS_CLI_PASS='ClueCon'
PROFILES='internal external'

usage() {
  echo "Usage: $0 profile1[,profile2[,etc]] {start|stop|status}"
  exit 1
}

fs_cli() {
        $FS_CLI_PROG -H $FS_CLI_HOST -P $FS_CLI_PORT -p $FS_CLI_PASS -x "$1"
}

sofia_profile_started() {
  fs_cli "sofia xmlstatus" | grep "<name>$1</name>" | wc -l
}

if [ $# != 1 ]; then
  usage
fi


#PROFILES=`echo $1 | tr ',' ' '`
CMD=$1
#was $2

case "$CMD" in
  'start')
     fs_cli "sofia recover"
     exit 0
     ;;
  'stop')
     exit 0
     ;;
  'status')
     for p in $PROFILES; do
       if [ `sofia_profile_started "$p"` -eq 0 ]; then
         echo "$p DOWN"
         exit 3
       fi
     done
     echo "OK"
     exit 0
     ;;
  *)
     usage
     ;;
esac

Make the script executable
chmod +x /etc/init.d/FSSofia

Install Pacemaker: (Do on both servers)
yum install -y pacemaker cman ccs

Add them to start up: (Do on both servers)
chkconfig pacemaker on
NOTE: This step takes way to long to finish

Create Corosync Key: (Only do on the first server)
corosync-keygen

Set permissions on the key: (Do first server)
chown root:root /etc/corosync/authkey
chmod 400 /etc/corosync/authkey

Rsync the key to your 2nd node:
rsync –avh /etc/corosync/authkey root@10.10.10.2:/etc/corosync/authkey

Get a couple additional files we will need: (One is an admin utility the other is a dependency) (Do on both servers)
cd /usr/local/src
rpm -Uvh http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/pssh-2.3.1-3.2.x86_64.rpm
rpm -Uvh http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/crmsh-1.2.6-6.1.x86_64.rpm
 Note: Log out of your putty session and log back in to all the CRM utility to be ran.

Install a base corosync config: (Do on first server)
cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf


Edit the corosync.conf file: (Do on first server)
nano /etc/corosync/corosync.conf
compatibility: whitetank
totem {
        version: 2
        secauth: off
        threads: 0
        interface {
                ringnumber: 0
                # the bindnetaddr is the ip of your base subnet since we are using 10.10.10.0 for an example
                bindnetaddr: 10.10.10.0
        mcastaddr: 226.94.1.1
#broadcast: yes
        mcastport: 5405
                ttl: 1
        }
}
logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}
amf {
        mode: disabled
}
service {
        name: pacemaker
        ver: 0
}

Make the log folder: (Do on both servers)
mkdir /var/log/cluster

Rsync the config file to the 2nd node in cluster:
rsync -avh /etc/corosync/corosync.conf root@10.10.10.2:/etc/corosync/corosync.conf

Start the service: (Do on first server)
/etc/init.d/corosync start

Check the log file make sure nothing jumps out for errors: (Do on first server)
tail -f /var/log/cluster/corosync.log

Create the folder crm configure needs: (Do on both servers)
mkdir -p /var/lib/pacemaker/cores/root

Stop the service as we will use crm to manage it and pacemaker to start it: (Do on first server)
/etc/init.d/corosync stop

Setup the pacemaker cluster: (Do on first server)
# At the command line
ccs -f /etc/cluster/cluster.conf --createcluster fusion
ccs -f /etc/cluster/cluster.conf --addnode pbx1
ccs -f /etc/cluster/cluster.conf --addnode pbx2
ccs -f /etc/cluster/cluster.conf --addfencedev pcmk agent=fence_pcmk
ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect pbx1
ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect pbx2
ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk pbx1 pcmk-redirect port=pbx1
ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk pbx2 pcmk-redirect port=pbx2
echo "CMAN_QUORUM_TIMEOUT=0" >> /etc/sysconfig/cman

Rsync cluster file to 2nd server:
rsync -avh /etc/cluster/cluster.conf root@10.10.10.2:/etc/cluster/cluster.conf

Allow Pacemaker through iptables: (On both servers)
iptables -I INPUT 1 --protocol udp --dport 5405 -j ACCEPT
iptables -I INPUT 1 --protocol udp --sport 5404 -j ACCEPT
iptables -I OUTPUT 1 --protocol udp --dport 5405 -j ACCEPT
iptables -I OUTPUT 1 --protocol udp --sport 5404 -j ACCEPT
service iptables save

Start pacemaker: (On the first server)
/etc/init.d/pacemaker start

Setup the cluster options: (On the first server)
# At the command line
crm configure
> edit
Note: This will open the xml file inside vi / vim or whatever your default editor is.

Delete everything in the file and paste in the following: (On the first server)

node pbx1 \
        attributes standby="off"
node pbx2 \
        attributes standby="off"
#This section starts sofia recovery script
primitive fs lsb:FSSofia \
        op monitor interval="2s" enabled="true" timeout="8s" on-fail="stop" \ meta target-role="Started" is-managed="true"
# This section is our internal ip and the gateway we use for all boxes on the network
primitive fs-ip ocf:heartbeat:IPaddr2 \
params ip="10.10.10.10" cidr_netmask="24" nic="eth0:0" \
op monitor interval="10s"
# This section is our public ip (or external that your isp gives you)
##primitive vip2 ocf:heartbeat:IPaddr2 \
## params ip="192.168.0.1" cidr_netmask="24" nic="eth1" \
## op monitor interval="40s" timeout="20s"
# This section is to raise an additional ip used for public web traffic to something like haproxy or load balancers
##primitive vip3 ocf:heartbeat:IPaddr2 \
## params ip="10.2.0.1" cidr_netmask="8" nic="eth2" \
## op monitor interval="40s" timeout="20s"
property $id="cib-bootstrap-options" \
dc-version="1.1.8-7.el6-394e906" \
cluster-infrastructure="cman" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
#vim:set syntax=pcmk
Save and commit your changes:
# Exit and save
:wq
# At the crm command line
commit
exit

Restart Pacemaker and check the logs: (On the first server)
/etc/init.d/pacemaker restart
tail -f /var/log/cluster/corosync.log

Start Pacemaker on the 2nd server:
/etc/init.d/pacemaker start

To test our setup we will launch crm_mon on the passive node, while running a ping in another session to watch for packet loss:
crm_mon

Configure FreeSWITCH

You should have the following parameters set in freeswitch_base_dir/conf/vars.xml (or System -Variables in fusionpbx GUI) on both nodes:
<X-PRE-PROCESS cmd="set" data="local_ip_v4=10.10.10.10"/>

You should have the following parameters set in both (internal, external) sofia profiles freeswitch_base_dir/conf/sip_profiles/ (or Advanced - SIP profiles in fusionpbx GUI) on both nodes:
<param name="track-calls" value="true"/>

Also you should set the db connection parameters in both profiles:
<param name="odbc-dsn" value="freeswitch:freeswitch:password"/>

You should have the following parameters set in  freeswitch_base_dir/conf/autoload_configs/switch.conf.xml on both nodes:
<param name="core-db-dsn" value="freeswitch:freeswitch:password"/>
<param name="core-recovery-db-dsn" value="freeswitch:freeswitch:password"/>


2 comments:

  1. You may want to read http://inside-out.xyz/technology/high-availability-fusionpbx-cluster-overview.html

    ReplyDelete
  2. Did this method work or were there more changes ?

    ReplyDelete