Setting up HA with HAProxy and Keepalived in AWS ~ Peter Manton

Wednesday, 20 July 2016

20.7.16

By Peter

In: centos, haproxy, keepalived

Setting up HA with HAProxy and Keepalived in AWS

Typically (or rather by default) keepalived uses multicast to make decisions dependent on host availability - although on cloud platforms like AWS, Google Developer Cloud etc. multicast is not currently supported and hence we must instruct keepalived to use unicast instead.

For this exercise there will be two HAProxy instances (a slave and a master node) that will share an elastic IP between the two of them using keepalived to perform the switch over where nescasery.

These two load balances will then interact with two backend application servers - which in turn themselves interact with it's own backend database server that have SQL replication setup.

On the master we should firstly ensure the system is up-to-date and it has the relevant version of haproxy installed (which is anything > 1.2.13.)

yum update && yum install haproxy keepalived

Ensure both of them startup on boot:

systemctl enable keepalived
systemctl enable haproxy

Now in a normal environment keepalived does a great job of automatically assigning the shared IP to the nescasery host - although due to (static IP configuration) limitations within AWS this is not possible and instead we should instruct keepalived to run a script when a failover should occur - which will simply utilize the AWS API by re-associating an elastic IP from the master to the slave (or visa versa.)

We should replace the keepalived.conf configuration as follows:

sudo mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.orig
sudo vi /etc/keepalived/keepalived.conf

vrrp_script chk_haproxy {
script "pidof haproxy"
interval 5
fall 2 # fail twice before failing test
rise 2 # ensure is successful twice before passing test
}

vrrp_instance VI_1 {
debug 2
interface eth0
state MASTER
virtual_router_id 51
priority 101
unicast_src_ip 10.11.12.201
unicast_peer {
5.6.7.8
}
track_script {
chk_haproxy
}
notify_master /usr/libexec/keepalived/failover.sh
}

and add the following on the slave node:

sudo cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.orig
sudo vi /etc/keepalived/keepalived.conf

vrrp_script chk_haproxy {
script "pidof haproxy"
interval 2
}

vrrp_instance VI_1 {
debug 2
interface eth0
state BACKUP
virtual_router_id 51
priority 100
unicast_src_ip 10.11.13.202
unicast_peer {
1.2.3.4
}
track_script {
chk_haproxy
}
notify_master /usr/libexec/keepalived/failover.sh
notify_fault /usr/libexec/keepalived/failover_fault.sh
}

Now we will create the script defined in the 'notify_master' section - although before we do this we should use AWS IAM to create and configure the relevant role for our servers so they are able to use the AWS CLI to switch the elastic IP's.

I create a policy with something like:

{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ec2:AssignPrivateIpAddresses",
"ec2:AssociateAddress",
"ec2:DescribeInstances"
],
"Effect": "Allow",
"Resource": "*"
}
]
}

* Although I would reccomended specifying the resource specifcially to tighten it up.

Now we will create the script (on both nodes):

sudo vi /usr/libexec/keepalived/failover.sh
chmod 700 /usr/libexec/keepalived/failover.sh

#!/bin/bash

ALLOCATION_ID=eipalloc-123456
INSTANCE_ID=i-123456789
SECONDARY_PRIVATE_IP=172.30.0.101

/usr/bin/aws ec2 associate-address --allocation-id $ALLOCATION_ID --instance-id $INSTANCE_ID --private-ip-address $SECONDARY_PRIVATE_IP --allow-reassociation

and then (on each node) configure the AWS CLI:

aws configure

For the networking side we will have a single interface on each node - although both of them will have a secondary IP (which we will use to assosiate with our elastic IP.) The IP's of the two machines will also be in separate subnet's since they are spread accross two availability zones.

We should now start keepalived on both hosts:

sudo service keepalived start
sudo service haproxy start

** You WILL almost certainly come accross problems with SELinux (if it's enabled) - ensure you check your audit.log for any related messages and resolve those problems before continuing! **

We should now see the following on the master node:

tail -f /var/log/messages

Jul 18 15:34:40 localhost Keepalived_vrrp[27585]: VRRP_Script(chk_haproxy) succeeded
Jul 18 15:34:40 localhost Keepalived_vrrp[27585]: Kernel is reporting: interface eth0 UP
Jul 18 15:34:40 localhost Keepalived_vrrp[27585]: VRRP_Instance(VI_1) Transition to MASTER STATE
Jul 18 15:34:41 localhost Keepalived_vrrp[27585]: VRRP_Instance(VI_1) Entering MASTER STATE
Jul 18 15:34:59 localhost Keepalived_vrrp[27585]: VRRP_Instance(VI_1) Received lower prio advert, forcing new election
Jul 18 15:34:59 localhost Keepalived_vrrp[27585]: VRRP_Instance(VI_1) Received lower prio advert, forcing new election

and then on the slave node:

tail -f /var/log/messages

Jul 18 15:34:54 localhost Keepalived_vrrp[27641]: VRRP_Script(chk_haproxy) succeeded
Jul 18 15:34:55 localhost Keepalived_vrrp[27641]: Kernel is reporting: interface eth0 UP
Jul 18 15:34:59 localhost Keepalived_vrrp[27641]: VRRP_Instance(VI_1) Transition to MASTER STATE
Jul 18 15:34:59 localhost Keepalived_vrrp[27641]: VRRP_Instance(VI_1) Received higher prio advert
Jul 18 15:34:59 localhost Keepalived_vrrp[27641]: VRRP_Instance(VI_1) Entering BACKUP STATE

We will now configure the HAProxy portion by replacing the existing haproxy config:

mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.orig

vi /etc/haproxy/haproxy.cfg

global
daemon
maxconn 4000
stats socket /var/run/haproxy.sock mode 600 level admin
stats timeout 2m
user haproxy
group haproxy
daemon

defaults
log global
mode tcp
option tcplog
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms

frontend www
bind 172.30.0.241:80
default_backend webserver_pool

backend webserver_pool
balance roundrobin
mode http
option httplog
option httpchk GET /someService/isAlive
server serverA 10.11.12.13:8080 check inter 5000 downinter 500 # active node
server serverB 10.12.13.14:8080 check inter 5000 backup # passive node

listen admin
bind 172.30.0.241:8777
stats enable
stats realm Haproxy\ Statistics
stats auth adminuser:secure_pa$$word!

Finally reload both haproxy instances to apply the new configuration.

Peter Manton :: Tech Notes

Wednesday, 20 July 2016

Setting up HA with HAProxy and Keepalived in AWS

0 comments:

Post a Comment