Table of Contents

Linux cluster

To service multiple FIX sessions reliably, FIXEdge can be deployed in a Linux HA cluster (Red Hat High-Availability Add-on Overview) with 2 or 3 nodes and shared storage for keeping a session's state.

The possible configurable options of shared state are:

For a shared physical device
For a shared operational system (Gluster FS)
FIXEdge Logs Replicator
DRBD

The cluster solution utilizes Corosync and Pacemaker – tools that facilitate the HA Linux cluster for applications that do not have native support for clustering.

This article describes the Active-Passive / Hot-Warm Failover Cluster.

Active or Hot node - is a working FIXEdge node getting runtime messages
Passive or Warm node - is a FIXEdge node getting periodic updates from an active one, and ready to start if the main one fails.

Virtual IP

FIXEdge nodes use virtual IP as an entry point for FIX Clients and remote administrating tools (for example FIXICC). Virtual IP is assigned to the node when it is active.

This approach gives a working, highly-available solution and works well for active-passive clusters with few nodes and static FIX session configuration.

Health checks

Health checks are used to figure out if a node or an application that runs on the node is operating properly.

The simplest way to do a node health check is to monitor the FIXEdge PID file.

More precise checks:

Check the system status via FIXEdge Admin REST API by sending the special GET Request
Establish FIX Admin monitoring session.

Shared physical device

Shared storage might be a SAN-attached device, Fibre channel attached device or TCP/IP attached device. The device might be attached to all nodes simultaneously, or the cluster resource manager can attach it to the active node only. The device might, in turn, be a resilient one, presenting the distributed file system with software or hardware replication between filesystem nodes. In the case of a geographically distributed cluster, the shared storage also can be distributed geographically in the same way as cluster nodes, and the cluster resource manager can attach the storage instance to the node in the same geo-location.

Two or three nodes constitute a cluster and at any moment only one of them can be running on FIXEdge. When a failure of the active FIXEdge node is detected, the shared FIX message file system storage is unmounted on that node and mounted on the second node. Then, FE is started on the second node. All sessions are handled by one active server and will be started on another node in case of failure.

FIXEdge start-up time increases with the number of the session it serves.

Info

The current approach prevents load balancing between the cluster nodes.

The problem with balancing the load is resolved in this solution: FIXEdge NGE

Multiple FIXEdge nodes need to have a consistent view of the FIX sessions' state, which includes the messages and sequence numbers.

This can be done with multiple approaches.

The simplest is to use a shared network filesystem (e.g. NFS share)
The drawback is significantly increased latency (approx 20 times slower).
The recommended configuration is to use a SAN storage that can be attached to each of the cluster nodes.
The storage itself is a block device that is mounted on the node where FIXEdge is running.
This approach allows having shared storage for all nodes while keeping I/O latency low.

Environment Requirements

Info
The following instructions are for CentOS/RedHat 7. They don't work for CentOS/RedHat 6.

Network:
1. The channel between FE and data provider - max wide
2. dedicated network for cluster synchronization (heartbeats)
Open ports:
1. high-availability service
2. 8005/tcp
3. 8901/tcp
4. 8905/tcp
5. 1234/udp
Application:
1. 1 Core per 1 Data Provider (recommended for latency)
2. 1G for binaries
3. Collocated nodes in one DC for max performance
4. Health-check interval sufficient for FIXEdge to start
Storage:
1. Mandatory Fibre Channel SAN storage for session logs - 1.5 TB, directly attached to both nodes. Encrypted via standard LVM features.
2. Mandatory STONITH to ensure that FC storage is mounted on a single node
3. Mandatory Archive for session logs - https://kb.b2bits.com/display/B2BITS/FIXEdge+Capacity
4. Shared storage (NFS/SMB) for configuration - can be slow.
Experimental features:
1. Start/stop from FIXICC - concept conflicts with cluster control.
Operating system: RHEL 7 or newer.

Deployment Diagram

Image Removed

Scheduled Tasks

Start FIXEdge at start-of-day - via pacemaker per FE resource by enabling resources (fixegde).
Stop FIXEdge at end-of-day, for maintenance hours - via pacemaker by disabling resources (fixegde).
Log archiving - mandatory for FIXEdge operation, FIXEdge needs available space on the /opt/fixedge/FIXEdge1/log directory, so a periodic archiving of the files and directory cleanup is required.

Filesystem Layout

...

How to Deploy

Install cluster software

Instructions below should be run on each cluster node.

Table of Contents

Linux cluster

To service multiple FIX sessions reliably, FIXEdge can be deployed in a Linux HA cluster (Red Hat High-Availability Add-on Overview) with 2 or 3 nodes and shared storage for keeping a session's state.

The possible configurable options of shared state are:

For a shared physical device
For a shared operational system (Gluster FS)
FIXEdge Logs Replicator
DRBD

The cluster solution utilizes Corosync and Pacemaker – tools that facilitate the HA Linux cluster for applications that do not have native support for clustering.

This article describes the Active-Passive / Hot-Warm Failover Cluster.

Active or Hot node - is a working FIXEdge node getting runtime messages
Passive or Warm node - is a FIXEdge node getting periodic updates from an active one, and ready to start if the main one fails.

Virtual IP

FIXEdge nodes use virtual IP as an entry point for FIX Clients and remote administrating tools (for example FIXICC). Virtual IP is assigned to the node when it is active.

This approach gives a working, highly-available solution and works well for active-passive clusters with few nodes and static FIX session configuration.

Health checks

Health checks are used to figure out if a node or an application that runs on the node is operating properly.

The simplest way to do a node health check is to monitor the FIXEdge PID file.

More precise checks:

Check the system status via FIXEdge Admin REST API by sending the special GET Request
Establish FIX Admin monitoring session.

Shared physical device

Shared storage might be a SAN-attached device, Fibre channel attached device or TCP/IP attached device. The device might be attached to all nodes simultaneously, or the cluster resource manager can attach it to the active node only. The device might, in turn, be a resilient one, presenting the distributed file system with software or hardware replication between filesystem nodes. In the case of a geographically distributed cluster, the shared storage also can be distributed geographically in the same way as cluster nodes, and the cluster resource manager can attach the storage instance to the node in the same geo-location.

Two or three nodes constitute a cluster and at any moment only one of them can be running on FIXEdge. When a failure of the active FIXEdge node is detected, the shared FIX message file system storage is unmounted on that node and mounted on the second node. Then, FE is started on the second node. All sessions are handled by one active server and will be started on another node in case of failure.

FIXEdge start-up time increases with the number of the session it serves.

Info

The current approach prevents load balancing between the cluster nodes.

The problem with balancing the load is resolved in this solution: FIXEdge NGE

Multiple FIXEdge nodes need to have a consistent view of the FIX sessions' state, which includes the messages and sequence numbers.

This can be done with multiple approaches.

The simplest is to use a shared network filesystem (e.g. NFS share)
The drawback is significantly increased latency (approx 20 times slower).
The recommended configuration is to use a SAN storage that can be attached to each of the cluster nodes.
The storage itself is a block device that is mounted on the node where FIXEdge is running.
This approach allows having shared storage for all nodes while keeping I/O latency low.

Environment Requirements

Info
The following instructions are for CentOS/RedHat 7. They don't work for CentOS/RedHat 6.

Network:
1. The channel between FE and data provider - max wide
2. dedicated network for cluster synchronization (heartbeats)
Open ports:
1. high-availability service
2. 8005/tcp
3. 8901/tcp
4. 8905/tcp
5. 1234/udp
Application:
1. 1 Core per 1 Data Provider (recommended for latency)
2. 1G for binaries
3. Collocated nodes in one DC for max performance
4. Health-check interval sufficient for FIXEdge to start
Storage:
1. Mandatory Fibre Channel SAN storage for session logs - 1.5 TB, directly attached to both nodes. Encrypted via standard LVM features.
2. Mandatory STONITH to ensure that FC storage is mounted on a single node
3. Mandatory Archive for session logs - https://b2bits.atlassian.net/wiki/display/B2BITS/FIXEdge+Capacity
4. Shared storage (NFS/SMB) for configuration - can be slow.
Experimental features:
1. Start/stop from FIXICC - concept conflicts with cluster control.
Operating system: RHEL 7 or newer.

Deployment Diagram

Image Added

Scheduled Tasks

Start FIXEdge at start-of-day - via pacemaker per FE resource by enabling resources (fixegde).
Stop FIXEdge at end-of-day, for maintenance hours - via pacemaker by disabling resources (fixegde).
Log archiving - mandatory for FIXEdge operation, FIXEdge needs available space on the /opt/fixedge/FIXEdge1/log directory, so a periodic archiving of the files and directory cleanup is required.

Filesystem Layout

Directory	Location	Purpose
`/opt/fixedge`	local filesystem	FIXEdge installation directory
`/opt/fixedge/FIXEdge1/conf`	NFS share	FIXEdge configuration
`/opt/fixedge/FIXEdge1/log`	SAN storage	FIXEdge logs, mounted on a single node at a time by pacemaker

How to Deploy

Install cluster software

Instructions below should be run on each cluster node. Superuser privileges are required for all the steps.

Install packages from a repository:
Code Block
language bash
yum install corosync pcs pacemaker
Set the password for the hacluster user:
Code Block
language bash
passwd hacluster

Open ports on the firewall:

Code Block

language	bash

firewall-cmd --add-service=high-availability 
firewall-cmd --runtime-to-permanent

Enable cluster services to run at system start-up:
Code Block
language bash
systemctl enable pcsd corosync pacemaker

Install FIXEdge

To install FIXEdge you first need the following artifacts:

the FIXEdge package (fixedge.tar.gz)
FIXEdge systemd integration configuration (fixedge-systemd.tar.gz)
the license (engine.license).

Superuser privileges are required for all the steps.

Install packages from a repositorythe FIXEdge RPM package:

Code Block

language	bash

yum install corosync pcs pacemaker

Set the password for the hacluster user

mkdir --parents /opt/fixedge
tar --extract --file fixedge.tar.gz --directory /opt/fixedge

Unpack the FIXEdge systemd integration configuration:
Code Block
language bash
passwd hacluster
Open ports on the firewall
tar --extract --file fixedge-systemd.tar.gz --directory /

Add a user and a group for FIXEdge:

Code Block

language	bash

firewall-cmd --add-service=high-availability 
firewall-cmd --runtime-to-permanent

Enable cluster services to run at system start-up

groupadd --system fixedge 
useradd --system --gid fixedge --home-dir /opt/fixedge --shell /sbin/nologin --comment "Account to own and run FIXEdge" fixedge

Change ownership of FIXEdge to the dedicated user:
Code Block
language bash
systemctlchown enable pcsd corosync pacemaker

Install FIXEdge

To install FIXEdge you first need the following artifacts:

the FIXEdge package (fixedge.tar.gz)
FIXEdge systemd integration configuration (fixedge-systemd.tar.gz)
the license (engine.license).

Superuser privileges are required for all the steps.

Install the FIXEdge RPM package

--recursive fixedge:fixedge /opt/fixedge

Copy the license:
Code Block
language bash
cp engine.license /opt/fixedge
Enable the FIXICC Agent to start at the system start-up:
Code Block
systemctl enable fixicc-agent

Firewall Configuration

Open ports for FIXEdge on the firewall:

Code Block

language	bash

mkdirfirewall-cmd --parents /opt/fixedge
tar --extract --file fixedge.tar.gz --directory /opt/fixedge

Unpack the FIXEdge systemd integration configuration:
Code Block
language bash
tar --extract --file fixedge-systemd.tar.gz --directory /

Add a user and a group for FIXEdge:

Code Block

language	bash

groupadd --system fixedge 
useradd --system --gid fixedge --home-dir /opt/fixedge --shell /sbin/nologin --comment "Account to own and run FIXEdge" fixedge

Change ownership of FIXEdge to the dedicated user

add-port=8005/tcp --add-port=8901/tcp --add-port=8903/tcp --add-port=8905/tcp --add-port=1234/udp 
firewall-cmd --runtime-to-permanent

Prepare Storage for the Session Logs and the Configuration

At this point, FIXEdge is deployed locally. Now, we need to make the configuration and the state shared.

Make an NFS share, copy files from /opt/fixedge/FIXEdge1/conf to the share and mount it to /opt/fixedge/FIXEdge1/conf .

Use a Dual-Port SAN Device

Create an LVM volume group for the shared session logs storage:
Code Block
language bash
vgcreate shared_logs_group <SAN_STORAGE_DEVICE>
replace SAN_STORAGE_DEVICE with the actual device name of SAN storage.

Create a logical volume for the session logs:

Code Block

language	bash

chownlvcreate shared_logs_group --recursive fixedge:fixedge /opt/fixedge

Copy the license

extents 100%FREE --name shared_logs

Prepare the logs storage, these commands must be executed on a single node.

Create a filesystem on the logical volume:
Make an NFS share, copy files from
Code Block
language bash
cp engine.license /opt/fixedge
Enable the FIXICC Agent to start at the system start-up:
Code Block
systemctl enable fixicc-agent

Firewall Configuration

Open ports for FIXEdge on the firewall:

Code Block

language	bash

firewall-cmd --add-port=8005/tcp --add-port=8901/tcp --add-port=8903/tcp --add-port=8905/tcp --add-port=1234/udp 
firewall-cmd --runtime-to-permanent

Prepare Storage for the Session Logs and the Configuration

At this point, FIXEdge is deployed locally. Now, we need to make the configuration and the state shared.

mkfs -t xfs /dev/shared_logs_group/shared_logs

Mount the filesystem:

Code Block

language	bash

mount /dev/shared_logs_group/shared_logs /opt/fixedge/FIXEdge1/log

Create a FIXEdge log directory structure:
Code Block
language bash
mkdir /opt/fixedge/FIXEdge1/log/archive /opt/fixedge/FIXEdge1
/conf to the share and mount it to
/log/backup
Unmount and deactivate:
Code Block
language bash
umount /opt/fixedge/FIXEdge1/
conf .

Use a Dual-Port SAN Device

Create an LVM volume group for the shared session logs storage
log vgchange -a n shared_logs_group

Set up a FIXEdge Cluster

Now we need to set up cluster resources for FIXEdge. Superuser privileges are required for all the steps.

Authorize nodes for hacluster user:
Code Block
language bash
vgcreate shared_logs_group <SAN_STORAGE_DEVICE>
replace SAN_STORAGE_DEVICE with the actual device name of SAN storage.
Create a logical volume for the session logs
pcs cluster auth NODE_1_NAME NODE_2_NAME -u hacluster
Where NODE_1_NAME and NODE_2_NAME are the hostnames of the servers that run FIXEdge.
Info
Note that these names must not resolve to 127.0.0.1 locally or use IP addresses instead of the hostnames.

Create the cluster and add nodes:

Code Block

language	bash

lvcreate shared_logs_grouppcs cluster setup --extents 100%FREEforce --name shared_logs

Prepare the logs storage, these commands must be executed on a single node.

Create a filesystem on the logical volume:
Code Block
language bash
mkfs -t xfs /dev/shared_logs_group/shared_logs
Mount the filesystem:
fixedge_ha NODE_1_NAME NODE_2_NAME
Where NODE_1_NAME and NODE_2_NAME are the hostnames of the servers that run FIXEdge.
Info
Note that these names must not resolve to 127.0.0.1 locally or use IP addresses instead of the hostnames.

Start cluster

Unmount and deactivate

Code Block

language	bash

mount /dev/shared_logs_group/shared_logs /opt/fixedge/FIXEdge1/log

Create a FIXEdge log directory structure:

Code Block

language	bash

mkdir /opt/fixedge/FIXEdge1/log/archive /opt/fixedge/FIXEdge1/log/backup

pcs cluster start --all
Now the cluster is starting up, you can check its status with commands:
Authorize nodes for hacluster user
Code Block
language bash
umount /opt/fixedge/FIXEdge1/log vgchange -a n shared_logs_group

Set up a FIXEdge Cluster

Now we need to set up cluster resources for FIXEdge. Superuser privileges are required for all the steps.

pcs status cluster pcs status nodes
Disable resource migration on the first failure, since restarting on the same node takes less time than migration to another node:
Code Block
language bash
pcs clusterproperty auth NODE_1_NAME NODE_2_NAME -u hacluster
Where NODE_1_NAME and NODE_2_NAME are the hostnames of the servers that run FIXEdge.
Info
Note that these names must not resolve to 127.0.0.1 locally or use IP addresses instead of the hostnames.
Create the cluster and add nodes
set start-failure-is-fatal=false
For a two-node cluster we must disable the quorum, but do not do this for a three-node cluster. Quorum avoids the situation when the cluster cannot decide which node is active.
Code Block
language bash
pcs property set no-quorum-policy=ignore
Add a virtual IP as a resource to the cluster:
Code Block
language bash
pcs clusterresource setup --force --name fixedge_ha NODE_1_NAME NODE_2_NAME
Where NODE_1_NAME and NODE_2_NAME are the hostnames of the servers that run FIXEdge.
Info
Note that these names must not resolve to 127.0.0.1 locally or use IP addresses instead of the hostnames.
Start cluster
Code Block
language bash
pcs cluster start --all
Now the cluster is starting up, you can check its status with commands
create virtual_ip ocf:heartbeat:IPaddr2 ip=<VIRTUAL_IP> cidr_netmask=32 op monitor interval=30s
where <VIRTUAL_IP> is the IP that will be used by the FIX clients to connect to the cluster.

Use a Dual-Port SAN Device

Add an LVM group for the sessions' logs as a resource to the cluster:
Code Block
language bash
pcs status cluster pcs status nodes
Disable resource migration on the first failure, since restarting on the same node takes less time than migration to another node
resource create logs_vg ocf:heartbeat:LVM volgrpname=logs
Add a filesystem for the sessions' logs as a resource to the cluster:
Code Block
language bash
pcs property set start-failure-is-fatal=false
For a two-node cluster we must disable the quorum, but do not do this for a three-node cluster. Quorum avoids the situation when the cluster cannot decide which node is active.
resource create logs_fs ocf:heartbeat:Filesystem device=/dev/shared_logs_group/shared_logs directory=/opt/fixedge/FIXEdge1/log fstype=xfs

Add FIXEdge as a resource to the cluster:

Code Block

language	bash

pcs property set no-quorum-policy=ignore

Add a virtual IP as a resource to the cluster

 resource create fixedge systemd:fixedge op start timeout=300s op stop timeout=60s op monitor interval=10 timeout=60s meta migration-threshold=3

Make sure that all resources are started on the same node:
Code Block
language bash
pcs resourceconstraint colocation createset virtual_ip ocf:heartbeat:IPaddr2 ip=<VIRTUAL_IP> cidr_netmask=32 op monitor interval=30s
where <VIRTUAL_IP> is the IP that will be used by the FIX clients to connect to the cluster.

Use a Dual-Port SAN Device

Add an LVM group for the sessions' logs as a resource to the cluster

logs_vg logs_fs fixedge sequential=true setoptions score=INFINITY

Make sure that resources are started in the proper order:
Code Block
language bash
pcs resource create logs_vg ocf:heartbeat:LVM volgrpname=logs
Add a filesystem for the sessions' logs as a resource to the cluster:
constraint order set virtual_ip logs_vg logs_fs fixedge

Starting and stopping FIXEdge cluster resource from FIXICC

To avoid issues related to unexpected behavior by cluster software the user should prepare scripts that use cluster management commands.

Create scripts:

bin/FixEdge1.run.cluster.sh

Code Block

language	bash
title	bin/FixEdge1.run.cluster.sh

pcs resource create logs_fs ocf:heartbeat:Filesystem device=/dev/shared_logs_group/shared_logs directory=/opt/fixedge/FIXEdge1/log fstype=xfs

Add FIXEdge as a resource to the cluster:

enable fixedge

This script is used for starting FIXEdge service as a cluster resource

bin/FixEdge1.stop.cluster.sh
Code Block
Make sure that resources are started in the proper order:
pcs constraint order set virtual_ip logs_vg logs_fs fixedge
Code Block
language bash
language bash
pcs resource create fixedge systemd:fixedge op start timeout=300s op stop timeout=60s op monitor interval=10 timeout=60s meta migration-threshold=3

Make sure that all resources are started on the same node:

Code Block

language	bash

pcs constraint colocation set virtual_ip logs_vg logs_fs fixedge sequential=true setoptions score=INFINITY

title bin/FixEdge1.stop.cluster.sh
pcs resource disable fixedge
This script is used for stopping FIXEdge service as a cluster resource

Update the paths to start and stop scripts in the fixicc-agent/conf/agent.properties.

Code Block

title	fixicc-agent/conf/agent.properties

StartFile = bin/FixEdge1.run.sh

StopFile = bin/FixEdge1.stop.sh

The user who runs fixicc-agent should have permission to operate the cluster. In some cases, it is required to add sudo for running pcs commands

Info
The script will log to console so usually for troubleshooting and debugging purposes the scripts should be extended with the logic of forwarding standard and error output with timestamps to some file

How to Validate the Installation

...

Do these steps on both servers: NODE_1_NAME and NODE_2_NAME

Download and install:

Code Block

language	bash

$ sudo wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/glusterfs-epel.repo
$ sudo yum install glusterfs
$ sudo yum install glusterfs-fuse
$ sudo yum install glusterfs-server

Check installed version:

Code Block

language	bash

$ glusterfsd --version

glusterfs 3.6.2 built on Jan 22 2015 12:58:10
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.

Start glusterfs services on all servers and enable them to start automatically on startup:
Code Block
language bash
$ sudo /etc/init.d/glusterd start $ sudo chkconfig glusterfsd on

...

On both nodes, install the needed software:
Code Block
language bash
$ sudo yum install corosync pcs pacemaker
On both nodes, set the password for hacluster user ('epmc-cmcc' was used):
Code Block
language bash
$ sudo passwd hacluster

Configure Firewall on both nodes to allow cluster traffic:

Code Block

language	bash

$ sudo iptables -I INPUT -m state --state NEW -p udp -m multiport --dports 5404,5405 -j ACCEPT
$ sudo iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT
$ sudo iptables -I INPUT -p igmp -j ACCEPT
$ sudo iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT
$ sudo service iptables save

Start the pcsd service on both nodes:
Code Block
language bash
$ sudo systemctl start pcsd
From now on, all commands need to be executed on one node only. We can control the cluster by using PCS from one of the nodes.
Since we will configure all nodes from one point, we need to authenticate on all nodes before we are allowed to change the configuration. Use the previously configured hacluster user and password to do this:
Code Block
language bash
$ sudo pcs cluster auth NODE_1_NAME NODE_2_NAME Username: hacluster Password: NODE_1_NAME: Authorized NODE_2_NAME: Authorized

Create the cluster and add nodes. This command creates the cluster node configuration in /etc/corosync.conf.

Code Block

language	bash

$ sudo pcs cluster setup --name fixedge_cluster NODE_1_NAME NODE_2_NAME 
Shutting down pacemaker/corosync services...
Redirecting to /bin/systemctl stop  pacemaker.service
Redirecting to /bin/systemctl stop  corosync.service
Killing any remaining services...
Removing all cluster configuration files...
NODE_1_NAME: Succeeded
NODE_2_NAME: Succeeded

We can start cluster now:

Code Block

language	bash

$ sudo pcs cluster start --all
NODE_1_NAME: Starting Cluster...
NODE_2_NAME: Starting Cluster...

We can check cluster status:

Code Block

language	bash

$ sudo pcs status cluster
Cluster Status:
 Last updated: Tue Jan 27 22:11:15 2015
 Last change: Tue Jan 27 22:10:48 2015 via crmd on NODE_1_NAME
 Stack: corosync
 Current DC: NODE_1_NAME (1) - partition with quorum
 Version: 1.1.10-32.el7_0.1-368c726
 2 Nodes configured
 0 Resources configured


$ sudo pcs status nodes
Pacemaker Nodes:
 Online: NODE_1_NAME NODE_2_NAME
 Standby:
 Offline:
 
$ sudo corosync-cmapctl | grep members
runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(10.17.131.127)
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(10.17.131.128)
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined


$ sudo pcs status corosync
Membership information
----------------------
    Nodeid      Votes Name
         1          1 NODE_1_NAME (local)
         2          1 NODE_2_NAME

Disable the STONITH option as we don't have STONITH devices in our demo virtual environment:
Code Block
language bash
$ sudo pcs property set stonith-enabled=false

For a two-node cluster we must disable the quorum:

Code Block

language	bash

$ sudo pcs property set no-quorum-policy=ignore
$ sudo pcs property
Cluster Properties:
 cluster-infrastructure: corosync
 dc-version: 1.1.10-32.el7_0.1-368c726
 no-quorum-policy: ignore
 stonith-enabled: false

Add Virtual IP as a resource to the cluster:

Code Block

language	bash

$ sudo pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=10.17.135.17 cidr_netmask=32 op monitor interval=30s
$ sudo pcs status resources
 virtual_ip (ocf::heartbeat:IPaddr2): Started

Add FIXEdge as a resource to cluster:

Code Block

language	bash

$ sudo pcs resource create FIXEdge ocf:heartbeat:anything params binfile="/home/user/FixEdge/bin/FIXEdge" cmdline_options="/data/FixEdge1/conf/FIXEdge.properties" user="user" logfile="/home/user/FIXEdge_resource.log" errlogfile="/home/user/FIXEdge_resource_error.log"

Note

For some reason in the /usr/lib/ocf/resource.d/ of the installed cluster there are many missing agents, including ocf:heartbeat:anything. You need to modify the original version (which you can download here: https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/anything) to make it working. The working version of the agent is attached.

This file should be copied to /usr/lib/ocf/resource.d/ and make executable:

Code Block

language	bash

$ sudo cp anything /usr/lib/ocf/resource.d/heartbeat/
$ sudo chmod a+rwx /usr/lib/ocf/resource.d/heartbeat/anything

Also, to make this agent works the following lines shall be added to sudoers file:

Code Block

language	bash

$ sudo visudo
Defaults    !requiretty
user    ALL=(user)      NOPASSWD: ALL
root    ALL=(user)      NOPASSWD: ALL

In order to make sure that the Virtual IP and FIXEdge always stay together, we can add a constraint:
Code Block
language bash
$ sudo pcs constraint colocation add FIXEdge virtual_ip INFINITY
To avoid the situation where the FIXEdge would start before the virtual IP is started or owned by a certain node, we need to add another constraint that determines the order of availability of both resources:
Code Block
language bash
$ sudo pcs constraint order virtual_ip then FIXEdge Adding virtual_ip FIXEdge (kind: Mandatory) (Options: first-action=start then-action=start)

After configuring the cluster with the correct constraints, restart it and check the status:

Code Block

language	bash

$ sudo pcs cluster stop --all && sudo pcs cluster start --all
NODE_1_NAME: Stopping Cluster...
NODE_2_NAME: Stopping Cluster...
NODE_2_NAME: Starting Cluster...
NODE_1_NAME: Starting Cluster...

The cluster configuration is now completed.

...

The current article describes the Recovery Time Objective (RTO) and Recovery Point Objective(RPO) for Disaster recovery in case of active-passive Cluster configuration (see FIXEdge Failover Cluster installation).

Recovery Time Objective (RTO)

...

The session recovery procedure happens automatically. The missing messages should be recovered with a resend request procedure automatically.

Info

Session recovery requires a reset sequence in case of damaged storage. The messages from the beginning of the day will be lost when resetting the sequence. See Recovery procedure for a session with corrupted storages

FIX Standard recommends requesting sequences after logon using Message recovery procedure or use Extended features for FIX session and FIX connection initiation

...

Versions Compared

Old Version 35

New Version Current

Key

Linux cluster

Virtual IP

Health checks

Shared physical device

Environment Requirements

Deployment Diagram

Scheduled Tasks

Filesystem Layout

How to Deploy

Install cluster software

Linux cluster

Virtual IP

Health checks

Shared physical device

Environment Requirements

Deployment Diagram

Scheduled Tasks

Filesystem Layout

How to Deploy

Install cluster software

Install FIXEdge

Install FIXEdge

Firewall Configuration

Prepare Storage for the Session Logs and the Configuration

Use a Dual-Port SAN Device

Firewall Configuration

Prepare Storage for the Session Logs and the Configuration

Use a Dual-Port SAN Device

Set up a FIXEdge Cluster

Set up a FIXEdge Cluster

Use a Dual-Port SAN Device

Use a Dual-Port SAN Device

Starting and stopping FIXEdge cluster resource from FIXICC

How to Validate the Installation

Page Comparison

Versions Compared

Old Version 35

New Version Current

Key

Linux cluster

Virtual IP

Health checks

Shared physical device

Environment Requirements

Deployment Diagram

Scheduled Tasks

Filesystem Layout

How to Deploy

Install cluster software

Linux cluster

Virtual IP

Health checks

Shared physical device

Environment Requirements

Deployment Diagram

Scheduled Tasks

Filesystem Layout

How to Deploy

Install cluster software

Install FIXEdge

Install FIXEdge

Firewall Configuration

Prepare Storage for the Session Logs and the Configuration

Use a Dual-Port SAN Device

Firewall Configuration

Prepare Storage for the Session Logs and the Configuration

Use a Dual-Port SAN Device

Set up a FIXEdge Cluster

Set up a FIXEdge Cluster

Use a Dual-Port SAN Device

Use a Dual-Port SAN Device

Starting and stopping FIXEdge cluster resource from FIXICC

How to Validate the Installation