CLUSTER BASICS
HA……HA………HA……… JJJ ………… It’s not Ha..Ha..Ha LLL
In the world of sys admin the word “HA” is
as common as our daily natural activities.
What is this HA…?
HIGH AVAILABILITY………..
It is great to hear that our applications/services
will never go down. The term “High Availability” is made up of two words, High
and Availability.
First “Availability” came in picture, and then the word “High” came to fulfill the incompleteness of “Availability”.
Let’s understand the “Availability” first.
General meaning of Available is to be present
always. Means that thing will be available for us all the times, 100%
attendance without fail.
HOW……??
Dual NIC’s with Link aggregation for availability of
NIC.
Two HDD’s with SVM for availability of DISKS.
DUAL PSU’s for PS availability.
From above examples it seems that the Availability
is implemented in terms of “HARDWARE”.
It’s Great, but what about OS…
Hmmmmm…………
What if OS crashed/hung/panic…LL
Here comes the “HIGH” as savior. Now our “High Availability” is complete.
Availability can be measured relative to "100%
operational" or "never failing." Have you ever heard someone
refer to "five 9s"? This is a metric that refers to system
availability by the number of "9s." One 9 is 90% uptime, two 9s are
99%, three 9s are 99.9%, etc. Four-9 uptime would mean that your system has
less than one hour of downtime per year. Five 9s are currently considered the
entry point for high availability solutions. "Five 9s (which represents
5.39 minutes of annual downtime) would be considered as high-availability.
So we can define “HA” as a group of systems (called
CLUSTER) which survives any case of failure, whether it is “HARDWARE” or it is
“SOFTWARE”.
The Cluster can be either “MANUAL” or “AUTOMATIC”.
Manual cluster needs human intervention in case
anything went wrong on one system, whereas Automatic cluster starts
automatically on secondary system in case any failure on primary.
A cluster is a set of NODE’s that communicate with
each other and work toward a common goal.
Advantages of clustering
·
High performance
·
Large capacity
·
High availability
·
Incremental growth
CLUSTER TERMINOLOGY……
FAULT:
Fault is anything that may or may not hamper the
normal behavior of system; generally faults are not conveyed / visible to end
users. Like there is one PS failed, there are some soft/hard errors on HDD but
system is running. These are faults.
FAILURE:
Failure can be caused by either Hardware or
Software. It stops the systems operation and requires immediate attention for
rectification. System becomes non-operational due to a failure and visible to
all. Like both PS failed, hardware/software errors on disk increased and results
in corruption of disk.
NODE:
A node is a single server within a cluster; we can
have up to 16 nodes within a single cluster. All nodes within a cluster have
the capacity to talk to each other with help of INTERCONNECT, when a node joins
or leaves a cluster all other nodes are made aware. Nodes should be of a
similar build (same CPU, Memory, etc), though it’s good to build cluster with
similar build nodes but it’s not mandatory.
CLUSTER
INTERCONNECT:
The interconnect should be a private network that
all cluster nodes are connected to, the nodes communicate across this network
sharing information about the cluster. The interconnect should have redundancy
built in thus it should be able to survive network outages.
SWITCHOVER
& FAILOVER:
It can be considered as “Manual” & “Automatic”
over secondary. In both cases it is confirmed that the running services are
about to leave their current host server.
SWITCHOVER:
We know what we are doing, means I want that the
services should run from secondary node. The switchover is planned.
FAILOVER:
System/HA Software knows what he is doing; this is
done in case any failure detected by HA Software. Completely un-planned.
FAILBACK:
The process when a failed server automatically recommences performing
its former operations once it is online again.
There are two common clustering technologies:
1. High-availability
(HA) clustering: Always available, not even a single point of
failure. In case one node failed, services automatically migrated to another
node. Sometimes also referred as “Failover Cluster”.
2. Load-balance
clustering: It is kind of team work, sharing each other’s load. Results
by improving overall response of application. Load balancing provides better
and higher performance of service.
Cluster configuration models:
If there is cluster of two nodes, then no issues but if the no of
participating nodes are on higher side then they can be categorized in one of
the following.
Active-Active: All
participant nodes in active-active cluster are actively running the same kind
of service simultaneously. The main purpose of an active-active cluster is to
achieve load balancing if any load balancing solution is implemented, otherwise
it reduces the fail over time.
Active-Passive / Asymmetric: As
name suggest, it is like one active and one as standby for that, once the
active went down the application switches to standby which will be active then.
N+1: Two or
more nodes are required for this configuration, “N” is the no of active nodes
in that cluster and “1” is the “standby/hot-standby” for all “N” no of nodes.
If any node from “N” no of cluster failed the “1” will replace it and become active.
N+M: Two or
more nodes are required for this configuration, enhancement of “N+1”. Here “N”
is the no of active nodes and “M” is the no of “hot-standby” nodes. Generally
used in case where cluster manages many services and multiple hot-standbys are
needed for failover requirement.
N-to-1: Boss
is Boss, means I (stand-by) will take your place if you (Primary/Active) are
down, but I will not claim your place instead I will wait till your recovery
& getting back to hold your position, and then back to my original place
i.e. stand-by.
N-to-N: A
combination of active/active and N+M clusters, N to N clusters redistribute the
services, instances or connections from the failed node among the remaining
active nodes, thus eliminating (as with active/active) the need for a 'standby'
node, but introducing a need for extra capacity on all active nodes.
So what we got by all these…??
Service availability without downtime. How this can be achieved…??
By clustering only…??
NOOOO…
This can be achieved only if all cluster nodes are accessing same
data simultaneously.
This leads to requirement of shared Storage devices.
The shared storage is accessed by all nodes of cluster, but only the
Active is owner of that shared storage. Once Active down, it is claimed
automatically by Passive/Stand-by nodes. Though the data is same hence there is
no difference.
In a cluster environment almost everything which needs to ensure the
service availability is referred as “RESOURCE”.
Hence “Resource”
is a service which can be hardware or software entity managed by cluster
application. Simply A resource is a service made highly available by a cluster.
* hardware or software entity: file systems, network interface
cards (NIC), IP Addresses and applications.
Failover & Failback are based on these resources.
So, Does these resources are failover & failback…??
No…!!
The failover & failback performed on “RESOURCE GROUPS”.
Cluster resources are hold together in a cluster within a cluster resource
group, or a cluster group. Cluster groups are the units of failover within the
cluster. When a cluster resource fails and cannot be restarted automatically,
the entire cluster group is taken offline and failed over to another available
cluster node.
Good… But how the other node in a cluster knows about any kind of
failure…??
HEARTBEAT:
Heartbeat network is a private network which is shared only by the
cluster nodes, and is not accessible from outside the cluster. It is used by
cluster nodes in order to monitor each node's status and communicate with each
other.
A heartbeat provides cluster members with information on the exact
status of any cluster member at any given time. It means that any node of the
cluster knows the exact number of the nodes/participants in the cluster it is
joined to and also knows which cluster members are active or online, in
maintenance mode, offline.
Generally heartbeat set on completely different subnet network, so
that system can identify between physical failure and network failure.
If a network fails, then this can cause a false-positive. That’s it
is recommended to having a minimum of two cluster networks.
Well, next is
QUORUM:
Before proceeding towards quorum in terms of cluster, better we
should know the simple definition of quorum.
How we can define Quorum…?
Search the internet, few are here as example…
“www.dictionary.com”
The number of members of a group or organization required to be
present to transact business legally, usually a majority
“www.vocabulary.com”
A quorum is not necessarily a majority of members of a group, but the
minimum needed in order to conduct business. For example, if two members of a
group are absent, there can still be a quorum, meaning the meeting can go on
without them.
A gathering of the minimal number of members of an organization to
conduct business.
“www.thefreedictionary.com”
A minimum number of members in an assembly, society, board of
directors, etc, required to be present before any valid business can be transacted.
“www.businessdictionary.com”
Fixed minimum number of eligible members
or stockholders (shareholders) who must be present (physically or by proxy) at
a meeting before any official business may be transacted or a decision taken
therein becomes legally binding. Usually the articles of association or bylaws
of a firm specify this number, otherwise the number prescribed in corporate
legislation (such as company law) is followed.
I think these are enough to understand the meaning of Quorum.
The simplest example came in my mind is the our Society Meeting,
whenever there is meeting organized about any Festival organization and
throwing a dinner by society fund, all members of society are present with
their valuable feedback,…………BUT…………………BUT………………………whenever there is any meeting
called to raise the fund for any constructive or maintenance purpose there are merely
15 to 20% members available, and then Society chairman and Secretary start
crying about fulfillment of “QUORUM” to agree upon and pass the resolution.
So now we can define quorum,
A minimum number of members in an assembly, society, board of
directors, etc, required to be present before any valid business can be
transacted.
Cluster means we have at least one operational node. How this goal
will achieve…?
Quorum is the minimum number of cluster member votes required to
perform a cluster operation. When a node failed in cluster a config change is
required b'coz there is change in no. of nodes participating in cluster. The
quorum tells the cluster which node is currently active and which node or nodes
are in standby.
Means what...??
The resource groups are managed by cluster nodes and when a node
failed, these resource groups should be migrated on other node, right...??
So who will decide this...??
Here comes the quorum, there is a voting done in between live nodes
about the new node who take over the responsibility. This voting should be
agreed between all live nodes.
Each member carries one vote and the cluster member votes are
required to achieve a majority in order to reach quorum.
Confused...??
OK... Let's make it simpler.
Let’s assume two node clusters who don’t know what quorum is. There a
problem occurred in network and both nodes isolated,
Now what...?
Though they are live, but in eyes of cluster application there is
major problem.
Both nodes are working fine, so which node will hold the service...??
Node "A" ... Node "B" or both...??
Both nodes simultaneously cannot hold the service, and node
"A" & "B" are unaware that both are live, so what they
will do...??
They will declare themselves as Master and take the ownership of
service.
What will happen then...??
Hence the cluster state will result in "SPLIT-BRAIN".
*we will learn it later.
How to avoid it...?
Quorum... right!!!
That is why we need quorum.
The other thing that the quorum does is to intervene when
communications fail between nodes. Normally, each node within a cluster can
communicate with every other node in the cluster over a dedicated network
connection. If this network connection were to fail though, the cluster would
be split into two pieces, each containing one or more functional nodes that
cannot communicate with the nodes that exist on the other side of the
communications failure.
When this type of communications failure occurs, the cluster is said
to have been partitioned. The problem is that both partitions have the same
goal; to keep the application running. The application can’t be run on multiple
servers simultaneously though, so there must be a way of determining which
partition gets to run the application. This is where the quorum comes in. The
partition that “owns” the quorum is allowed to continue running the
application. The other partition is removed from the cluster.
Let’s back to our two node cluster example, this time assume there is
quorum. Each node has one vote, total two votes. Quorum needs more than half to
operate.
What happen if one node went down…??
There is only one node remaining with his vote, and definitely it is
note more than half. So what happen…??
There will be no “Rise of Fallen” happen. In this case an external
vote is required. But who will give vote…??
A “quorum device”, what is this quorum device.
A quorum device is a shared storage device or quorum server that is
shared by two or more nodes and that contributes votes that are used to establish
a quorum. The cluster can operate only when a quorum of votes is available. The
quorum device is used when a cluster becomes partitioned into separate sets of
nodes to establish which set of nodes constitutes the new cluster.
We know that what is “PARTITIONED” OR “PARTITIONED IN TO SUB-CLUSTER” state.
When a cluster stuck in “PARTITIONED” state, then "SPLIT-BRAIN"
happens.
SPLIT-BRAIN:
Split brain occurs when the cluster interconnect between nodes is
lost and the cluster becomes partitioned into sub clusters or in two sides.
There is no communication between them so each side/partition believes that
other is dead and tries to get ownership of resources.
How to avoid such condition…??
FENCING:
Fencing is the process of isolating a node of a computer cluster or
protecting shared resources when a node appears to be malfunctioning. As the
number of nodes in a cluster increases, so possibility also increases that one
of them may fail at some point.
Fencing is the component of cluster that cuts off access to a
resource (hard disk, etc.) from a node in cluster if it loses contact with the
rest of the nodes in the cluster.
There are two kinds of fencing: Resource level and Node level.
Using the resource level fencing the cluster can make sure that a
node cannot access one or more resources. One typical example is a SAN, where a
fencing operation changes rules on a SAN switch to deny access from a node.
The Resource
level fencing may be achieved using normal resources on which the
resource we want to protect would depend. Such a resource would simply refuse
to start on this node and therefore resources which depend on it will be not runnable
on the same node as well.
The Node level
fencing makes sure that a node does not run any resources at all. This is
usually done in a very simple, yet brutal way: the node is simply reset using a
power switch. This may ultimately be necessary because the node may not be
responsive at all.
FENCING RACE / FENCING WAR:
Considering two-node clusters, when connection between the two nodes
is broken, both nodes will follow the same procedure: "Because I'm still
alive, the other node must have failed, either partially or completely. I must
fence it to make sure it cannot later spontaneously recover and corrupt the
disks I'm writing to." Both nodes will attempt to fence each other: if the
fencing is by an external power switch, the switch should accept only one
connection at a time, and therefore only one node can succeed in fencing the
other. (This is called "fencing race".
References:
No comments:
Post a Comment