MySQL High Availability Framework Explained – Part III: Failover Scenarios
Tuesday, April 16, 2019 at 9:33AM
Kristi Anderson in AWS, Database, DevOps, Failure Analysis, Geo-distributed Clusters, High Availability Framework, Master Slave, Monitoring, MySQL, MySQL Crash, MySQL Failover Scenarios, MySQL High Availability, MySQL Semisynchronous Replication, MysQL Hosting, Network Connectivity, Replication, Split-Brain, administration, administrator, application, cloud, cloud computing, cloud storage, cluster, clusters, data management, database, database replication, database scalability, databases, datacenter, deployment, distirbuted, distributed systems, enterprise architecture, failure, high availability, high availablilty, infrastructure, management, mysql cluster, network, nodes, open source, paritioning, partitioning, platform, replication, sql, sql, tutorial, uptime

MySQL High Availability Framework Explained – Part III: Failover Scenarios

In this three-part blog series, we introduced a High Availability (HA) Framework for MySQL hosting in Part I, and discussed the details of MySQL semisynchronous replication in Part II. Now in Part III, we review how the framework handles some of the important MySQL failure scenarios and recovers to ensure high availability.

MySQL Failover Scenarios

Scenario 1 – Master MySQL Goes Down

Thus, whenever a master MySQL goes down (whether due to a MySQL crash, OS crash, system reboot, etc.), our HA framework detects it and promotes a suitable slave to take over the role of the master. This ensures that the system continues to be available to the applications.

Scenario 2 – Slave MySQL Goes Down

Scenario 3 – Network Partition – Network Connectivity Breaks Down Between Master and Slave Nodes

This is a classical problem in any distributed system where each node thinks the other nodes are down, while in reality, only the network communication between the nodes is broken. This scenario is more commonly known as split-brain scenario, and if not handled properly, can lead to more than one node claiming to be a master MySQL which in turn leads to data inconsistencies and corruption.

Let’s use an example to review how our framework deals with split-brain scenarios in the cluster. We assume that due to network issues, the cluster has partitioned into two groups – master in one group and 2 slaves in the other group, and we will denote this as [(M), (S1,S2)].

Thus, we see that the MySQL HA framework handles split-brain scenarios effectively, ensuring both data consistency and availability in the event the network connectivity breaks between master and slave nodes.

This concludes our 3-part blog series on the MySQL High Availability (HA) framework using semisynchronous replication and the Corosync plus Pacemaker stack. At ScaleGrid, we offer highly available hosting for MySQL on AWS and MySQL on Azure that is implemented based on the concepts explained in this blog series. Please visit the ScaleGrid Console for a free trial of our solutions.

Article originally appeared on (
See website for complete article licensing information.