Orchestrator for MySQL: why it is impossible to build a fail-safe project without it /Cityimobil company blog /Habr
Any major project started with a couple of servers. First, there was one DB server, then slaves were added to it to scale the reading. And here - stop! One master, but many slaves; if one of the slaves leaves, then everything will be fine, but if the master leaves, it will be bad: downtime, admins in the soap raise the server. What to do? Reserve the master. My colleague Pavel already wrote about this article 3-3-3107. I will not repeat it. Instead, I'll tell you why you definitely need an Orchestrator for MySQL!
Let's start with the main question: “How will we switch the code to a new machine when the wizard leaves?”
• I like the VIP scheme (Virtual IP) the most; we’ll talk about it below. It is the simplest and most obvious, although it has an obvious limitation: the master, which we will reserve, must be in the L2 segment with a new machine, that is, you can forget about the second DC. And, in a good way, if you follow the rule that large L2 is evil, because L2 is only per rack, and between the racks L? and such a scheme has even more restrictions.
• You can register a DNS name in the code and resolve it through /etc /hosts. In fact, there will be no resolution. Advantage of the scheme: there is no restriction characteristic of the first method, that is, you can also organize cross-DCs. But then the obvious question arises, how quickly through Puppet-Ansible we can deliver the change to /etc /hosts.
• You can change the second method a little: on all web servers we put caching DNS, through which the code will go to the master database. You can register TTL 60 for this DNS entry. It seems that with proper implementation, the method is good.
• Scheme with service discovery, involving the use of Consul and etcd.
• An interesting option with ProxySQL . It is necessary to wrap all traffic to MySQL through ProxySQL, ProxySQL can determine who the master is now. By the way, about one of the options for using this product can be found in my article .
The author of Orchestrator, working at Github, first implemented the first scheme with VIP, and then redid it to a scheme with consul.
Typical infrastructure scheme:
I will immediately describe the obvious situations that need to be taken into account:
• The VIP address must not be registered in the config on any of the servers. Imagine the situation: the master rebooted, and while it is loading, Orchestrator went into failover mode and made the master one of the slaves; then the old master rose, and now VIP in two cars. This is bad.
• For the orchestra, you will need to write a script to refer to the old master and the new master. On the old one, you must perform ifdown, and on the new master, ifup vip. It would be nice to add to this script that in case of failover, the port on the switch of the old master is simply quenched to avoid any splitbrain.
• After Orchestrator called your script to first remove the VIP and /or put out the port on the switch, and then on the new master called the VIP raising script, do not forget to tell the arping team that the new VIP is now here.
• All slaves should have read_only = ? and as soon as you promote the slave to the master, it should have read_only = 0.
• Do not forget that any slave that we have chosen for this can become a master (Orchestrator has a whole mechanism of preference, which slave should be considered as a candidate for a new master in the first place, which one in the second, and which slave should not be under any circumstances selected by the master). If the slave becomes a master, then the slave load will remain on it and the master load will be added, this must be taken into account.
Why do you absolutely need Orchestrator if you don’t have one?
• Orchestrator has a very user-friendly graphical interface that displays the entire topology (see screenshot below).
• Orchestrator can track which slaves are behind, and where replication is broken at all (we have scripts for sending SMS to Orchestrator).
• Orchestrator tells you which slides have a GTID errant error.
What is a GTID errant?
There are two basic requirements for Orchestrator to work:
• It is necessary that pseudo GTID is enabled on all machines in the MySQL cluster, we have GTID enabled.
• It is necessary that everywhere there is one type of binlogs, you can statement. We had a configuration in which Row was on the master and on most of the slaves, and the Mixed mode remained historically on two. As a result, Orchestrator simply did not want to connect these slaves to the new master.
Remember that the most important thing in a production slave is its consistency with the master! If you have the Global Transaction ID (GTID) enabled on both the master and the slave, then you can use the gtid_subset function to find out if the same data change requests are actually executed on these machines. You can read more about this in here .
Thus, Orchestrator shows you through a GTID errant error that there are transactions on the slave that are not on the wizard. Why it happens?
• Read_only = 1 is not enabled on the slave, someone connected and fulfilled a request to change data.
• Super_read_only = 1 is not enabled on the slave, then the administrator, having mixed up the server, went in and executed a request there.
• If you have taken into account both of the previous points, there is one more trick: in MySQL, the request for flush binlogs also falls into the binlog, so when you flush the first time, GTID errant will appear on the wizard and on all slaves. How to avoid this? In perona-???-2? the binlog_skip_flush_commands = 1 setting appeared, which forbids writing flush to binlogs. There is a wound up on mysql.com. bug .
I summarize all of the above. If you do not want to use Orchestrator in failover mode yet, put it in surveillance mode. Then you will always have before your eyes a map of the interaction of MySQL machines and clear information about what type of replication on each machine, whether the slaves are behind, and most importantly - how much consistency they have with the master!
The obvious question is: “How should Orchestrator work?” He must select a new master from the current slaves, and then reconnect all the slaves to it (this is what GTID is for; if you use the old mechanism with binlog_name and binlog_pos, then switching the slave from the current master to the new one is simply impossible!). Before Orchestrator came to us, I once had to do all this manually. The old master crashed due to a buggy Adaptec controller, it had about 10 slaves. I needed to transfer the VIP from the master to one of the slaves and reconnect all the other slaves to it. How many consoles I had to open, how many simultaneous commands to enter I had to wait until 3 a.m., remove the load from all the slaves, except two, make the first car out of two masters, immediately hook the second car to it, so pick up all the other slaves to the new master and return the load. In general, the horror
How does Orchestrator work when it enters failover mode? This is most easily shown by the example of a situation where we want to make a master a more powerful, more modern machine than now.
The figure shows the middle of the process. What has already been done up to this point? We said that we want to make some kind of slave a new master, Orchestrator just started reconnecting all the other slaves to it, and the new master acts as a transit machine. With such a scheme, no errors occur, all the slaves work, Orchestrator removes the VIP from the old master, transfers it to the new one, makes read_only = 0 and forgets about the old master. All! The downtime of our service is the VIP transfer time, it is 2-3 seconds.
That's all for today, thank you all. Soon there will be a second article about Orchestrator. In the famous Soviet film “The Garage,” one hero said, “I wouldn’t go into reconnaissance with him!” So, Orchestrator, I’d go with you to scout!
It may be interesting
find the best coffee maker
best websites for coffee lovers