Updating the database and zero-downtime deployment
To update systems on the fly without stopping (zero-downtime deployment), many articles have been written and many aspects of this approach are quite obvious. In my opinion, the most difficult part of the deployment in this case is the update of data warehouses, if their contract (scheme) has changed. It is this aspect that I would like to consider in this article.
Whatever the database - with an explicit data schema as relational or arbitrary, as NoSQL - the data schema is still present, even at the application level. Data read from the database should be understandable to the client, even if the storage itself does not impose any restrictions on their structure.
Suppose a system with a certain data structure and terabytes of data in the database already works in production. In the new version of the system, we need to structure a little to implement new features or improve performance. Consider what changes in the scheme may occur:
Adding a new field to
Changes of the field type
Transferring a field to another data structure (for example, in the case of denormalization)
Adding a new field as well as adding any other database object is a backwards-compatible change and does not require any additional steps in terms of implementing zero-downtime deployment. It is enough simply to apply the changes to the database on the fly, and then apply a new version of the code that uses the new database objects.
Deleting a field or any other database object is not a backwards-compatible change, but the approach to implementing it is very simple - unnecessary database objects must be removed only after the new version of the system is completely docked.
The other three types of changes are more complex in terms of ensuring zero-downtime deployment. In general, they can all be performed by copying data to other fields /entities and deleting the "old" ones after the migration of the data has successfully completed: for renaming, you can copy the data from the old field to the field with the new name, then delete the old field, change the data type can be done together with renaming, etc. Either way, over a period of time, the database must support both old and new contracts. There are at least two ways to perform such changes "on the fly":
If the database supports triggers
Create triggers that copy data from an old location to a new one on any change /addition and install them on production.
Apply the utility for data conversion, which does the same, but for all records in the database. Since the triggers are already installed, the utility can do nothing more complicated than simply "fictitious" updating each record (UPDATE table SET field = field ). A very important point here is that the action to read data from an old location and write to a new one must be atomic and protected from lost changes. Depending on the structure of the database, you can use either a pessimistic lock through SELECT FOR UPDATE or its counterparts, or optimistic if there is a field with the version of the record in the tables.
After the utility finishes its work (depending on the amount of data and complexity of the update, runtime can be calculated in days), you can already install a new version of the system that supports the new data scheme. By this point, all the records in the database that existed at the time the utility was launched will be successfully converted, and all the new ones that appeared during its operation are also converted into triggers.
Delete triggers and all fields (or other database objects) that are no longer needed.
If it is not possible to use triggers (as is the case with many NoSQL solutions)
Create and install a new version of the application (temporary version 1 in the figure), which always reads from the old field, but when you write to this field it updates both the old and the corresponding new place (in the figure, "C" - old, "H" new). Configure this version for all nodes on which the application instances run.
Apply a utility that copies data from the old location to the new one. As in the case of triggers, measures must be taken to prevent lost changes.
Create and after the completion of the utility to install another version of the application (temporary version 2), which reads data from the new field, but writes still in two places. This step is necessary, because during the successive updating of each of the nodes there will still be a gap when instances of the previous version of the application that read the old field work simultaneously with the new one.
Create and after the completion of the full sweep of the previous one, install the final version, which does not interact with the old field.
Delete old fields.
The second approach requires the creation and installation of three different versions of the application, which can be very inconvenient and cumbersome. Instead, you can use the feature toggling - put the logic of all three versions into one, but switch the mode depending on the configuration parameter, which ideally could be switched "on the fly." Thus, instead of installing each subsequent version it will be enough to change the value of the parameter (and restart the service if updating the configuration on the fly is not provided). After successful completion of the final version installation, all the code related to the data migration must be completely removed from the work branch, even if it will "live" on production until the next system update.
It's easy to see that providing zero downtime when upgrading the system is a cumbersome and fragile procedure, so it makes sense to bother with it only if there is an appropriate requirement from the business. But even if the system availability requirements are low enough (for example, 99% per year and the scheduled system update window is a day), the data conversion required to install the new version can still take more. Therefore, you need to be ready in advance for the use of such solutions, if you intend to store large amounts of data.
An alternative approach may be the deliberate rejection of back-inconsistent changes in the database schema, but, unfortunately, in practice it is not always achievable, since often the most effective way to improve data access performance is to restructure the scheme.
It may be interesting
visit this site
visit this site
visit this site