We deal with partitions in PostgreSQL 9
PostgreSQL 10 was released in early October 201? almost a year ago.
One of the most interesting new "features" is unconditional declarative partitioning. But what if you do not rush to upgrade to 10k? Here Amazon, for example, does not hurry, and introduced the support of PostgreSQL 10 only in the last days of February 2018.
Then the old-good partirovanie through inheritance comes to the rescue. I'm the software architect of the finance department in the taxi company, so all the examples will be somehow connected with the trips (problems with money will be left for another time).
Since we began to rewrite our financial system in 201? when I only joined the company, there was no question of declarative party-building. So to this day successfully used the technique described below.
The original reason for writing the article was that most of the partitioning examples in PostgreSQL that I encountered were very basic. Here is the table, here is one column that we are looking at, and maybe even know in advance what values it contains. It would seem that everything is simple. But the real life makes its own adjustments.
see here .
In our case, we want to get the name of the parent of the table on which trigger was triggered. In our case this will be rides. We use a similar approach in several microservices, and this part can be transferred practically unchanged.
PERFORM It's useful if we want to call a function that returns nothing. Usually in the examples all the logic is tried to put into one function, but we try to be careful.
USING NEW indicates that in this query we are using the values from the string that the try was trying to add.
$ 1. * will expand all values of the new line. In fact, this can be translated into
NEW. * . That translates into
The following procedure, which we call using
PERFORM will create a new partition if it does not already exist. This will happen once per period for every tenant.
CREATE OR REPLACE FUNCTION create_new_partition (parent_table_name text,.
partition_name text) RETURNS VOID AS
$ BODY $
RAISE NOTICE 'A new% partition will be created:%', parent_table_name, partition_name;
select format ('CREATE TABLE IF NOT EXISTS% s' created_at AT TIME ZONE '' UTC '' ''% s '' AND
created_at AT TIME ZONE (
tanant_id = ''% s '' AND
'' UTC '' <= ''%s''))
.INHERITS (% I) ', partition_name, env, partition_date,
Partition_date + interval' 1 month ', parent_table_name) into sql;
.-- EXECUTE sqr;
.PERFORM index_partition (partition_name);
$ BODY $
As already described, we use
INHERITSto create a table similar to parent, and
CHECKin order to determine what data should be there.
RAISE NOTICEjust prints a string into the console. If we now run
INSERTfrom psql, we can see if the partition was created.
We have a new problem.
INHERITSdoes not inherit indexes. For that, we have two solutions:
Create indexes using inheritance:
CREATE TABLE LIKE, and then
ALTER TABLE INHERITS
Or create indexes procedurally:
CREATE OR REPLACE FUNCTION index_partition (partition_name text) RETURNS VOID AS
$ BODY $
- Ensure we have all the necessary indices in this partition;
EXECUTE 'CREATE INDEX IF NOT EXISTS' || partition_name || '_tenant_timezone_idx ON' || partition_name || '(tenant_id, timezone (' 'UTC' ':: text, created_at))';
- More indexes here
$ BODY $
It is very important not to forget about child table indexing, because even after partitioning in each of them there will be millions of rows. Indices on the parent are not needed in our case, since the parent will always remain empty.
Finally, we create a trigger that will be called when a new line is created:
CREATE TRIGGER before_insert_row_trigger
BEFORE INSERT ON rides
FOR EACH ROW EXECUTE PROCEDURE insert_row ();
There is another subtlety here, on which attention is rarely emphasized. Partitioning is best done by columns whose data never change. In our case, this works: the trip never changes tenant_id and created_at. The problem that arises, if it is not so - PostreSQL will not return some data to us. We then promised him CHECK'om that all the data are valid.
There are several solutions (except for the obvious - do not mutate the data on which we partition):
UPDATE'And at the application level we always do
DELETE + INSERT
Add one more trigger to
UPDATE, which will transfer the data to the correct partition
Another nuance that is worth considering is how to properly index columns containing dates. If we use AT TIME ZONE in queries, do not forget that it is actually a function call. So our index should also be function based. I forgot. As a result - again the base that has died from the load.
The last aspect that should be considered is how partitions interact with different ORM frameworks, whether ActiveRecord in Ruby or GORM in Go.
The partitions in PostgreSQL rely on the fact that the parent table will always be empty. If you do not use ORM, you can safely return to the first stored procedure, and change RETURN NEW; on RETURN NULL ;. Then the row in the parent table is simply not added, which we actually want.
But the fact is that most ORMs use INSERT'e RETURNING clause. If we return NULL from our trigger, the ORM will panic, believing that the string has not been added. It is added, but not where the ORM looks.
There are several ways to get around this:
- Do not use ORM for INSERTs
- To patch ORM (that sometimes advise in case of ActiveRecord)
- Add another trigger, which will remove the string from the parent.
The last option is undesirable, because for each operation we will perform three. But nevertheless at times inevitable, therefore we will consider it separately:
CREATE OR REPLACE FUNCTION delete_parent_row ()
RETURNS TRIGGER AS
$ BODY $
delete from only rides where id = NEW.ID;
$ BODY $
CREATE TRIGGER after_insert_row_trigger
AFTER INSERT ON rides
FOR EACH ROW EXECUTE PROCEDURE delete_parent_row ();
The last thing that remains for us is to test our decision. For this we generate a certain number of lines:
$ script $
year_start_epoch bigint: = extract (epoch from '20170101' :: timestamptz at time zone 'UTC');
delta bigint: = extract (epoch from '???:59:59' :: timestamptz at time zone 'UTC') - year_start_epoch;
tenants varchar: = array['tenant_a', 'tenant_b', 'tenant_c', 'tenant_d'];
FOREACH tenant IN ARRAY tenants LOOP
FOR i IN 1100000 LOOP
insert into rides (tenant_id, created_at, ride_id)
values (tenant, to_timestamp (random () * delta + year_start_epoch) at time zone 'UTC', i);
$ script $;
And let's see how the database behaves:
explain select *
where tenant_id = 'tenant_a'
and created_at AT TIME ZONE 'UTC'> '20171102'
and created_at AT TIME ZONE 'UTC' <= '20171103';
If everything went as it should, we should see the following result:
Append (cost = ?????? rows = 4 width = 196).
-> Seq Scan on rides (cost = ?????? rows = 3 width = 196)
Filter: (((created_at) :: timestamp without time zone> '2017-11-???:00:00' :: timestamp without time zone) AND ((created_at) :: timestamp without time zone <= '2017-11-03 00:00:00'::timestamp without time zone) AND ((tenant_id)::text = 'tenant_a'::text))
-> Index Scan using rides_tenant_a_2017_11_tenant_timezone_idx on rides_tenant_a_2017_11 (cost = ?????? rows = 1 width = 196)
Index Cond: (((tenant_id) :: text = 'tenant_a' :: text) AND ((created_at) :: timestamp without time zone > '2017-11-???:00:00' :: timestamp without time zone) AND ((created_at) :: timestamp without time zone <= '2017-11-03 00:00:00'::timestamp without time zone))
Despite the fact that each tenant'a hundred thousand lines, we make a selection only from the required data slice. Success!
I hope this article was interesting for those who were not already familiar with what partitioning is and how it is implemented in PostgreSQL. And those for whom this topic is no longer new, yet learned a couple of interesting tricks.
It may be interesting
Situs QQ Online
Situs QQ Online