pglogical2 initial data copy or table resynchronization and its effects on bloat and vacuum. How to minimise excess bloat when doing an initial pglogical subscription of a large busy database.
Important: pglogical2 and pglogical3 have significantly different logic for table synchronization. All the below information applies to pglogical3 as well, at least as of 3.5, but pglogical3 may switch to doing table-by-table initial sync in future.
In the README for pglogical2
, the section on Table Syncronization says:
It's usually better to create replication sets before subscribing so that all tables are synchronized during initial replication setup in a single initial transaction. However, users of bigger databases may instead wish to create them incrementally for better control.
On active databases, synchronization of tables can have unexpected side effects in terms of the operation of autovaccum. Those side effects are amplified when you subscribe multiple tables to a replication set before initial sync because pglogical2 synchronizes the data for all subscribed tables in a single job when a subscription is created with sync_data = 't'
set.
Whether doing a single table, or multiple tables, one of the first steps taken by pglogical is to create a new replication slot (separate to the main slot for data streaming) then attach to the exported snapshot the creation of the slot produces. This snapshot holds down the database's global xmin value to the xmin associated with the oldest transaction that affects data rows that that snapshot may need to read. The slot holds the xmin down until the replication slot is released.
Even though most DML operations can continue on the database, (AUTO)VACUUM cannot remove any rows in any table in a db that is newer than the lowest xmin of any backend or slot for that db.
The slot's xmin has a database-wide effect; even if pglogical is only using the slot to sync one or a few tables, VACUUM is affected on all tables in the database.
If synchronization is being performed within a maintenance window, when nothing but syncing pglogical is being performed, this is not a problem. But on a live database with high churn, the result of an initial sync of all replicated tables at once would be extra bloat on all tables as vacuum is unable to recycle data rows, as the xmin is held down for the duration of the whole initial sync operation.
You can instead subscribe with sync_data = 'f'
then synchronize individual tables with pglogical.alter_subscription_resynchronize_table
. This approach allows the system to advance the global xmin after each table has completed syncing, allowing VACUUM to clean dead rows. Excess bloat can still be created when syncing very busy or very large tables as xmin is still held down while the whole table syncs, but it's better than holding it down for all tables at once.
Alternately you can use pglogical_create_subscriber
to create a physical base backup and convert it to a pglogical subscriber. No excess bloat is generated as the base backup doesn't have to hold down xmin; instead it replays WAL to reach a consistent point. In some cases you may even wish to bypass pglogical_create_subscriber
's automatic pg_basebackup
and instead make your own base backup and bring it up a streaming replica first - then convert it with pglogical_create_subscriber
.
Diagnostic hint: A slot's xmin
is shown in the slot's pg_replication_slots.xmin
. After the initial slot creation it will be null because the slot releases its initial snapshot as soon as any command is run on the slot or Pg is restarted. It's only set for new slots. Other things that can hold down a database's xmin (and thus global xmin) are prepared transactions and normal running backends with a non-null xmin
in pg_stat_activity