Authors: Martín Marquez and Edgar Leitão.
In some cases where there's a very large database involved in an online upgrade plan with pglogical, it may be interesting to seed the subscriber from a pg_basebackup
copy to speed up the initial sync (the subscriber will be the node you will upgrade the database). With this process you will also require less resources for the initialization.
This describes the process of setting up a node seeded from pg_basebackup
to be a subscriber on a newer version of Postgres.
Note 1: This only works with 9.5 or better. Minor changes can be made for it to work on 9.4 as well.
Note 2: The example in this article is for the upgrade of a cluster of 1 database from 14 to 15. Some simple changes can make it extensible to a list of databases.
Note 3: Examples in this article are for Red Hat based installations. Only paths for binaries and data directories would need to be amended for other distributions.
On the source node, the current primary, connect to the database which is meant to be migrated, in the case below the database to be migrated is called baseexample
.
The examples below show how to upgrade from PostgreSQL 14 to 15 using pglogical
2.4.3.
The source node, where the production 14 database runs will be NODE1
, and the destination node, initially a 14 replica which we will upgrade to the final version of 15, will be NODE2
. This is to have reference for the instructions below.
You could upgrade on the same DB server (this requires enough space to host both the old and new cluster), or use this process that does a logical upgrade to migrate to a new server. Our examples will use separate nodes, which bring the possibility of also upgrading the Operating System.
Either way, the host where you're upgrading to
must have both
versions of PostgreSQL (14 and 15 in our examples here) with all the same extensions used.
On the node we'll be upgrading to install PostgreSQL 15, must also have all the PostgreSQL 15 extensions and add-on packages (including pglogical
) used in the current production environment. Match extension versions if possible. If you're provisioning on a new separate host or VM, you must also ensure that you have installed PostgreSQL 14 packages on the destination host, again with all the same extensions as the current production host.
IMPORTANT!!: Before start the proceedure, make sure that Postgres is configured with wal_level = logical
in postgresql.conf
, also have the pglogical
extension for Postgresql 14 installed and included in shared_preloaded_libraries
.
Connect to the database of the currently running primary PG 14 node, in our examples it's NODE1
(192.168.1.111) and create the pglogical
extension there. After that create the pglogical
node just like in the SQL code below:
\c baseexample
CREATE EXTENSION pglogical;
SELECT pglogical.create_node(
node_name := 'node1',
dsn := 'host=192.168.1.111 ort=5432 dbname=baseexample'
);
Verify the node information:
SELECT * FROM pglogical.node_interface;
Add all tables from the public schema to the default replication set:
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']);
If you have tables on other schemata, you can add them into the ARRAY
above. For example, if there are schemata myschema1
and myshcema2
, you should use this statement call:
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public', 'myschema1', 'myschema2']);
Check the list of added tables:
SELECT * FROM pglogical.tables;
IMPORTANT!!: Before proceed, make sure that Postgresql 14 and pglogical
for Postgresql 14 extension are installed in the NODE2
, also any other extension that you are using in the NODE1
.
Now, on the upgrade-destination, in the examples from this article it's NODE2
(192.168.1.113), take a base backup from the production node NODE1
using pg_basebackup
using PostgreSQL 14 binaries:
su - postgres
/usr/pgsql-14/bin/pg_basebackup -h 192.168.1.111 -D/var/lib/pgsql/14/data/ -X stream -P -v -R
Start the new replica node using the same tools you'd usually use to start a replica (pg_ctl
or systemctl
) and check that it's streaming:
/usr/pgsql-14/bin/pg_ctl -D /var/lib/pgsql/14/data -w start
Verify it caught up with the primary. You can use this query on the primary node NODE1
to verify that the lag in contained:
SELECT pg_wal_lsn_diff(pg_current_wal_insert_lsn(), flush_lsn) AS lag_bytes,
pid,
application_name
FROM pg_stat_replication;
At this point we have a primary node with a replica which will be our candidate for upgrade to 15. We can now use pglogical_create_subscriber
to convert the PostgreSQL 14 physical replica running on NODE2
into a pglogical
subscriber, still running on PostgreSQL 14.
But the replica has to be stopped before running pglogical_create_subscriber
. Make sure it's not running before creating the subscriber.
# Stop postgres replica on NODE2
/usr/pgsql-14/bin/pg_ctl -D /var/lib/pgsql/14/data stop -m fast
# Run pglogical_create_subscriber to make this replica a logical subscriber
/usr/pgsql-14/bin/pglogical_create_subscriber -D /var/lib/pgsql/14/data \
--subscriber-name='node2' \
--subscriber-dsn='host=192.168.1.113 port=5432 dbname=baseexample' \
--provider-dsn='host=192.168.1.111 port=5432 dbname=baseexample' -v
After performing the command above, NODE2
will be started, and you can use the query below to verify if NODE2
is in fact receiving data via logical replication from NODE1
:
# Query to check the replication origin:
SELECT * FROM pg_replication_origin;
# Output of the query:
roident | roname
---------+-----------------------------
1 | pgl_baseexample_node1_node2
On NODE1
, use the below query to check if the logical replication is working to NODE2
:
# QUERY:
SELECT * FROM pg_stat_replication;
# OUTPUT:
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-------+----------+----------+------------------+---------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+-------------------------------
26976 | 10 | postgres | node2 | 192.168.1.113 | | 39670 | 2024-06-09 13:35:30.203863-04 | | streaming | 0/3000308 | 0/3000308 | 0/3000308 | 0/3000308 | | | | 0 | async | 2024-06-09 13:36:30.251301-04
At this point we are ready to upgrade the 14 instance on NODE2
which is the subscriber for the database to upgrade from the production environment to version 15 with pg_upgrade
.
First install and initialize an empty PG15 cluster, however, don't start the Postgres instance with pg_ctl
or systemctl
:
yum install -y postgresql15-server
/usr/pgsql-15/bin/postgresql-15-setup initdb
systemctl enable postgresql-15
Install pglogical
for PG15, add pglogical
to the configuration file at /var/lib/pgsql/15/data/postgresql.conf
in the shared_preload_libraries
setting, as well as any other extensions you are using or want to use that require preloading, configure the listen_addresses
and wal_level
:
yum install pglogical_15 -y
su - postgres
vim /var/lib/pgsql/15/data/postgresql.conf
listen_addresses = '*'
wal_level = 'logical'
shared_preload_libraries = 'pglogical'
Set any other postgresql.conf
parameters you had configured in PG14, and also copy pg_hba.conf
from the PG14 directory (/var/lib/pgsql/14/data/pg_hba.conf
) in the subscriber NODE2
to /var/lib/pgsql/15/data/pg_hba.conf
(should be identical to the PG14 primary as it was copied by pg_basebackup
).
The upgrade using pg_upgrade will need to be performed with logical replication disabled, and once it's finished, we'll need to restore replication origin information. For that we'll need to store a copy before doing so.
First disable (not drop!) the subscription in /var/lib/pgsql/14/data
cluster on NODE2
by connecting on this node to the database we're going to upgrade (in this case baseexample
) and executing the following statement:
\c baseexample
SELECT * FROM pglogical.alter_subscription_disable('node2' , true);
Note: if you don't know or are in doubt about the subscription name to be used in the above statement, you can verify it with the following query:
SELECT sub_name FROM pglogical.subscription;
Now we can dump the contents of pg_replication_origin_status
on this subscriber node you just disabled using psql
(this won't work in PgAdmin etc):
\copy (select * from pg_replication_origin_status) to '/var/lib/pgsql/pg_replication_origin_status.out';
You should have in the directory /var/lib/pgsql/
a file named pg_replication_origin_status.out
with the replication origin data. It should look something like this:
cat /var/lib/pgsql/pg_replication_origin_status.out
1 pgl_baseexample_node1_node2 0/3000420 0/3168EE0
We'll need the value from the third column later, that is 0/3000420
in our example.
Now stop the PG14 instance on NODE2
with pg_ctl
or systemctl, e.g.:
/usr/pgsql-14/bin/pg_ctl -D /var/lib/pgsql/14/data stop -m fast
Upgrade the database cluster to PostgreSQL 15 using pg_upgrade
located in the PG15 utility directory:
/usr/pgsql-15/bin/pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ \
-d /var/lib/pgsql/14/data/ -D /var/lib/pgsql/15/data/ -v -r
Note that this creates another copy of the database, so allow sufficient disk space to be available on this node, or use the --link
option which creates hard links, saving a lot of space. You may also want to do a pre-check run before. See the pg_upgrade manual.
Now we have to initialize the logical replication again for the upgraded cluster. For this we'll be using the dumped data from pg_replication_origin_status
. In order to do this we will need the PostgreSQL 15 cluster instance on NODE2
running, so start node in /var/lib/pgsql/15/data
:
/usr/pgsql-15/bin/pg_ctl -D /var/lib/pgsql/15/data start
Connect to database you are upgrading, in this example it's baseexample
on the PG15 cluster and run the following SQL commands, we will use here the values of the second and third columns from the file /var/lib/pgsql/pg_replication_origin_status.out
which contains the information of the replication origin previously used by PG14 instance to maintain a logical replication with NODE1
. Use the values from your output, do not run the following verbatim!
\c baseexample
SELECT * FROM pg_replication_origin_create('pgl_baseexample_node1_node2');
SELECT * FROM pg_replication_origin_advance('pgl_baseexample_node1_node2',
'0/3000420'::pg_lsn);
Finally enable once again the subscription on the PG 15 cluster (/var/lib/pgsql/15/data
) by connecting to the baseexample
database on NODE2
and executing the following statement:
psql
\c baseexample
SELECT * FROM pglogical.alter_subscription_enable('node2');
From this moment on, the baseexample
database from the PostgreSQL instance running on NODE2
will be receiving replication changes from the production provider we set up at the beginning of the article NODE1
.
Once you have a scheduled time the switchover for your application from PG14 instance in the NODE1
to the new PG15 instance on NODE2
, all that needs to be done is orchestrate the the change of the conninfo
string on all application servers.