This article describes the procedure of restoring a Barman backup into a Postgres instance and using it as a seed for a new PGD cluster.
The original cluster has 3 nodes:
bdrdb=# SELECT node_name, peer_state_name FROM bdr.node_summary;
node_name | peer_state_name
node1 | ACTIVE
node2 | ACTIVE
node3 | ACTIVE
(3 rows)
There is a table with some data:
bdrdb=# SELECT * FROM test;
id | msg
1 | AAA
(1 row)
It's possible to recover a Barman backup into a single node and join it in an existing PGD cluster, as described in article How to recover a PGD node from a Barman backup and join it in an existing cluster?. However, as explained in that article, this technique is not recommended because it can lead to data inconsistency, unless you do this on a very controlled environment.
Our recommendation is to use Barman to recover one of the nodes, and use this node as a seed to build a new PGD cluster from scratch. This article explains this procedure.
In order to destroy the cluster, on node by node do the following:
systemctl stop harp-manager
systemctl stop postgres
rm -rf /opt/postgres/data/*
Barman can be used to rebuild any of the nodes that originally had the same data.
From the Barman node, we recover the latest backup of node 1 into node 2:
barman recover \
--remote-ssh-command 'ssh postgres@node2' \
--get-wal \
node1 latest /opt/postgres/data/
In this case, a PITR (point-in-time recovery) is also possible, simply by specifying a --target-time
and a proper backup name (it might not be latest
) in the command above.
Then we start Postgres on node 2:
systemctl start postgres
And we confirm the data is still there on node 2:
bdrdb=# SELECT * FROM test;
id | msg
1 | AAA
(1 row)
Even though it was recovered from a backup of node 1, node 2 is not configured as a PGD node anymore because it doesn't have the required replication slots and replication origins.
So the first step is to clean up the BDR metadata on node 2:
SELECT bdr.consensus_disable();
SELECT bdr.part_node (
node_name := t.node_name,
wait_for_completion := false,
force := true
)
FROM bdr.node_summary t
WHERE t.node_name != (
SELECT node_name FROM bdr.local_node_summary
);
SELECT bdr.part_node (
node_name := t.node_name,
wait_for_completion := false,
force := true
)
FROM bdr.local_node_summary t;
-- Restart Postgres
SELECT bdr.drop_node (
node_name := t.node_name,
cascade := true,
force := true
)
FROM bdr.node_summary t
WHERE t.node_name != (
SELECT node_name FROM bdr.local_node_summary
);
SELECT bdr.drop_node (
node_name := t.node_name,
cascade := true,
force := true
)
FROM bdr.local_node_summary t;
IMPORTANT!!: Before doing the next step of dropping the extension, please make sure CRDTs are not being used, by following the Knowledge Base Article How to check if CRDTs are being used.
DROP EXTENSION bdr CASCADE;
By default, TPAexec randomly picks one of the PGD nodes as the source, this is called the "first BDR primary". So we need to tell TPAexec we want the first BDR primary to be node 2, because we just recovered a Barman backup into node 2. We do this by configuring preferred_first_bdr_primary
under cluster_vars
in the config.yml
:
cluster_vars:
...
preferred_first_bdr_primary: node2
Then get into the cluster directory (the same directory where the config.yml
resides) and run the following:
tpaexec provision .
tpaexec deploy .
TPAexec will rebuild the BDR metadata on node 2, then will join the other nodes having node 2 as an upstream, effectively copying data from node 2 into the other nodes.
IMPORTANT!!: If data on node 2 is too large, then the logical join can take a long time, even several hours depending on the size and the network latency between the nodes. If data is too large (larger than 1 TB for example) or network latency is significant, then we recommend using bdr_init_physical
to build the other nodes instead of relying on TPAexec for that. You can find more details about bdr_init_physical
in the PGD documentation.
At the end of the tpaexec deploy
command, you should see the full cluster again:
bdrdb=# SELECT node_name, peer_state_name FROM bdr.node_summary;
node_name | peer_state_name
node1 | ACTIVE
node2 | ACTIVE
node3 | ACTIVE
(3 rows)
And the data should be there on all nodes:
bdrdb=# SELECT * FROM test;
id | msg
1 | AAA
(1 row)