How to perform a Point-in-Time-Recovery restore using Barman.
PostgreSQL allows PITR (point in time recovery), i.e. the possibility to restore the database to its state at any time after the completion of the backup.
Barman leverages PostgreSQL’s PITR, allowing you to perform a point in time recovery with a simple command specifying a recovery target, either as a timestamp, as a restore label, or as a transaction ID.
The recovery target can be specified using one of four mutually exclusive options:
--target-time TARGET_TIME
--target-xid TARGET_XID
--target-name TARGET_NAME
--target-immediate
For the complete details see the barman documentation.
In this demonstration, we'll use the --target-name
option.
The requirements for this demonstration are:
- a primary PostgreSQL server.
- a Barman server with at least a backup of the primary PostgreSQL.
- a second PostgreSQL server (we'll call it the recovery server) where we can restore from the backup. It's required that the two servers run the same PostgreSQL major version. PostgreSQL should be installed but not running on this server.
- Backup and WAL archiving from the primary server have to be properly configured on Barman. Barman will also need to be able to connect via SSH to the
postgres
user on the recovery server.
We will name the servers as follows:
-
pg01
=> primary server -
backup
=> barman server -
pg02
=> recovery server
To demonstrate how the PITR works, we'll create two tables on pg01
and perform a PITR to a point in time between the two CREATE TABLE
commands. We expect to see only the first table after the recovery on the pg02
.
On pg01
server:
- Create a
test
database and atest
table with some data:
postgres=# create database test;
CREATE DATABASE
postgres=# \c test
You are now connected to database "test" as user "postgres".
test=# create table test AS
test-# SELECT * FROM GENERATE_SERIES(1, 10) AS id;
SELECT 10
- Create the restore point:
test=# select pg_create_restore_point('restore_point');
pg_create_restore_point
0/C03FC68
- Create the
test2
table:
test=# create table test2 AS
SELECT * FROM GENERATE_SERIES(1, 10) AS id;
SELECT 10
- Check the existing tables. We should see the two tables we've just created:
test=# \d
List of relations
Schema | Name | Type | Owner
public | test | table | postgres
public | test2 | table | postgres
(2 rows)
On the backup
server:
- Force the archiving of the current WAL file.
Note: this step is not usually necessary on a production server. However, in this case the workload on our server is probably not enough to fill a WAL file. This could cause Barman not to see the WAL file containing the required info: in case of WAL shipping, because PostgreSQL won't execute the archive_command
on an incomplete WAL, in case of WAL streaming, because the WAL would still be a partial one.
barman@backup:~$ barman switch-wal --force --archive pg01
The WAL file 000000300000022F000000FA has been closed on server 'pg01'
Waiting for the WAL file 000000300000022F000000FA from server 'pg01' (max: 30 seconds)
Processing xlog segments from streaming for pg01
0000001000000001000000C0
Let's now perform a recovery up to the restore point we've created. Note: we need to have at least a backup completed before the creation of the restart point.
On the backup
server as the barman
user:
- Check the existence of at least a backup for the
pg01
server
barman@backup:~$ barman list-backup pg01
pg01 20190325T214517 - Mon Mar 25 21:45:18 2019 - Size: 31.5 MiB - WAL Size: 30.2 KiB
pg01 20190325T214512 - Mon Mar 25 21:45:14 2019 - Size: 31.5 MiB - WAL Size: 91.8 KiB
pg01 20190325T214503 - Mon Mar 25 21:45:07 2019 - Size: 31.5 MiB - WAL Size: 61.8 KiB
- Perform the recovery
barman@backup:~$ barman recover --remote-ssh-command="ssh postgres@pg02" pg01 20190325T214517 /opt/postgres/data --target-name restore_point --target-action pause
Starting remote restore for server pg01 using backup 20190325T162018
Destination directory: /opt/postgres/data
Doing PITR. Recovery target name: 'restore_point'
Copying the base backup.
Generating recovery.conf
Recovery completed (start time: 2019-03-25 21:55:14.231354, elapsed time: 3 seconds)
Your PostgreSQL server has been successfully prepared for recovery!
Analyzing the above command:
-
--remote-ssh-command
specifies the command through which Barman connects topg02
-
pg01
specifies the name of the server we want to recover -
20190325T214517
specifies the id of the backup we want to recover -
/opt/postgres/data
specifies the destination directory onpg02
-
--target-name
specifies the restore point previously created onpg01
--target-action
option, accepting the following values:
-
shutdown
: once the recovery target is reached, PostgreSQL is shut down -
pause
: once recovery target is reached, PostgreSQL is started in pause state, allowing users to inspect the instance -
promote
: once the recovery target is reached, PostgreSQL will exit recovery and is promoted as a master
See the PostgreSQL documentation for further details on target action.
NOTE: the recovery command used in this example will copy all WAL files needed to reach to the target. In some cases, when the amount of WAL files is considerable big, it can be better to use the option --get-wal
. You will find additional information about this option in There's more...
section of this article.
On pg02
you can now check that only the test
table exists:
-
start PostgreSQL service: PostgreSQL will replay the WAL files until a consistent point is reached. Watch PostgreSQL logs and wait for that situation to happen before continuing with the next step.
-
connect to
test
database and check the existing tables
test=# \d
List of relations
Schema | Name | Type | Owner
public | test | table | postgres
(1 row)
We can see that we have restored the instance at the exact point in which we had created the restore point on pg01
.
- we have performed the recovery process using
--target-action pause
so PostgreSQL is now in the pause state. To terminate the recovery and promote the server to master execute:
test=# SELECT pg_wal_replay_resume();
Whenever an archive recovery completes, a new timeline is created to identify the series of WAL records generated after that recovery, so a new timeline does not overwrite the WAL data generated by previous timelines.
As we can see from the PostgreSQL log from pg02
a new timeline with ID 2 was created:
Mar 25 22:00:04 pg02 postgres[22577]: [18-1] 2019-03-25 22:00:04 UTC [@//:22577]: [14] LOG: selected new timeline ID: 2
Mar 25 22:00:04 pg02 postgres[22577]: [19-1] 2019-03-25 22:00:04 UTC [@//:22577]: [15] LOG: archive recovery complete
Mar 25 22:00:04 pg02 postgres[22585]: [6-1] 2019-03-25 22:00:04 UTC [@//:22585]: [1] LOG: checkpoint starting: end-of-recovery immediate wait
Mar 25 22:00:04 pg02 postgres[22585]: [7-1] 2019-03-25 22:00:04 UTC [@//:22585]: [2] LOG: checkpoint complete: wrote 41 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.003 s, total=0.013 s; sync files=33, longest=0.002 s, average=0.000 s; distance=33023 kB, estimate=33023 kB
Mar 25 22:00:05 pg02 postgres[22576]: [7-1] 2019-03-25 22:00:05 UTC [@//:22576]: [8] LOG: database system is ready to accept connections
Every time a new timeline is created, PostgreSQL creates a “timeline history”
file that shows which timeline it branched off from and when.
These history files are necessary to allow the system to pick the right WAL files when recovering from an archive that contains multiple timelines.
The notion of timeline goes beyond the scope of this article, you can find more details on this article on our blog.
By default, Barman pushes all WAL files from the start of the backup to the end of log (last available WAL file at recovery time for that server) to the recovery destination, inside the barman_xlog
/barman_wal
directory in PGDATA
.
For large installations this is not practical, and we advise to use on-demand pulling of WAL files, directly performed by the PostgreSQL instance as part of the recovery process. For this purpose, Barman implements the get-wal
feature of Barman. When a remote recovery is performed, it is possible to use the --get-wal
option in the recover
command at run-time:
barman recover --get-wal \
--remote-ssh-command="ssh postgres@pg02" \
--target-name restore_point \
--target-action pause \
pg01 20190325T214517 /opt/postgres/data
This way Barman will automatically set the restore_command
in the standby server in order to retrieve all the WAL files from the barman server, allowing it to be used as a WAL hub.
For more information, please read the "Speed up getting WAL files from Barman with barman-wal-restore" article from our Knowledge Base.