PITR with Barman

Anna Bellandi
Anna Bellandi

How to perform a Point-in-Time-Recovery restore using Barman.

PostgreSQL allows PITR (point in time recovery), i.e. the possibility to restore the database to its state at any time after the completion of the backup.

Barman leverages PostgreSQL’s PITR, allowing you to perform a point in time recovery with a simple command specifying a recovery target, either as a timestamp, as a restore label, or as a transaction ID.

The recovery target can be specified using one of four mutually exclusive options:

  • --target-time TARGET_TIME
  • --target-xid TARGET_XID
  • --target-name TARGET_NAME
  • --target-immediate

For the complete details see the barman documentation.

In this demonstration, we'll use the --target-name option.

Requirements

The requirements for this demonstration are:

  • a primary PostgreSQL server.
  • a Barman server with at least a backup of the primary PostgreSQL.
  • a second PostgreSQL server (we'll call it the recovery server) where we can restore from the backup. It's required that the two servers run the same PostgreSQL major version. PostgreSQL should be installed but not running on this server.
  • Backup and WAL archiving from the primary server have to be properly configured on Barman. Barman will also need to be able to connect via SSH to the postgres user on the recovery server.

Preliminary steps

We will name the servers as follows:

  • pg01 => primary server
  • backup => barman server
  • pg02 => recovery server

To demonstrate how the PITR works, we'll create two tables on pg01 and perform a PITR to a point in time between the two CREATE TABLE commands. We expect to see only the first table after the recovery on the pg02.

On pg01 server:

  1. Create a test database and a test table with some data:
postgres=# create database test;
CREATE DATABASE
postgres=# \c test
You are now connected to database "test" as user "postgres".
test=# create table test AS
test-# SELECT * FROM GENERATE_SERIES(1, 10) AS id;
SELECT 10
  1. Create the restore point:
test=# select pg_create_restore_point('restore_point');
pg_create_restore_point
0/C03FC68
  1. Create the test2 table:
test=# create table test2 AS
SELECT * FROM GENERATE_SERIES(1, 10) AS id;
SELECT 10
  1. Check the existing tables. We should see the two tables we've just created:
test=# \d
List of relations
Schema | Name | Type | Owner
public | test | table | postgres
public | test2 | table | postgres
(2 rows)

On the backup server:

  1. Force the archiving of the current WAL file.

Note: this step is not usually necessary on a production server. However, in this case the workload on our server is probably not enough to fill a WAL file. This could cause Barman not to see the WAL file containing the required info: in case of WAL shipping, because PostgreSQL won't execute the archive_command on an incomplete WAL, in case of WAL streaming, because the WAL would still be a partial one.

barman@backup:~$ barman switch-wal --force --archive pg01
The WAL file 000000300000022F000000FA has been closed on server 'pg01'
Waiting for the WAL file 000000300000022F000000FA from server 'pg01' (max: 30 seconds)
Processing xlog segments from streaming for pg01
0000001000000001000000C0

Point in Time Recovery

Let's now perform a recovery up to the restore point we've created. Note: we need to have at least a backup completed before the creation of the restart point.

On the backup server as the barman user:

  1. Check the existence of at least a backup for the pg01 server
barman@backup:~$ barman list-backup pg01
pg01 20190325T214517 - Mon Mar 25 21:45:18 2019 - Size: 31.5 MiB - WAL Size: 30.2 KiB
pg01 20190325T214512 - Mon Mar 25 21:45:14 2019 - Size: 31.5 MiB - WAL Size: 91.8 KiB
pg01 20190325T214503 - Mon Mar 25 21:45:07 2019 - Size: 31.5 MiB - WAL Size: 61.8 KiB
  1. Perform the recovery
barman@backup:~$ barman recover --remote-ssh-command="ssh postgres@pg02" pg01 20190325T214517 /opt/postgres/data --target-name restore_point --target-action pause
Starting remote restore for server pg01 using backup 20190325T162018
Destination directory: /opt/postgres/data
Doing PITR. Recovery target name: 'restore_point'
Copying the base backup.
Generating recovery.conf

Recovery completed (start time: 2019-03-25 21:55:14.231354, elapsed time: 3 seconds)

Your PostgreSQL server has been successfully prepared for recovery!

Analyzing the above command:

  • --remote-ssh-command specifies the command through which Barman connects to pg02
  • pg01 specifies the name of the server we want to recover
  • 20190325T214517 specifies the id of the backup we want to recover
  • /opt/postgres/data specifies the destination directory on pg02
  • --target-name specifies the restore point previously created on pg01

--target-action option, accepting the following values:

  • shutdown: once the recovery target is reached, PostgreSQL is shut down
  • pause: once recovery target is reached, PostgreSQL is started in pause state, allowing users to inspect the instance
  • promote: once the recovery target is reached, PostgreSQL will exit recovery and is promoted as a master

See the PostgreSQL documentation for further details on target action.

NOTE: the recovery command used in this example will copy all WAL files needed to reach to the target. In some cases, when the amount of WAL files is considerable big, it can be better to use the option --get-wal. You will find additional information about this option in There's more... section of this article.

On pg02 you can now check that only the test table exists:

  1. start PostgreSQL service: PostgreSQL will replay the WAL files until a consistent point is reached. Watch PostgreSQL logs and wait for that situation to happen before continuing with the next step.

  2. connect to test database and check the existing tables

test=# \d
List of relations
Schema | Name | Type | Owner
public | test | table | postgres
(1 row)

We can see that we have restored the instance at the exact point in which we had created the restore point on pg01.

  1. we have performed the recovery process using --target-action pause so PostgreSQL is now in the pause state. To terminate the recovery and promote the server to master execute:
test=# SELECT pg_wal_replay_resume();

About timelines

Whenever an archive recovery completes, a new timeline is created to identify the series of WAL records generated after that recovery, so a new timeline does not overwrite the WAL data generated by previous timelines.

As we can see from the PostgreSQL log from pg02 a new timeline with ID 2 was created:

Mar 25 22:00:04 pg02 postgres[22577]: [18-1] 2019-03-25 22:00:04 UTC [@//:22577]: [14] LOG: selected new timeline ID: 2
Mar 25 22:00:04 pg02 postgres[22577]: [19-1] 2019-03-25 22:00:04 UTC [@//:22577]: [15] LOG: archive recovery complete
Mar 25 22:00:04 pg02 postgres[22585]: [6-1] 2019-03-25 22:00:04 UTC [@//:22585]: [1] LOG: checkpoint starting: end-of-recovery immediate wait
Mar 25 22:00:04 pg02 postgres[22585]: [7-1] 2019-03-25 22:00:04 UTC [@//:22585]: [2] LOG: checkpoint complete: wrote 41 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.003 s, total=0.013 s; sync files=33, longest=0.002 s, average=0.000 s; distance=33023 kB, estimate=33023 kB
Mar 25 22:00:05 pg02 postgres[22576]: [7-1] 2019-03-25 22:00:05 UTC [@//:22576]: [8] LOG: database system is ready to accept connections

Every time a new timeline is created, PostgreSQL creates a “timeline history” file that shows which timeline it branched off from and when. These history files are necessary to allow the system to pick the right WAL files when recovering from an archive that contains multiple timelines.

The notion of timeline goes beyond the scope of this article, you can find more details on this article on our blog.

There's more ...

By default, Barman pushes all WAL files from the start of the backup to the end of log (last available WAL file at recovery time for that server) to the recovery destination, inside the barman_xlog/barman_wal directory in PGDATA.

For large installations this is not practical, and we advise to use on-demand pulling of WAL files, directly performed by the PostgreSQL instance as part of the recovery process. For this purpose, Barman implements the get-wal feature of Barman. When a remote recovery is performed, it is possible to use the --get-wal option in the recover command at run-time:

barman recover --get-wal \
--remote-ssh-command="ssh postgres@pg02" \
--target-name restore_point \
--target-action pause \
pg01 20190325T214517 /opt/postgres/data

This way Barman will automatically set the restore_command in the standby server in order to retrieve all the WAL files from the barman server, allowing it to be used as a WAL hub.

For more information, please read the "Speed up getting WAL files from Barman with barman-wal-restore" article from our Knowledge Base.

Was this article helpful?

0 out of 0 found this helpful