Recipe - Hot physical backups with Barman

Gabriele Bartolini
Gabriele Bartolini

A short tutorial on using Barman to perform hot physical backups of your PostgreSQL databases, covering from installation to backup execution.

The main reason we came up with the idea of starting a new open source project for disaster recovery of PostgreSQL databases was the lack (back in 2011) of a simple and standard procedure for managing backups and, most importantly, recovery. Disasters and failures in ICT will happen.

As a database administrator, your duty is to plan for backups and recovery of PostgreSQL databases and perform regular tests in order to sweep away stress and fear, which typically follow those unexpected events. Barman, which stands for Backup and Recovery Manager, is definitely a tool that you can use for these purposes.

Before you dive into this recipe and the next one, which will introduce you to Barman, I recommend that you read the following recipes from earlier in this chapter: Understanding and controlling crash recovery, Planning backups, Hot physical backup and continuous archiving, and Recovery to a point in time. Although Barman hides the complexity of the underlying concepts, it is important that you be aware of them, as it will make you more resilient to installation, configuration, and recovery issues of Barman.

Barman is currently available only for Linux systems and is written in Python. It supports PostgreSQL versions from 8.3 onwards. Among its main features worth citing are remote backup, remote recovery, multiple server management, backup catalogs, incremental backups, retention policies, WAL streaming, compression of WAL files, and backup from a standby server (for 9.2 and later versions).

For the sake of simplicity, in this recipe we will assume the following architecture:

  • One Linux server named angus, running your PostgreSQL production
    database server
  • One Linux server named malcolm, running Barman for disaster recovery of your PostgreSQL database server
  • Both the servers are in the same LAN, and for better business continuity objectives, the only resource they share is the network

Later on, we will see how easy it is with Barman to add more Postgres servers (such as bon) to our disaster recovery solution on malcolm.

Getting ready

Although Barman can be installed via sources or through pip - Python's main package manager - the easiest way to install Barman is by using the software package manager of your Linux distribution.

Currently, 2ndQuadrant maintains packages for RHEL, CentOS 5/6/7, Debian, and Ubuntu systems. If you are using a different distribution or another Unix system, you can follow the instructions written in the official documentation of Barman, available at http://docs.pgbarman.org/.

In this recipe, we will cover the installation of Barman 2.3 (currently the latest stable release) on CentOS 7 and Ubuntu 16.04 LTS Linux servers.

If you are using RHEL or CentOS 7 on the malcolm server, you need to install the following repositories:

Then, as root, type this:

yum install barman

If you are using Ubuntu on malcolm, you need to install the APT repository of PostgreSQL, available at http://apt.postgresql.org/. Then, as root, type this:

apt-get install barman

From now on, we will assume the following:

  • A freshly installed PostgreSQL is running on angus as the postgres system user and listening to the default port (5432). Its configuration is such that the barman system user on malcolm can connect as the postgres database user without having to type a password.
  • Barman is installed on malcolm and runs as the barman system user.
  • TCP connections for SSH and PostgreSQL are allowed between the two servers (check your firewall settings).
  • Two-way automated communication via SSH is properly set up between these users.
  • You have created a superuser called barman in your PostgreSQL server on angus and it can connect only from the malcolm server. See Chapter 1, Enabling access for network/remote users and Chapter 6, The PostgreSQL superuser.

The last operation requires "exchanging" a public SSH key without passphrase between the postgres user on angus and the barman user on malcolm. If you are not familiar with this topic, which goes beyond the scope of this book, you are advised to follow Barman's documentation or surf the net for more information.

Alternatively, if your system administrator complains about opening SSH access to your PostgreSQL server, you can always take your backups via streaming replication. Indeed, Barman 2.0 introduces transparent integration with pg_basebackup, meaning that base backups can be taken through the 5432 port and permissions can be granted at PostgreSQL level.

However, in this book we will concentrate on the copy method that uses rsync via SSH. If you are interested in setting up backups via streaming replication, look at the Barman's documentation, in particular the backup_method and streaming_conninfo options, as well as Setting up streaming replication in Chapter 12.

How to do it

We will start by looking at Barman's main configuration file:

  1. As root on malcolm, open the /etc/barman.conf file for editing. This file contains global options for Barman. Once you are familiar with the main configuration options, I recommend that you set the default compression method by uncommenting the following line:
compression = gzip
  1. Add the configuration file for the angus server. Drop the angus.conf file, containing the following lines, into the /etc/barman.d directory:
[angus]
description = "PostgreSQL Database on angus"
active = off
archiver = on
backup_method = rsync
ssh_command = ssh postgres@angus
conninfo = host=angus user=barman dbname=postgres
  1. You have just added the angus server to the list of Postgres servers managed by Barman. Temporarily, the server is inactive until configuration is completed. You can verify this by typing barman list-server, as follows:
[root@malcolm]# barman list-server
angus - PostgreSQL Database on angus (inactive)
  1. In this book, you will be executing commands as root user. Be aware, however, that every command will be executed by the barman system user (or, more generally, as specified in the configuration file by the barman_user option). Anyway, it is now time to set up continuous archiving of WAL files between Postgres and Barman. Execute the barman show-server angus command and write down the directory for incoming WALs (incoming_wals_directory):
[root@malcolm]# barman show-server angus
Server angus (inactive):
active: False
archive_command: None
archive_mode: None
...
incoming_wals_directory: /var/lib/barman/angus/incoming
  1. The next task is to initialize the directory layout for the angus server, through the check command. You are advised to add this command to your monitoring infrastructure as, among other things, it ensures that connection to the Postgres server via SSH and libpq is working properly, as well as continuous archiving. It returns 0 if everything is fine:
[root@malcolm]# barman check angus
Server angus (inactive):
WAL archive: FAILED (please make sure WAL shipping is setup)
PostgreSQL: OK
superuser: OK
wal_level: FAILED (please set it to a higher level than 'minimal')
directories: OK
retention policy settings: OK
backup maximum age: OK (no last_backup_maximum_age provided)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 0 backups, expected at least 0)
ssh: OK (PostgreSQL server)
not in recovery: OK
archive_mode: FAILED (please set it to 'on' or 'always')
archive_command: FAILED (please set it accordingly to documentation)
archiver errors: OK

[root@malcolm]# echo $?
1
  1. As you can see, the returned value is 1, meaning that the angus server is not yet ready for backup. The output suggests that archive_mode and archive_command in Postgres are not set for continuous archiving. Connect to angus and modify the postgresql.conf file by adding this:
archive_mode = on
archive_command = 'rsync -a %p barman@malcolm:/var/lib/barman/angus/incoming/%f'
wal_level = replica
  1. Restart the PostgreSQL server.

  2. Activate the server in Barman, by removing the line that starts with active.

  3. Run the check command on malcolm (suppressing the output with -q) again, and compare the results with what you got earlier:

[root@malcolm]# barman -q check angus 
[root@malcolm]# echo $?
0

It returned 0. Everything is all good! PostgreSQL on angus should now be regularly shipping WAL files to Barman on malcolm, depending on the write workload of your database.

Do not worry if the check command complains with the following error:

WAL archive: FAILED (please make sure WAL shipping is setup)

It is a precautionary measure we had to take in order to prevent users from going live without a working archiving process. That means that your server (like angus in this case) has a very low workload and no WAL files have yet been produced, shipped and archived. If you want to speed up the installation, you can execute the following commands:

[root@malcolm]# barman switch-xlog --force --archive angus
[root@malcolm]# barman archive-wal angus

We recommend that you check both the PostgreSQL and Barman log files and verify that WALs are correctly shipped. Continuous archiving is indeed the main requirement for physical backups in Postgres.

  1. Once you have set up continuous archiving, in order to add the disaster recovery capability to your Postgres server, you need to have at least one full base backup. Taking a full base backup in Barman is as easy as typing one single command. It should not be hard for you to guess that the command to execute is `barman backup angus.

Barman initiates the physical backup procedure and waits for the checkpoint to happen, before copying the data files from angus to malcom using rsync:

[root@malcolm]# barman backup angus
Starting backup using rsync-exclusive method for server angus
Backup start at xlog location: 0/3000028 (000000010000000000000003, 00000028)
This is the first backup for server angus
WAL segments preceding the current backup have been found:
000000010000000000000001 from server angus has been removed
Copying files.
Copy done.
This is the first backup for server angus
Asking PostgreSQL server to finalize the backup.
Backup size: 21.1 MiB
Backup end at xlog location: 0/3000130 (000000010000000000000003, 00000130)
Backup completed
Processing xlog segments from file archival for angus
000000010000000000000002
000000010000000000000003
000000010000000000000003.00000028.backup

It is worth noting that, during the backup procedure, your PostgreSQL server is available for both read and write operations. This is because PostgreSQL natively implements hot backup, a feature that other DBMS vendors might make you pay for.

From now on, your angus PostgreSQL server is continuously backed up on malcolm. You can now schedule weekly backups (using the barman user's cron) and manage retention policies so that you can build a catalog of backups covering you for weeks, months, or years of data and allowing you to perform recovery operations at any point in time between the first available backup and the last successfully archived WAL file.

How it works

Barman is a Python application that wraps PostgreSQL core technology for continuous backup and PITR. It also adds some practical functionality focused on helping the database administrator manage disaster recovery of one or more PostgreSQL servers.

When devising Barman, we decided to keep the design simple and not to use any daemon or client/server architecture. Maintenance operations are simply delegated to the barman cron command, which is mainly responsible for archiving WAL files (moving them from the incoming directory to the WAL file and compressing them) and managing retention policies.

If you have installed Barman through RPM or APT packages, you will notice that maintenance is run every minute through cron:

[root@malcolm ~]# cat /etc/cron.d/barman
# m h dom mon dow user command
* * * * * barman [ -x /usr/bin/barman ] && /usr/bin/barman -q cron

Barman follows the "convention over configuration" paradigm and uses an INI format configuration file with options operating at two different levels:

  • Global options: These are options specified in the [barman] section, used by any Barman command and for every server. Several global options can be overridden at the server level.
  • Server options: These are options specified in the [SERVER_ID] section, used by server commands. These options can be customized at the server level (including overriding general settings).

The SERVER_ID placeholder (such as angus) is fundamental, as it identifies the server in the catalogue (therefore, it must be unique).

Similarly, commands in Barman are of two types:

  • Global commands: These are general commands, not tied with any server in particular, such as a list of the servers managed by the Barman installation (list-server) and maintenance (cron)
  • Server commands: These are commands executed on a specific server, such as diagnostics (check and status), backup control (backup, list-backup, delete, and show-backup) and recovery control (recover, which is discussed in the next recipe, Recovery with Barman)

The previous sections of this recipe showed you how to add a server (angus) to a Barman installation on the malcolm server. You can easily add a second server (bon) to the Barman server on malcolm. All you have to do is create the bon.conf file in the /etc/barman.d directory and repeat the steps outlined in the How it works… section, as you have done for angus.

There's more

Every time you execute the barman backup command for a given server, you take a full base backup (a more generic term for this is periodical full backup). Once completed, this backup can be used as a base for any recovery operation from the start time of the backup to the last available WAL file for that server (provided there is continuity among all the WAL segments).

As mentioned earlier, by scheduling daily or weekly automated backups, you end up having several periodic backups for a server. In Barman's jargon, this is known as the backup catalogue and it is one of our favorite features of this tool.

At any time, you can get the list of available backups for a given server through the list-backup command:

[root@malcolm ~]# barman list-backup angus
angus 20161003T194717 - Mon Oct 3 19:47:20 2016 - Size: 21.1 MiB - WAL Size: 26.6 KiB

The last informative command you might want to get familiar with is show-backup, which gives you detailed information on a specific backup regarding the server, base backup time, WAL archive, and context within the catalog (for example, the last available backup):

[root@malcolm ~]# barman show-backup angus 20161003T194717

Rather than the full backup ID (20161003T194717), you can use a few synonyms, such as these:

  • Last or latest: This refers to the latest available backup (the last in the catalog)
  • First or oldest: This refers to the oldest available backup (the first in the catalog)

For the show-backup command, however, we will use a real and concrete example, taken directly from one of our customers' installation of Barman on a 16.4 TB Postgres 9.4 database:

Backup 20160930T130002:
Server Name : skynyrd
Status : DONE
PostgreSQL Version : 90409
PGDATA directory : /srv/pgdata

Base backup information:
Disk usage : 16.4 TiB (16.4 TiB with WALs)
Incremental size : 5.7 TiB (-65.08%)
Timeline : 1
Begin WAL : 000000010000358800000063
End WAL : 00000001000035A0000000A2
WAL number : 6208
WAL compression ratio: 79.15%
Begin time : 2016-09-30 13:00:04.245110+00:00
End time : 2016-10-01 13:24:47.322288+00:00
Begin Offset : 24272
End Offset : 11100576
Begin XLOG : 3588/63005ED0
End XLOG : 35A0/A2A961A0

WAL information:
No of files : 3240
Disk usage : 11.9 GiB
WAL rate : 104.33/hour
Compression ratio : 76.43%
Last available : 00000001000035AD0000004A

Catalog information:
Retention Policy : not enforced
Previous Backup : 20160923T130001
Next Backup : - (this is the latest base backup)

As you can see, Barman is a production-ready tool that can be used in large, business-critical contexts, as well as in basic Postgres installations. It provides good Recovery Point Objective (RPO) outcomes, allowing you to limit potential data loss to a single WAL file.

Finally, Barman supports also WAL streaming, which dramatically reduces the amount of data you can lose. With synchronous replication and replication slots support, you can achieve “zero data loss” backups. For further information, please refer to Barman's documentation, in particular: streaming_archiver, streaming_archiver_name, streaming_conninfo, and slot_name.

Barman is distributed under GNU GPL 3 terms and is available for download at http://www.pgbarman.org/.

There is also a module for Puppet available at https://github.com/2ndquadrant-it/puppet-barman.

For further and more detailed information, refer to the following:

  • The man barman command, which gives the man page for the Barman application
  • The man 5 barman command, which gives the man page for the configuration file
  • The barman help command, which gives a list of the available commands
  • The official documentation of Barman, publicly available at http://docs.pgbarman.org/
  • The mailing list for community support at http://www.pgbarman.org/support/

Was this article helpful?

0 out of 0 found this helpful