Patroni
is a tool written in Python
that is used for managing replication and high availability in PostgreSQL
clusters, including support for automated failover and switchover operations.
In this article we will explain how to perform the following common maintenance operations with Patroni
:
- Perform a switchover
- Recover from a failover
- Restart or reload a PostgreSQL cluster
- Change PostgreSQL configuration
- Disable automatic failover
Finally, we'll bring everything together with a real world use case: Change the IP of one node.
For the examples in this article we will be working with a Patroni/ETCD cluster with the following components:
Operating System: Ubuntu Focal
PostgreSQL: 14
Patroni:
Version: 2.1.4
Cluster name: patroni-ts-36
Configuration file: /etc/patroni.yml
DCS: etcd
Nodes intial state:
ts-36-patroni-node-1: primary
ts-36-patroni-node-2: standby
ts-36-patroni-node-3: standby
Many of these example operations are performed using the patronictl
utility which can be run in either interactive or non-interactive mode. When using patronictl
in this article we will use the non-interactive mode.
A switchover can be performed through Patroni
by using the patronictl switchover
command. A switchover
is where Patroni
will promote one of the Replica
nodes as the new Leader
in the cluster. The operation is performed graciously with minimal downtime and brings the former primary as a standby of the promoted node. The syntax is:
patronictl -c CONFIGURATION_FILE switchover --master CURRENT_PRIMARY --candidate TARGET_STANDBY --scheduled TIMESTAMP --force CLUSTER_NAME
And here are details on those option's arguments:
-
CONFIGURATION_FILE
: the path to yourpatroni.yml
configuration file -
CURRENT_PRIMARY
: thePatroni
name of your currentPostgreSQL
primary node -
TARGET_STANDBY
: thePatroni
name of thePostgreSQL
standby node that is intended to take the primary role -
TIMESTAMP
: the time at which you wish the switchover to proceed. Valid values arenow
if you want an immediate switchover or anISO 8601
timestamp if you want to schedule it for later (for example:2022-11-10T16:54
) -
CLUSTER_NAME
: the name of thePatroni
cluster
The --force
flag will cause the switchover proceed with the operation without prompting for confirmation.
To begin this first example, we'll use patronictl
's topology
command to view the current state of our cluster:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 2 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 2 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |
We can perform a switchover from node ts-36-patroni-node-1
to node ts-36-patroni-node-2
by running the following command:
patronictl -c /etc/patroni.yml switchover --master ts-36-patroni-node-1 --candidate ts-36-patroni-node-2 --scheduled now --force patroni-ts-36
As an example this is how the output looks like:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml switchover --master ts-36-patroni-node-1 --candidate ts-36-patroni-node-2 --scheduled now --force patroni-ts-36
Current cluster topology
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 2 | |
| ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 2 | 0 |
| ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |
2022-11-29 14:53:32.47012 Successfully switched over to "ts-36-patroni-node-2"
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Replica | stopped | | unknown |
| ts-36-patroni-node-2 | 172.17.0.3 | Leader | running | 2 | |
| ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |
After some time you will see the topology of the cluster changes accordingly:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-2 | 172.17.0.3 | Leader | running | 3 | |
| + ts-36-patroni-node-1 | 172.17.0.2 | Replica | running | 3 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 3 | 0 |
Let's say you have faced a failover promotion in your Patroni
cluster due to network disruption and you wish to recover from that situation. Patroni makes this easy by automatically rebuilding the failed node as standby of newly promoted primary once communication with the failed node had been re-established. By default Patroni does this by running a fresh base backup of the promoted primary, replacing the data on the failed node.
We'll begin this example with the following cluster topology:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 4 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 4 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 4 | 0 |
At this point suppose your Patroni
cluster decided to failover from node ts-36-patroni-node-1
to ts-36-patroni-node-3
because both standby nodes lost communication with the primary node. As our cluster was deployed using Docker
containers, we can use the following command to simulate network isolation of the leader node:
docker network disconnect bridge ts-36-patroni-node-1
When checking the topology we can see that the ts-36-patroni-node-1
node is no longer part of the cluster and that one of the other nodes is now leader:
postgres@ts-36-patroni-node-2:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
2022-11-29 15:05:15,608 - ERROR - Failed to get list of machines from http://172.17.0.2:2379/v2: MaxRetryError("HTTPConnectionPool(host='172.17.0.2', port=2379): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3ddfcd0580>: Failed to establish a new connection: [Errno 113] No route to host'))")
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-3 | 172.17.0.4 | Leader | running | 5 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 5 | 0 |
Now let's re-establish communication between ts-36-patroni-node-1
and the remaining cluster nodes:
docker network connect bridge ts-36-patroni-node-1
Former primary (ts-36-patroni-node-1
) will be automatically rebuilt as a standby of the promoted primary (ts-36-patroni-node-3
). That can be confirmed by checking the topology of the cluster again:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-3 | 172.17.0.4 | Leader | running | 5 | |
| + ts-36-patroni-node-1 | 172.17.0.2 | Replica | running | 5 | 0 |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 5 | 0 |
If you ever need to restart the PostgreSQL
cluster for any reason you should avoid restarting the Patroni
systemd service. Doing so causes the local Patroni
agent to quit, which would also stop the local PostgreSQL
instance, potentially causing a failover in the cluster if the local PostgreSQL
instance is the current leader.
If a restart of any of a Patroni
cluster's PostgreSQL
servers is needed, the patronictl restart
command should be instead of systemctl restart patroni
or direct execution of pg_ctl restart
. The syntax is:
patronictl -c CONFIGURATION_FILE restart --force CLUSTER_NAME MEMBER_NAMES
And you need to replace the values as described below:
-
CONFIGURATION_FILE
: the path to yourpatroni.yml
configuration file -
CLUSTER_NAME
: the name of thePatroni
cluster -
MEMBER_NAMES
: list of node names of the members to be restarted
The --force
flag will cause the node restart(s) proceed without prompting for confirmations.
Please note that Patroni
does not use the PostgreSQL
systemd service. It manages PostgreSQL
through pg_ctl
, not through systemctl
.
We'll once again begin with this cluster topology:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |
Let's say you need to restart the current primary node (ts-36-patroni-node-1
). You can achieve that by running the following command:
patronictl -c /etc/patroni.yml restart --force patroni-ts-36 ts-36-patroni-node-1
Here is what that command's output would look like:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml restart --force patroni-ts-36 ts-36-patroni-node-1
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |
Success: restart on member ts-36-patroni-node-1
Now, if you check the topology again, you will notice there was no failover in the cluster, e.g. the primary node and the timeline are still the same:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |
Similar to how restarts of PostgreSQ
L nodes in a Patroni
cluster should be managed via Patroni
, the cluster's PostgreSQL
server configurations are also best managed by Patroni
.
We need to keep in mind that the PostgreSQL
configuration file <data directory>/postgresql.conf
is managed by Patroni
. If you ever attempt to change this configuration file on an instance, it will be overwritten by Patroni
when the next HA loop timer is triggered or a patronictl reload
is given.
Note: When Patroni
is bootstrapped on a new cluster it will move the original postgresql.conf
to postgresql.base.conf
, and it will add an include in the new postgresql.conf
. The original config will be loaded by PostgreSQL
but any settings provided to Patroni
may override those settings.
The correct way of managing the PostgreSQL
configuration is through the patronictl edit-config
command. The syntax is:
patronictl -c CONFIGURATION_FILE edit-config --pg PG_CONFIG=CONFIG_VAULE CLUSTER_NAME
And you need to replace the values as described below:
-
CONFIGURATION_FILE
: the path to yourpatroni.yml
configuration file -
PG_CONFIG
: the PostgreSQL` configuration parameter that you would like to change -
CONFIG_VALUE
: the value for thePostgreSQL
configuration parameter -
CLUSTER_NAME
: the name of thePatroni
cluster
The --force
flag will cause the configuration change proceed without prompting for confirmations.
Note: patronictl edit-config
command requires that you have previously installed the less
command in your Linux
server.
Note: any changes requested through patronictl edit-config
command will be applied to all PostgreSQL
instances that are part of this cluster.
Note: the patronictl edit-config
command is not able to change all PostgreSQL
parameters. Refer to the YAML Configuration Settings section of Patroni
docs for more details.
For this example let's say we would like to change the values of work_mem
and random_page_cost
in the PostgreSQL
configuration, beginning with these are these values as seen on ts-36-patroni-node-1
:
postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW work_mem"
work_mem
16MB
(1 row)
postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW random_page_cost"
random_page_cost
4
(1 row)
And that this is the current patronictl show-config
output from Patroni
:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml show-config patroni-ts-36
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
retry_timeout: 10
ttl: 30
We can change the values of work_mem
and random_page_cost
to 32MB
and 1.5
, respectively, by using the following command:
patronictl -c /etc/patroni.yml edit-config --pg work_mem=32MB --pg random_page_cost=1.5 --force patroni-ts-36
The output will look like:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml edit-config --pg work_mem=32MB --pg random_page_cost=1.5 --force patroni-ts-36
___
+++
@@ -1,6 +1,9 @@
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
+ parameters:
+ random_page_cost: 1.5
+ work_mem: 32MB
use_pg_rewind: true
use_slots: true
retry_timeout: 10
Configuration changed
At this point the new PostgreSQL
configuration has already taken place. You can confirm that by checking their values in PostgreSQL
:
postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW work_mem"
work_mem
32MB
(1 row)
postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW random_page_cost"
random_page_cost
1.5
(1 row)
The same result will be observed in all PostgreSQL
instances that are part of this cluster.
It is worth noting that neither work_mem
nor random_page_cost
changes require a PostgreSQL
restart to take effect. If you happen to change the value of a setting that requires a PostgreSQL
restart, like archive_mode
, then you would need to apply more steps.
For example, assuming archive_mode
is enabled in the PostgreSQL
instances:
postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW archive_mode"
archive_mode
on
(1 row)
You would be able to disable that by running the following command:
patronictl -c /etc/patroni.yml edit-config --pg archive_mode=off --force patroni-ts-36
And its output would look like:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml edit-config --pg archive_mode=off --force patroni-ts-36
___
+++
@@ -2,6 +2,7 @@
maximum_lag_on_failover: 1048576
postgresql:
parameters:
+ archive_mode: false
random_page_cost: 1.5
work_mem: 32MB
use_pg_rewind: true
Configuration changed
Although the configuration is claimed to be changed by Patroni
, that will only take effect once you perform a restart of the PostgreSQL
nodes.
By checking the patronictl topology
output, we can see it has marked all PostgreSQL
nodes as "pending restart":
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB | Pending restart |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | | * |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 | * |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 | * |
We can now use the patronictl restart
command, as detailed in a previous section of this article, to command the PostgreSQL
instances to be restarted:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml restart --force patroni-ts-36 ts-36-patroni-node-1 ts-36-patroni-node-2 ts-36-patroni-node-3
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB | Pending restart |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | | * |
| ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 | * |
| ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 | * |
Success: restart on member ts-36-patroni-node-3
Success: restart on member ts-36-patroni-node-1
Success: restart on member ts-36-patroni-node-2
And now the "pending restart" flag has been cleared from the nodes in patronictl topology
output:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |
And we can see that the changes are now live on the PostgreSQL
servers, e.g.:
postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW archive_mode"
archive_mode
off
(1 row)
If, for any reason, you need to temporarily disable automatic failover in the cluster you can use the patronictl pause
command. The syntax is:
patronictl -c CONFIGURATION_FILE pause --wait CLUSTER_NAME
And you need to replace the values as described below:
-
CONFIGURATION_FILE
: the path to yourpatroni.yml
configuration file -
CLUSTER_NAME
: the name of thePatroni
cluster
When this command completes, Patroni
will no longer monitor the health of the cluster, so any disruption to the PostgreSQL
cluster will not result in an automatic failover.
Note: The patronictl pause
command also changes the behavior of calling systemctl stop patroni
. As we prviously covered, stopping the Patroni
systemd service on a node will, by default, also shut down the local PostgreSQL
instance. Pausing a Patroni
cluster with the patronictl pause
command is also known as placing it in "Maintenance Mode", and while the cluster is in Maintenance Mode stopping a node's Patroni
systemd service will not also shut down the node's local PostgreSQL
server.
As an example, you can pause monitoring by using the following command:
patronictl -c /etc/patroni.yml pause --wait patroni-ts-36
The output would look like this:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml pause --wait patroni-ts-36
'pause' request sent, waiting until it is recognized by all nodes
Success: cluster management is paused
If you check the output of patronictl topology
command, you will see that the cluster is in "Maintenance mode":
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |
Maintenance mode: on
Which essentially means that Patroni monitoring and management of the cluster's nodes, including automatic failovers, is disabled.
At this point you can perform any operations in the PostgreSQL
cluster, and no failover will take place.
When you want to resume monitoring of the Patroni
cluster, then you can use the patronictl resume
command. The syntax is similar to the one used for patronictl pause
:
patronictl -c /etc/patroni.yml resume --wait patroni-ts-36
The output would look like this:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml resume --wait patroni-ts-36
'resume' request sent, waiting until it is recognized by all nodes
Success: cluster management is resumed
This section contains a more complete example you may encounter in the real world: Suppose you need to move a virtual machine from one host or data center another, causing its IP
address to be changed.
Care needs to be taken when performing this maintenance as the underlying Patroni
, PostgreSQL
, and etcd
configuration (and eventually more) all require changes.
In this section you will find detailed steps to perform the aforementioned maintenance. Please note that we will simulate a virtual machine movement in our Docker
environment by changing the container IP
address. The Docker
containers being used are attached to the bridge
network, which does not accept assigning static IP
s, but actually uses an incremental IP
for each connected container. So we will use the following approach to "move" the container between data centers:
- Disconnect the node
ts-36-patroni-node-1
from thebridge
network. This will make the currentIP
(172.17.0.2
in the example) available for usage - Start a random
Docker
container, so it consumes theIP
that was being used byts-36-patroni-node-1
- Reconnect the node
ts-36-patroni-node-1
to thebridge
network. This will make it use the next availableIP
(172.17.0.5
in the example)
Let's now go through the actions needed to perform the above described maintenance work:
1- On any node check the current cluster topology:
patronictl -c /etc/patroni.yml topology patroni-ts-36
2- On any node put Patroni in Maintenance Mode with patronictl pause
:
patronictl -c /etc/patroni.yml pause --wait patroni-ts-36
3- On each node we now stop the Patroni
systemd service. Recall that since our cluster is now in Maintenance Mode this will not stop the PostgreSQL servers:
systemctl stop patroni
4- On the Docker
host disconnect container ts-36-patroni-node-1
from the bridge
network:
docker network disconnect bridge ts-36-patroni-node-1
5- On the Docker
host start a new random container, so it gets the IP that was released by ts-36-patroni-node-1
:
docker run -itd --name random ubuntu:focal
6- On the Docker
host connect container ts-36-patroni-node-1
back to bridge
network:
docker network connect bridge ts-36-patroni-node-1
You can check what IP
address was assigned by checking the output of the following command:
ip addr show
In our case, it was assigned with 172.17.0.5
:
root@ts-36-patroni-node-1:~# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
3: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1000
link/tunnel6 :: brd ::
241: eth1@if242: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:ac:11:00:05 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 172.17.0.5/16 brd 172.17.255.255 scope global eth1
valid_lft forever preferred_lft forever
7- Update the /etc/hosts
configuration on each node:
sed 's@172\.17\.0\.2@172\.17\.0\.5@g' /etc/hosts > /root/hosts
cat /root/hosts > /etc/hosts
rm /root/hosts
Note:: We are not allowed to use sed -i
over /etc/hosts
. So we used the above workaround.
8- On all nodes update the local Patroni
configuration:
sed -i 's@172\.17\.0\.2@172\.17\.0\.5@g' /etc/patroni.yml
9- On all nodes do the same for the local etcd
configuration:
sed -i 's@172\.17\.0\.2@172\.17\.0\.5@g' /etc/default/etcd
10- In node ts-36-patroni-node-1
update etcd
configuration, so it will attempt to rejoin an existing cluster instead of spawning a new one:
sed -i 's@ETCD_INITIAL_CLUSTER_STATE=\"new\"@ETCD_INITIAL_CLUSTER_STATE\"existing\"@g' /etc/default/etcd
11- Stop etcd
on node ts-36-patroni-node-1
:
systemctl stop etcd
12- Update etcd
's member information for ts-36-patroni-node-1
on any other node with etcdctl member update
:
etcdctl member update $(etcdctl member list | grep ts-36-patroni-node-1 | cut -d',' -f1) --peer-urls="http://172.17.0.5:2380"
Note: The etcdctl member list
subcommand in the above command is used to get the ts-36-patroni-node-1
member ID from the etcd
cluster.
Output should look similar to:
root@ts-36-patroni-node-2:~# etcdctl member update $(etcdctl member list | grep ts-36-patroni-node-1 | cut -d',' -f1) --peer-urls="http://172.17.0.5:2380"
Member 53e2188e2c657738 updated in cluster c0dc6b1849aa7dd2
13- Start etcd
again n node ts-36-patroni-node-1
, allowing it will join the etcd
cluster with its new node IP
:
systemctl start etcd
14- In all nodes check etcd
member list:
etcdctl member list
The output should look like this:
root@ts-36-patroni-node-1:~# etcdctl member list
53e2188e2c657738, started, ts-36-patroni-node-1, http://172.17.0.5:2380, http://172.17.0.5:2379
c64e6395fafdc498, started, ts-36-patroni-node-2, http://172.17.0.3:2380, http://172.17.0.3:2379
d80fc6730c50b3cd, started, ts-36-patroni-node-3, http://172.17.0.4:2380, http://172.17.0.4:2379
Note that node ts-36-patroni-node-1
is now showing its new IP
address.
15- In all nodes start Patroni
service:
systemctl start patroni
16- On any node check that it reports the new IP in the topology:
patronictl -c /etc/patroni.yml topology patroni-ts-36
The output should look similar to:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7172574149615387955)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.5 | Leader | running | 2 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 2 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |
Maintenance mode: on
17- On any node run patronictl resume
to disable Maintenance Mode and resume automatic failover functionality:
patronictl -c /etc/patroni.yml resume --wait patroni-ts-36
Output should look like:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml resume --wait patroni-ts-36
'resume' request sent, waiting until it is recognized by all nodes
Success: cluster management is resumed
18- On any node check topology again to confirm it's not in maintenance mode anymore:
patronictl -c /etc/patroni.yml topology patroni-ts-36
Output should look like:
postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7172574149615387955)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.5 | Leader | running | 2 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 2 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |
At this point the node relocation is already performed in our example using Docker
containers to simulate that.
We've now completed the maintenance needed to update Patroni
and etcd
with ts-36-patroni-node-1
new IP address.