How to perform maintenance with Patroni

Israel Barth
Israel Barth

Patroni is a tool written in Python that is used for managing replication and high availability in PostgreSQL clusters, including support for automated failover and switchover operations.

In this article we will explain how to perform the following common maintenance operations with Patroni:

  • Perform a switchover
  • Recover from a failover
  • Restart or reload a PostgreSQL cluster
  • Change PostgreSQL configuration
  • Disable automatic failover

Finally, we'll bring everything together with a real world use case: Change the IP of one node.

Introduction

For the examples in this article we will be working with a Patroni/ETCD cluster with the following components:

Operating System: Ubuntu Focal
PostgreSQL: 14
Patroni:
Version: 2.1.4
Cluster name: patroni-ts-36
Configuration file: /etc/patroni.yml
DCS: etcd
Nodes intial state:
ts-36-patroni-node-1: primary
ts-36-patroni-node-2: standby
ts-36-patroni-node-3: standby

Many of these example operations are performed using the patronictl utility which can be run in either interactive or non-interactive mode. When using patronictl in this article we will use the non-interactive mode.

Perform a switchover

A switchover can be performed through Patroni by using the patronictl switchover command. A switchover is where Patroni will promote one of the Replica nodes as the new Leader in the cluster. The operation is performed graciously with minimal downtime and brings the former primary as a standby of the promoted node. The syntax is:

patronictl -c CONFIGURATION_FILE switchover --master CURRENT_PRIMARY --candidate TARGET_STANDBY --scheduled TIMESTAMP --force CLUSTER_NAME

And here are details on those option's arguments:

  • CONFIGURATION_FILE: the path to your patroni.yml configuration file
  • CURRENT_PRIMARY: the Patroni name of your current PostgreSQL primary node
  • TARGET_STANDBY: the Patroni name of the PostgreSQL standby node that is intended to take the primary role
  • TIMESTAMP: the time at which you wish the switchover to proceed. Valid values are now if you want an immediate switchover or an ISO 8601 timestamp if you want to schedule it for later (for example: 2022-11-10T16:54)
  • CLUSTER_NAME: the name of the Patroni cluster

The --force flag will cause the switchover proceed with the operation without prompting for confirmation.

To begin this first example, we'll use patronictl's topology command to view the current state of our cluster:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 2 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 2 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |

We can perform a switchover from node ts-36-patroni-node-1 to node ts-36-patroni-node-2 by running the following command:

patronictl -c /etc/patroni.yml switchover --master ts-36-patroni-node-1 --candidate ts-36-patroni-node-2 --scheduled now --force patroni-ts-36

As an example this is how the output looks like:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml switchover --master ts-36-patroni-node-1 --candidate ts-36-patroni-node-2 --scheduled now --force patroni-ts-36
Current cluster topology
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 2 | |
| ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 2 | 0 |
| ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |
2022-11-29 14:53:32.47012 Successfully switched over to "ts-36-patroni-node-2"
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Replica | stopped | | unknown |
| ts-36-patroni-node-2 | 172.17.0.3 | Leader | running | 2 | |
| ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |

After some time you will see the topology of the cluster changes accordingly:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-2 | 172.17.0.3 | Leader | running | 3 | |
| + ts-36-patroni-node-1 | 172.17.0.2 | Replica | running | 3 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 3 | 0 |

Recover from a failover

Let's say you have faced a failover promotion in your Patroni cluster due to network disruption and you wish to recover from that situation. Patroni makes this easy by automatically rebuilding the failed node as standby of newly promoted primary once communication with the failed node had been re-established. By default Patroni does this by running a fresh base backup of the promoted primary, replacing the data on the failed node.

We'll begin this example with the following cluster topology:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 4 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 4 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 4 | 0 |

At this point suppose your Patroni cluster decided to failover from node ts-36-patroni-node-1 to ts-36-patroni-node-3 because both standby nodes lost communication with the primary node. As our cluster was deployed using Docker containers, we can use the following command to simulate network isolation of the leader node:

docker network disconnect bridge ts-36-patroni-node-1

When checking the topology we can see that the ts-36-patroni-node-1 node is no longer part of the cluster and that one of the other nodes is now leader:

postgres@ts-36-patroni-node-2:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
2022-11-29 15:05:15,608 - ERROR - Failed to get list of machines from http://172.17.0.2:2379/v2: MaxRetryError("HTTPConnectionPool(host='172.17.0.2', port=2379): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3ddfcd0580>: Failed to establish a new connection: [Errno 113] No route to host'))")
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-3 | 172.17.0.4 | Leader | running | 5 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 5 | 0 |

Now let's re-establish communication between ts-36-patroni-node-1 and the remaining cluster nodes:

docker network connect bridge ts-36-patroni-node-1

Former primary (ts-36-patroni-node-1) will be automatically rebuilt as a standby of the promoted primary (ts-36-patroni-node-3). That can be confirmed by checking the topology of the cluster again:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-3 | 172.17.0.4 | Leader | running | 5 | |
| + ts-36-patroni-node-1 | 172.17.0.2 | Replica | running | 5 | 0 |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 5 | 0 |

Restart or reload a PostgreSQL cluster

If you ever need to restart the PostgreSQL cluster for any reason you should avoid restarting the Patroni systemd service. Doing so causes the local Patroni agent to quit, which would also stop the local PostgreSQL instance, potentially causing a failover in the cluster if the local PostgreSQL instance is the current leader.

If a restart of any of a Patroni cluster's PostgreSQL servers is needed, the patronictl restart command should be instead of systemctl restart patroni or direct execution of pg_ctl restart. The syntax is:

patronictl -c CONFIGURATION_FILE restart --force CLUSTER_NAME MEMBER_NAMES

And you need to replace the values as described below:

  • CONFIGURATION_FILE: the path to your patroni.yml configuration file
  • CLUSTER_NAME: the name of the Patroni cluster
  • MEMBER_NAMES: list of node names of the members to be restarted

The --force flag will cause the node restart(s) proceed without prompting for confirmations.

Please note that Patroni does not use the PostgreSQL systemd service. It manages PostgreSQL through pg_ctl, not through systemctl.

We'll once again begin with this cluster topology:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |

Let's say you need to restart the current primary node (ts-36-patroni-node-1). You can achieve that by running the following command:

patronictl -c /etc/patroni.yml restart --force patroni-ts-36 ts-36-patroni-node-1

Here is what that command's output would look like:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml restart --force patroni-ts-36 ts-36-patroni-node-1
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |
Success: restart on member ts-36-patroni-node-1

Now, if you check the topology again, you will notice there was no failover in the cluster, e.g. the primary node and the timeline are still the same:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |

Similar to how restarts of PostgreSQL nodes in a Patroni cluster should be managed via Patroni, the cluster's PostgreSQL server configurations are also best managed by Patroni.

Change PostgreSQL configuration

We need to keep in mind that the PostgreSQL configuration file <data directory>/postgresql.conf is managed by Patroni. If you ever attempt to change this configuration file on an instance, it will be overwritten by Patroni when the next HA loop timer is triggered or a patronictl reload is given.

Note: When Patroni is bootstrapped on a new cluster it will move the original postgresql.conf to postgresql.base.conf, and it will add an include in the new postgresql.conf. The original config will be loaded by PostgreSQL but any settings provided to Patroni may override those settings.

The correct way of managing the PostgreSQL configuration is through the patronictl edit-config command. The syntax is:

patronictl -c CONFIGURATION_FILE edit-config --pg PG_CONFIG=CONFIG_VAULE CLUSTER_NAME

And you need to replace the values as described below:

  • CONFIGURATION_FILE: the path to your patroni.yml configuration file
  • PG_CONFIG: the PostgreSQL` configuration parameter that you would like to change
  • CONFIG_VALUE: the value for the PostgreSQL configuration parameter
  • CLUSTER_NAME: the name of the Patroni cluster

The --force flag will cause the configuration change proceed without prompting for confirmations.

Note: patronictl edit-config command requires that you have previously installed the less command in your Linux server.

Note: any changes requested through patronictl edit-config command will be applied to all PostgreSQL instances that are part of this cluster.

Note: the patronictl edit-config command is not able to change all PostgreSQL parameters. Refer to the YAML Configuration Settings section of Patroni docs for more details.

For this example let's say we would like to change the values of work_mem and random_page_cost in the PostgreSQL configuration, beginning with these are these values as seen on ts-36-patroni-node-1:

postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW work_mem"
work_mem
16MB
(1 row)

postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW random_page_cost"
random_page_cost
4
(1 row)

And that this is the current patronictl show-config output from Patroni:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml show-config patroni-ts-36
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
retry_timeout: 10
ttl: 30

We can change the values of work_mem and random_page_cost to 32MB and 1.5, respectively, by using the following command:

patronictl -c /etc/patroni.yml edit-config --pg work_mem=32MB --pg random_page_cost=1.5 --force patroni-ts-36

The output will look like:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml edit-config --pg work_mem=32MB --pg random_page_cost=1.5 --force patroni-ts-36
___
+++
@@ -1,6 +1,9 @@
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
+ parameters:
+ random_page_cost: 1.5
+ work_mem: 32MB
use_pg_rewind: true
use_slots: true
retry_timeout: 10

Configuration changed

At this point the new PostgreSQL configuration has already taken place. You can confirm that by checking their values in PostgreSQL:

postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW work_mem"
work_mem
32MB
(1 row)

postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW random_page_cost"
random_page_cost
1.5
(1 row)

The same result will be observed in all PostgreSQL instances that are part of this cluster.

It is worth noting that neither work_mem nor random_page_cost changes require a PostgreSQL restart to take effect. If you happen to change the value of a setting that requires a PostgreSQL restart, like archive_mode, then you would need to apply more steps.

For example, assuming archive_mode is enabled in the PostgreSQL instances:

postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW archive_mode"
archive_mode
on
(1 row)

You would be able to disable that by running the following command:

patronictl -c /etc/patroni.yml edit-config --pg archive_mode=off --force patroni-ts-36

And its output would look like:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml edit-config --pg archive_mode=off --force patroni-ts-36
___
+++
@@ -2,6 +2,7 @@
maximum_lag_on_failover: 1048576
postgresql:
parameters:
+ archive_mode: false
random_page_cost: 1.5
work_mem: 32MB
use_pg_rewind: true

Configuration changed

Although the configuration is claimed to be changed by Patroni, that will only take effect once you perform a restart of the PostgreSQL nodes.

By checking the patronictl topology output, we can see it has marked all PostgreSQL nodes as "pending restart":

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB | Pending restart |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | | * |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 | * |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 | * |

We can now use the patronictl restart command, as detailed in a previous section of this article, to command the PostgreSQL instances to be restarted:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml restart --force patroni-ts-36 ts-36-patroni-node-1 ts-36-patroni-node-2 ts-36-patroni-node-3
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB | Pending restart |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | | * |
| ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 | * |
| ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 | * |
Success: restart on member ts-36-patroni-node-3
Success: restart on member ts-36-patroni-node-1
Success: restart on member ts-36-patroni-node-2

And now the "pending restart" flag has been cleared from the nodes in patronictl topology output:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |

And we can see that the changes are now live on the PostgreSQL servers, e.g.:

postgres@ts-36-patroni-node-1:~ $ psql -c "SHOW archive_mode"
archive_mode
off
(1 row)

Disable automatic failover

If, for any reason, you need to temporarily disable automatic failover in the cluster you can use the patronictl pause command. The syntax is:

patronictl -c CONFIGURATION_FILE pause --wait CLUSTER_NAME

And you need to replace the values as described below:

  • CONFIGURATION_FILE: the path to your patroni.yml configuration file
  • CLUSTER_NAME: the name of the Patroni cluster

When this command completes, Patroni will no longer monitor the health of the cluster, so any disruption to the PostgreSQL cluster will not result in an automatic failover.

Note: The patronictl pause command also changes the behavior of calling systemctl stop patroni. As we prviously covered, stopping the Patroni systemd service on a node will, by default, also shut down the local PostgreSQL instance. Pausing a Patroni cluster with the patronictl pause command is also known as placing it in "Maintenance Mode", and while the cluster is in Maintenance Mode stopping a node's Patroni systemd service will not also shut down the node's local PostgreSQL server.

As an example, you can pause monitoring by using the following command:

patronictl -c /etc/patroni.yml pause --wait patroni-ts-36

The output would look like this:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml pause --wait patroni-ts-36
'pause' request sent, waiting until it is recognized by all nodes
Success: cluster management is paused

If you check the output of patronictl topology command, you will see that the cluster is in "Maintenance mode":

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7171447746833270061)
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.2 | Leader | running | 6 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 6 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 6 | 0 |
Maintenance mode: on

Which essentially means that Patroni monitoring and management of the cluster's nodes, including automatic failovers, is disabled.

At this point you can perform any operations in the PostgreSQL cluster, and no failover will take place.

When you want to resume monitoring of the Patroni cluster, then you can use the patronictl resume command. The syntax is similar to the one used for patronictl pause:

patronictl -c /etc/patroni.yml resume --wait patroni-ts-36

The output would look like this:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml resume --wait patroni-ts-36
'resume' request sent, waiting until it is recognized by all nodes
Success: cluster management is resumed

Change the IP of one node

This section contains a more complete example you may encounter in the real world: Suppose you need to move a virtual machine from one host or data center another, causing its IP address to be changed.

Care needs to be taken when performing this maintenance as the underlying Patroni, PostgreSQL, and etcd configuration (and eventually more) all require changes.

In this section you will find detailed steps to perform the aforementioned maintenance. Please note that we will simulate a virtual machine movement in our Docker environment by changing the container IP address. The Docker containers being used are attached to the bridge network, which does not accept assigning static IPs, but actually uses an incremental IP for each connected container. So we will use the following approach to "move" the container between data centers:

  1. Disconnect the node ts-36-patroni-node-1 from the bridge network. This will make the current IP (172.17.0.2 in the example) available for usage
  2. Start a random Docker container, so it consumes the IP that was being used by ts-36-patroni-node-1
  3. Reconnect the node ts-36-patroni-node-1 to the bridge network. This will make it use the next available IP (172.17.0.5 in the example)

Let's now go through the actions needed to perform the above described maintenance work:

1- On any node check the current cluster topology:

patronictl -c /etc/patroni.yml topology patroni-ts-36

2- On any node put Patroni in Maintenance Mode with patronictl pause:

patronictl -c /etc/patroni.yml pause --wait patroni-ts-36

3- On each node we now stop the Patroni systemd service. Recall that since our cluster is now in Maintenance Mode this will not stop the PostgreSQL servers:

systemctl stop patroni

4- On the Docker host disconnect container ts-36-patroni-node-1 from the bridge network:

docker network disconnect bridge ts-36-patroni-node-1

5- On the Docker host start a new random container, so it gets the IP that was released by ts-36-patroni-node-1:

docker run -itd --name random ubuntu:focal

6- On the Docker host connect container ts-36-patroni-node-1 back to bridge network:

docker network connect bridge ts-36-patroni-node-1

You can check what IP address was assigned by checking the output of the following command:

ip addr show

In our case, it was assigned with 172.17.0.5:

root@ts-36-patroni-node-1:~# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
3: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1000
link/tunnel6 :: brd ::
241: eth1@if242: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:ac:11:00:05 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 172.17.0.5/16 brd 172.17.255.255 scope global eth1
valid_lft forever preferred_lft forever

7- Update the /etc/hosts configuration on each node:

sed 's@172\.17\.0\.2@172\.17\.0\.5@g' /etc/hosts > /root/hosts
cat /root/hosts > /etc/hosts
rm /root/hosts

Note:: We are not allowed to use sed -i over /etc/hosts. So we used the above workaround.

8- On all nodes update the local Patroni configuration:

sed -i 's@172\.17\.0\.2@172\.17\.0\.5@g' /etc/patroni.yml

9- On all nodes do the same for the local etcd configuration:

sed -i 's@172\.17\.0\.2@172\.17\.0\.5@g' /etc/default/etcd

10- In node ts-36-patroni-node-1 update etcd configuration, so it will attempt to rejoin an existing cluster instead of spawning a new one:

sed -i 's@ETCD_INITIAL_CLUSTER_STATE=\"new\"@ETCD_INITIAL_CLUSTER_STATE\"existing\"@g' /etc/default/etcd

11- Stop etcd on node ts-36-patroni-node-1:

systemctl stop etcd

12- Update etcd's member information for ts-36-patroni-node-1 on any other node with etcdctl member update:

etcdctl member update $(etcdctl member list | grep ts-36-patroni-node-1 | cut -d',' -f1) --peer-urls="http://172.17.0.5:2380"

Note: The etcdctl member list subcommand in the above command is used to get the ts-36-patroni-node-1 member ID from the etcd cluster.

Output should look similar to:

root@ts-36-patroni-node-2:~# etcdctl member update $(etcdctl member list | grep ts-36-patroni-node-1 | cut -d',' -f1) --peer-urls="http://172.17.0.5:2380"
Member 53e2188e2c657738 updated in cluster c0dc6b1849aa7dd2

13- Start etcd again n node ts-36-patroni-node-1, allowing it will join the etcd cluster with its new node IP:

systemctl start etcd

14- In all nodes check etcd member list:

etcdctl member list

The output should look like this:

root@ts-36-patroni-node-1:~# etcdctl member list
53e2188e2c657738, started, ts-36-patroni-node-1, http://172.17.0.5:2380, http://172.17.0.5:2379
c64e6395fafdc498, started, ts-36-patroni-node-2, http://172.17.0.3:2380, http://172.17.0.3:2379
d80fc6730c50b3cd, started, ts-36-patroni-node-3, http://172.17.0.4:2380, http://172.17.0.4:2379

Note that node ts-36-patroni-node-1 is now showing its new IP address.

15- In all nodes start Patroni service:

systemctl start patroni

16- On any node check that it reports the new IP in the topology:

patronictl -c /etc/patroni.yml topology patroni-ts-36

The output should look similar to:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7172574149615387955) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.5 | Leader | running | 2 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 2 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |
Maintenance mode: on

17- On any node run patronictl resume to disable Maintenance Mode and resume automatic failover functionality:

patronictl -c /etc/patroni.yml resume --wait patroni-ts-36

Output should look like:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml resume --wait patroni-ts-36
'resume' request sent, waiting until it is recognized by all nodes
Success: cluster management is resumed

18- On any node check topology again to confirm it's not in maintenance mode anymore:

patronictl -c /etc/patroni.yml topology patroni-ts-36

Output should look like:

postgres@ts-36-patroni-node-1:~ $ patronictl -c /etc/patroni.yml topology patroni-ts-36
+ Cluster: patroni-ts-36 (7172574149615387955) 
| Member | Host | Role | State | TL | Lag in MB |
| ts-36-patroni-node-1 | 172.17.0.5 | Leader | running | 2 | |
| + ts-36-patroni-node-2 | 172.17.0.3 | Replica | running | 2 | 0 |
| + ts-36-patroni-node-3 | 172.17.0.4 | Replica | running | 2 | 0 |

At this point the node relocation is already performed in our example using Docker containers to simulate that.

We've now completed the maintenance needed to update Patroni and etcd with ts-36-patroni-node-1 new IP address.

Was this article helpful?

0 out of 0 found this helpful