PgPool and EFM integration for application high availability

Edgar Silva Leitão
Edgar Silva Leitão

This article provides information and examples for configuring EFM to integrate with PgPool for application high availability. PgPool will manage the VIP in Watchdog mode, requiring a minimum of three PgPool nodes for an EDB-supported architecture, while EFM handles streaming replication and executes PCP commands to attach or detach Postgres backend nodes on each PgPool node.

Requirements

Before configuring the integration between EFM and PgPool, ensure that you have set up the PgPool nodes, installed the PCP commands on each EFM node, and configured Watchdog mode across all PgPool nodes.

An EDB-supported architecture requires a minimum of three EFM nodes and three PgPool nodes, with each EFM and PgPool node located on separate hosts.

PgPool settings

The PgPool nodes must be configured before the EFM nodes. If EFM is configured first, it will not start once you restart the service to apply the new settings for PgPool integration, as EFM will be unable to execute the PCP commands correctly, resulting in the following error in journalctl.

Example:

efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/edb/efm-4.9/bin/efm_pgpool_functions watchdog /etc/edb/efm-4.9/testcluster.properties

For this architecture to work, it is necessary to disable PgPool's health checks and failover actions on the Postgres backends, as these can conflict with EFM health checks and could lead to a backend being mistakenly detached.

health_check_period = 0
failover_command = ''
failback_command = ''
failover_on_backend_error = off
failover_on_backend_shutdown = off

Even though health checks are disabled, you can still enable streaming replication checks by configuring the following settings correctly: sr_check_period, sr_check_user, sr_check_password, and sr_check_database.

It is recommended to set the following parameter to a lower value, such as 10 seconds (default is 5 minutes), to allow PgPool to quickly detect the new primary and reset the connection:

search_primary_node_timeout = 10

EFM settings

EFM must be able to run PCP commands without a password. Ensure that the .pcppass file exists with 0600 permissions (preferably located in the OS user's home directory) and is owned by the user running the instance (typically postgres for PostgreSQL Community or enterprisedb for EPAS).

Then, configure the following settings:

  • pgpool.enable: Enables the use of PCP commands and PgPool integration;
  • pcp.user: The PCP user configured in the pcp.conf file;
  • pcp.host: The VIP configured in the PgPool setting delegate_ip;
  • pcp.port: The port number configured in the PgPool setting pcp_port;
  • pcp.pass.file: Full path on the EFM host where the .pcppass file is located. This file should be accessible by the OS user postgres or enterprisedb;
  • pgpool.bin: Directory where the PCP commands are located, typically /usr/bin/ (for PCP package installation) or /usr/edb/pgpool4.5/bin/ (for EDB PgPool installation if sharing the same host as EFM).

Example:

pgpool.enable=true
pcp.user=pcpuser
pcp.host=192.168.1.249
pcp.port=9898
pcp.pass.file=/var/lib/edb/.pcppass
pgpool.bin=/usr/bin/

The list below shows all the PCP commands used by EFM:

  • pcp_node_count
  • pcp_node_info
  • pcp_attach_node
  • pcp_detach_node
  • pcp_watchdog_info

Once a failover or switchover occurs, the primary node should be correctly reflected in the output of both efm cluster-status and SHOW POOL_NODES;.

Switchover example

1- Check the cluster status in EFM, and verify that the node 192.168.1.100 is currently the primary node.

/usr/edb/efm-4.9/bin/efm cluster-status efm
Cluster Status: efm

	Agent Type  Address              DB       VIP
	----------------------------------------------------------------
	Primary     192.168.1.100        UP        
	Standby     192.168.1.101        UP        
	Standby     192.168.1.102        UP        

Allowed node host list:
	192.168.1.101 192.168.1.100 192.168.1.102

Membership coordinator: 192.168.1.101

Standby priority host list:
	192.168.1.101 192.168.1.102

Promote Status:

	DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info
	---------------------------------------------------------------------------
	Primary     192.168.1.100                           0/60379A0          
	Standby     192.168.1.101        0/60379A0          0/60379A0          
	Standby     192.168.1.102        0/60379A0          0/60379A0          

	Standby database(s) in sync with primary. It is safe to promote.

2- Check the PgPool nodes using the VIP as the host, and confirm that 192.168.1.100 is the primary, as reflected in EFM.

psql -p 9999 -h 192.168.1.249 -U enterprisedb -d postgres -c 'SHOW POOL_NODES;'
 node_id |   hostname    | port | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+---------------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | 192.168.1.100 | 5444 | up     | unknown   | 0.333333  | primary | unknown | 0          | false             | 0                 |                   |                        | 2024-11-13 09:39:13
 1       | 192.168.1.101 | 5444 | up     | unknown   | 0.333333  | standby | unknown | 0          | true              | 0                 |                   |                        | 2024-11-13 09:39:13
 2       | 192.168.1.102 | 5444 | up     | unknown   | 0.333333  | standby | unknown | 0          | false             | 0                 |                   |                        | 2024-11-13 09:39:13
(3 rows)

3- Perform an EFM switchover.

/usr/edb/efm-4.9/bin/efm promote efm -switchover
Promote/switchover command accepted by local agent. Proceeding with promotion and will reconfigure original primary. Run the 'cluster-status' command for information about the new cluster state.

4- After the switchover is complete, check the EFM cluster again to confirm that 192.168.1.101 is now the primary node.

/usr/edb/efm-4.9/bin/efm cluster-status efm
Cluster Status: efm

	Agent Type  Address              DB       VIP
	----------------------------------------------------------------
	Standby     192.168.1.100        UP        
	Primary     192.168.1.101        UP        
	Standby     192.168.1.102        UP        

Allowed node host list:
	192.168.1.101 192.168.1.100 192.168.1.102

Membership coordinator: 192.168.1.101

Standby priority host list:
	192.168.1.100 192.168.1.102

Promote Status:

	DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info
	---------------------------------------------------------------------------
	Primary     192.168.1.101                           0/6037B68          
	Standby     192.168.1.100        0/6037B68          0/6037B68          
	Standby     192.168.1.102        0/6037B68          0/6037B68          

	Standby database(s) in sync with primary. It is safe to promote.

5- Check PgPool again to confirm that the primary node now reflects the updated primary node in the EFM cluster, which is 192.168.1.101.

psql -p 9999 -h 192.168.1.249 -U enterprisedb -d postgres -c 'SHOW POOL_NODES;'
 node_id |   hostname    | port | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+---------------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | 192.168.1.100 | 5444 | up     | unknown   | 0.333333  | standby | unknown | 0          | false             | 0                 |                   |                        | 2024-11-13 10:30:16
 1       | 192.168.1.101 | 5444 | up     | unknown   | 0.333333  | primary | unknown | 0          | false             | 0                 |                   |                        | 2024-11-13 10:30:16
 2       | 192.168.1.102 | 5444 | up     | unknown   | 0.333333  | standby | unknown | 0          | true              | 0                 |                   |                        | 2024-11-13 09:39:13
(3 rows)

If all the configurations are correct, all the PgPool nodes in the Watchdog cluster will reflect the same primary node as the EFM cluster.

Documentation: Failover Manager with EDB PgPool-II

Was this article helpful?

0 out of 0 found this helpful