This article aims to discuss PostgreSQL's native logical replication features and how to differentiate between physical and logical replication. Additionally, the article contains steps to setup and maintain logical replication in PostgreSQL.
Physical replication in PostgreSQL will stream every change in primary using WAL and it was introduced in PostgreSQL 9.0. Physical replication have quite a few limitations such as,
-
It can cannot replicate a part of the database or perform selective replication with it.
-
There's no way to write to the standby server.
-
It requires the same major versions for the Primary and Standby.
-
It requires the same platforms for the Primary and Standby, i.e both primary and standby will have same OS.
PostgreSQL 10 implements logical replication, which takes care of the above limitations of physical replication.
Logical Replication:
PostgreSQL supports native logical replication based on publication and subscription model from version 10.
Logical replication is a method of replicating data objects and their changes, based upon their replication identity (usually a primary key).
Logical replication uses a publisher and subscriber model wherein subscribers pull data from the publications they subscribe to.
Logical replication starts by copying a snapshot of initial existing data on the publisher database. When complete, the changes on the publisher (INSERT, UPDATE, and DELETE) are relayed to the subscriber in near-real time.
The subscriber applies data in the same order in which commits were made on the publisher to ensure transactional consistency.
In Logical replication, primary db is referred as publisher database and standby db or target db referred as subscriber database.
Features of logical replication -
- Logical replication replicate a subset such as table.
- Subscriber node can be used for writes.
- Can have different indexes and security definition.
- Logical Replication has cross-version support. Unlike Streaming Replication, Logical Replication can be set between different versions of PostgreSQL (> 10 though).
- Logical Replication can be used for migrations and upgrades.
Limitations of logical replication -
- Tables must have primary key or unique key or need to set replica identity for table
- Does not replicate schema/DDL
- Tables must have the same full qualified name between publication and subscription.
- Subscriptions can have more columns or different order of columns, but the types and column names must match between Publication and Subscription.
First step is to set configuration parameters at publisher side -
- wal_level must be set to logical
wal_level = logical
- max_replication_slots must be set to at least the number of subscriptions expected to connect, plus some reserve for table synchronization
max_replication_slots=10
- max_wal_senders should be set to at least the same as max_replication_slots plus the number of physical replicas that are connected at the same time.
max_wal_senders=10
- Add entry of subscriber in ph_hba.conf of publisher,
host all postgres subscriber_ip/32 md5
-
Restart postgres service on publisher
-
Connect to publisher and create publication using below command,
CREATE PUBLICATION name
[ FOR TABLE [ ONLY ] table_name [ * ] [, ...]
| FOR ALL TABLES ]
[ WITH ( publication_parameter [= value] [, ... ] ) ]
The publication name must be distinct from the name of any existing publication in the current database. A publication is essentially a group of tables whose data changes are intended to be replicated through logical replication Example -
CREATE PUBLICATION mypub FOR TABLE users, departments;
Note - Publication can be created for group of tables or for all the tables in publisher using **FOR ALL TABLES** option.
Version 15 and above have feature to create publication for all tables of particular schema.
-
Connect to the subscriber node,
-
Recreate the tables added in publication on the subscriber node
-
Create subscription using below command,
CREATE SUBSCRIPTION subscription_name
CONNECTION 'conninfo'
PUBLICATION publication_name [, ...]
[ WITH ( subscription_parameter [= value] [, ... ] ) ]
Example -
CREATE SUBSCRIPTION mysub CONNECTION 'dbname=postgres host=publisher_ip user=postgres' PUBLICATION mypub;
Note - Replication slot with Subscription name is get created on publisher after creating subscription on subscriber.
- Query to the check status of subscription on subscriber node,
select * from pg_subscription;
- Queries to the check status of publication on publisher node
select * from pg_publication;
select * from pg_publication_tables;
As the replication slot on publication is get created, need to monitor replication slots by querying pg_replication_slots
table. An inactive replication slot can cause unbounded disk growth due to piled up WAL files in publisher.
When found replication slot active false it means that there is currently not any subscriber connected to the slot.
The subscriber might still need the replication slot,
-
But might be down for maintenance or transient fault.
-
Have temporarily lost its network connection
-
Hit an error when receiving from the slot that causes it to disconnect.
If replication slot is not required by subscriber anymore the we can drop it using pg_drop_replication_slot
select pg_drop_replication_slot(slot_name) from pg_replication_slots where active='f'
Below are some queries for publication and subscription maintenance,
For Publication:
ALTER PUBLICATION PublicationName DROP table TableName;
DROP PUBLICATION PublicationName;
For Subscription:
ALTER SUBSCRIPTION SubscriptionName REFRESH PUBLICATION;
DROP SUBSCRIPTION SubscriptionName;