Partitioning refers to splitting what is logically one large table into smaller physical pieces.
Partitioning in Postgres is present starting from version 8.1. The table inheritance feature in PostgreSQL allows the creation of child tables with the same structure as a parent table. With the right set of triggers on the parent table, practical 'table partitioning' is provided.
Postgres supports the following forms of partitioning:
Range Partitioning :
The table is partitioned into “ranges” defined by a key column or set of columns with no overlap between the ranges of values assigned to different partitions. For example, one might partition by date ranges or by ranges of identifiers for particular business objects.
List Partitioning :
The table is partitioned by explicitly listing which key values appear in each partition.
Hash Partitioning (From Postgres 11):
The table is partitioned by specifying a modulus and a remainder for each partition. Each partition will hold the rows for which the hash value of the partition key divided by the specified modulus will produce the specified remainder.
Why to partition a table:
Query performance can be dramatically improved in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily used parts of the indexes fit in memory.
When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table.
Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design. Executing ALTER TABLE DETACH PARTITION or dropping an individual partition using DROP TABLE is far faster than a bulk operation. These commands also entirely avoid the VACUUM overhead caused by a bulk DELETE.
Seldom-used data can be migrated to cheaper and slower storage media.
As said trigger-based partitioning implemented on Postgres table inheritance is still working. Below is the example of inheritance and trigger-based partitioning
CREATE TABLE temperature (
id BIGSERIAL PRIMARY KEY NOT NULL,
city_id INT NOT NULL,
timestamp TIMESTAMP NOT NULL,
temp DECIMAL(5,2) NOT NULL
);
CREATE TABLE temperature_201901 (CHECK (timestamp >= DATE '2019-01-01' AND timestamp <= DATE '2019-01-31')) INHERITS (temperature);
CREATE TABLE temperature_201902 (CHECK (timestamp >= DATE '2019-02-01' AND timestamp <= DATE '2019-02-28')) INHERITS (temperature);
CREATE TABLE temperature_201903 (CHECK (timestamp >= DATE '2019-03-01' AND timestamp <= DATE '2019-03-31')) INHERITS (temperature);
CREATE OR REPLACE FUNCTION temperature_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF ( NEW.timestamp >= DATE '2019-01-01' AND NEW.timestamp <= DATE '2019-01-31' ) THEN INSERT INTO temperature_201901 VALUES (NEW.*);
ELSIF ( NEW.timestamp >= DATE '2019-02-01' AND NEW.timestamp <= DATE '2019-02-28' ) THEN INSERT INTO temperature_201902 VALUES (NEW.*);
ELSIF ( NEW.timestamp >= DATE '2019-03-01' AND NEW.timestamp <= DATE '2019-03-31' ) THEN INSERT INTO temperature_201903 VALUES (NEW.*);
ELSE RAISE EXCEPTION 'Date out of range!';
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
CREATE TRIGGER insert_temperature_trigger
BEFORE INSERT ON temperature
FOR EACH ROW EXECUTE PROCEDURE temperature_insert_trigger();
Drawbacks of trigger-based partitioning are manual work and lack of optimization.
New feature of declarative partitioning is introduced in Postgres 10, where you no longer need to write code manually to route tuples to their correct partitions and no longer need to manually declare correct constraints for each partition. The system did those things automatically. PostgreSQL 10 supports the range and list partitioning methods.
Consider above table using declarative partitioning,
CREATE TABLE temperature (
id BIGSERIAL NOT NULL,
city_id INT NOT NULL,
timestamp TIMESTAMP NOT NULL,
temp DECIMAL(5,2) NOT NULL
) PARTITION BY RANGE (timestamp);
CREATE TABLE temperature_201901 PARTITION OF temperature FOR VALUES FROM ('2019-01-01') TO ('2019-02-01');
CREATE TABLE temperature_201902 PARTITION OF temperature FOR VALUES FROM ('2019-02-01') TO ('2019-03-01');
CREATE TABLE temperature_201903 PARTITION OF temperature FOR VALUES FROM ('2019-03-01') TO ('2019-04-01');
There are few limitations with Postgres version 10 partitioning,
-
Lack of optimization. No pruning of partitions result in poor performance.
-
No default partitioning - if a row does not fit in any child table an error is raised.
-
No row movement across partitions when doing an update.
-
No Index, primary key, and unique key constraints.
-
No support for hash partitioning.
Partitioning Features in Postgres version 11-
To overcome the some limitations of partitioning in version 10, new features were added to Postgres version 11,
- Possibility to define a default partition, to which any entry that would not fit a corresponding partition would be added.
CREATE TABLE orders_default PARTITION OF orders DEFAULT;
-
Indexes added to the main table 'replicated' to the underlying partitions, which improved declarative partitioning usability.
-
Support for Foreign Keys.
-
Row movements across partitions update
-
Partitioning pruning - Improves select performance through enhanced partition elimination strategies during query planning and execution. A significantly cheaper plan is delivered. This is possible by parameter 'enable_partition_pruning' being set at session level (eg. set enable_partition_pruning= on/off)
-
Support Index, primary key, and unique key constraints, and triggers on partitioned tables.
-
Sub partitioning - adding table with partitions to previously created partitioned table, consider below example,
-
Hash partition: In simple terms hash partitioning means dividing parent table into equal size of child tables, hash partitioning is useful for large tables containing no logical or natural value ranges to partition. consider below example,
CREATE TABLE clients (
client_id INTEGER, name TEXT
) PARTITION BY HASH (client_id);
CREATE TABLE clients_0 PARTITION OF clients FOR VALUES WITH (MODULUS 3, REMAINDER 0);
CREATE TABLE clients_1 PARTITION OF clients FOR VALUES WITH (MODULUS 3, REMAINDER 1);
CREATE TABLE clients_2 PARTITION OF clients FOR VALUES WITH (MODULUS 3, REMAINDER 2);
Modulus is actual number of partitions and remainder should be less than modulus.
-
In Postgresql 12 the most noticeable enhancement is a performance improvement. There is a big focus on scaling partitioning to make it not only perform better but perform better with a larger number of partitions.
-
Partitioning performance enhancements can improve query performance, particularly performance with INSERT and COPY statements.
-
In Postgresql 12 users have the ability to alter partitioned tables without blocking queries.
-
Foreign Key references for partitioned tables.
Partitioning Features in Postgres version 13
-
Allow pruning of partitions and partition wise joins to happen in more cases
-
Allow partitioned tables to be logically replicated via publications
Previously, partitions had to be replicated individually. Now a partitioned table can be published explicitly, causing all its partitions to be published automatically. Addition/removal of a partition causes it to be likewise added to or removed from the publication. The CREATE PUBLICATION option publish_via_partition_root controls whether changes to partitions are published as their own changes or their parent's.
- Allow logical replication into partitioned tables on subscribers
Previously, subscribers could only receive rows into non-partitioned tables.
- Allow whole-row variables to be used in partitioning expressions
Partitioning Features in Postgres version 14
- Improve the performance of updates and deletes on partitioned tables with many partitions (Amit Langote, Tom Lane)
This change greatly reduces the planner's overhead for such cases, and also allows updates/deletes on partitioned tables to use execution-time partition pruning.
- Allow partitions to be detached in a non-blocking manner (Álvaro Herrera)
The syntax is ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY, and FINALIZE.
- Ignore COLLATE clauses in partition boundary values (Tom Lane)
Previously any such clause had to match the collation of the partition key; but it's more consistent to consider that it's automatically coerced to the collation of the partition key.
- Allow REINDEX to process all child tables or indexes of a partitioned relation (Justin Pryzby, Michael Paquier)
Partitioning Features in Postgres version 15
- Improve planning time for queries referencing partitioned tables (David Rowley)
This change helps when only a few of many partitions are relevant.
- Allow ordered scans of partitions to avoid sorting in more cases (David Rowley)
Previously, a partitioned table with a DEFAULT partition or a LIST partition containing multiple values could not be used for ordered partition scans. Now they can be used if such partitions are pruned during planning.
- Improve foreign key behavior of updates on partitioned tables that move rows between partitions (Amit Langote)
Previously, such updates ran a delete action on the source partition and an insert action on the target partition. PostgreSQL will now run an update action on the partition root, providing cleaner semantics.
-
Allow CLUSTER on partitioned tables (Justin Pryzby)
-
Fix ALTER TRIGGER RENAME on partitioned tables to properly rename triggers on all partitions (Arne Roland, Álvaro Herrera)
Also prohibit cloned triggers from being renamed.