This article introduces the concepts behind vacuuming in PostgreSQL and the basics of tuning PostgreSQL's autovacuum
parameters to improve its efficiency and limit its I/O impact on the server.
Autovacuum is a background subsystem that's enabled by default on PostgreSQL to perform a variety of important cleanup tasks asynchronously.
PostgreSQL defers cleanup work created by write queries to permit those queries to complete more quickly and to reduce the amount of locking that's required between read and write queries. In exchange it creates a backlog of work that must be performed in the background to keep the system functioning optimally.
One type of leftover work is "table bloat" - the space used by tuples that have been deleted or replaced but not yet marked as re-usable space.
Dealing with bloat and the other cleanup tasks is autovacuum's job. See ["Why vacuum?"](#Why vacuum?) for a high level overview of those tasks and why they're important.
It is very important to control bloat proactively. It's much easier to keep a table in good condition than it is to de-bloat a massively bloated table without unduly impacting production operations. Once a system is struggling with bloat it has less spare I/O to spend on bloat control because it's wasting more on querying the bloated tables.
A common mistake is to turn autovacuum off or tune it to run less to try to fix query performance on a server that's slowing down. This often has the opposite effect to the intended speedup. If the cause of the slowdown is increasing table bloat due to autovacuum not keeping up with the workload, restricting autovacuum will seem to help at first, but will make the problem much much worse over time. 2ndQuadrant often see systems where the administrators have progressively de-tuned autovacuum, disabled it, or regularly killed its processes because they see that "autovacuum is using lots of CPU and I/O we need for queries". They often reach the point where the system is nearly out of disk space, running extremely slowly, and sometimes ends up performing forced emergency wrap-around vacuuming.
Forced emergency anti-wrap around vacuuming is bad for your production server, but it's not as bad as what happens if it isn't allowed to run. This forced vacuuming only becomes necessary when all PostgreSQL's prior efforts to perform normal vacuuming have been unsuccessful, often due to administrator action. In emergency anti-wraparound vacuuming mode, PostgreSQL forces autovacuum to turn on, and run at maximum speed. This usually causes serious performance impacts on already badly bloated and struggling servers.... but if it's not allowed to continue PostgreSQL may perform an emergency shutdown and refuse to start up again until neccessary vacuuming is done. You can defer forced vacuuming by adjusting autovacuum parameters, but it's vital that you still urgently prioritize aggressive vacuuming.
It's usually preferable to run autovacuum as frequently and aggressively as necessary to keep bloat at a low, steady level - and not more. If in doubt, make it run more often and faster, not less.
Before doing any sort of tuning, you need to be able to collect relevant data - otherwise how could you say you need to do any tuning at all, or evaluate the impact of configuration changes?
In other words, you should have some basic monitoring in place, collecting metrics from the database. PostgreSQL doesn't offer tooling for this directly, but there are many 3rd party tools available[²][#Notes].
For cleanup, you should to be looking at least at these values:
-
pg_stat_all_tables.n_dead_tup
- number of dead rows in each table (both user tables and system catalogs) -
n_dead_tup
/ (n_dead_tup
+n_live_tup
) - fraction of dead rows in each table - (
pg_class.relpages
* 8192 /pg_class.reltuples
) - space "per row"
Note: There is also a handy pgstattuple
extension, allowing you to perform analysis on tables and indexes, including computing the amount of free space, dead tuples etc. Please refer to the related KB article to have more information about it.
When tuning autovacuum
parameters, we want to achieve two general goals:
- timely cleanup of dead tuples Keep the number of dead rows reasonably low, not to waste unreasonable amount of disk space, prevent index bloat and keep queries fast.
- minimize unnecessary load on the database Do not perform cleanup too often, and do not consume too much resources (CPU, I/O and RAM) which might negatively impact performance of user queries.
These two goals are somewhat contradictory, so autovacuum
tuning is about finding the right balance for your system / workload, and it may take a few iterations to converge to good values.
Most default values in postgresql.conf
are quite conservative. For many deployments (particularly smaller ones and/or handling read-mostly workloads) the default configuration parameters will however work just fine. But as the database size and/or amount of writes increase, problems start to appear. The typical issue is that the cleanup does not happen often enough, and then when it happens it significantly disrupts performance, because it has to deal with a lot of garbage.
In those cases you should follow this simple rule:
If it hurts, you're not doing it often enough.
That is, tune the parameters so that the cleanup happens more often, and possibly processes smaller number of dead tuples every time.
In the PostgreSQL's configuration (postgresql.conf
etc) two system-wide parameters determine when a table is chosen to be VACUUM
ed by the autovacuum
daemon ([possibly modified by per-table settings](#Per-table tuning)):
autovacuum_vacuum_threshold = 50 # min number of row updates before vacuum, set to 50 by default
autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum, set to 20% of table records by default
The cleanup on a table is triggered whenever the number of dead tuples exceeds the following value:
threshold + pg_class.reltuples * scale_factor
This means that dead tuples can reach 20% of a table before they get cleaned up, while threshold prevents frequent VACUUM
execution on tiny tables.
On busy databases (doing a lot of UPDATEs
and DELETEs
), particularly large ones, you should probably decrease the scale factor, so that cleanup happens more frequently.
autovacuum
is a cleanup task, and it should not consume too many resources (CPU and disk I/O) under normal circumstances.
The cleanup process is fairly simple, it reads pages (8KB chunks of data) from data files, and check if they need any cleanup. If there are no dead tuples or other pending tasks for the tuples on the page, the page is simply skipped. Otherwise it's cleaned by modifying the page and tuple headers, marked as "dirty" and eventually written out.
Even when autovacuum only changes a few bytes in a page it must usually write the whole page to WAL (and later to the table) for crash-safety reasons. See full_page_writes.
Vacuum has a built-in delay that makes it pause each time it has "spent" a certain amount of resources to reduce its impact on the rest of the server workload. The total estimated cost of VACUUM
activity (in arbitrary units) is defined by the sum of three basic operations for each page:
vacuum_cost_page_hit = 1 # cost for vacuuming a page found in the buffer cache
vacuum_cost_page_miss = 10 # cost for vacuuming a page retrieved from disk
vacuum_cost_page_dirty = 20 # cost for writing back a modified page to disk
Once the sum of costs has reached autovacuum_cost_limit
(default 200), the autovacuum vacuum
process will be put to sleep for autovacuum_vacuum_cost_delay
(default 20 ms).
This means that, by default, a VACUUM
can consume at worst the following disk I/O resource:
- ~8MB/s for reads
- ~4MB/s for writes
Considering the capabilities of current hardware, and that the reads/writes are mostly sequential, those limits are fairly low. In fact, busy databases often generate dead tuples at much higher rate that autovacuum
cannot keep up with.
Accordingly, we strongly recommend relaxing the throttling limits on autovacuum
by raising autovacuum_cost_limit
and lowering autovacuum_vacuum_cost_delay
. The exact optimal values vary with hardware, database and workload, but you should err in favour of running more not less.
PostgreSQL provides per tables Storage parameters settings to configure individual table autovacuum
parameters that override global settings.
This is helpful to tune autovacuum
on individual table when global settings are not enough or too aggressive for them. For example, when just few tables have million rows it's advisable to change scale_factor
and thresholds
accordingly. It's also very useful when a database has a mix of large slow-changing tables that need infrequent vacuuming and small extremely busy tables that need constant vacuuming. Work-queue tables are a particularly good candidate for table-specific tuning.
To change these parameters use ALTER TABLE
, for example:
ALTER TABLE <table_name> SET (autovacuum_vacuum_scale_factor = X);
As a MVCC database PostgreSQL keeps old versions of deleted or replaced rows around in tables for a while to satisfy other queries. PostgreSQL's MVCC implementation keeps such "dead" rows directly in the main table and its indexes, then returns to them later to "vacuum" them, marking them as free space that may be overwritten.
Query performance may be degraded when there are too many dead tuples in a table and/or its indexes, as PostgreSQL has to do more I/O to read the data, check if it's relevant, then discard the obsolete data. This leftover obsolete data also degrades disk cache performance which means even more I/O is needed. In some cases it may also waste considerable amounts of disk space, making backups harder and risking availability issues related to disk space exhaustion.
The excess or wasted space consumed by dead tuples is called bloat. Some table bloat is normal and even necessary. But heavily bloated tables, especially large ones, can be a serious problem.
The autovacuum daemon in PostgreSQL runs in the background to remove dead tuples that are no longer needed, marking the space as free for re-use. When autovacuum is performing optimally little or no manual VACUUM
ing is required and bloat is kept close to the level that's necessary for PostgreSQL to operate properly. If autovacuum cannot remove bloat as fast as the rest of the server's workload generates it then the system can start slowing down, often impacting autovacuum's performance more and allowing bloat to grow even faster.
Autovacuum is also responsible for [analyzing tables](https://www.postgresql.org/docs/current/sql-analyze.html] to create the table and column statistics used by the query planner. When these stats are outdated, PostgreSQL may choose inferior query plans that could perform a lot worse than the optimal plan for the query¹, often further constricting already stretched I/O resources.
Vacuuming also performs cleanup tasks related to the transaction ID counter. The most user-visible of these tasks is tuple freezing. Freezing is necessary to let postgresql advance the global transaction-id threshold that it uses to decide which transaction-ids in its 32-bit cyclic xid counter are in the future and which are in the past. If it doesn't advance, the system runs out of transaction IDs and stops. The tuple freezing done by (auto)vacuum annotates tuples written by transactions that commited a long time ago with a special marker that disassociates the tuple from the transaction that created it. That permits the transaction-id of the transaction that created the tuple to be safely re-used for new transactions. This task is vital to the proper long term function of the system, especially on systems with very high transaction rates. There are PostgreSQL instances that cycle through the 4 million xids in the xid counter every few days.
¹. The auto_explain and pg_stat_statements extensions can help you spot problem queries. ². Monitoring tools include Prometheus, Zabbix, Icinga2, Munin, and numerous other open source, commercial or hosted solutions.