An introduction to database pooling and tools PgBouncer and Pgpool-II.
- In Postgresql, each database client acquires a database connection from the database.
- postmaster is responsible for listening and providing database connections, requiring overhead.
- once a client logs off, the connection process is destroyed.
- Connection pooling allows reuse of the existing process since the client logs off:
- lowers connection overhead on postmaster.
- improves database access performance for applications.
- spreads connection cost over repeated clients.
max_connections
(postgresql.conf
)
- It is highly recommended to use a connection pooler when involving large amounts of connections to the database, with a lowered
max_connections
. - A large
max_connections
allows concurrent transactions which can drastically use server resources and reduce performance: - each connection, even idle, uses RAM
- idle connections slightly use CPU resources, but this is only a problem if CPU is already high
- large max connections have the potential of hitting the max open files Linux limit. Value of
cat /proc/sys/fs/file-max
should be >max_files_per_process
*max_connections
(postgresql.conf
) - To find the optimum
max_connections
, performance testing of your database and pooler setup is recommended. This is described in detail in our blog.
We support use of Pgbouncer and Pgpool.
Pgbouncer is our recommended tool due to it's effectiveness in reusing connections, and maintaining idle connections.
Pgpool-II is often used for pooling as part of a bundle, where it's other features are being used in a cluster, e.g replication, load balancing, automatic failover, parallel query.
Only one connection pooler should ever be used.
- A pool is a set of cached database connections that can be re-used.
- However, Pgbouncer and Pgpool-II have different ways of achieving this, and their pools are different in nature.
- A "pool" in Pgbouncer, is the set of cached connection(s) corresponding to a specific database:user pair that can be re-used by new connections.
- Pgbouncer is a single process, with multiple threads. The main thread builds the connection cache:
- all clients can access each pool.
- Pgbouncer can be considered light-weight and fast.
- when there is a very large amount of incoming connections, Pgbouncer might be slow to assign each one of them to a pool because of context switching.
- To see more information on Pgbouncer, please see https://www.pgbouncer.org/.
- A "pool" in Pgpool-II is the set of cached connection(s) that a child process stores that can be re-used for connections.
- Pgpool-II uses multiple child processes to cache and host connections:
- each child process only stores 1 connection per user:database pair, and can store up to
max_pool
connections of unique pairs. - each child process builds the cache up from previous connections that it served.
- if a client connection is hosted by another process, it does not have access to the other process' cache and pool.
- therefore, there is a random based element of whether a connection is hosted by a process with a matching connection cache and allow re-use of a connection.
- the fact that there are multiple processes, each one with their own memory and consuming resources, can indicate that Pgpool-II isn't simple and light-weight as Pgbouncer.
- to see more information on Pgpool-II, please see https://pgpool.net/mediawiki/index.php/Main_Page, and our Pgpool-II articles - Pgpool-II-introduction and Pgpool-II-max-connections.
Depending on your setup and testing, it can be hosted on the database server or the web server.