Pooling Pgbouncer vs Pgpool-II

Matthew Gwillam
Matthew Gwillam
  • Updated

An introduction to database pooling and tools PgBouncer and Pgpool-II.

Background

  • In Postgresql, each database client acquires a database connection from the database.
  • postmaster is responsible for listening and providing database connections, requiring overhead.
  • once a client logs off, the connection process is destroyed.
  • Connection pooling allows reuse of the existing process since the client logs off:
  • lowers connection overhead on postmaster.
  • improves database access performance for applications.
  • spreads connection cost over repeated clients.

max_connections (postgresql.conf)

  • It is highly recommended to use a connection pooler when involving large amounts of connections to the database, with a lowered max_connections.
  • A large max_connections allows concurrent transactions which can drastically use server resources and reduce performance:
  • each connection, even idle, uses RAM
  • idle connections slightly use CPU resources, but this is only a problem if CPU is already high
  • large max connections have the potential of hitting the max open files Linux limit. Value of cat /proc/sys/fs/file-max should be > max_files_per_process * max_connections (postgresql.conf)
  • To find the optimum max_connections, performance testing of your database and pooler setup is recommended. This is described in detail in our blog.

Poolers - Pgbouncer & Pgpool-II

We support use of Pgbouncer and Pgpool.

Pgbouncer is our recommended tool due to it's effectiveness in reusing connections, and maintaining idle connections.

Pgpool-II is often used for pooling as part of a bundle, where it's other features are being used in a cluster, e.g replication, load balancing, automatic failover, parallel query.

Only one connection pooler should ever be used.

What is "pooling?"

  • A pool is a set of cached database connections that can be re-used.
  • However, Pgbouncer and Pgpool-II have different ways of achieving this, and their pools are different in nature.

Pgbouncer

  • A "pool" in Pgbouncer, is the set of cached connection(s) corresponding to a specific database:user pair that can be re-used by new connections.
  • Pgbouncer is a single process, with multiple threads. The main thread builds the connection cache:
  • all clients can access each pool.
  • Pgbouncer can be considered light-weight and fast.
  • when there is a very large amount of incoming connections, Pgbouncer might be slow to assign each one of them to a pool because of context switching.
  • To see more information on Pgbouncer, please see https://www.pgbouncer.org/.

Pgpool-II

  • A "pool" in Pgpool-II is the set of cached connection(s) that a child process stores that can be re-used for connections.
  • Pgpool-II uses multiple child processes to cache and host connections:
  • each child process only stores 1 connection per user:database pair, and can store up to max_pool connections of unique pairs.
  • each child process builds the cache up from previous connections that it served.
  • if a client connection is hosted by another process, it does not have access to the other process' cache and pool.
  • therefore, there is a random based element of whether a connection is hosted by a process with a matching connection cache and allow re-use of a connection.
  • the fact that there are multiple processes, each one with their own memory and consuming resources, can indicate that Pgpool-II isn't simple and light-weight as Pgbouncer.
  • to see more information on Pgpool-II, please see https://pgpool.net/mediawiki/index.php/Main_Page, and our Pgpool-II articles - Pgpool-II-introduction and Pgpool-II-max-connections.

Hosting the pooler

Depending on your setup and testing, it can be hosted on the database server or the web server.

Was this article helpful?

0 out of 0 found this helpful