Tools for performance insight and tracing

Craig Ringer
Craig Ringer

There are many platform tools that can offer a great deal of insight into PostgreSQL's performance and behaviour, supplementing its own stats and metrics.

This article offers pointers to them and brief summaries, pointers to external guides/articles/resources, etc. There's far too much to cover for it to be possible to comprehensively write it up here.

An overview of tracing and profiling tools that can be useful when investigating performance issues, mainly on Linux but with other platforms mentioned where relevant.

EDB Lasso

Lasso should be your first stop as it automates collection of a large suite of data.It's an excellent starting point. Use Scipio to analyse the results and/or view the data dump directly.

Lasso cannot do everything. Crucially:

  • If run on a remote PostgreSQL instance it can only see what PostgreSQL can see, it can't collect system level data
  • At time of writing Lasso does not run as root so it does not support collecting anything that requires root access
  • The Lasso doesn't collect much in the way of time-series data so it only sees a momentary snapshot of the system state

PostgreSQL's own tools

Stats views

The pg_stat_* and pg_statio_* views like

  • pg_stat_activity
  • pg_stat_replication
  • pg_stat_progress_vacuum
  • pg_stat_wal_receiver
  • pg_replication_slots
  • pg_stat_user_tables
  • ...

etc offer a great deal of insight into what PostgreSQL is currently doing and into trends. Make sure you're familiar with them and keep an eye out for enhancements in new versions.

They offer far from complete coverage though. There's almost no insight into memory use, cache utilization, or CPU use, and very little to help associate load (especially I/O) with specific users, workloads or queries. There's little direct insight into bloat. They won't tell you much about walsender or archiver performance, checkpointing, etc either.

They should still be your first stop.

See

Other system tables and views

A number of other system views offer important insights. In particular pg_locks helps when looking into why a query isn't progressing. There is not currently a helper view around pg_stat_activity and pg_locks to simply report lock waits.

For bloat you want to use queries on pg_class directly; see linked related KB article linked.

Wait Events

PostgreSQL tracks wait events, keeping information on how long we wait for locks, I/O etc in various places. Current waits are visible in pg_stat_activity.wait_event.

There isn't currently aggregation over time for wait events. External tooling could create an aggregator for it either by polling pg_stat_activity, using perf, or with an extension like pg_wait_sampling (see (http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/)[blog]). Note pg_wait_sampling is not on the supported software list and needs some checking before we can recommend it to customers.

Extensions for observing PostgreSQL state and history

  • Core
  • pg_buffercache
  • pg_stat_statements
  • auto_explain
  • 3rd party
  • 2ndQuadrant's supporttools has a variety, see the readme there and the directory listing. A subset of these are:
  • extensions/checkrel for examining possibly bad pages
  • extensions/indexcheck for b-tree index validation etc
  • investigation/buffercache.sq;l - buffer cache stats
  • investigation/fk_check.sql - (re)validate all foreign key relationships
  • check_catalogs - catalog integrity checking

System tools (Linux, some on other *nix)

Observing the host/vm the db runs on and what the db is doing there. CPU, memory, I/O, network, etc etc.

Tracing/probing tools like perf, stap (SystemTap) and bpftrace really transcend all the categories here, providing targeted tool-creation toolkits. They're covered separately.

Observing current system state

Tools for seeing what the system and database engine is doing right now.

"top"-like tools

Useful for initial investigations - "what's going on here then?"

  • top - for cpu and memory
  • ntop - for network activity
  • iotop - for disk I/O
  • perf top and perf top -az - see "Dynamic Tracing Tools"

Other basic system observation tools

  • ps
  • ps deserves its own chapter
  • ps -e -H --forest -o pid,ppid,state,wchan:40,cmd is particularly useful - wchan indicates where if anywhere the proc is waiting in the kernel. Helps hugely with mystery "D" states. Remember -u postges instead of -e is good too.
  • Seriously invest the time to learn ps properly
  • lsof - "list open files" is really a multipurpose toolbox that can do all sorts of things with files, sockets and more
  • netstat and ss
  • vmstat - list samples of aggregate cpu, i/o, memory, network etc activity to give a system-wide broad picture of load and activity
  • iostat - much more detail on I/O activity on the device level
  • ping - round-trip times on the network
  • traceroute, tcptraceroute and mtr - trace network paths and latencies
  • ioping - measure I/O operation round trips through storage
  • strace - can be used to glimpse what process is currently doing; see also ps's wchan field, gdb, and dynamic tracing tools. Can get stacks with -k too.
  • sysdig - tool for drilling down into various system activities. Won't be preinstalled, but v useful, packaged in Ubuntu at least. Answers questions like "which files is the system doing most I/O on right now?".
  • /proc/$pid/stack to observe the kernel call stack for the target process

Don't forget that if a tool doesn't exist for what you need, you can probably make one with SystemTap or perf can run tap scripts to drill down and inspect kernel or user space data structures, produce backtraces from arbitrary points of execution etc.

Time/aggregate/historical

When you need to collect data over longer periods or examine historical data and trends.

  • sar (important as should be collecting historical data on most systems already)
  • Most systems have sysstat including sar and have a bunch of coarse grained stats being collected all the time
  • vmstat (long sampling periods)
  • iostat (long sampling periods)
  • perf record and perf report
  • SystemTap can collect aggregates and statistics

Tracing and probing

  • perf (linux-perf)
  • SystemTap
  • bpftrace and ebpf-tools
  • strace - system call tracing
  • strace -c for statistics on syscalls
  • strace -k can show the userspace stack leading up to the call
  • ltrace - library function interface call tracing.
  • ltrace -k captures stacks, like strace does
  • blkmon - block I/O observation
  • gdb - debugger to inspect current program execution, stack and variables, etc

Dynamic tracing tools

Linux has a number of related/overlapping tools for general purpose probing, tracing and inspection of system and program state. It's not simple to say what they can do because they're so flexible - with some time and effort, you can do nearly anything with systemtap or with perf-script.

Tools include:

  • perf
  • stap from SystemTap
  • bpftrace and ebpf-tools
  • sysdig (deserves a mention though it's not quite the same thing)

Broad capabilities include:

  • Focusing and filtering by a variety of criteria:
  • system-wide observation and tracing
  • process or process group
  • process user id
  • vm/container/cgroup
  • executable or loaded library
  • Observing system calls made by processes
  • Observing application level function calls at entry points and printing arguments
  • Capturing process stacks at probe point hits
  • Filtering probe point hits by expressions
  • Associating process stacks and functions with system calls made or other trace/probe hits, e.g. "which processes are making lots of fsync() calls"
  • Finding targeted process interactions, e.g. "which process or executable or script is sending SIGTERM to postgres background workers, and what's invoking it?"
  • Keeping and reporting statistics on probe hits
  • Logging streams of probe hits along with arguments captured, stack info etc
  • In the case of systemtap, largely arbitrary reporting and tracking of data within probe scripts
  • lots more

These tools work best when used in combination with pre-written probes or scripts, or with a deep understanding of the system you are investigating. They're vital for when you just can't see what's going on any other way.

Some tools are better at some things.

perf is really good at working with stacks, and can do a great job of reporting on collected data to show various tree-form representations of how different code paths led to a certain function call, system call, probe point hit, etc, the relative frequencies and more. However, it's quite bad at understanding anything that requires keeping state and it doesn't have a simple way to measure and collect the duration between two different probes or traces (such as syscall enter and exit).

systemtap is really good at tracking and observing userspace state and activity. It's great when you need the tool to be aware of and understand program state and associate various activities within a process with subcategories, search for particular things, ignore/filter out irrelevant activity etc. systemtap is also great for when there are complex multi-process interactions and with effort it can even trace across hosts. Systemtap has the best userspace probing support of any of the tools available, as well as the best statistics and aggregation. Systemtap also lets you write libraries that make it much, much simpler to apply by encapsulating knowledge of subsystems or processes.

bpftrace is newer and has limited support for userspace probing. But there are some great tools for helping to understand low level and kernel activity.

System tools (Windows)

"Full stack" tracing

A newer model is the full-stack tracing suite. Agents are usually embedded into applications and send trace information to an aggregator. The aggregator can query it and display useful, easily understood results showing operators a full picture of the lifecycle of an activity as it passes through the layers in their system.

For example a system might be set up to observe incoming HTTP requests at the load balancer and tag them with a request identity. Each layer in the system preserves the tag in some manner specific to that layer's communication with the other layers - for example, the request ID might be sent to the app server as a HTTP header, then to the DB connection pooler as an app server context variable, then to the database as a SET command to set a session variable. Each layer reports on the entry of a request, reports activities at key points throughout its execution and details on the request outcome.

Operators can look at a request to see where it's up to, where it's stuck, etc. They can sample requests, keep stats, get timing data, lots more.

Systems vary in their capabilities. They're not something you can just throw in place on pre-existing systems when doing diagnostics but they're important to understand. We're moving in the direction of adopting some of them.

Examples include

  • OpenTelemetry
  • Jaeger / OpenTracing
  • OpenCensus
  • Zipkin
  • Dapper (obsolete, predecessor)

Higher level aggregation, alerting and trending

There are lots of agents and tools to collect this data and more then ingest it to compute trend information on it, organise it for display, highlight and alert on unusual states etc. That's a whole extra topic. See

  • icinga2
  • Cacti and Munin
  • zabbix
  • Prometheus

Was this article helpful?

0 out of 0 found this helpful