Detecting and fixing corruption

Matthew Gwillam
Matthew Gwillam
  • Updated

Contributors

Co-authors Matthew Gwillam-Kelly and Luciano Botti.

Technical additions by Jakub Wartak and Arthur Nascimento.

Abstract

This article is part of series:

Introduction

This article consists of the following sections which show the start to finish process when dealing with corruption:

  1. Diagnosing the health of the instance using various tools - how widespread is the corruption? What type of corruption is it?

  2. Evaluate backup/standby options - can we recover from another backup/ node? If not:

  3. Creating a database dump in order to recover the healthy tables and restore into a new, healthy instance.

  4. Recovering the healthy rows from the corrupted table.

  5. Evaluating corrupted rows.

1) Diagnosing the health of the instance

Postgres logs

Postgres logs can be useful in showing relations erroring due to corruption, and queries executed around the time of corruption occurrence. However, they will not show the cause of the corruption.

Lookout for errors which seem to be complaining about internal database issues rather than user-facing things. The following differing error messages suggest some sort of corruption:

ERROR: could not access status of transaction 3881522
DETAIL: could not open file "pg_xact/0003": No such file or directory
ERROR: could not read block 123 in file "base/45678/90123": read only 0 of 8192 bytes
ERROR: failed to re-find parent key in index "xxx_idx" for split pages 123/456
ERROR: cache lookup failed for relation 123456
ERROR: found multixact 123 from before relminmxid 456
ERROR: uncommitted xmin 1234 from before xid cutoff 5678 needs to be frozen
ERROR: MultiXactId 12345 has not been created yet -- apparent wraparound
ERROR: could not fsync file "base/24305/1709567143.1": Input/output error
ERROR: missing chunk number 0 for toast value XXXXX in pg_toast_XXXXX

The following may be related to corruption, or another problem:

  PANIC: could not locate a valid checkpoint record
  LOG: startup process (pid 12345) was terminated by signal 6: Aborted

OS logs

Check dmesg -T for OOM killer, and any other unexpected errors.

Tools for corruption diagnosis

a. Logical backup/dump

Logical backups pg_dumpall or pg_dump are a useful tool in diagnosing whether data corruption exists, as often they will catch corruption and error out of issues exist.

Bear in mind that logical backups can take a long time for large databases.

Do not specify -h localhost so that the default unix socket is used to improve performance and avoid TCP overhead:

pg_dumpall > /dev/null 2> pg_dump_error_output

cat pg_dump_error_output

Healthy output:

Empty file

Example output for corruption:

pg_dump: error: Dumping the contents of table "test_corruption2" failed: PQgetResult() failed.
pg_dump: detail: Error message from server: ERROR:  invalid page in block 0 of relation base/14319/73740
pg_dump: detail: Command was: COPY public.test_corruption2 (i) TO stdout;
pg_dumpall: error: pg_dump failed on database "edb", exiting

b. pg_amcheck

pg_amcheck is recommended to be used to check the consistency of user and toast relations (PG14+):

psql -d edb
create extension amcheck;
exit

pg_amcheck edb -v > amcheck_output 2>&1

cat amcheck_output

Example healthy output:

pg_amcheck: checking btree index "edb.pg_catalog.pg_event_trigger_oid_index"
pg_amcheck: checking heap table "edb.pg_catalog.pg_enum"

Example output for corruption:

WARNING:  XX001: page verification failed, calculated checksum 13021 but expected 10829
LOCATION:  PageIsVerifiedExtended, bufpage.c:153
heap table "edb.public.test_corruption2":
    ERROR:  XX001: invalid page in block 0 of relation base/14319/73740
    LOCATION:  ReadBuffer_common, bufmgr.c:1148
query was: SELECT v.blkno, v.offnum, v.attnum, v.msg FROM pg_catalog.pg_class c, "public".verify_heapam(
relation := c.oid, on_error_stop := false, check_toast := true, skip := 'none'
) v WHERE c.oid = 73740 AND c.relpersistence != 't'

c. pg_catcheck

pg_catcheck is recommended to be used (PG 8.4+) to check for corruption in system catalog tables, e.g pg_catalog.pg_stat_activity.

It is available to download from https://www.enterprisedb.com/repos-downloads. For some additional background on the tool, please see our blog post.

pg_catcheck edb > pg_catcheck_output 2>&1

cat pg_catcheck_output

Example healthy output:

progress: done (0 inconsistencies, 0 warnings, 0 errors)

Example output for corruption:

notice: edb_last_ddl_time row has invalid objid "602": no matching entry in pg_type
row identity: classid="1247" objid="602"
notice: edb_last_ddl_time row has invalid objid "604": no matching entry in pg_type
row identity: classid="1247" objid="604"
progress: done (1060 inconsistencies, 0 warnings, 0 errors)

d. pg_checksums

pg_checksums is designed to be enabled ahead of corruption occurrence, in order to log information to postgres logs. Please see previous article in this series Minimizing-database-corruption-best-practices.

Check if pg_checksums are enabled:

 psql -d edb -c "show data_checksums;"
 
 data_checksums 
 on
(1 row)

If pg_checksums were enabled before corruption occurred, it can be used to check for corruption while the instance is offline (shown for EPAS 16):

sudo systemctl stop edb-as-16 

pg_checksums --pgdata as16/data --check

Example healthy output:

Checksum operation completed
Files scanned:   1777
Blocks scanned:  982829
Bad checksums:  0
Data checksum version: 1

Example output for corruption:

[enterprisedb@epas16 ~]$ /usr/edb/as16/bin/pg_checksums --pgdata as16/data --check
pg_checksums: error: checksum verification failed in file "as16/data/base/14319/73728", block 5: calculated checksum E9DC but block contains 6167
pg_checksums: error: checksum verification failed in file "as16/data/base/14319/73728", block 6: calculated checksum E9DD but block contains 6167
...
...
pg_checksums: error: checksum verification failed in file "as16/data/base/14319/73728", block 123: calculated checksum E97A but block contains 6167
pg_checksums: error: checksum verification failed in file "as16/data/base/14319/73728", block 124: calculated checksum E973 but block contains 6167
pg_checksums: error: could not read block 125 in file "as16/data/base/14319/73728": read 2048 of 8192

Please note that different corruption types may display different types of content. In this example, every row of the table was corrupted with incorrect data, including the page header.

Interestingly, if pg_checksums wasn't enabled before the corruption, attempting to enable it may behave somewhat like a pg_dumpall > /dev/null, depending on the type of corruption. However, prioritise the pg_dumpall over this:

show data_checksums;
 data_checksums
 off
 
 /usr/edb/as16/bin/pg_checksums --pgdata as16/data --enable
pg_checksums: error: could not read block 125 in file "as16/data/base/14319/65536": read 2048 of 8192

2) Evaluate options - backups and other nodes

Once we know the status of the database, we will evaluate various options.

  • If the problem is only with a standby, rebuild the standby.
  • If the problem is with the master, consider restoring from backup or failing over to an unaffected standby (if present). While often data corruption can pass from a primary to a physical standby, it does not always.
  • If running multi master e.g PGD, are the other masters unaffected?
  • Evaluate whether you can restore from a recent logical backup (pg_dump/pg_dumpall) if present.

If we do not have a backup or a healthy standby, we need to dump the data, and restore the healthy data into a new, healthy instance.

If you just hack on the existing, corrupted database until it seems to run again, there’s a good chance of future trouble. - Robert Haas.

3) Create new dump of healthy data

As discussed in the previous article, corruption may be passed on to a physical backup, but is not passed on to a logical backup.

For extra safety, please create a fresh physical backup before performing any steps.

a) Dumping by excluding corrupted tables (recommended):

  • The quickest way to do a dump is to exclude the corrupted table(s).

  • If you only want the database and not globals, run:

    • pg_dump --exclude-table='<schema>.<corrupted table>'
  • If you want to recover the whole cluster (globals, all tables except corrupted) run both:

    • pg_dump --exclude-table='<schema>.<corrupted table>' on the corrupted db
    • pg_dumpall --exclude-database=<corrupted_database> on the instance

The logical backups will then be restored into a new healthy instance.

OR

b) Dumping non corrupted tables using a script

  • If there are large numbers of corrupted tables, too many to specify, we can run a script will run a dump on each table independently, including global objects, views, functions and sequences. A normal dump will fail when it hits the first erroring object.
  • We then recover these objects in a new, empty and healthy instance using another script.
  • These scripts are provided on article Logical dump and restore scripts for Corruption Recovery.
  • Please bear in mind that customers have stated that this script is very slow on large databases, and excluding the corrupted tables is much faster.

4) Corrupted table - diagnosing and retrieving healthy rows

Identifying corrupted rows and recovering healthy rows article provides crucial functions for:

  • Finding out which rows are corrupted
  • Creating a copy of the table with only the healthy rows

The healthy copy of the table should be dumped using pg_dump -t table_name, and restored in the new, healthy instance using psql.

5) Corrupted rows retrieval

If wanting to attempt to recover data from the corrupted rows. Please contact EDB. Each situation will be dealt with on a case by case basis, depending on the type of corruption.

Whether pg_surgery could be used will be evaluated.

Conclusion

  • Please proceed with extreme caution when attempting to recover a corrupted database.

  • This article is based on previous past experiences in which our expert staff were involved. It is provided in the hope that it may be useful, but without any guarantee.

  • In all cases, please contact EDB Technical Support for further assistance before performing these steps. In case you try any of the techniques mentioned in this article without EDB support, you do so at your own risk and you may lose or further corrupt your data.

Related to

Was this article helpful?

0 out of 0 found this helpful