How to force redo to start from an earlier checkpoint

Craig Ringer
Craig Ringer

Some kinds of corruptions can be repaired by forcing PostgreSQL to restart redo at an earlier checkpoint than the restart point in the control file. Here's how to do it.

WARNING: Please check this procedure on a test instance before following it live.

Take a full snapshot of the system if at all possible, before doing anything more. Doing this wrong or doing it when it's not the appropriate fix can make a bad problem worse.

Used correctly, you can force PostgreSQL to retry WAL that was misapplied the first time around due to disk errors, creative operator intervention, postgres or kernel bugs, etc, and repair a broken cluster.

You must have a continuous archive of xlogs from before whatever the corruption was until the current state to attempt this fix. It will only help if the xlogs are correct and intact, and it's just the heap that's corrupt.

  • Examine the xlog filesystem timestamps and/or use pg_xlogdump / pg_waldump to find roughly the LSN range at which forced redo should start.
  • Use pg_xlogdump / pg_waldump to find the exact LSN of a checkpoint at which redo should start. Note down the checkpoint LSN and the file it is in.
  • Set hot_standby = on in postgresql.conf
  • Start the server with a recovery.conf. This updates the control data, setting the cluster status as archive recovery (the status of a running standby). Make sure that the recovery.conf has:
  • standby_mode = on
  • pause_at_recovery_target = on
  • A restore_command if not all WAL is in pg_xlog
  • Stop postgres with pg_ctl stop -m immediate to avoid performing a clean shutdown that would change the status. If it was already shut down, start it then stop it again with this command. Check pg_controldata to make sure it shows the status as archive_recovery.
  • Copy all WAL from that file through to the current state into pg_xlog, or copy the first oldest few segments including the start location segment and set a restore_command in recovery.conf.
  • Create a backup_label file that lists the lsn and xlog segment file of the checkpoint, e.g.
START WAL LOCATION: 44C4/ED7A7EF8 (file 00000001000044C4000000ED)
BACKUP FROM: standby
START TIME: Fri 24 Feb 2017 05:01:07 PM UTC
LABEL: force recovery
  • Start the server
  • Wait for it to replay all WAL and get back to where it was. It will remain in standby mode
  • Shut the server down and remove recovery.conf
  • Start the server; it should now operate normally

This procedure should not increment the server's timeline.

Was this article helpful?

0 out of 0 found this helpful