In some special cases you need the stack trace of a postgres process which is producing an error shortly after connecting, so fast that won't give you time to attach gdb
to it.
If you are facing this situation you can follow this guide to delay the start of the process gaining some valuable time to get the PID and setting gdb
options.
To have better debugging information the postgresql server should have been installed with the *-debuginfo
packages (for Red Hat / Centos / Fedora) or the *-dbgsym
/ *-dbg
packages ( for Debian / Ubuntu). In case you also need to debug some extensions, their debuginfo
packages should have also been installed. See related Knowledge Base articles for more information.
You will also need the gdb
package installed.
Once gdb
is installed you should run the following steps, replacing the command with the one you would like to get the stack trace. For example, let's say we want to debug the command reindexdb -d somedb -t sometable
1) Preparing gdb commands
First we need to create a cmds
file with some gdb
commands that will be used later:
cat > cmds << EOF
set logging file gdb.txt
set logging on
set pagination off
set confirm off
break errfinish
commands 1
print errordata[0]
bt
cont
end
cont
quit
EOF
2) Delaying start of the postgres process
In a shell run the following and send it to background:
PGOPTIONS="-W10" reindexdb -d somedb -t sometable &
Note that during this step the start of the postgres process will be delayed by 10 seconds. So you should be prepared to run or set anything that your command needs before running gdb
.
3) Collecting with gdb
Use the postgresql backend PID that is in 'startup' phase as the process parameter in gdb
:
gdb -p $(pgrep -f 'postgres:.*startup') -x cmds
Note that in this example gdb
will break every time when reaching the errfinish
function. You can change this behaviour according to your needs.
This last step will end up with a gdb.txt
file with the collected stack trace of your command. For example:
Breakpoint 1, errfinish (dummy=dummy@entry=16779816) at elog.c:414
$1 = {elevel = 19, output_to_server = 1 '\001', output_to_client = 1 '\001', show_funcname = 0 '\000', hide_stmt = 0 '\000', hide_ctx = 0 '\000', filename = 0x989240 "bufpage.c", lineno = 152, funcname = 0x989514 <__func__.7628> "PageIsVerified", domain = 0x964bc9 "postgres-10", context_domain = 0x964bc9 "postgres-10", sqlerrcode = 64, message = 0x28c9528 "page verification failed, calculated checksum 3487 but expected 17674", detail = 0x0, detail_log = 0x0, hint = 0x0, context = 0x0, message_id = 0x989280 "page verification failed, calculated checksum %u but expected %u", schema_name = 0x0, table_name = 0x0, column_name = 0x0, datatype_name = 0x0, constraint_name = 0x0, cursorpos = 0, internalpos = 0, internalquery = 0x0, saved_errno = 0, assoc_context = 0x28c6c78}
#0 errfinish (dummy=dummy@entry=16779816) at elog.c:414
#1 0x0000000000713b41 in PageIsVerified (page=page@entry=0x7f566b47d000 "ERR", blkno=blkno@entry=0) at bufpage.c:149
#2 0x00000000006f0020 in ReadBuffer_common (smgr=0x29a62e8, relpersistence=<optimized out>, forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=blockNum@entry=0, mode=RBM_NORMAL, strategy=0x0, hit=hit@entry=0x7ffe42754e17 "") at bufmgr.c:901
#3 0x00000000006f0b40 in ReadBufferExtended (reln=0x7f56c9401068, forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=blockNum@entry=0, mode=mode@entry=RBM_NORMAL, strategy=<optimized out>) at bufmgr.c:664
#4 0x00000000004b1bf3 in heapgetpage (scan=scan@entry=0x2921c98, page=page@entry=0) at heapam.c:373
#5 0x00000000004b2986 in heapgettup (key=0x0, nkeys=0, dir=ForwardScanDirection, scan=0x2921c98) at heapam.c:525
#6 heap_getnext (scan=scan@entry=0x2921c98, direction=direction@entry=ForwardScanDirection) at heapam.c:1804
#7 0x00000000005125d5 in IndexBuildHeapRangeScan (heapRelation=heapRelation@entry=0x7f56c9401068, indexRelation=indexRelation@entry=0x7f56c9401838, indexInfo=indexInfo@entry=0x29220a8, allow_sync=allow_sync@entry=1 '\001', anyvisible=anyvisible@entry=0 '\000', start_blockno=start_blockno@entry=0, numblocks=numblocks@entry=4294967295, callback=callback@entry=0x4c8b50 <btbuildCallback>, callback_state=callback_state@entry=0x7ffe427553e0) at index.c:2311
#8 0x0000000000512b73 in IndexBuildHeapScan (heapRelation=heapRelation@entry=0x7f56c9401068, indexRelation=indexRelation@entry=0x7f56c9401838, indexInfo=indexInfo@entry=0x29220a8, allow_sync=allow_sync@entry=1 '\001', callback=callback@entry=0x4c8b50 <btbuildCallback>, callback_state=callback_state@entry=0x7ffe427553e0) at index.c:2191
#9 0x00000000004c8a67 in btbuild (heap=0x7f56c9401068, index=0x7f56c9401838, indexInfo=0x29220a8) at nbtree.c:209
[...]