Write database logs inside the table

Supriya Khosare
Supriya Khosare
  • Updated

The purpose of this article is to show you how to pull PostgresSQL log file content from within the Postgres tables.

What is the need?

Sometimes, you must create a view or report of the information in the Postgres logs, and for this, the initial step is to put the data into the table so that we can analyze it further. By default, we don’t have tables in Postgres that can log the data from the log file to the table. If something suspicious happens, the fastest way to debug the log is to query it rather than look at the enormous records.

The possible ways to achieve this are as follows:

  1. Table auditing.
  2. Using csv log format and pull the entire log file into the table at once.
  3. Using the file_fdw module and get everything from the log file to the table at runtime.

In the case of table auditing, you will have to create the trigger on the required tables, and then you can log each event from the main table to the audit table. Here, you cannot capture each log as you see in database log files, but you can definitely log events such as insert, update, or delete. In this article, we will focus on the second and third ways of capturing postgres logs.

Using CSV-Format Log Output

The very first step to achieve this is to change the log_destination to csvlog. Open the postgresql.conf file and add the changes as below and reload the database to take effect.

> #log_destination = 'stderr' 
log_destination = 'csvlog' 

See below how the log file format changes once we set the above:

bash-4.2$ cd /var/lib/edb/as15/data/log
bash-4.2$ ls -ltrh
total 12K
-rw_______. 1 enterprisedb enterprisedb 2.6K Apr 8 05:38 edb-2023-04-08_053334.log
-rw_______. 1 enterprisedb enterprisedb 156 Apr 8 05:38 edb-2023-04-08_053836.log. —> before
-rw_______. 1 enterprisedb enterprisedb 2.3K Apr 8 05:38 edb-2023-04-08_053836.csv. —> after

Now let's create a table called postgres_log.

CREATE TABLE postgres_log
(
log_time timestamp(3) with time zone,
user_name text,
database_name text,
process_id integer,
connection_from text,
session_id text,
session_line_num bigint,
command_tag text,
session_start_time timestamp with time zone,
virtual_transaction_id text,
transaction_id bigint,
error_severity text,
sql_state_code text,
message text,
detail text,
hint text,
internal_query text,
internal_query_pos integer,
context text,
query text,
query_pos integer,
location text,
application_name text,
backend_type text,
leader_pid integer,
query_id bigint,
PRIMARY KEY (session_id, session_line_num)
);


enterprisedb=# \dt
List of relations
Schema | Name | Type | Owner 
public | postgres_log | table | enterprisedb
(1 row)

enterprisedb=# \d postgres_log
Table "public.postgres_log"
Column | Type | Collation | Nullable | Default 
log_time | timestamp(3) with time zone | | | 
user_name | text | | | 
database_name | text | | | 
process_id | integer | | | 
connection_from | text | | | 
session_id | text | | not null | 
session_line_num | bigint | | not null | 
command_tag | text | | | 
session_start_time | timestamp with time zone | | | 
virtual_transaction_id | text | | | 
transaction_id | bigint | | | 
error_severity | text | | | 
sql_state_code | text | | | 
message | text | | | 
detail | text | | | 
hint | text | | | 
internal_query | text | | | 
internal_query_pos | integer | | | 
context | text | | | 
query | text | | | 
query_pos | integer | | | 
location | text | | | 
application_name | text | | | 
backend_type | text | | | 
leader_pid | integer | | | 
query_id | bigint | | | 
Indexes:
"postgres_log_pkey" PRIMARY KEY, btree (session_id, session_line_num)

Now let's import a log file into this table, using the COPY FROM command:

enterprisedb=# \COPY postgres_log FROM '/var/lib/edb/as15/data/log edb-2023-04-08_053836.csv' WITH csv;
COPY 72
enterprisedb=# \x
Expanded display is on.
enterprisedb=# select * from postgres_log limit 5;

log_time | 08-APR-23 05:38:36.37 -07:00
user_name | 
database_name | 
process_id | 26481
connection_from | 
session_id | 6431604c.6771
session_line_num | 1
command_tag | 
session_start_time | 08-APR-23 05:38:36 -07:00
virtual_transaction_id | 
transaction_id | 0
error_severity | LOG
sql_state_code | 00000
message | ending log output to stderr
detail | 
hint | Future log output will go to log destination "csvlog".
internal_query | 
internal_query_pos | 
context | 
query | 
query_pos | 
location | 
application_name | 
backend_type | postmaster
leader_pid | 
query_id | 0

log_time | 08-APR-23 05:38:36.37 -07:00
user_name | 
database_name | 
process_id | 26481
connection_from | 
session_id | 6431604c.6771
session_line_num | 2
command_tag | 
session_start_time | 08-APR-23 05:38:36 -07:00
virtual_transaction_id | 
transaction_id | 0
error_severity | LOG
sql_state_code | 00000
message | starting PostgreSQL 15.2 (EnterpriseDB Advanced Server 15.2.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
detail | 
hint | 
internal_query | 
internal_query_pos | 
context | 
query | 
query_pos | 
location | 
application_name | 
backend_type | postmaster
leader_pid | 
query_id | 0

log_time | 08-APR-23 05:38:36.37 -07:00
user_name | 
database_name | 
process_id | 26481
connection_from | 
session_id | 6431604c.6771
session_line_num | 3
command_tag | 
session_start_time | 08-APR-23 05:38:36 -07:00
virtual_transaction_id | 
transaction_id | 0
error_severity | LOG
sql_state_code | 00000
message | listening on IPv4 address "0.0.0.0", port 5444
detail | 
hint | 
internal_query | 
internal_query_pos | 
context | 
query | 
query_pos | 
location | 
application_name | 
backend_type | postmaster
leader_pid | 
query_id | 0

log_time | 08-APR-23 05:38:36.37 -07:00
user_name | 
database_name | 
process_id | 26481
connection_from | 
session_id | 6431604c.6771
session_line_num | 4
command_tag | 
session_start_time | 08-APR-23 05:38:36 -07:00
virtual_transaction_id | 
transaction_id | 0
error_severity | LOG
sql_state_code | 00000
message | listening on IPv6 address "::", port 5444
detail | 
hint | 
internal_query | 
internal_query_pos | 
context | 
query | 
query_pos | 
location | 
application_name | 
backend_type | postmaster
leader_pid | 
query_id | 0

See the corresponding log file,

bash-4.2$ head -10 edb-2023-04-08_053836.csv
2023-04-08 05:38:36.370 PDT,,,26481,,6431604c.6771,1,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,"","postmaster",,0
2023-04-08 05:38:36.370 PDT,,,26481,,6431604c.6771,2,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"starting PostgreSQL 15.2 (EnterpriseDB Advanced Server 15.2.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit",,,,,,,,,"","postmaster",,0
2023-04-08 05:38:36.370 PDT,,,26481,,6431604c.6771,3,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"listening on IPv4 address ""0.0.0.0"", port 5444",,,,,,,,,"","postmaster",,0
2023-04-08 05:38:36.370 PDT,,,26481,,6431604c.6771,4,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"listening on IPv6 address ""::"", port 5444",,,,,,,,,"","postmaster",,0
2023-04-08 05:38:36.374 PDT,,,26481,,6431604c.6771,5,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"listening on Unix socket ""/tmp/.s.PGSQL.5444""",,,,,,,,,"","postmaster",,0
2023-04-08 05:38:36.376 PDT,,,26481,,6431604c.6771,6,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"

bash-4.2$ 

Here is the link to the original documentation.

Using file_fdw

It is also possible to access the file as a foreign table, using the supplied file_fdw module. There are a few things you need to do to simplify importing CSV log files:

  1. Set log_filename and log_rotation_age to provide a consistent, predictable naming scheme for your log files. This lets you predict what the file name will be and know when an individual log file is complete and therefore ready to be imported.

  2. Set log_rotation_size to 0 to disable size-based log rotation, as it makes the log file name difficult to predict.

  3. Set log_truncate_on_rotation to on so that old log data isn't mixed with the new in the same file.

  4. The table definition above includes a primary key specification. This is useful to protect against accidentally importing the same information twice.

The COPY command commits all of the data it imports at one time, so any error will cause the entire import to fail. If you import a partial log file and later import the file again when it is complete, the primary key violation will cause the import to fail. Wait until the log is complete and closed before importing. This procedure will also protect against accidentally importing a partial line that hasn't been completely written, which would also cause COPY to fail.

One of the obvious uses for file_fdw is to make the PostgreSQL activity log available as a table for querying. To do this, first, you must be logging into a CSV file, which here we will see somename.csv in below commands.

  1. First, install file_fdw as an extension:
enterprisedb=# CREATE EXTENSION file_fdw;
  1. Then create a foreign server:
enterprisedb=# CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;
  1. Now you are ready to create the foreign data table. Using the CREATE FOREIGN TABLE command, you will need to define the columns for the table, the CSV file name, and its format:
enterprisedb=# CREATE FOREIGN TABLE pglog (
log_time timestamp(3) with time zone,
user_name text,
database_name text,
process_id integer,
connection_from text,
session_id text,
session_line_num bigint,
command_tag text,
session_start_time timestamp with time zone,
virtual_transaction_id text,
transaction_id bigint,
error_severity text,
sql_state_code text,
message text,
detail text,
hint text,
internal_query text,
internal_query_pos integer,
context text,
query text,
query_pos integer,
location text,
application_name text,
backend_type text,
leader_pid integer,
query_id bigint
) SERVER pglog
OPTIONS ( filename 'log/latestlogfilename.csv', format 'csv' );

Done, now you can query your log directly. In production, of course, you would need to define some way to deal with log rotation.

You can list the table using \dE[S+] and \detr command.

enterprisedb=# \dE[S+]
List of relations
Schema | Name | Type | Owner | Persistence | Size | Description 
public | pglog | foreign table | enterprisedb | permanent | 0 bytes | 

enterprisedb=# \detr
List of foreign tables
Schema | Table | Server 
public | pglog | pglog

Before PostgreSQL 11, only a superuser could use file_fdw to read files from the operating system. From PostgreSQL 11, the following default roles are available to enable file_fdw usage by non-superusers:

  1. pg_read_server_files
  2. pg_write_server_files
  3. pg_execute_server_program

Here is the link to the original documentation.

Now let's compare both methods used above

  1. A foreign table can be used in queries just like a normal table, but a foreign table has no storage in the PostgreSQL server. Whenever it is used, PostgreSQL asks the foreign data wrapper to fetch data from the external source, or transmit data to the external source in the case of update commands.
  2. You can see the data runtime in the second method i.e. using file_fdw whereas in the first method if you want to look at the latest logs then you will have to delete the table and create it again to load COPY command with the latest log file.

The below demo shows how the pglog (table created using FDW) table can receive logs that were created after table creation.

enterprisedb=# create schema newme;
CREATE SCHEMA

enterprisedb=# create table newme.tt(id int);
CREATE TABLE

enterprisedb=# insert into newme.tt values(101);
INSERT 0 1

enterprisedb=# drop table newme.tt;
DROP TABLE

enterprisedb=# create user u1;
CREATE ROLE

enterprisedb=# select now();
now 
30-JUN-23 06:46:35.184503 -07:00
(1 row)

enterprisedb=# select log_time,message from postgres_log order by 1 DESC limit 5;;
log_time | message 
30-JUN-23 06:40:37.82 -07:00 | statement: drop user supriya;
30-JUN-23 06:40:28.895 -07:00 | role "supriya" already exists
30-JUN-23 06:40:28.895 -07:00 | statement: create user supriya;
30-JUN-23 06:40:23.48 -07:00 | statement: drop table t;
30-JUN-23 06:40:23.48 -07:00 | table "t" does not exist
(5 rows)

enterprisedb=# select log_time,message from pglog order by 1 DESC limit 5;;
log_time | message 
30-JUN-23 06:46:20.111 -07:00 | statement: create user u1;
30-JUN-23 06:46:12.947 -07:00 | statement: drop table newme.tt;
30-JUN-23 06:45:39.738 -07:00 | statement: create table newme.tt(id int);
30-JUN-23 06:45:26.68 -07:00 | statement: create schema newme;
30-JUN-23 06:44:39.32 -07:00 | statement: CREATE FOREIGN TABLE pglog ( +
| log_time timestamp(3) with time zone, +
| user_name text, +
| database_name text, +
| process_id integer, +
| connection_from text, +
| session_id text, +
| session_line_num bigint, +
| command_tag text, +
| session_start_time timestamp with time zone, +
| virtual_transaction_id text, +
| transaction_id bigint, +
| error_severity text, +
| sql_state_code text, +
| message text, +
| detail text, +
| hint text, +
| internal_query text, +
| internal_query_pos integer, +
| context text, +
| query text, +
| query_pos integer, +
| location text, +
| application_name text, +
| backend_type text, +
| leader_pid integer, +
| query_id bigint +
| ) SERVER pglog +
| OPTIONS ( filename '/var/lib/edb/as15/data/log/edb-2023-06-30_043358.csv', format 'csv');
(5 rows)

The table was created on 30-JUN-23 06:44:39.32 and still it has all the contents after that time. This means using the pglog table you can view recent data from the log file, whereas this was not the case with the previous method.

Related to

Was this article helpful?

0 out of 0 found this helpful