The purpose of this article is to show you how to pull PostgresSQL log file content from within the Postgres tables.
Sometimes, you must create a view or report of the information in the Postgres logs, and for this, the initial step is to put the data into the table so that we can analyze it further. By default, we don’t have tables in Postgres that can log the data from the log file to the table. If something suspicious happens, the fastest way to debug the log is to query it rather than look at the enormous records.
The possible ways to achieve this are as follows:
- Table auditing.
- Using csv log format and pull the entire log file into the table at once.
- Using the file_fdw module and get everything from the log file to the table at runtime.
In the case of table auditing, you will have to create the trigger on the required tables, and then you can log each event from the main table to the audit table. Here, you cannot capture each log as you see in database log files, but you can definitely log events such as insert, update, or delete. In this article, we will focus on the second and third ways of capturing postgres logs.
The very first step to achieve this is to change the log_destination to csvlog. Open the postgresql.conf file and add the changes as below and reload the database to take effect.
> #log_destination = 'stderr'
log_destination = 'csvlog'
See below how the log file format changes once we set the above:
bash-4.2$ cd /var/lib/edb/as15/data/log
bash-4.2$ ls -ltrh
total 12K
-rw_______. 1 enterprisedb enterprisedb 2.6K Apr 8 05:38 edb-2023-04-08_053334.log
-rw_______. 1 enterprisedb enterprisedb 156 Apr 8 05:38 edb-2023-04-08_053836.log. —> before
-rw_______. 1 enterprisedb enterprisedb 2.3K Apr 8 05:38 edb-2023-04-08_053836.csv. —> after
Now let's create a table called postgres_log.
CREATE TABLE postgres_log
(
log_time timestamp(3) with time zone,
user_name text,
database_name text,
process_id integer,
connection_from text,
session_id text,
session_line_num bigint,
command_tag text,
session_start_time timestamp with time zone,
virtual_transaction_id text,
transaction_id bigint,
error_severity text,
sql_state_code text,
message text,
detail text,
hint text,
internal_query text,
internal_query_pos integer,
context text,
query text,
query_pos integer,
location text,
application_name text,
backend_type text,
leader_pid integer,
query_id bigint,
PRIMARY KEY (session_id, session_line_num)
);
enterprisedb=# \dt
List of relations
Schema | Name | Type | Owner
public | postgres_log | table | enterprisedb
(1 row)
enterprisedb=# \d postgres_log
Table "public.postgres_log"
Column | Type | Collation | Nullable | Default
log_time | timestamp(3) with time zone | | |
user_name | text | | |
database_name | text | | |
process_id | integer | | |
connection_from | text | | |
session_id | text | | not null |
session_line_num | bigint | | not null |
command_tag | text | | |
session_start_time | timestamp with time zone | | |
virtual_transaction_id | text | | |
transaction_id | bigint | | |
error_severity | text | | |
sql_state_code | text | | |
message | text | | |
detail | text | | |
hint | text | | |
internal_query | text | | |
internal_query_pos | integer | | |
context | text | | |
query | text | | |
query_pos | integer | | |
location | text | | |
application_name | text | | |
backend_type | text | | |
leader_pid | integer | | |
query_id | bigint | | |
Indexes:
"postgres_log_pkey" PRIMARY KEY, btree (session_id, session_line_num)
Now let's import a log file into this table, using the COPY FROM command:
enterprisedb=# \COPY postgres_log FROM '/var/lib/edb/as15/data/log edb-2023-04-08_053836.csv' WITH csv;
COPY 72
enterprisedb=# \x
Expanded display is on.
enterprisedb=# select * from postgres_log limit 5;
log_time | 08-APR-23 05:38:36.37 -07:00
user_name |
database_name |
process_id | 26481
connection_from |
session_id | 6431604c.6771
session_line_num | 1
command_tag |
session_start_time | 08-APR-23 05:38:36 -07:00
virtual_transaction_id |
transaction_id | 0
error_severity | LOG
sql_state_code | 00000
message | ending log output to stderr
detail |
hint | Future log output will go to log destination "csvlog".
internal_query |
internal_query_pos |
context |
query |
query_pos |
location |
application_name |
backend_type | postmaster
leader_pid |
query_id | 0
log_time | 08-APR-23 05:38:36.37 -07:00
user_name |
database_name |
process_id | 26481
connection_from |
session_id | 6431604c.6771
session_line_num | 2
command_tag |
session_start_time | 08-APR-23 05:38:36 -07:00
virtual_transaction_id |
transaction_id | 0
error_severity | LOG
sql_state_code | 00000
message | starting PostgreSQL 15.2 (EnterpriseDB Advanced Server 15.2.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
detail |
hint |
internal_query |
internal_query_pos |
context |
query |
query_pos |
location |
application_name |
backend_type | postmaster
leader_pid |
query_id | 0
log_time | 08-APR-23 05:38:36.37 -07:00
user_name |
database_name |
process_id | 26481
connection_from |
session_id | 6431604c.6771
session_line_num | 3
command_tag |
session_start_time | 08-APR-23 05:38:36 -07:00
virtual_transaction_id |
transaction_id | 0
error_severity | LOG
sql_state_code | 00000
message | listening on IPv4 address "0.0.0.0", port 5444
detail |
hint |
internal_query |
internal_query_pos |
context |
query |
query_pos |
location |
application_name |
backend_type | postmaster
leader_pid |
query_id | 0
log_time | 08-APR-23 05:38:36.37 -07:00
user_name |
database_name |
process_id | 26481
connection_from |
session_id | 6431604c.6771
session_line_num | 4
command_tag |
session_start_time | 08-APR-23 05:38:36 -07:00
virtual_transaction_id |
transaction_id | 0
error_severity | LOG
sql_state_code | 00000
message | listening on IPv6 address "::", port 5444
detail |
hint |
internal_query |
internal_query_pos |
context |
query |
query_pos |
location |
application_name |
backend_type | postmaster
leader_pid |
query_id | 0
See the corresponding log file,
bash-4.2$ head -10 edb-2023-04-08_053836.csv
2023-04-08 05:38:36.370 PDT,,,26481,,6431604c.6771,1,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,"","postmaster",,0
2023-04-08 05:38:36.370 PDT,,,26481,,6431604c.6771,2,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"starting PostgreSQL 15.2 (EnterpriseDB Advanced Server 15.2.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit",,,,,,,,,"","postmaster",,0
2023-04-08 05:38:36.370 PDT,,,26481,,6431604c.6771,3,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"listening on IPv4 address ""0.0.0.0"", port 5444",,,,,,,,,"","postmaster",,0
2023-04-08 05:38:36.370 PDT,,,26481,,6431604c.6771,4,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"listening on IPv6 address ""::"", port 5444",,,,,,,,,"","postmaster",,0
2023-04-08 05:38:36.374 PDT,,,26481,,6431604c.6771,5,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"listening on Unix socket ""/tmp/.s.PGSQL.5444""",,,,,,,,,"","postmaster",,0
2023-04-08 05:38:36.376 PDT,,,26481,,6431604c.6771,6,,2023-04-08 05:38:36 PDT,,0,LOG,00000,"
bash-4.2$
Here is the link to the original documentation.
It is also possible to access the file as a foreign table, using the supplied file_fdw
module. There are a few things you need to do to simplify importing CSV log files:
-
Set
log_filename
andlog_rotation_age
to provide a consistent, predictable naming scheme for your log files. This lets you predict what the file name will be and know when an individual log file is complete and therefore ready to be imported. -
Set
log_rotation_size
to 0 to disable size-based log rotation, as it makes the log file name difficult to predict. -
Set
log_truncate_on_rotation
to on so that old log data isn't mixed with the new in the same file. -
The table definition above includes a primary key specification. This is useful to protect against accidentally importing the same information twice.
The COPY
command commits all of the data it imports at one time, so any error will cause the entire import to fail. If you import a partial log file and later import the file again when it is complete, the primary key violation will cause the import to fail. Wait until the log is complete and closed before importing. This procedure will also protect against accidentally importing a partial line that hasn't been completely written, which would also cause COPY
to fail.
One of the obvious uses for file_fdw is to make the PostgreSQL activity log available as a table for querying. To do this, first, you must be logging into a CSV file, which here we will see somename.csv in below commands.
- First, install
file_fdw
as an extension:
enterprisedb=# CREATE EXTENSION file_fdw;
- Then create a foreign server:
enterprisedb=# CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;
- Now you are ready to create the foreign data table. Using the
CREATE FOREIGN TABLE
command, you will need to define the columns for the table, the CSV file name, and its format:
enterprisedb=# CREATE FOREIGN TABLE pglog (
log_time timestamp(3) with time zone,
user_name text,
database_name text,
process_id integer,
connection_from text,
session_id text,
session_line_num bigint,
command_tag text,
session_start_time timestamp with time zone,
virtual_transaction_id text,
transaction_id bigint,
error_severity text,
sql_state_code text,
message text,
detail text,
hint text,
internal_query text,
internal_query_pos integer,
context text,
query text,
query_pos integer,
location text,
application_name text,
backend_type text,
leader_pid integer,
query_id bigint
) SERVER pglog
OPTIONS ( filename 'log/latestlogfilename.csv', format 'csv' );
Done, now you can query your log directly. In production, of course, you would need to define some way to deal with log rotation.
You can list the table using \dE[S+]
and \detr
command.
enterprisedb=# \dE[S+]
List of relations
Schema | Name | Type | Owner | Persistence | Size | Description
public | pglog | foreign table | enterprisedb | permanent | 0 bytes |
enterprisedb=# \detr
List of foreign tables
Schema | Table | Server
public | pglog | pglog
Before PostgreSQL 11, only a superuser could use file_fdw to read files from the operating system. From PostgreSQL 11, the following default roles are available to enable file_fdw usage by non-superusers:
- pg_read_server_files
- pg_write_server_files
- pg_execute_server_program
Here is the link to the original documentation.
- A foreign table can be used in queries just like a normal table, but a foreign table has no storage in the PostgreSQL server. Whenever it is used, PostgreSQL asks the foreign data wrapper to fetch data from the external source, or transmit data to the external source in the case of update commands.
- You can see the data runtime in the second method i.e. using file_fdw whereas in the first method if you want to look at the latest logs then you will have to delete the table and create it again to load COPY command with the latest log file.
The below demo shows how the pglog
(table created using FDW) table can receive logs that were created after table creation.
enterprisedb=# create schema newme;
CREATE SCHEMA
enterprisedb=# create table newme.tt(id int);
CREATE TABLE
enterprisedb=# insert into newme.tt values(101);
INSERT 0 1
enterprisedb=# drop table newme.tt;
DROP TABLE
enterprisedb=# create user u1;
CREATE ROLE
enterprisedb=# select now();
now
30-JUN-23 06:46:35.184503 -07:00
(1 row)
enterprisedb=# select log_time,message from postgres_log order by 1 DESC limit 5;;
log_time | message
30-JUN-23 06:40:37.82 -07:00 | statement: drop user supriya;
30-JUN-23 06:40:28.895 -07:00 | role "supriya" already exists
30-JUN-23 06:40:28.895 -07:00 | statement: create user supriya;
30-JUN-23 06:40:23.48 -07:00 | statement: drop table t;
30-JUN-23 06:40:23.48 -07:00 | table "t" does not exist
(5 rows)
enterprisedb=# select log_time,message from pglog order by 1 DESC limit 5;;
log_time | message
30-JUN-23 06:46:20.111 -07:00 | statement: create user u1;
30-JUN-23 06:46:12.947 -07:00 | statement: drop table newme.tt;
30-JUN-23 06:45:39.738 -07:00 | statement: create table newme.tt(id int);
30-JUN-23 06:45:26.68 -07:00 | statement: create schema newme;
30-JUN-23 06:44:39.32 -07:00 | statement: CREATE FOREIGN TABLE pglog ( +
| log_time timestamp(3) with time zone, +
| user_name text, +
| database_name text, +
| process_id integer, +
| connection_from text, +
| session_id text, +
| session_line_num bigint, +
| command_tag text, +
| session_start_time timestamp with time zone, +
| virtual_transaction_id text, +
| transaction_id bigint, +
| error_severity text, +
| sql_state_code text, +
| message text, +
| detail text, +
| hint text, +
| internal_query text, +
| internal_query_pos integer, +
| context text, +
| query text, +
| query_pos integer, +
| location text, +
| application_name text, +
| backend_type text, +
| leader_pid integer, +
| query_id bigint +
| ) SERVER pglog +
| OPTIONS ( filename '/var/lib/edb/as15/data/log/edb-2023-06-30_043358.csv', format 'csv');
(5 rows)
The table was created on 30-JUN-23 06:44:39.32
and still it has all the contents after that time. This means using the pglog table you can view recent data from the log file, whereas this was not the case with the previous method.
Related to