There are many reason why PostgreSQL optimizer cannot use an index:
- Index is not present
- Index is INVALID state (e.g. after aborted
CREATE CONCURRENT INDEX
statement) - Function/expression is applied on an indexed column (and the function/expressions is not indexed)
- Datatype mismatch
- According to the PostgreSQL statistics, index wouldn't accelerate the query
- The index type doesn't support operator used by the query
Here, were are going to concentrate on showing effect of not using the index due to JDBC <-> PostgreSQL data type mismatch.
Create table (note the "bigint" datatype used here):
create table t (id bigint primary key, val text);
insert into t select g, 'sampledata' || g from generate_series(1, 2000000) g(i);
analyze t;
alter table t owner to app;
As seen above there is index for primary key.
Install and configure properly auto_explain:
alter system set auto_explain.log_min_duration = '50ms';
alter system set auto_explain.log_analyze = on;
alter system set auto_explain.log_buffers = on;
alter system set auto_explain.log_verbose = on;
alter system set auto_explain.log_timing = on;
Prepare problematic application:
import java.sql.*;
import java.math.BigDecimal;
public class JdbcDataTypeMismatch
{
public static void main(String[] args)
{
try
{
String url = "jdbc:postgresql://127.0.0.1:4444/edb";
String user = "app";
String password = "abc123";
BigDecimal max = new BigDecimal(10000);
Class.forName("org.postgresql.Driver");
Connection c = DriverManager.getConnection(url, user, password);
c.setAutoCommit(false);
PreparedStatement ps = c.prepareStatement("select val from t where id = ?");
for(BigDecimal i = new BigDecimal(0); i.compareTo(max) < 0; i = i.add(new BigDecimal(1))) {
ps.setBigDecimal(1, i);
ResultSet rs = ps.executeQuery();
while (rs.next()) {
}
rs.close();
c.commit();
if(i.remainder(new BigDecimal(1000)).compareTo(BigDecimal.ZERO) == 0) {
System.out.println(i + "/" + max);
}
}
}
catch(ClassNotFoundException e)
{
System.out.println("Class Not Found : " + e.getMessage());
}
catch(SQLException e)
{
System.out.println("SQL Exception: " + e.getMessage());
}
}
}
Compile and run it using:
javac JdbcDataTypeMismatch.java
java -cp postgresql-42.6.0.jar:. JdbcDataTypeMismatch
One will see in PostgreSQL logs:
2023-09-20 11:47:16 CEST LOG: duration: 410.446 ms plan:
Query Text: select val from t where id = $1
Seq Scan on public.t (cost=0.00..44608.00 rows=10000 width=17) (actual time=0.121..410.438 rows=1 loops=1)
Output: val
Filter: ((t.id)::numeric = '4'::numeric)
Rows Removed by Filter: 1999999
Buffers: shared hit=14608
2023-09-20 11:47:17 CEST LOG: duration: 469.929 ms plan:
Query Text: select val from t where id = $1
Seq Scan on public.t (cost=0.00..44608.00 rows=10000 width=17) (actual time=0.019..469.920 rows=1 loops=1)
Output: val
Filter: ((t.id)::numeric = '5'::numeric)
Rows Removed by Filter: 1999999
Buffers: shared hit=14608
2023-09-20 11:47:17 CEST LOG: duration: 459.444 ms plan:
Query Text: select val from t where id = $1
Seq Scan on public.t (cost=0.00..44608.00 rows=10000 width=17) (actual time=0.018..459.438 rows=1 loops=1)
Output: val
Filter: ((t.id)::numeric = '6'::numeric)
Rows Removed by Filter: 1999999
Buffers: shared hit=14608
[..]
As evidenced above the PostgreSQL optimizer applies Filter: ((t.id)::numeric = '6'::numeric)
because the JDBC BigDecimal data type supplied by the client (JDBC) is sent as numeric
and needs to be cast (converted), which causes index NOT to be used.
If we change the JDBC data type used from BigDecimal
to Long
to match with the PostgreSQL schema's data type (bigint/int(8)), one can measure the improved query taking much less time per average execution (using Prepared Statements) as low as 0.01 ms:
edb=# select query, calls, mean_exec_time, stddev_exec_time from pg_stat_statements where query like 'select val from t where id = %';
query | calls | mean_exec_time | stddev_exec_time
---------------------------------+-------+----------------------+----------------------
select val from t where id = $1 | 10000 | 0.016391058100000012 | 0.014163808369450812
and once understood it can be simulated by properly misaligning casts even from psql(1):
edb=# explain analyze select val from t where id = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Index Scan using t_pkey on t (cost=0.43..8.45 rows=1 width=17) (actual time=0.028..0.029 rows=1 loops=1)
Index Cond: (id = 1)
Planning Time: 0.203 ms
Execution Time: 0.051 ms
(4 rows)
edb=# explain analyze select val from t where id = 1::numeric;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..44608.00 rows=10000 width=17) (actual time=0.013..558.057 rows=1 loops=1)
Filter: ((id)::numeric = '1'::numeric)
Rows Removed by Filter: 1999999
Planning Time: 0.132 ms
Execution Time: 558.685 ms
(5 rows)
edb=#
Here's the improved (fast) code:
import java.sql.*;
public class JdbcDataTypeMismatch
{
public static void main(String[] args)
{
try
{
String url = "jdbc:postgresql://127.0.0.1:4444/edb";
String user = "app";
String password = "abc123";
Long max = new Long(10000);
Class.forName("org.postgresql.Driver");
Connection c = DriverManager.getConnection(url, user, password);
c.setAutoCommit(false);
PreparedStatement ps = c.prepareStatement("select val from t where id = ?");
for(Long i = new Long(0); i < max; i++) {
ps.setLong(1, i);
ResultSet rs = ps.executeQuery();
while (rs.next()) {
}
rs.close();
c.commit();
if(i % 1000 == 0) {
System.out.println(i + "/" + max);
}
}
}
catch(ClassNotFoundException e)
{
System.out.println("Class Not Found : " + e.getMessage());
}
catch(SQLException e)
{
System.out.println("SQL Exception: " + e.getMessage());
}
}
}
More info about mapping data types can be found in JDBC™ 4.3 Specification in the following section "Appendix B / Data Type Conversion Tables"
In short for purposes of this example we derive the following mapping table :
PostgreSQL Data Type JDBC Type Java Language Types
bigint BIGINT long (Long)
numeric NUMERIC java.math.BigDecimal