PostgreSQL’s EXPLAIN command is your window into understanding how the database engine executes your queries. Let’s explore how to effectively analyze query performance using this powerful tool.

The basic syntax for analyzing a query starts with prefixing your SQL statement with EXPLAIN:


EXPLAIN SELECT * FROM users WHERE age > 25;

However, to get real execution statistics rather than just the planned behavior, you’ll want to use EXPLAIN ANALYZE:


EXPLAIN ANALYZE SELECT * FROM users WHERE age > 25;

When examining the output, PostgreSQL presents execution information in a tree-like structure, reading from the inside out. Each node represents an operation, with indentation showing how operations nest within each other. The most deeply indented operations execute first, feeding their results upward to parent operations.

Let’s decode the key metrics you’ll encounter:


-- Example output with annotations
EXPLAIN ANALYZE
SELECT * FROM users WHERE age > 25;

Seq Scan on users  (cost=0.00..1.14 rows=10 width=244) (actual time=0.012..0.014 rows=8 loops=1)
  Filter: (age > 25)
  Rows Removed by Filter: 2
Planning Time: 0.095 ms
Execution Time: 0.027 ms

The cost values (0.00..1.14) represent PostgreSQL’s estimation of the relative computational expense. The first number is the startup cost – how long before the first row is returned. The second number is the total cost to return all rows. These aren’t actual milliseconds but rather arbitrary units PostgreSQL uses internally.

The actual time values (0.012..0.014) show the real execution time in milliseconds. The first number is the startup time, and the second is the total time. The “rows” value indicates how many rows were processed, while “loops” shows how many times this operation was performed.

For more detailed analysis, you can add options to EXPLAIN:


EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) 
SELECT * FROM users WHERE age > 25;

BUFFERS shows how many disk blocks were read or written, helping you identify I/O-heavy operations. FORMAT JSON (or FORMAT XML, FORMAT YAML) provides the output in a structured format, which can be easier to parse programmatically.

When optimizing queries, pay special attention to:

  1. Sequential scans on large tables (they might benefit from an index)
  2. Large differences between estimated and actual row counts (might indicate outdated statistics)
  3. High startup costs (problematic for applications needing quick first results)
  4. Multiple loops (potential for optimization through different join strategies)

Remember to run ANALYZE on your tables periodically:


ANALYZE users;

When sharing query plans with others or documenting performance issues, always include the complete EXPLAIN ANALYZE output and relevant table statistics. This provides crucial context for understanding the query’s behavior in your specific environment.
A thorough analysis often involves comparing different query approaches. For example, if you’re querying a range of dates, you might compare these alternatives:


-- Compare performance of different approaches
EXPLAIN ANALYZE SELECT * FROM events WHERE date_column >= '2024-01-01' AND date_column < '2024-02-01';
EXPLAIN ANALYZE SELECT * FROM events WHERE date_trunc('month', date_column) = '2024-01-01';

By understanding how to read and interpret EXPLAIN ANALYZE output, you can make informed decisions about query optimization and index creation, leading to better performing database applications.

Similar content:

Categorized in:

Short answers,

Last Update: 22/01/2025