PostgreSQL 17 Performance Tuning: Queries, Indexes & Configuration

The Performance Diagnosis Framework

Before optimizing anything, you must diagnose. PostgreSQL performance issues fall into four categories: slow queries (missing indexes, bad joins), configuration problems (default settings are for development, not production), connection exhaustion (too many clients, no pooler), and hardware bottlenecks (I/O, memory, CPU).

The diagnosis order: (1) Run pg_stat_statements to find your slowest queries, (2) EXPLAIN ANALYZE those queries to understand execution plans, (3) Add indexes based on the plan analysis, (4) Tune postgresql.conf for your workload.

PostgreSQL 17 adds several performance features: incremental sorting (reuses partial sort orders), enhanced parallel query execution, faster JSONB operations, and improved logical replication.

Key Takeaways

pg_stat_statements: finds your TOP N slowest queries by total time.

EXPLAIN ANALYZE: shows the actual execution plan with timing.

Fix queries first, then indexes, then configuration, then hardware.

PostgreSQL 17: incremental sort, parallel improvements, faster JSONB.

EXPLAIN ANALYZE: Reading Execution Plans

EXPLAIN ANALYZE runs the query and shows both the planned and actual execution strategy. The key things to look for: Sequential Scans on large tables (missing index), Nested Loops on large result sets (consider Hash Join), high "actual rows" vs "rows estimated" (stale statistics), and "Sort Method: external merge" (insufficient work_mem).

The cost numbers (e.g., cost=0.00..123.45) represent arbitrary units — lower is better. The actual time in milliseconds is what matters for real performance.

Snippet

-- Enable pg_stat_statements extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find top 10 slowest queries by total execution time
SELECT 
  calls,
  round(total_exec_time::numeric, 2) AS total_ms,
  round(mean_exec_time::numeric, 2) AS avg_ms,
  round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct,
  substring(query, 1, 100) AS query_preview
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- EXPLAIN ANALYZE: See actual execution plan
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(p.id) AS post_count
FROM users u
LEFT JOIN posts p ON p.author_id = u.id
WHERE u.created_at > '2026-01-01'
GROUP BY u.id
ORDER BY post_count DESC
LIMIT 20;

-- Sample output:
-- Limit (actual time=245.3..245.4 rows=20)
--   -> Sort (actual time=245.3..245.3 rows=20)
--     Sort Key: (count(p.id)) DESC
--     -> HashAggregate (actual time=243.1..244.2 rows=5000)
--       -> Hash Left Join (actual time=12.5..180.3 rows=45000)
--         -> Seq Scan on users u (actual time=0.1..8.2 rows=5000)  ← Needs index!
--             Filter: (created_at > '2026-01-01')
--             Rows Removed by Filter: 95000
--         -> Hash (actual time=3.2..3.2 rows=45000)
--           -> Seq Scan on posts p (actual time=0.1..2.1 rows=45000)

Indexing Strategies: B-Tree, GIN, BRIN & Partial

Indexes are the most powerful performance tool, but they are not free. Each index slows down INSERT/UPDATE/DELETE operations and consumes disk space. The goal is the minimum set of indexes that covers your critical query patterns.

B-Tree (default): best for equality and range queries (WHERE email = ?, WHERE age > 30). GIN: best for full-text search, JSONB containment, and array overlap. BRIN: best for naturally ordered large tables (time-series data). Partial: indexes only rows matching a condition (WHERE deleted = false).

PostgreSQL 17 improves multi-column B-Tree index utilization with incremental sorting — it can reuse a partial sort order from a prefix of a multi-column index.

Snippet

-- B-Tree: Equality and range queries (most common)
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
CREATE INDEX CONCURRENTLY idx_posts_created ON posts(created_at DESC);

-- Composite index: WHERE author_id = ? AND published = true ORDER BY created_at
CREATE INDEX CONCURRENTLY idx_posts_author_published 
ON posts(author_id, published, created_at DESC);

-- Partial index: Only index active users (skips 90% of rows)
CREATE INDEX CONCURRENTLY idx_active_users 
ON users(email) WHERE deleted_at IS NULL;

-- GIN: Full-text search
CREATE INDEX CONCURRENTLY idx_posts_search 
ON posts USING gin(to_tsvector('english', title || ' ' || content));

-- GIN: JSONB containment (@> operator)
CREATE INDEX CONCURRENTLY idx_posts_tags 
ON posts USING gin(tags);
-- Enables: WHERE tags @> '["react", "performance"]'

-- BRIN: Time-series data (very compact, good for 100M+ row tables)
CREATE INDEX CONCURRENTLY idx_logs_created 
ON audit_logs USING brin(created_at) WITH (pages_per_range = 32);

-- Covering index: includes extra columns to avoid table lookup
CREATE INDEX CONCURRENTLY idx_users_email_covering 
ON users(email) INCLUDE (name, role);
-- SELECT name, role FROM users WHERE email = ? → index-only scan!

Key Takeaways

CONCURRENTLY: creates index without locking the table for writes.

Partial indexes: dramatically smaller, only index the rows you actually query.

Covering indexes (INCLUDE): avoid table lookups for frequently read columns.

GIN: essential for JSONB queries and full-text search.

BRIN: tiny indexes for naturally sorted data (logs, events, time-series).

Index order matters: (author_id, published, created_at) ≠ (created_at, author_id).

Query Optimization Techniques

Beyond indexes, there are query-level optimizations that can 10x performance. The most impactful: avoid SELECT * (only select columns you need), use EXISTS instead of IN for subqueries, avoid functions on indexed columns (WHERE LOWER(email) kills your index), and use CTEs only when needed (materialized CTEs prevent optimization).

For pagination, use keyset pagination (WHERE id > last_id) instead of OFFSET. OFFSET scans and discards rows — OFFSET 100000 means scanning 100,000 rows just to skip them. Keyset pagination is O(1) regardless of page depth.

Snippet

-- ❌ BAD: OFFSET pagination (slow for deep pages)
SELECT * FROM posts ORDER BY created_at DESC OFFSET 100000 LIMIT 20;
-- Scans 100,020 rows to return 20!

-- ✅ GOOD: Keyset pagination (constant speed on any page)
SELECT id, title, created_at FROM posts
WHERE created_at < '2026-03-15T10:30:00Z'  -- Last item from previous page
ORDER BY created_at DESC
LIMIT 20;

-- ❌ BAD: Function on indexed column (kills the index)
SELECT * FROM users WHERE LOWER(email) = 'ashutosh@ih.com';

-- ✅ GOOD: Use a functional index if you must
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- ❌ BAD: SELECT * (fetches all columns including large text/jsonb)
SELECT * FROM posts WHERE author_id = 123;

-- ✅ GOOD: Select only needed columns
SELECT id, title, excerpt, created_at FROM posts WHERE author_id = 123;

-- ❌ BAD: IN with large subquery
SELECT * FROM orders WHERE user_id IN (SELECT id FROM users WHERE active = true);

-- ✅ GOOD: EXISTS (stops at first match)
SELECT * FROM orders o
WHERE EXISTS (SELECT 1 FROM users u WHERE u.id = o.user_id AND u.active = true);

-- ✅ Batch operations: INSERT multiple rows in one statement
INSERT INTO posts (title, content, author_id) VALUES
  ('Post 1', 'Content 1', 1),
  ('Post 2', 'Content 2', 1),
  ('Post 3', 'Content 3', 2);
-- 1 network roundtrip instead of 3

Key Takeaways

Keyset pagination: O(1) vs OFFSET: O(N). Use keyset for any paginated API.

Functions on indexed columns bypass the index — use functional indexes.

EXISTS > IN for subqueries: EXISTS stops at first match.

SELECT only needed columns — avoid SELECT * in production.

Batch inserts: 1 statement with N rows vs N statements with 1 row.

Connection Pooling with PgBouncer

Each PostgreSQL connection consumes ~10MB of memory. The default max_connections is 100, meaning ~1GB of RAM just for connections. In a high-traffic web app with 1000 concurrent users, you will hit "too many connects" errors instantly.

PgBouncer sits between your application and PostgreSQL, multiplexing hundreds of application connections into a small pool of actual database connections. Mode "transaction" is recommended: connections are returned to the pool after each transaction, not after each query.

Supabase, Railway, and Neon all include built-in connection poolers. If you self-host PostgreSQL, PgBouncer or pgcat is essential.

Key Takeaways

Each PG connection ≈ 10MB RAM. 100 connections = 1GB just for connections.

PgBouncer multiplexes 1000 app connections into 20 DB connections.

Transaction mode: connection returned after each transaction (recommended).

Session mode: connection held for entire client session (needed for prepared statements).

Cloud PostgreSQL (Supabase, Neon) includes built-in pooling.

Production Configuration (postgresql.conf)

PostgreSQL ships with conservative defaults optimized for a laptop. Production servers need tuning based on available RAM, CPU cores, and workload type (OLTP vs analytics).

The most impactful settings: shared_buffers (25% of RAM), effective_cache_size (75% of RAM), work_mem (per-operation sort/hash memory), maintenance_work_mem (for vacuum and index creation), and wal_buffers (for write-heavy workloads).

Snippet

# postgresql.conf — Production tuning for 16GB RAM, 4 CPU cores

# Memory
shared_buffers = 4GB              # 25% of RAM
effective_cache_size = 12GB       # 75% of RAM (includes OS cache)
work_mem = 64MB                   # Per-operation sort/hash memory
maintenance_work_mem = 1GB        # For VACUUM and CREATE INDEX

# WAL (Write-Ahead Log)
wal_buffers = 64MB
max_wal_size = 4GB
min_wal_size = 1GB
checkpoint_completion_target = 0.9

# Query planner
random_page_cost = 1.1            # SSD (default 4.0 is for spinning disks!)
effective_io_concurrency = 200     # SSD parallel reads

# Parallelism
max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4

# Autovacuum (prevent table bloat)
autovacuum_vacuum_scale_factor = 0.05   # Vacuum more frequently
autovacuum_analyze_scale_factor = 0.02  # Update stats more frequently

# Logging (for pg_stat_statements)
shared_preload_libraries = 'pg_stat_statements'
log_min_duration_statement = 200   # Log queries slower than 200ms

Key Takeaways

PostgreSQL performance tuning follows a clear path: diagnose with pg_stat_statements, analyze with EXPLAIN ANALYZE, fix with indexes, optimize queries, and tune configuration. Always measure before and after — gut feelings are wrong.

The highest-impact techniques: keyset pagination (not OFFSET), partial indexes (only index what you query), covering indexes (avoid table lookups), connection pooling (PgBouncer), and production-appropriate postgresql.conf settings.

For interviews: explain B-Tree vs GIN vs BRIN index selection, demonstrate EXPLAIN ANALYZE reading, discuss keyset vs offset pagination, and know the top postgresql.conf settings and why they matter.

Key Takeaways

pg_stat_statements → EXPLAIN ANALYZE → indexes → query optimization → configuration.

B-Tree: equality/range. GIN: JSONB/full-text. BRIN: time-series. Partial: filtered rows.

Keyset pagination > OFFSET: O(1) vs O(N) for deep pages.

PgBouncer: multiplex 1000 app connections into 20 DB connections.

Critical settings: shared_buffers (25% RAM), random_page_cost (1.1), work_mem (64MB).

CREATE INDEX CONCURRENTLY avoids locking the table during index creation.

Article Author

Ashutosh

Lead Developer

-- Enable pg_stat_statements extension CREATE EXTENSION IF NOT EXISTS pg_stat_statements; -- Find top 10 slowest queries by total execution time SELECT calls, round(total_exec_time::numeric, 2) AS total_ms, round(mean_exec_time::numeric, 2) AS avg_ms, round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct, substring(query, 1, 100) AS query_preview FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10; -- EXPLAIN ANALYZE: See actual execution plan EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT u.name, COUNT(p.id) AS post_count FROM users u LEFT JOIN posts p ON p.author_id = u.id WHERE u.created_at > '2026-01-01' GROUP BY u.id ORDER BY post_count DESC LIMIT 20; -- Sample output: -- Limit (actual time=245.3..245.4 rows=20) -- -> Sort (actual time=245.3..245.3 rows=20) -- Sort Key: (count(p.id)) DESC -- -> HashAggregate (actual time=243.1..244.2 rows=5000) -- -> Hash Left Join (actual time=12.5..180.3 rows=45000) -- -> Seq Scan on users u (actual time=0.1..8.2 rows=5000) ← Needs index! -- Filter: (created_at > '2026-01-01') -- Rows Removed by Filter: 95000 -- -> Hash (actual time=3.2..3.2 rows=45000) -- -> Seq Scan on posts p (actual time=0.1..2.1 rows=45000)

-- B-Tree: Equality and range queries (most common) CREATE INDEX CONCURRENTLY idx_users_email ON users(email); CREATE INDEX CONCURRENTLY idx_posts_created ON posts(created_at DESC); -- Composite index: WHERE author_id = ? AND published = true ORDER BY created_at CREATE INDEX CONCURRENTLY idx_posts_author_published ON posts(author_id, published, created_at DESC); -- Partial index: Only index active users (skips 90% of rows) CREATE INDEX CONCURRENTLY idx_active_users ON users(email) WHERE deleted_at IS NULL; -- GIN: Full-text search CREATE INDEX CONCURRENTLY idx_posts_search ON posts USING gin(to_tsvector('english', title || ' ' || content)); -- GIN: JSONB containment (@> operator) CREATE INDEX CONCURRENTLY idx_posts_tags ON posts USING gin(tags); -- Enables: WHERE tags @> '["react", "performance"]' -- BRIN: Time-series data (very compact, good for 100M+ row tables) CREATE INDEX CONCURRENTLY idx_logs_created ON audit_logs USING brin(created_at) WITH (pages_per_range = 32); -- Covering index: includes extra columns to avoid table lookup CREATE INDEX CONCURRENTLY idx_users_email_covering ON users(email) INCLUDE (name, role); -- SELECT name, role FROM users WHERE email = ? → index-only scan!

-- ❌ BAD: OFFSET pagination (slow for deep pages) SELECT * FROM posts ORDER BY created_at DESC OFFSET 100000 LIMIT 20; -- Scans 100,020 rows to return 20! -- ✅ GOOD: Keyset pagination (constant speed on any page) SELECT id, title, created_at FROM posts WHERE created_at < '2026-03-15T10:30:00Z' -- Last item from previous page ORDER BY created_at DESC LIMIT 20; -- ❌ BAD: Function on indexed column (kills the index) SELECT * FROM users WHERE LOWER(email) = 'ashutosh@ih.com'; -- ✅ GOOD: Use a functional index if you must CREATE INDEX idx_users_email_lower ON users(LOWER(email)); -- ❌ BAD: SELECT * (fetches all columns including large text/jsonb) SELECT * FROM posts WHERE author_id = 123; -- ✅ GOOD: Select only needed columns SELECT id, title, excerpt, created_at FROM posts WHERE author_id = 123; -- ❌ BAD: IN with large subquery SELECT * FROM orders WHERE user_id IN (SELECT id FROM users WHERE active = true); -- ✅ GOOD: EXISTS (stops at first match) SELECT * FROM orders o WHERE EXISTS (SELECT 1 FROM users u WHERE u.id = o.user_id AND u.active = true); -- ✅ Batch operations: INSERT multiple rows in one statement INSERT INTO posts (title, content, author_id) VALUES ('Post 1', 'Content 1', 1), ('Post 2', 'Content 2', 1), ('Post 3', 'Content 3', 2); -- 1 network roundtrip instead of 3

# postgresql.conf — Production tuning for 16GB RAM, 4 CPU cores # Memory shared_buffers = 4GB # 25% of RAM effective_cache_size = 12GB # 75% of RAM (includes OS cache) work_mem = 64MB # Per-operation sort/hash memory maintenance_work_mem = 1GB # For VACUUM and CREATE INDEX # WAL (Write-Ahead Log) wal_buffers = 64MB max_wal_size = 4GB min_wal_size = 1GB checkpoint_completion_target = 0.9 # Query planner random_page_cost = 1.1 # SSD (default 4.0 is for spinning disks!) effective_io_concurrency = 200 # SSD parallel reads # Parallelism max_worker_processes = 4 max_parallel_workers_per_gather = 2 max_parallel_workers = 4 # Autovacuum (prevent table bloat) autovacuum_vacuum_scale_factor = 0.05 # Vacuum more frequently autovacuum_analyze_scale_factor = 0.02 # Update stats more frequently # Logging (for pg_stat_statements) shared_preload_libraries = 'pg_stat_statements' log_min_duration_statement = 200 # Log queries slower than 200ms

The Performance Diagnosis Framework

Key Takeaways

EXPLAIN ANALYZE: Reading Execution Plans

Indexing Strategies: B-Tree, GIN, BRIN & Partial

Key Takeaways

Query Optimization Techniques

Key Takeaways

Connection Pooling with PgBouncer

Key Takeaways

Production Configuration (postgresql.conf)

Key Takeaways

Key Takeaways

Related Knowledge

Rust for Systems Programming

Go Concurrency in Practice

Understanding Closures in JavaScript: The Complete 2026 Guide

React 19 Server Components: The Definitive 2026 Guide

Next.js 15 App Router Masterclass: Everything You Need to Know

The Performance Diagnosis Framework

Key Takeaways

EXPLAIN ANALYZE: Reading Execution Plans

Indexing Strategies: B-Tree, GIN, BRIN & Partial

Key Takeaways

Query Optimization Techniques

Key Takeaways

Connection Pooling with PgBouncer

Key Takeaways

Production Configuration (postgresql.conf)

Key Takeaways

Key Takeaways

Related Knowledge

Rust for Systems Programming

Go Concurrency in Practice

Understanding Closures in JavaScript: The Complete 2026 Guide

React 19 Server Components: The Definitive 2026 Guide

Next.js 15 App Router Masterclass: Everything You Need to Know