SQL Optimization

Optimize SQL query performance across PostgreSQL, MySQL, and SQL Server through execution plan analysis, strategic indexing, and query rewriting.

When to Use

Use when:

Debugging slow queries or database timeouts
Analyzing EXPLAIN plans or execution plans
Determining index requirements
Rewriting inefficient queries
Identifying query anti-patterns (N+1, SELECT *, correlated subqueries)
Database-specific optimization needs (PostgreSQL, MySQL, SQL Server)

Key Features

Core Optimization Workflow

Analyze Query Performance: Run EXPLAIN ANALYZE to identify bottlenecks
Identify Opportunities: Look for sequential scans, high row counts, inefficient joins
Apply Indexing Strategies: Add strategic indexes based on WHERE, JOIN, ORDER BY
Rewrite Queries: Eliminate anti-patterns and optimize query structure
Verify Improvement: Re-run EXPLAIN and compare performance

Common Red Flags

Indicator	Problem	Solution
Seq Scan / Table Scan	Full table scan on large table	Add index on filter columns
High row count	Processing excessive rows	Add WHERE filter or index
Nested Loop with large outer	Inefficient join algorithm	Index join columns
Correlated subquery	Subquery executes per row	Rewrite as JOIN or EXISTS
Sort on large result set	Expensive sorting	Add index matching ORDER BY

Quick Start

Analyze Query Performance

PostgreSQL:

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 123
ORDER BY created_at DESC
LIMIT 10;

Output Analysis:

Seq Scan on orders (cost=0.00..2500.00 rows=10)
  Filter: (customer_id = 123)
  Rows Removed by Filter: 99990

Problem: Sequential scan filtering 99,990 rows

Add Composite Index

CREATE INDEX idx_orders_customer_created
ON orders (customer_id, created_at DESC);

Verify Improvement

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 123
ORDER BY created_at DESC
LIMIT 10;

Result: 200x faster (2000ms → 10ms)

Index Scan using idx_orders_customer_created (cost=0.42..12.44 rows=10)
  Index Cond: (customer_id = 123)

Indexing Strategies

Index Decision Framework

Is column used in WHERE, JOIN, ORDER BY, or GROUP BY?
├─ YES → Is column selective (many unique values)?
│  ├─ YES → Is table frequently queried?
│  │  ├─ YES → ADD INDEX
│  │  └─ NO → Consider based on query frequency
│  └─ NO (low selectivity) → Skip index
└─ NO → Skip index

Composite Index Design

Column Order Matters:

Equality filters first (most selective)
Additional equality filters (by selectivity)
Range filters or ORDER BY (last)

Example:

-- Query pattern
SELECT * FROM orders
WHERE customer_id = 123 AND status = 'shipped'
ORDER BY created_at DESC
LIMIT 10;

-- Optimal composite index
CREATE INDEX idx_orders_customer_status_created
ON orders (customer_id, status, created_at DESC);

Index Types by Use Case

PostgreSQL:

B-tree (default): General-purpose, supports <, ≤, =, ≥, >, BETWEEN, IN
GIN: Full-text search, JSONB, arrays
GiST: Spatial data, geometric types
BRIN: Very large tables with naturally ordered data

MySQL:

B-tree (default): General-purpose index
Full-text: Text search on VARCHAR/TEXT columns
Spatial: Spatial data types

Query Anti-Patterns

1. SELECT * (Over-fetching)

-- ❌ Bad: Fetches all columns
SELECT * FROM users WHERE id = 1;

-- ✅ Good: Fetch only needed columns
SELECT id, name, email FROM users WHERE id = 1;

2. N+1 Queries

-- ❌ Bad: 1 + N queries
SELECT * FROM users LIMIT 100;
-- Then in loop: SELECT * FROM posts WHERE user_id = ?;

-- ✅ Good: Single JOIN
SELECT users.*, posts.id AS post_id, posts.title
FROM users
LEFT JOIN posts ON users.id = posts.user_id;

3. Non-Sargable Queries

Functions on indexed columns prevent index usage:

-- ❌ Bad: Function prevents index usage
SELECT * FROM orders WHERE YEAR(created_at) = 2025;

-- ✅ Good: Sargable range condition
SELECT * FROM orders
WHERE created_at >= '2025-01-01'
  AND created_at < '2026-01-01';

4. Correlated Subqueries

-- ❌ Bad: Subquery executes per row
SELECT name,
  (SELECT COUNT(*) FROM orders WHERE orders.user_id = users.id)
FROM users;

-- ✅ Good: JOIN with GROUP BY
SELECT users.name, COUNT(orders.id) AS order_count
FROM users
LEFT JOIN orders ON users.id = orders.user_id
GROUP BY users.id, users.name;

Database-Specific Optimizations

PostgreSQL-Specific Features

Partial Indexes (index subset of rows):

CREATE INDEX idx_active_users_login
ON users (last_login)
WHERE status = 'active';

Expression Indexes (index computed values):

CREATE INDEX idx_users_email_lower
ON users (LOWER(email));

Covering Indexes (avoid heap access):

CREATE INDEX idx_users_email_covering
ON users (email) INCLUDE (id, name);

MySQL-Specific Features

Index Hints (override optimizer):

SELECT * FROM orders USE INDEX (idx_orders_customer)
WHERE customer_id = 123;

SQL Server-Specific Features

Query Store (track query performance over time):

ALTER DATABASE YourDatabase SET QUERY_STORE = ON;

Advanced Techniques

EXISTS vs IN for Subqueries

Use EXISTS for better performance with large datasets:

-- ✅ Good: EXISTS stops at first match
SELECT * FROM users
WHERE EXISTS (SELECT 1 FROM orders WHERE orders.user_id = users.id);

-- ❌ Less efficient: IN builds full list
SELECT * FROM users
WHERE id IN (SELECT user_id FROM orders);

Common Table Expressions (CTEs)

Break complex queries into readable parts:

WITH active_customers AS (
  SELECT id, name FROM customers WHERE status = 'active'
),
recent_orders AS (
  SELECT customer_id, COUNT(*) as order_count
  FROM orders
  WHERE created_at > NOW() - INTERVAL '30 days'
  GROUP BY customer_id
)
SELECT ac.name, COALESCE(ro.order_count, 0) as orders
FROM active_customers ac
LEFT JOIN recent_orders ro ON ac.id = ro.customer_id;

Monitoring and Maintenance

Regular Optimization Tasks:

Review slow query logs weekly
Update database statistics regularly (ANALYZE in PostgreSQL)
Monitor index usage (drop unused indexes)
Archive old data to keep tables manageable
Review execution plans for critical queries quarterly

PostgreSQL Statistics Update:

ANALYZE table_name;

MySQL Statistics Update:

ANALYZE TABLE table_name;

Data Architecture - Schema design and database fundamentals
Performance Engineering - Application performance profiling
Data Transformation - dbt and query optimization

When to Use​

Key Features​

Core Optimization Workflow​

Common Red Flags​

Quick Start​

Analyze Query Performance​

Add Composite Index​

Verify Improvement​

Indexing Strategies​

Index Decision Framework​

Composite Index Design​

Index Types by Use Case​

Query Anti-Patterns​

1. SELECT * (Over-fetching)​

2. N+1 Queries​

3. Non-Sargable Queries​

4. Correlated Subqueries​

Database-Specific Optimizations​

PostgreSQL-Specific Features​

MySQL-Specific Features​

SQL Server-Specific Features​

Advanced Techniques​

EXISTS vs IN for Subqueries​

Common Table Expressions (CTEs)​

Monitoring and Maintenance​

Related Skills​

References​