Advanced SQL: Beyond Basic Queries
-- Example: Recursive CTE for hierarchical data
WITH RECURSIVE org_tree AS (
SELECT id, name, manager_id
FROM employees
WHERE id = 1 -- CEO
UNION ALL
SELECT e.id, e.name, e.manager_id
FROM employees e
JOIN org_tree ot ON e.manager_id = ot.id
)
SELECT * FROM org_tree;
Window Functions: Powerful Data Analysis
Common Window Functions
Function | Description |
---|---|
ROW_NUMBER() | Unique sequential row numbers |
RANK() | Rank with gaps for ties |
DENSE_RANK() | Rank without gaps |
LEAD()/LAG() | Access next/previous row values |
FIRST_VALUE() | First value in window frame |
SUM() OVER() | Running/cumulative sums |
-- Running total example
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders;
Common Table Expressions (CTEs)
Types of CTEs
-
Basic CTE
WITH regional_sales AS ( SELECT region, SUM(amount) AS total_sales FROM orders GROUP BY region ) SELECT * FROM regional_sales;
-
Recursive CTE (for hierarchical data)
WITH RECURSIVE employee_hierarchy AS ( -- Base case SELECT id, name, manager_id, 1 AS level FROM employees WHERE manager_id IS NULL UNION ALL -- Recursive case SELECT e.id, e.name, e.manager_id, eh.level + 1 FROM employees e JOIN employee_hierarchy eh ON e.manager_id = eh.id ) SELECT * FROM employee_hierarchy;
-
Materialized CTE (PostgreSQL)
WITH MATERIALIZED expensive_query AS ( SELECT * FROM large_table WHERE complex_condition ) SELECT * FROM expensive_query JOIN other_table...;
Indexing Strategies
When to Index
- Columns frequently in
WHERE
clauses - Join conditions
- Columns used in
ORDER BY
orGROUP BY
Index Types
Type | Best For |
---|---|
B-Tree | Default, range queries |
Hash | Exact matches (PostgreSQL) |
GiST | Geometric data, full-text search |
GIN | Array/JSON operations |
BRIN | Very large tables with sorting |
-- Create a partial index
CREATE INDEX idx_active_users ON users(email)
WHERE active = true;
-- Multi-column index
CREATE INDEX idx_name_department ON employees(last_name, department_id);
Query Optimization Techniques
Execution Plan Analysis
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 123;
Optimization Tips
-
Avoid SELECT
-- Instead of: SELECT * FROM customers; -- Use: SELECT id, name, email FROM customers;
-
Use JOINs properly
-- Prefer: SELECT c.name, o.amount FROM customers c JOIN orders o ON c.id = o.customer_id; -- Over: SELECT c.name, o.amount FROM customers c, orders o WHERE c.id = o.customer_id;
-
Limit early
SELECT * FROM large_table WHERE condition LIMIT 10;
-
Batch operations
-- Instead of multiple INSERTs: INSERT INTO users (name) VALUES ('Alice'), ('Bob'), ('Charlie');
Advanced Joins
Lateral Joins
-- Get top 3 products for each category
SELECT c.name, p.product_name, p.price
FROM categories c
CROSS JOIN LATERAL (
SELECT product_name, price
FROM products
WHERE category_id = c.id
ORDER BY price DESC
LIMIT 3
) p;
Self Joins
-- Find employees and their managers
SELECT e.name AS employee, m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
Transaction Isolation Levels
Level | Dirty Reads | Non-Repeatable Reads | Phantom Reads |
---|---|---|---|
READ UNCOMMITTED | Possible | Possible | Possible |
READ COMMITTED | No | Possible | Possible |
REPEATABLE READ | No | No | Possible |
SERIALIZABLE | No | No | No |
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- Critical operations
COMMIT;
JSON Support in Modern SQL
PostgreSQL JSON Operations
-- Extract value
SELECT data->>'name' FROM users;
-- Query inside JSON
SELECT *
FROM products
WHERE metadata->>'color' = 'blue';
-- JSON aggregation
SELECT
department_id,
jsonb_agg(jsonb_build_object('id', id, 'name', name)) AS employees
FROM employees
GROUP BY department_id;
Temporal Tables (SQL:2011)
-- System-versioned temporal table
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(100),
price DECIMAL(10,2),
valid_from TIMESTAMP(6) GENERATED ALWAYS AS ROW START,
valid_to TIMESTAMP(6) GENERATED ALWAYS AS ROW END,
PERIOD FOR SYSTEM_TIME (valid_from, valid_to)
) WITH SYSTEM VERSIONING;
-- Query historical data
SELECT * FROM products
FOR SYSTEM_TIME AS OF TIMESTAMP '2023-01-01';
Further Reading
- SQL Performance Explained (opens in a new tab)
- PostgreSQL Documentation (opens in a new tab)
- SQL Antipatterns Book (opens in a new tab)
- Modern SQL Features (opens in a new tab)
Last updated on April 10, 2025