PostgreSQL Table Partitioning

Table partitioning in PostgreSQL is a technique to divide large tables into smaller, more manageable pieces called partitions. Partitioning improves query performance, maintenance, and scalability for very large datasets.

1. Benefits of Partitioning

  • Improved query performance through partition pruning.
  • Efficient management of large tables and historical data.
  • Easier maintenance tasks like vacuuming and backup on smaller partitions.
  • Supports parallel processing for queries and indexes.

2. Partitioning Methods

  • Range Partitioning: Data is divided based on a range of values (e.g., date ranges).
  • List Partitioning: Data is divided based on discrete values (e.g., regions or categories).
  • Hash Partitioning: Data is distributed across partitions using a hash function. Good for evenly distributing data.
  • Composite Partitioning: Combines multiple partitioning methods, such as range + hash.

3. Creating Range Partitions

-- Create a partitioned table
            CREATE TABLE orders (
                id SERIAL PRIMARY KEY,
                customer_id INT NOT NULL,
                order_date DATE NOT NULL,
                amount NUMERIC
            ) PARTITION BY RANGE (order_date);

            -- Create partitions for specific date ranges
            CREATE TABLE orders_2023 PARTITION OF orders
            FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

            CREATE TABLE orders_2024 PARTITION OF orders
            FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

4. Creating List Partitions

-- Partition table by region
            CREATE TABLE sales (
                id SERIAL PRIMARY KEY,
                region TEXT NOT NULL,
                amount NUMERIC
            ) PARTITION BY LIST (region);

            CREATE TABLE sales_north PARTITION OF sales FOR VALUES IN ('North');
            CREATE TABLE sales_south PARTITION OF sales FOR VALUES IN ('South');

5. Creating Hash Partitions

-- Partition table using hash function
            CREATE TABLE users (
                id SERIAL PRIMARY KEY,
                username TEXT NOT NULL
            ) PARTITION BY HASH (id);

            CREATE TABLE users_part1 PARTITION OF users FOR VALUES WITH (MODULUS 4, REMAINDER 0);
            CREATE TABLE users_part2 PARTITION OF users FOR VALUES WITH (MODULUS 4, REMAINDER 1);

6. Querying Partitioned Tables

Queries on partitioned tables automatically use partition pruning, so only relevant partitions are scanned:

-- Query orders for 2023 only
            SELECT * FROM orders WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';

7. Indexing Partitions

  • Indexes can be created on individual partitions.
  • Global indexes are not supported in PostgreSQL; use local indexes on each partition.
  • -- Create index on a partition
                CREATE INDEX idx_orders_2023_customer ON orders_2023(customer_id);

8. Maintenance and Best Practices

  • Use partitioning for very large tables where performance is critical.
  • Plan partition keys and ranges carefully to balance data across partitions.
  • Regularly vacuum and analyze individual partitions for query performance.
  • Combine partitioning with indexes and proper constraints for efficient data management.
  • Monitor query plans to ensure partition pruning is happening as expected.

Conclusion

PostgreSQL partitioning is a powerful feature to manage large datasets efficiently. By using range, list, hash, or composite partitions, you can improve query performance, simplify maintenance, and scale your database effectively. Proper planning of partitions, indexes, and constraints ensures optimal performance for both read and write operations.