Postgres Partitioning by Epoch

For high-volume time-series data (logs, events, metrics), partitioning by integer Unix timestamps offers a high-performance,timezone-agnostic strategy for managing data retention and query speed in PostgreSQL.

The Strategy

Instead of using Postgres's native TIMESTAMPTZ type for partitioning, we use a BIGINTcolumn storing the Unix timestamp. We then define range partitions based on these integer values.

  • Retention: Dropping old data is instant (DROP TABLE ...).
  • Performance: Integer math is CPU-friendly.
  • Timezones: Data is inherently UTC. No implicit conversions.

Implementation

SQL
-- 1. Create the parent partitioned table
CREATE TABLE events (
    event_id UUID NOT NULL,
    user_id UUID,
    event_data JSONB,
    created_at BIGINT NOT NULL -- Storing ms or seconds
) PARTITION BY RANGE (created_at);

-- 2. Create indexes (automatically propagate to partitions)
CREATE INDEX idx_events_created_at ON events (created_at);
CREATE INDEX idx_events_user_id ON events (user_id);

-- 3. Create partitions (Example: Daily partitions for Jan 2024)
-- Note: Bounds are [start, end)
-- 1704067200 = 2024-01-01 00:00:00 UTC
-- 1704153600 = 2024-01-02 00:00:00 UTC

CREATE TABLE events_2024_01_01 PARTITION OF events
    FOR VALUES FROM (1704067200) TO (1704153600);

CREATE TABLE events_2024_01_02 PARTITION OF events
    FOR VALUES FROM (1704153600) TO (1704240000);

Automating Partition Creation

You can use pg_partman (recommended) or a simple PL/pgSQL function to manage this. Here is a simplified manual logic:

PL/pgSQL Concept
DO $$
DECLARE
    start_ts BIGINT := 1704067200;
    one_day  BIGINT := 86400;
    i        INT;
BEGIN
    FOR i IN 0..30 LOOP
        EXECUTE format(
            'CREATE TABLE IF NOT EXISTS events_%s PARTITION OF events 
             FOR VALUES FROM (%s) TO (%s)',
            to_char(to_timestamp(start_ts + (i * one_day)), 'YYYY_MM_DD'),
            (start_ts + (i * one_day)),
            (start_ts + ((i + 1) * one_day))
        );
    END LOOP;
END $$;

Querying Partitions

To get the benefit of Partition Pruning (where Postgres ignores irrelevant tables), you must filter by the partition key.

SQL
-- ✅ Good Query (Scans only relevant partitions)
SELECT * FROM events 
WHERE created_at >= 1704067200 AND created_at < 1704153600;

-- ❌ Bad Query (Scans ALL partitions)
SELECT * FROM events 
WHERE user_id = 'some-uuid';
-- (Unless user_id is also part of the partition key)

Frequently Asked Questions

Why partition by integer timestamp instead of DATE?

Integer comparisons are extremely fast, and storing time as integers (4 or 8 bytes) is often more compact than timestamp types. It also decouples your data from timezone logic at the storage layer.

How do I query partitions efficiently?

Always include the partition key (your timestamp column) in your WHERE clause. This allows the Postgres query planner to skip scanning partitions that dont match the time range (Partition Pruning).

Should I use BIGINT or INTEGER for timestamps?

Use BIGINT (int8) if you store milliseconds or need to support dates after 2038. Use INTEGER (int4) only if you store seconds and dont care about dates after 2038.