Implementing Eventual Consistency in Partitioned PostgreSQL
Achieving reliable eventual consistency across declarative partitions requires a disciplined approach to asynchronous write propagation, conflict resolution, and lag-aware reconciliation. This guide provides a zero-downtime execution path for backend engineers, DBAs, and platform teams deploying logical replication across partitioned PostgreSQL clusters.
Before enabling replication, define partition sync boundaries and async propagation topology. Align partition keys with consistency requirements by reviewing Database Partitioning Fundamentals & Architecture to establish a solid baseline. Configure publisher/subscriber nodes with tuned WAL parameters and replication slots. Implement idempotent merge logic to safely handle out-of-order delivery and transient network partitions.
Partition Topology & Sync Boundary Design
Mapping partition keys to replication groups dictates both network overhead and convergence speed. Group partitions by tenant, region, or time-range to minimize cross-node replication traffic and isolate blast radii during network partitions. Evaluate whether declarative partitioning or external sharding aligns with your consistency model before committing to a topology.
Each active logical replication slot consumes WAL retention on the publisher; unbounded slot growth will trigger disk exhaustion and halt writes. Define explicit staleness windows per partition group — for example, <500ms for hot transactional partitions and <5s for analytical rollups — and document them as operational SLAs. These boundaries dictate when automated reconciliation triggers and how aggressively the system prioritizes catch-up over fresh writes.
Logical Replication Configuration
Deploy publisher/subscriber infrastructure to propagate partition writes asynchronously without blocking production workloads. Zero-downtime initialization requires careful WAL tuning and targeted publication scoping.
- Enable Logical WAL & Scale Slots: Set
wal_level = logicalinpostgresql.conf. Increasemax_replication_slotsandmax_wal_sendersto accommodate concurrent partition sync channels. - Scope Publications: Create targeted publications per partition group to avoid full-table replication overhead and unnecessary WAL streaming.
- Initialize Subscriptions Safely: Use
copy_data = falseduring subscription creation to prevent blocking initial partition writes. The subscriber will begin streaming WAL changes immediately while historical data syncs out-of-band if required. - Tune Commit Behavior: For eventual consistency, set
synchronous_commit = offon the publisher to maximize write throughput, accepting a small durability window in exchange for horizontal scaling.
-- Publisher: scope replication to specific child partitions only
CREATE PUBLICATION part_sync FOR TABLE orders_2024_q1, orders_2024_q2;
-- Subscriber: initialize without blocking writes
CREATE SUBSCRIPTION part_sync_sub
CONNECTION 'host=replica.internal dbname=analytics user=replicator password=***'
PUBLICATION part_sync
WITH (copy_data = false, create_slot = true, slot_name = 'part_sync_slot');
This establishes an async logical replication channel scoped to specific partitions, avoiding full-table copy overhead and enabling immediate write availability.
Important limitation: PostgreSQL logical replication does not automatically replicate DDL. When you add new child partitions to the publisher, you must manually create the corresponding tables on the subscriber and update the publication before the new partition’s rows will replicate.
Conflict Resolution & Idempotent Merge Logic
Asynchronous propagation guarantees out-of-order delivery during network partitions or high-throughput bursts. Without explicit conflict resolution, older WAL events can silently overwrite newer writes, causing data divergence.
Implement ON CONFLICT DO UPDATE with GREATEST() timestamp comparison or version-number gating. Use pg_replication_origin to track LSN progression and deduplicate replayed events. Enforce application-level idempotency keys to guarantee safe retries during transient failures.
-- Idempotent upsert with version-gated conflict resolution
INSERT INTO orders (id, payload, updated_at, version)
VALUES (101, '{"status":"shipped"}', NOW(), 3)
ON CONFLICT (id) DO UPDATE
SET payload = EXCLUDED.payload,
updated_at = GREATEST(orders.updated_at, EXCLUDED.updated_at),
version = EXCLUDED.version
WHERE EXCLUDED.version > orders.version;
This ensures only newer writes overwrite stale data during eventual convergence, preventing lost updates from out-of-order replication delivery.
Replication Lag Monitoring & Catch-Up
Eventual consistency requires continuous visibility into async propagation delay. Query pg_stat_subscription to calculate apply_lag against the publisher’s current WAL position. Alert on sustained lag exceeding your defined staleness window.
-- Monitor replication lag per subscription
SELECT
subname,
last_msg_receive_time,
received_lsn,
latest_end_lsn,
pg_wal_lsn_diff(pg_current_wal_lsn(), received_lsn) AS apply_lag_bytes,
EXTRACT(EPOCH FROM (now() - last_msg_receive_time)) AS lag_seconds
FROM pg_stat_subscription
WHERE subname LIKE 'part_sync%';
Automate reconciliation by running this query in a cron job or background worker. Compare lag_seconds against your staleness SLA and escalate to PagerDuty if thresholds persist across three consecutive checks. Deploy periodic checksum validation scripts (pg_checksums or application-level row-count comparisons) to detect silent divergence early.
Common Failure Modes & Anti-Patterns
| Failure Mode | Root Cause | Operational Impact | Remediation |
|---|---|---|---|
| Synchronous commit left on | synchronous_commit = on on partition sync channels |
Blocks write throughput, introduces cascading latency, defeats horizontal scaling | Set synchronous_commit = off for partition sync connections |
Missing wal_level = logical |
Default replica or minimal WAL level |
Subscription initialization fails; data loss risk during network partitions | Configure at cluster bootstrap; restart required to take effect |
| No DDL sync for new partitions | Logical replication skips DDL | New child partitions silently missing on subscriber | Create child partition on subscriber first, then add to publication |
Blind ON CONFLICT DO UPDATE without guards |
No version or timestamp comparison | Older replayed events overwrite newer writes | Use WHERE EXCLUDED.version > orders.version or GREATEST() guards |
FAQ
How do I handle write conflicts during partition sync?
Use logical replication combined with ON CONFLICT DO UPDATE and application-level version numbers or updated_at timestamps to ensure idempotent convergence. Always gate updates with a version or timestamp comparison to prevent stale overwrites.
Does PostgreSQL support native cross-partition eventual consistency?
No single-command toggle exists. It requires combining declarative partitioning with logical replication, custom conflict resolution logic, and monitoring of pg_stat_subscription to manage async propagation.
What is the acceptable replication lag for eventual consistency? Depends on business SLAs. Typically 500ms–2s for internal services, but must be explicitly defined per partition group and monitored continuously. Lag exceeding your defined staleness window should trigger automated reconciliation or circuit-breaker fallbacks.