Database Replication — Primary-Replica, Failover & Read Scaling

A single database server is a single point of failure. One hardware fault, one crashed process, and your application is down. Replication copies your data to multiple servers so that if one fails, another takes over — and while everything is running, it lets you distribute the read load across multiple machines.

Why Replication?

Two goals, one mechanism:

  1. High availability: if the primary database fails, a replica can take over (failover) and keep the application running
  2. Read scaling: direct read queries to replicas, freeing the primary for writes
Without replicationSingle DBR/WSingle point of failureWith replicationPrimaryWritesReadsReplica 1Replica 2replicatereadsreads

Primary-Replica Architecture

The primary (historically called master) accepts all writes. The replicas (historically called slaves) receive a stream of changes from the primary and apply them, maintaining a copy of the data.

App #1App #2App #3Primary(RW)Load Balancer(read replicas only)Replica 1(RO)Replica 2(RO)writesreadsreadsreplication

How it works:

  1. A write arrives at the primary
  2. Primary writes to its write-ahead log (WAL in PostgreSQL) or binary log (binlog in MySQL)
  3. The log is streamed to each replica
  4. Replicas apply the log entries in order, maintaining a consistent copy

Synchronous Replication

The primary waits for at least one replica to confirm the write before acknowledging success to the client.

ClientPrimaryReplicaINSERTlogapplyACKSUCCESS(only after replica confirms)

Pros:

  • Zero data loss: if primary crashes after acknowledging, the replica already has the data
  • Strong consistency: replicas are never behind

Cons:

  • Higher write latency: every write waits for the network round-trip to at least one replica
  • Availability risk: if the replica is slow or unreachable, writes block

PostgreSQL Synchronous Replication

sql
Loading...

Asynchronous Replication

The primary writes to its log and immediately acknowledges success. It streams the log to replicas in the background without waiting for confirmation.

ClientPrimaryReplicaINSERTlog (async)SUCCESSapply(success before replica applies)

Pros:

  • Low write latency: no waiting for network round-trips
  • Replica failures don't affect write availability

Cons:

  • Data loss on primary failure: writes acknowledged but not yet on replica are lost
  • Replication lag: replicas may be seconds or more behind

When Async is Acceptable

For most web applications, asynchronous replication is the right default. Losing a few seconds of data is an acceptable trade-off for write performance. Use synchronous replication for financial transactions or any data where loss is unacceptable.


Semi-Synchronous Replication

A middle ground: the primary waits for at least one replica to receive (but not necessarily apply) the data.

sql
Loading...

If no replica responds within the timeout, it falls back to asynchronous mode automatically. This prevents data loss in the most common failure scenario (primary dies before replicating) while tolerating replica failures gracefully.


Replication Lag

Replication lag is the delay between when a write is committed on the primary and when it appears on a replica. It's measured in seconds (ideally milliseconds).

text
Loading...

How to Measure

sql
Loading...

Consequences of Lag

Stale reads: a user updates their profile and immediately reads it — the read hits a lagged replica and returns old data. This violates read-your-writes consistency.

text
Loading...

Solutions:

  • Route reads-after-writes to the primary
  • Use session-level consistency: after a write, track a "read timestamp" and only use replicas that have caught up
  • Add a brief sleep/retry before reading back

Failover: Promoting a Replica

When the primary fails, one replica must be promoted to become the new primary.

text
Loading...

Manual Failover (PostgreSQL)

bash
Loading...

Automatic Failover

Tools like Patroni (PostgreSQL), MHA (MySQL), or cloud-managed databases handle this automatically:

  1. Health check detects primary is unreachable
  2. Consensus algorithm (Raft, etcd) elects the most up-to-date replica
  3. Elected replica is promoted
  4. Other replicas repoint to the new primary
  5. Application connection pool is notified (via DNS update or proxy layer)
etcd / Consulstores leader statePatroni monitorsPrimary(Patroni agent)Replica(Patroni agent)

Failover Challenges

  • Split-brain: both old primary and new primary accept writes simultaneously → data divergence. Fencing mechanisms (STONITH — Shoot The Other Node In The Head) prevent this.
  • Data loss window: with async replication, some writes on the old primary may not have reached the promoted replica.

Multi-Primary (Multi-Master) Replication

Every node can accept writes. Changes are replicated to all other nodes.

Primary 1(RW — Region A)Primary 2(RW — Region B)replicateApp writesApp writes

Advantage: write availability even if one region goes down. No single point of failure.

Problem: write conflicts. Two clients update the same row on different nodes simultaneously. Conflict resolution strategies:

  • Last-write-wins (LWW): the write with the later timestamp wins. Can silently drop data.
  • Application-level resolution: conflicts are flagged and the application resolves them.
  • CRDTs (Conflict-free Replicated Data Types): data structures designed to merge without conflicts (e.g., counters, sets).

Multi-primary is complex to operate correctly. Use it only when geography-distributed writes are required.


Sharding vs Replication

These solve different problems and are often used together:

ConceptWhat it doesProblem solved
ReplicationCopies data to multiple nodesHigh availability, read scaling
ShardingSplits data across multiple nodesWrite scaling, storage scaling
text
Loading...

A production system often does both: data is sharded across N nodes, and each shard has its own primary-replica set.


Consistency Models

ModelGuaranteeCommon Use
Strong consistencyEvery read returns the most recent writeFinancial systems, coordination
Eventual consistencyAll replicas converge eventuallySocial media likes, DNS
Read-your-writesA client always sees its own writesUser profile updates
Monotonic readsA client never reads older data after reading newer dataFeed readers
Causal consistencyCausally related operations appear in orderMessaging

Cloud Read Replicas

AWS RDS Read Replicas

text
Loading...
  • Up to 5 read replicas per RDS instance
  • Cross-region read replicas for geographic read distribution
  • Read replicas can be promoted to standalone databases in seconds

Amazon Aurora

Aurora uses a shared distributed storage layer. All instances (up to 15 read replicas) share the same storage — replicas are near-instantaneous because they read from the same storage volume, no data copying required.

text
Loading...

Google Cloud SQL

bash
Loading...

Python: Routing Writes and Reads

python
Loading...

Checking Replication Lag in Python

python
Loading...

Summary

  • Replication copies data to multiple servers for high availability and read scaling
  • Synchronous: zero data loss, higher latency — use for financial-critical writes
  • Asynchronous: low latency, possible data loss — the default for most systems
  • Replication lag causes stale reads — measure it, and route reads-after-writes to the primary
  • Failover promotes a replica to primary — automate it with Patroni, Orchestrator, or use a managed cloud database
  • Multi-primary enables geographic writes but introduces complex conflict resolution
  • Sharding solves write/storage scale; replication solves availability and read scale — use both in production
  • Cloud databases (RDS, Aurora, Cloud SQL) manage replication and failover for you at the cost of less control