Database Scaling¶
Overview¶
Database scaling tackles read/write growth beyond single-node limits through replication, partitioning, specialized engines, and application-level patterns.
Why This Exists¶
The database is often the hardest component to scale; its data model and consistency constraints ripple through the whole architecture.
How It Works¶
Read scaling: replicas, caching, CQRS. Write scaling: sharding, partitioning keys, avoiding hot partitions. Operational: backups, failover, online schema changes. Align with Databases — scaling.
Architecture¶

flowchart LR
Router[Shard router] --> S1[(Shard 1)]
Router --> S2[(Shard 2)]
S1 --> R1[Replica]
Key Concepts¶
Choose the shard key wisely
Skewed keys undo horizontal scaling; measure distribution and rebalance with operational tooling.
Code Examples¶
-- Prefer tenant_id in every query for partition pruning
SELECT * FROM orders WHERE tenant_id = $1 AND id = $2;
Interview Questions¶
What is the difference between replication and sharding?
Replication copies the same dataset for availability and read scale; sharding splits different subsets of data across nodes.
How does two-phase commit relate to scaling?
2PC coordinates distributed transactions but increases latency and failure modes—often avoided in favor of sagas and idempotent steps.
Practice Problems¶
- Shard a multi-tenant messaging system by conversation id
- Evaluate CQRS for a read-heavy analytics dashboard
Resources¶
- Designing Data-Intensive Applications
- Google Spanner paper — global SQL at scale