Skip to content

Load Balancers (System Design)

Overview

Load balancers distribute traffic across healthy instances to balance utilization and improve availability. They can terminate TLS, enforce routing rules, and shield origins from overload.

Why This Exists

Single points of failure and saturation disappear when requests are spread across pools with health-aware routing.

How It Works

Algorithms and features mirror Networking — load balancing, but system design focuses on session affinity, blue/green deployments, canary routing, and global vs regional load balancing.

Architecture

architecture

flowchart LR G[Global LB] --> R1[Region 1] G --> R2[Region 2] R1 --> LB[Regional LB] LB --> Pod[Pods]

Key Concepts

Sticky sessions Pinning users can break even load during deploys; prefer stateless services with shared session stores when possible.

Code Examples

spec:
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: api
                port:
                  number: 80

Interview Questions

How do you drain connections during deploy?

Remove instance from rotation, wait for active connections to finish or timeout, then deploy—coordinate with health checks.

What is anycast?

Same IP served from multiple locations; routing uses BGP to reach nearest POP—used by global load balancing and CDNs.

Practice Problems

  • Compare active-passive vs active-active multi-region setups
  • Design failover when regional database is unavailable

Resources