Skip to main content

Load Balancing Patterns

Distribute traffic across infrastructure using appropriate load balancing approaches, from simple round-robin to global multi-region failover.

When to Use

Use when:

  • Distributing traffic across multiple application servers
  • Implementing high availability and failover
  • Routing traffic based on URLs, headers, or geographic location
  • Managing session persistence across stateless backends
  • Deploying applications to Kubernetes clusters
  • Configuring global traffic management across regions
  • Implementing zero-downtime deployments (blue-green, canary)

Core Concepts

Layer 4 vs Layer 7

Layer 4 (Transport Layer):

  • Routes based on IP address and port (TCP/UDP packets)
  • No application data inspection
  • Lower latency, higher throughput
  • Client source IP preservation
  • Use for: Database connections, video streaming, gaming, non-HTTP protocols

Layer 7 (Application Layer):

  • Routes based on HTTP URLs, headers, cookies, request body
  • Full application data visibility
  • SSL/TLS termination, caching, WAF integration
  • Content-based routing capabilities
  • Use for: Web applications, REST APIs, microservices, complex routing logic

Load Balancing Algorithms

AlgorithmDistribution MethodUse Case
Round RobinSequentialStateless, similar servers
Weighted Round RobinCapacity-basedDifferent server specs
Least ConnectionsFewest active connectionsLong-lived connections
Least Response TimeFastest serverPerformance-sensitive
IP HashClient IP-basedSession persistence
Resource-BasedCPU/memory metricsVarying workloads

Health Check Types

Shallow (Liveness): Is the process alive?

  • Endpoint: /health/live or /live
  • Returns: 200 if process running
  • Use for: Process monitoring, container health

Deep (Readiness): Can the service handle requests?

  • Endpoint: /health/ready or /ready
  • Validates: Database, cache, external API connectivity
  • Use for: Load balancer routing decisions

Health Check Hysteresis:

  • Different thresholds for marking up vs down
  • Example: 3 failures to mark down, 2 successes to mark up
  • Prevents flapping

Cloud Load Balancers

AWS

Application Load Balancer (ALB) - Layer 7:

  • HTTP/HTTPS applications, microservices, WebSocket
  • Path/host/header routing, AWS WAF integration
  • Choose when: Content-based routing needed

Network Load Balancer (NLB) - Layer 4:

  • Ultra-low latency (<1ms), TCP/UDP, static IPs
  • Millions of requests per second
  • Choose when: Non-HTTP protocols, performance critical

Global Accelerator - Layer 4 Global:

  • Multi-region applications, global users
  • Anycast IPs, automatic regional failover

GCP

Application LB (L7): Global HTTPS LB, Cloud CDN integration, Cloud Armor (WAF/DDoS) Network LB (L4): Regional TCP/UDP, pass-through balancing, session affinity Cloud Load Balancing: Single anycast IP, global distribution

Azure

Application Gateway (L7): WAF integration, URL-based routing, SSL termination Load Balancer (L4): Basic and Standard SKUs, health probes Traffic Manager (Global): DNS-based routing (priority, weighted, performance)

Self-Managed Load Balancers

NGINX

Best for: General-purpose HTTP/HTTPS load balancing

upstream backend {
least_conn;
server backend1.example.com:8080 weight=3;
server backend2.example.com:8080 weight=2;
keepalive 32;
}

server {
listen 80;
location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}

HAProxy

Best for: Maximum performance, database load balancing

frontend http_front
bind *:80
default_backend web_servers

backend web_servers
balance roundrobin
option httpchk GET /health
server web1 192.168.1.101:8080 check
server web2 192.168.1.102:8080 check

Envoy

Best for: Microservices, Kubernetes, service mesh integration

  • Cloud-native design with dynamic configuration (xDS APIs)
  • Circuit breakers, retries, timeouts
  • Advanced health checks (TCP, HTTP, gRPC)

Traefik

Best for: Docker/Kubernetes environments, dynamic configuration

  • Automatic service discovery
  • Native Kubernetes integration
  • Built-in Let's Encrypt support

Session Persistence

Sticky Sessions (Use Sparingly)

Cookie-Based:

  • Load balancer sets cookie to track server affinity
  • Accurate routing, works with NAT/proxies
  • HTTP only, adds cookie overhead

IP Hash:

  • Hash client IP to select backend server
  • No cookie required, works for non-HTTP
  • Poor distribution with NAT/proxies

Drawbacks:

  • Uneven load distribution
  • Session lost on server failure
  • Complicates scaling

Architecture: Stateless application servers + centralized session storage (Redis, Memcached)

Benefits:

  • No sticky sessions needed
  • True load balancing
  • Server failures don't lose sessions
  • Horizontal scaling trivial

Client-Side Tokens (Best for APIs)

JWT (JSON Web Tokens): Server generates signed token, client stores and sends with requests

Benefits:

  • Fully stateless servers
  • Perfect load balancing
  • No session storage needed

Kubernetes Ingress Controllers

ControllerBest ForStrengths
NGINX Ingress (F5)General purposeStability, wide adoption
TraefikDynamic environmentsEasy configuration, service discovery
HAProxy IngressHigh performanceAdvanced L7 routing
Envoy (Contour/Gateway)Service meshRich L7 features, extensibility
KongAPI-heavy appsJWT auth, rate limiting, plugins

Progressive Delivery

Canary Deployment

# Istio VirtualService
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: backend-canary
spec:
hosts:
- backend
http:
- route:
- destination:
host: backend
subset: v1
weight: 90
- destination:
host: backend
subset: v2
weight: 10

Blue/Green Deployment

  • Instant cutover with quick rollback
  • Deploy green alongside blue
  • Test green, then instant switch
  • Rollback to blue if needed

Quick Selection Guide

Use CaseRecommended Solution
HTTP web app (AWS)ALB
Non-HTTP protocol (AWS)NLB
Kubernetes HTTP ingressNGINX Ingress or Traefik
Maximum performanceHAProxy
Service meshEnvoy
Multi-cloud portableNGINX or HAProxy
Global distributionCloudFlare, AWS Global Accelerator

References

  • Full Skill Documentation
  • L4 vs L7 Comparison: references/l4-vs-l7-comparison.md
  • Health Check Strategies: references/health-check-strategies.md
  • Cloud Load Balancers: references/cloud-load-balancers.md
  • Session Persistence: references/session-persistence.md
  • Kubernetes Ingress: references/kubernetes-ingress.md