Load Balancing Patterns

Distribute traffic across infrastructure using appropriate load balancing approaches, from simple round-robin to global multi-region failover.

When to Use

Use when:

Distributing traffic across multiple application servers
Implementing high availability and failover
Routing traffic based on URLs, headers, or geographic location
Managing session persistence across stateless backends
Deploying applications to Kubernetes clusters
Configuring global traffic management across regions
Implementing zero-downtime deployments (blue-green, canary)

Core Concepts

Layer 4 vs Layer 7

Layer 4 (Transport Layer):

Routes based on IP address and port (TCP/UDP packets)
No application data inspection
Lower latency, higher throughput
Client source IP preservation
Use for: Database connections, video streaming, gaming, non-HTTP protocols

Layer 7 (Application Layer):

Routes based on HTTP URLs, headers, cookies, request body
Full application data visibility
SSL/TLS termination, caching, WAF integration
Content-based routing capabilities
Use for: Web applications, REST APIs, microservices, complex routing logic

Load Balancing Algorithms

Algorithm	Distribution Method	Use Case
Round Robin	Sequential	Stateless, similar servers
Weighted Round Robin	Capacity-based	Different server specs
Least Connections	Fewest active connections	Long-lived connections
Least Response Time	Fastest server	Performance-sensitive
IP Hash	Client IP-based	Session persistence
Resource-Based	CPU/memory metrics	Varying workloads

Health Check Types

Shallow (Liveness): Is the process alive?

Endpoint: /health/live or /live
Returns: 200 if process running
Use for: Process monitoring, container health

Deep (Readiness): Can the service handle requests?

Endpoint: /health/ready or /ready
Validates: Database, cache, external API connectivity
Use for: Load balancer routing decisions

Health Check Hysteresis:

Different thresholds for marking up vs down
Example: 3 failures to mark down, 2 successes to mark up
Prevents flapping

Cloud Load Balancers

AWS

Application Load Balancer (ALB) - Layer 7:

HTTP/HTTPS applications, microservices, WebSocket
Path/host/header routing, AWS WAF integration
Choose when: Content-based routing needed

Network Load Balancer (NLB) - Layer 4:

Ultra-low latency (<1ms), TCP/UDP, static IPs
Millions of requests per second
Choose when: Non-HTTP protocols, performance critical

Global Accelerator - Layer 4 Global:

Multi-region applications, global users
Anycast IPs, automatic regional failover

GCP

Application LB (L7): Global HTTPS LB, Cloud CDN integration, Cloud Armor (WAF/DDoS) Network LB (L4): Regional TCP/UDP, pass-through balancing, session affinity Cloud Load Balancing: Single anycast IP, global distribution

Azure

Application Gateway (L7): WAF integration, URL-based routing, SSL termination Load Balancer (L4): Basic and Standard SKUs, health probes Traffic Manager (Global): DNS-based routing (priority, weighted, performance)

Self-Managed Load Balancers

NGINX

Best for: General-purpose HTTP/HTTPS load balancing

upstream backend {
    least_conn;
    server backend1.example.com:8080 weight=3;
    server backend2.example.com:8080 weight=2;
    keepalive 32;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

HAProxy

Best for: Maximum performance, database load balancing

frontend http_front
    bind *:80
    default_backend web_servers

backend web_servers
    balance roundrobin
    option httpchk GET /health
    server web1 192.168.1.101:8080 check
    server web2 192.168.1.102:8080 check

Envoy

Best for: Microservices, Kubernetes, service mesh integration

Cloud-native design with dynamic configuration (xDS APIs)
Circuit breakers, retries, timeouts
Advanced health checks (TCP, HTTP, gRPC)

Traefik

Best for: Docker/Kubernetes environments, dynamic configuration

Automatic service discovery
Native Kubernetes integration
Built-in Let's Encrypt support

Session Persistence

Sticky Sessions (Use Sparingly)

Cookie-Based:

Load balancer sets cookie to track server affinity
Accurate routing, works with NAT/proxies
HTTP only, adds cookie overhead

IP Hash:

Hash client IP to select backend server
No cookie required, works for non-HTTP
Poor distribution with NAT/proxies

Drawbacks:

Uneven load distribution
Session lost on server failure
Complicates scaling

Shared Session Store (Recommended)

Architecture: Stateless application servers + centralized session storage (Redis, Memcached)

Benefits:

No sticky sessions needed
True load balancing
Server failures don't lose sessions
Horizontal scaling trivial

Client-Side Tokens (Best for APIs)

JWT (JSON Web Tokens): Server generates signed token, client stores and sends with requests

Benefits:

Fully stateless servers
Perfect load balancing
No session storage needed

Kubernetes Ingress Controllers

Controller	Best For	Strengths
NGINX Ingress (F5)	General purpose	Stability, wide adoption
Traefik	Dynamic environments	Easy configuration, service discovery
HAProxy Ingress	High performance	Advanced L7 routing
Envoy (Contour/Gateway)	Service mesh	Rich L7 features, extensibility
Kong	API-heavy apps	JWT auth, rate limiting, plugins

Progressive Delivery

Canary Deployment

# Istio VirtualService
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: backend-canary
spec:
  hosts:
  - backend
  http:
  - route:
    - destination:
        host: backend
        subset: v1
      weight: 90
    - destination:
        host: backend
        subset: v2
      weight: 10

Blue/Green Deployment

Instant cutover with quick rollback
Deploy green alongside blue
Test green, then instant switch
Rollback to blue if needed

Quick Selection Guide

Use Case	Recommended Solution
HTTP web app (AWS)	ALB
Non-HTTP protocol (AWS)	NLB
Kubernetes HTTP ingress	NGINX Ingress or Traefik
Maximum performance	HAProxy
Service mesh	Envoy
Multi-cloud portable	NGINX or HAProxy
Global distribution	CloudFlare, AWS Global Accelerator

Architecting Networks - Network design and topology for load balancing
Operating Kubernetes - Ingress controllers for K8s traffic management
Writing Infrastructure Code - Deploy load balancers via Terraform/Pulumi
Implementing Service Mesh - Envoy as both ingress and service mesh proxy

References

Full Skill Documentation
L4 vs L7 Comparison: references/l4-vs-l7-comparison.md
Health Check Strategies: references/health-check-strategies.md
Cloud Load Balancers: references/cloud-load-balancers.md
Session Persistence: references/session-persistence.md
Kubernetes Ingress: references/kubernetes-ingress.md

When to Use​

Core Concepts​

Layer 4 vs Layer 7​

Load Balancing Algorithms​

Health Check Types​

Cloud Load Balancers​

AWS​

GCP​

Azure​

Self-Managed Load Balancers​

NGINX​

HAProxy​

Envoy​

Traefik​

Session Persistence​

Sticky Sessions (Use Sparingly)​

Shared Session Store (Recommended)​

Client-Side Tokens (Best for APIs)​

Kubernetes Ingress Controllers​

Progressive Delivery​

Canary Deployment​

Blue/Green Deployment​

Quick Selection Guide​

Related Skills​

References​