⚖️ Elastic Load Balancing & Auto Scaling#
Learning Objectives#
- Choose between ALB, NLB, and CLB based on requirements
- Configure target groups, listeners, and health checks
- Implement Auto Scaling policies (simple, step, target tracking)
- Design for high availability across AZs
1. Elastic Load Balancing (ELB)#
1.1 Load Balancer Types#
| Feature | ALB (Layer 7) | NLB (Layer 4) | CLB (Legacy) |
|---|---|---|---|
| OSI Layer | 7 (HTTP/HTTPS) | 4 (TCP/UDP) | 4 & 7 |
| Protocols | HTTP, HTTPS, gRPC | TCP, UDP, TLS | HTTP, HTTPS, TCP, SSL |
| Target Type | Instance, IP, Lambda | Instance, IP, ALB | Instance only |
| SSL/TLS | Yes (termination) | Yes (passthrough/TCP) | Yes |
| WebSocket | Yes | No | No |
| Sticky Sessions | Yes (cookies) | No (source IP) | Yes |
| Fixed Response | Yes | No | No |
| Path-based routing | Yes | No | No |
| Host-based routing | Yes | No | No |
| Static IP | No (use NLB + Global Accelerator) | Yes (per AZ) | No |
| Slow start | Yes | No | No |
| Price | $0.0225/hr | $0.0225/hr | $0.025/hr |
| Use Case | Microservices, containers | TCP/UDP, extreme performance | Legacy apps |
⚡ Exam Tip: ALB for HTTP/HTTPS with path-based routing. NLB for TCP/UDP with static IPs or extreme performance. CLB is legacy — avoid on new projects.
1.2 ALB Deep Dive#
graph TD
User["User / Client"]
Route53["Route53 DNS"]
ALB["ALB (Layer 7)\nmy-alb-123.elb.amazonaws.com"]
Listener["Listener: Port 443 (HTTPS)\nCertificate: ACM"]
Rules["Rules:\n/api/* → API-TG\n/* → Web-TG"]
TG_WEB["Web Target Group\nEC2 × 3 (t3.medium)\nHealth: /health"]
TG_API["API Target Group\nECS Fargate × 2\nHealth: /api/health"]
User --> Route53
Route53 --> ALB
ALB --> Listener
Listener --> Rules
Rules --> TG_WEB
Rules --> TG_API
style User fill:#888,color:#fff
style ALB fill:#ff9900,color:#fff
style TG_WEB fill:#527fff,color:#fff
style TG_API fill:#01ab5c,color:#fffConnection Flow (ALB → Target):
sequenceDiagram
participant User as User
participant DNS as Route53
participant ALB as ALB
participant TG as Target Group
participant EC2 as EC2 Instance
User->>DNS: myapp.com
DNS->>User: ALB DNS name
User->>ALB: HTTPS request (TLS termination)
ALB->>ALB: Evaluate listener rules
ALB->>ALB: Path-based routing (/api/*)
ALB->>TG: Forward to target group
TG->>EC2: HTTP to instance:80
EC2-->>TG: HTTP 200 OK
TG-->>ALB: Response
ALB-->>User: HTTPS response
Note over ALB,EC2: Health checks every 30sCreate ALB:
# Create target group
aws elbv2 create-target-group \
--name web-targets \
--protocol HTTP \
--port 80 \
--vpc-id vpc-abc123 \
--health-check-path /health \
--healthy-threshold-count 3 \
--unhealthy-threshold-count 3 \
--matcher HttpCode="200,301"
# Create ALB
aws elbv2 create-load-balancer \
--name my-alb \
--subnets subnet-abc subnet-def \
--security-groups sg-web
# Create listener
aws elbv2 create-listener \
--load-balancer-arn arn:aws:elasticloadbalancing:...:loadbalancer/app/my-alb/abc \
--protocol HTTPS \
--port 443 \
--certificates CertificateArn=arn:aws:acm:...certificate/abc \
--default-actions Type=forward,TargetGroupArn=arn:aws:...:targetgroup/web-targets/abc
# Register targets
aws elbv2 register-targets \
--target-group-arn arn:aws:...:targetgroup/web-targets/abc \
--targets Id=i-abc123 Id=i-def4561.3 Sticky Sessions (Session Affinity)#
ALB: Uses cookies (AWSALB or custom app cookie) NLB: Uses source IP (stickiness based on client IP)
Use case: Stateful apps where user sessions must stay on the same instance.
# Enable sticky sessions on ALB
aws elbv2 modify-target-group-attributes \
--target-group-arn arn:aws:...:targetgroup/web-targets/abc \
--attributes Key=stickiness.enabled,Value=true \
Key=stickiness.type,Value=lb_cookie \
Key=stickiness.lb_cookie.duration_seconds,Value=864001.4 Cross-Zone Load Balancing#
- ALB: Enabled by default (distributes evenly across all AZs)
- NLB: Disabled by default (each AZ gets traffic from its own clients)
- When disabled, traffic stays in the same AZ (50/50 to each AZ)
Cross-Zone ON:
us-east-1a: [EC2] [EC2] (50% traffic)
us-east-1b: [EC2] (50% traffic)
Cross-Zone OFF:
us-east-1a: [EC2] [EC2] (50% traffic, split between 2 instances)
us-east-1b: [EC2] (50% traffic, all to 1 instance)2. Auto Scaling#
2.1 Auto Scaling Groups (ASG)#
┌─────────────────────────────────────────────────────┐
│ Auto Scaling Group │
│ │
│ Launch Template: │
│ ├── AMI: ami-0abc123 (latest app version) │
│ ├── Instance Type: t3.medium │
│ ├── Security Group: sg-web │
│ ├── IAM Role: EC2-AppRole │
│ └── User Data: bootstrap script │
│ │
│ Scaling Config: │
│ ├── Min: 2 │
│ ├── Desired: 2 │
│ └── Max: 10 │
│ │
│ Health Check: ELB (grace period: 300s) │
│ Scaling Policy: Target tracking (CPU @ 70%) │
└─────────────────────────────────────────────────────┘Create ASG:
# Create launch template
aws ec2 create-launch-template \
--launch-template-name web-template \
--version-description v1 \
--launch-template-data '{"ImageId": "ami-0abcdef1234567890", "InstanceType": "t3.medium", "SecurityGroupIds": ["sg-abc123"], "IamInstanceProfile": {"Name": "EC2-AppRole"},
"UserData": "'$(base64 -w0 bootstrap.sh)'"
}'
# Create Auto Scaling Group
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name web-asg \
--launch-template LaunchTemplateName=web-template,Version=1 \
--min-size 2 \
--max-size 10 \
--desired-capacity 2 \
--vpc-zone-identifier "subnet-abc,subnet-def" \
--target-group-arns arn:aws:...:targetgroup/web-targets/abc \
--health-check-type ELB \
--health-check-grace-period 3002.2 Scaling Policies#
| Policy Type | How it Works | Use Case |
|---|---|---|
| Simple | Set threshold → scale by X → cool down | Basic scenarios |
| Step | Multiple thresholds → different scaling amounts | Fine-grained control |
| Target Tracking | Automatically maintain a target metric | Most common (CPU 70%) |
| Scheduled | Scale based on time-based predictions | Known traffic patterns |
| Predictive | ML-based forecast + proactive scaling | Cyclic workloads |
Target Tracking Policy:
aws autoscaling put-scaling-policy \
--auto-scaling-group-name web-asg \
--policy-name cpu-target \
--policy-type TargetTrackingScaling \
--target-tracking-configuration '{"TargetValue": 70.0, "PredefinedMetricSpecification": { "PredefinedMetricType": "ASGAverageCPUUtilization" }
}'Step Scaling Policy:
aws autoscaling put-scaling-policy \
--auto-scaling-group-name web-asg \
--policy-name cpu-step \
--policy-type StepScaling \
--adjustment-type ChangeInCapacity \
--step-adjustments '[
{"MetricIntervalLowerBound": 0, "MetricIntervalUpperBound": 20, "ScalingAdjustment": 1},
{"MetricIntervalLowerBound": 20, "ScalingAdjustment": 3}
]'2.3 Lifecycle Hooks#
Pause instance launch/termination for custom actions:
1. Scale out event → Pending:Wait (60 min timeout)
2. Lambda invoked (configures app, registers with monitoring)
3. Complete lifecycle action → InServiceaws autoscaling put-lifecycle-hook \
--lifecycle-hook-name web-configure \
--auto-scaling-group-name web-asg \
--lifecycle-transition autoscaling:EC2_INSTANCE_LAUNCHING \
--default-result CONTINUE \
--heartbeat-timeout 600 \
--notification-target-arn arn:aws:sns:...:asg-lifecycle3. Real-World Use Cases#
Use Case 1: E-Commerce Traffic Spike#
Scenario: Your e-commerce site gets 10x traffic during Black Friday.
Solution:
- ALB with path-based routing:
/api/*→ API servers,/static/*→ S3 - ASG with CPU target tracking (target: 60%)
- Predictive scaling based on last year’s data
- Scheduled scaling to pre-warm before the event
Use Case 2: Blue/Green Deployment#
Blue (Current): ASG with v1 instances
Green (New): ASG with v2 instances
Swap: Update ALB listener to point to Green TG
┌──────────────────────────────────────────────────┐
│ ALB │
│ /v1/* → Blue-TG (instances running v1) │
│ /v2/* → Green-TG (instances running v2) │
│ /* → Blue-TG (until cutover) │
└──────────────────────────────────────────────────┘
Switch: Update default rule to Green-TG → Shift trafficUse Case 3: Spot + On-Demand Mix#
Scenario: Save costs by using Spot Instances for workers, On-Demand for critical services.
aws autoscaling create-auto-scaling-group \
--mixed-instances-policy '{ "LaunchTemplate": {"LaunchTemplateName": "worker", "Version": "1"},
"InstancesDistribution": {"OnDemandPercentageAboveBaseCapacity": 20, "SpotAllocationStrategy": "capacity-optimized", "OnDemandBaseCapacity": 2 }
}'4. ⚡ Exam Tips#
- ALB vs NLB — ALB = HTTP/HTTPS path-based. NLB = TCP/UDP with static IPs
- Cross-zone LB — ALB = on by default. NLB = off by default
- Sticky sessions — ALB uses cookies. NLB uses source IP
- Target tracking — Easiest scaling policy (just set target value)
- Health check grace period — Allow app to start before health checks begin (default 300s)
- Cooldown — Time between scaling activities (default 300s for simple)
- ELB with Lambda — ALB can directly invoke Lambda as target
- NLB + Static IP — Use with Global Accelerator for fixed IP addresses
✅ Chapter Quiz#
-
You need a public-facing load balancer with path-based routing. Which type?
- A) CLB
- B) NLB
- C) ALB
- D) Route53
-
What is the default health check grace period in Auto Scaling?
- A) 60 seconds
- B) 120 seconds
- C) 300 seconds
- D) 600 seconds
-
Which scaling policy automatically maintains a target metric?
- A) Simple
- B) Step
- C) Target Tracking
- D) Scheduled
-
By default, cross-zone load balancing is enabled for which ELB type?
- A) NLB only
- B) ALB only
- C) Both ALB and NLB
- D) Neither
-
Which ELB type can directly invoke Lambda functions as targets?
- A) CLB
- B) NLB
- C) ALB
- D) None
-
A web application behind an ALB returns intermittent 503 errors during traffic spikes. What is the MOST likely cause?
- A) The SSL certificate has expired
- B) The target group has insufficient healthy capacity
- C) Cross-zone load balancing is disabled
- D) The security group blocks inbound traffic
-
A company needs to route TCP traffic with static IP addresses across multiple AWS regions. Which combination of services should be used?
- A) ALB + Route53
- B) NLB + Global Accelerator
- C) CLB + CloudFront
- D) NLB + Route53 latency routing
-
An Auto Scaling group has min=2, desired=2, max=10. A target tracking policy is configured for 60% CPU. During a deployment, new instances fail health checks and are terminated. What happens next?
- A) The ASG scales down to 0 instances
- B) The ASG launches replacement instances to maintain desired capacity
- C) The ASG suspends all scaling activities
- D) The ASG remains at the current count
-
A solutions architect needs user sessions to persist on the same backend instance across multiple requests behind an ALB. Which feature should be enabled?
- A) Cross-zone load balancing
- B) Sticky sessions
- C) Connection draining
- D) Slow start
-
After placing an application behind an ALB, all access logs show the ALB’s private IP instead of the client IP. How can the application retrieve the original client IP?
- A) Enable Proxy Protocol on the ALB
- B) Read the X-Forwarded-For header
- C) Enable VPC Flow Logs
- D) Configure the ALB to use the client’s source IP
-
An ASG using simple scaling adds 2 instances when CPU exceeds 80%. After scaling out, CPU drops below the threshold. The ASG waits 300 seconds before allowing another scaling activity. What is this period called?
- A) Health check grace period
- B) Cooldown period
- C) Warm-up period
- D) Termination delay
-
A solutions architect needs to route /api/ requests to backend services and / requests to web servers. Which load balancer type supports this?**
- A) NLB
- B) CLB
- C) ALB
- D) Gateway Load Balancer
-
A web application behind an ALB must be protected from SQL injection and cross-site scripting attacks. Which service should be associated with the ALB?
- A) Network ACLs
- B) Security Groups
- C) AWS WAF
- D) AWS Shield Advanced
-
During rolling updates, in-flight requests are interrupted when instances deregister from a target group. What feature should be configured to allow requests to complete?
- A) Slow start
- B) Connection draining (deregistration delay)
- C) Sticky sessions
- D) Cross-zone load balancing
-
An ASG is at max capacity (10 instances) of t3.micro instances, yet the application remains CPU-bound during peak hours. What is the MOST effective solution?
- A) Increase max capacity to 20
- B) Change the instance type to a larger size
- C) Switch from target tracking to step scaling
- D) Reduce the cooldown period
-
A company uses lifecycle hooks to run a custom configuration script before instances serve traffic. Which lifecycle state should the hook target?
- A) InService
- B) Pending:Wait
- C) Terminating:Wait
- D) Detaching:Wait
-
An NLB target group hosts HTTPS services on port 443. What health check type is MOST appropriate?
- A) HTTP on port 80
- B) HTTPS with certificate validation
- C) TCP on port 443
- D) HTTP with status code matcher
-
An ALB distributes traffic to 4 instances across 3 AZs (us-east-1a has 2, us-east-1b has 1, us-east-1c has 1). Cross-zone load balancing is enabled. How is traffic distributed?
- A) Each AZ receives 33.3% of traffic
- B) Each instance receives 25% of traffic
- C) us-east-1a receives 50%, others receive 25% each
- D) Traffic is distributed randomly
-
A company needs to scale infrastructure in anticipation of known traffic patterns, such as a flash sale starting at 9:00 AM. Which scaling policy should be used?
- A) Target tracking
- B) Step scaling
- C) Scheduled scaling
- D) Simple scaling
-
A gaming company runs a UDP-based multiplayer game. Which load balancer handles UDP traffic with the lowest latency?
- A) ALB
- B) NLB
- C) CLB
- D) Gateway Load Balancer
-
An ASG has min=3, max=10 with a step scaling policy: add 2 instances when CPU > 70%, remove 1 when CPU < 30%. After a spike, the ASG is at 8 instances. Traffic normalizes and CPU drops below 30%. How many instances terminate in the next scale-in event?
- A) 0
- B) 1
- C) 5
- D) 8
-
A company wants to store ALB access logs in S3 for analysis. What must be configured?
- A) Enable access logging on the ALB and specify an S3 bucket
- B) Install the CloudWatch agent on the ALB
- C) Enable VPC Flow Logs to S3
- D) Configure CloudTrail for the ALB
-
An ALB in public subnets must accept HTTPS traffic only. Which configurations are required? (Choose two.)
- A) Security group allowing inbound HTTPS from 0.0.0.0/0
- B) Network ACL allowing HTTPS on ALB subnets
- C) HTTPS listener with an ACM certificate
- D) Cross-zone load balancing enabled
- E) Sticky sessions configured
-
An ASG uses a mixed instances policy with On-Demand and Spot Instances. During a Spot interruption, what does the ASG do?
- A) Terminates without replacement
- B) Launches replacement instances to maintain desired capacity
- C) Reduces desired capacity
- D) Switches all instances to On-Demand
-
A company needs to gradually shift 10% of traffic to a new application version behind an ALB. Which feature supports this?
- A) Multiple target groups with weighted routing on the listener rule
- B) Route53 weighted routing
- C) Auto Scaling instance refresh
- D) ALB connection draining
📝 Answer Key
- C — ALB supports path-based and host-based routing at Layer 7.
- C — 300 seconds (5 minutes) is the default health check grace period.
- C — Target Tracking automatically maintains the target value (e.g., 70% CPU).
- B — Cross-zone is enabled by default on ALB, disabled by default on NLB.
- C — ALB can target Lambda functions, making it ideal for serverless APIs.
- B — 503 errors from an ALB indicate no healthy targets are available to handle the request due to insufficient capacity.
- B — NLB handles TCP/UDP traffic and Global Accelerator provides 2 static Anycast IPs with global traffic optimization.
- B — The ASG automatically replaces terminated instances to maintain the desired or minimum capacity.
- B — Sticky sessions (session affinity) route the same client to the same target using cookies.
- B — ALB adds the X-Forwarded-For header with the client’s IP; the application must read this header.
- B — The cooldown period (default 300s for simple scaling) prevents additional scaling actions before the previous one takes effect.
- C — ALB supports path-based routing rules to direct requests to different target groups.
- C — AWS WAF protects web applications from SQL injection, XSS, and other web exploits when associated with an ALB.
- B — Connection draining (deregistration delay) allows in-flight requests to complete while the instance is being deregistered.
- B — When the ASG is at max capacity and instances are still CPU-bound, increasing instance type provides more resources per instance.
- B — Pending:Wait allows custom actions after launch but before the instance enters InService.
- C — NLB health checks operate at Layer 4; a TCP health check on port 443 verifies the target accepts connections.
- B — With cross-zone LB enabled, traffic is distributed evenly across all instances regardless of AZ distribution.
- C — Scheduled scaling allows time-based scaling actions for known traffic patterns.
- B — NLB handles UDP traffic at Layer 4 with extreme performance and low latency.
- B — Step scaling policies execute exact adjustments per event; the policy specifies remove 1 instance.
- A — ALB can be configured to deliver access logs directly to a specified S3 bucket.
- A, C — The security group must allow HTTPS inbound, and an HTTPS listener with an ACM certificate is required.
- B — The ASG treats Spot interruptions as failures and launches replacement instances to maintain desired capacity.
- A — ALB supports weighted target groups on a listener rule, enabling gradual traffic shifting between versions.
📚 Additional Resources#
Next → Database Services