🖥️ EC2 & Compute Services#
Learning Objectives#
- Understand EC2 instance types, purchasing options, and placement groups
- Configure EC2 with user data, security groups, and IAM roles
- Implement Auto Scaling and load balancing
- Choose between EC2, Lightsail, and Bare Metal
1. Amazon EC2 Overview#
Amazon Elastic Compute Cloud (EC2) provides virtual servers in the cloud. You can provision and scale compute capacity within minutes.
EC2 Architecture#
┌─────────────────────────────────────────────────────┐
│ VPC / Subnet │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ EC2 Instance (i-abc123) │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ vCPU │ │ Memory │ │ EBS │ │ │
│ │ │ (2-128) │ │ (1-384GB)│ │ Volume │ │ │
│ │ └──────────┘ └──────────┘ └────┬─────┘ │ │
│ │ │ │ │
│ │ ┌──────────┐ ┌──────────┐ │ │ │
│ │ │ ENI │ │ Instance │ │ │ │
│ │ │ (Network)│ │ Store │ │ │ │
│ │ └──────────┘ └──────────┘ │ │ │
│ └────────────────────────────────────┼──────────┘ │
│ │ │
│ ┌────────────────────────────────────┴──────────┐ │
│ │ Security Group (Firewall) │ │
│ └───────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘1.1 EC2 Instance Types#
AWS provides instance families optimized for different workloads:
| Family | Name | Use Case | Example |
|---|---|---|---|
| General Purpose | T3, T4g, M6i, M7g | Web servers, small DBs | t3.micro (Free tier) |
| Compute Optimized | C6i, C7g | Batch processing, gaming | c6i.large |
| Memory Optimized | R6i, X2iedn | In-memory caches, large DBs | r6i.large |
| Storage Optimized | I3, D2 | Data warehousing, logs | i3.large |
| Accelerated | P4, G4ad, F1 | ML training, video transcoding | p4d.24xlarge |
Instance Naming Convention:
m5.large
│ │ │
│ │ └── Instance size (small, medium, large, xlarge, etc.)
│ └── Generation (5th gen)
└── Instance family (General purpose)⚡ Exam Tip: Know which instance family for which workload: T/M = general, C = compute, R/X = memory, I/D = storage, P/G/F = GPU/FPGA.
1.2 EC2 Purchasing Options#
| Option | Pricing | Commitment | Use Case |
|---|---|---|---|
| On-Demand | Highest ($/hr) | None | Short-term, spiky workloads |
| Reserved (RI) | Up to 72% off | 1 or 3 years | Steady-state production |
| Savings Plans | Up to 72% off | 1 or 3 years ($/hr commitment) | Flexible across instance families |
| Spot | Up to 90% off | None (can be terminated) | Fault-tolerant, batch jobs |
| Dedicated Host | Physical server | 1 or 3 years | Licensing, compliance |
| Dedicated Instance | Single-tenant HW | On-demand or RI | Isolation requirements |
Spot Instance Lifecycle:
Request Spot → Active → Spot Instance Interruption Notice (2 min)
↓
Provisioned → Running → Terminated (when spot price > bid)# Request a Spot Instance
aws ec2 request-spot-instances \
--spot-price "0.05" \
--instance-count 2 \
--launch-specification file://spot-config.json
# Check spot price history
aws ec2 describe-spot-price-history \
--instance-types m5.large \
--product-description "Linux/UNIX" \
--availability-zone us-east-1a⚡ Exam Tip: Spot Instances are NOT suitable for stateful workloads, databases, or anything that can’t handle interruption. Use them for batch processing, CI/CD workers, and stateless web servers.
1.3 Placement Groups#
| Type | Strategy | Use Case |
|---|---|---|
| Cluster | Low latency, high throughput (same rack) | HPC, big data analytics |
| Spread | Isolated hardware (max 7 instances per AZ) | Critical applications |
| Partition | Isolated racks (per partition) | Cassandra, Kafka, Hadoop |
graph LR
subgraph Cluster["Cluster Placement Group"]
C1["EC2 App #1"] --- C2["EC2 App #2"]
C2 --- C3["EC2 App #3"]
end
subgraph Spread["Spread Placement Group"]
S1["EC2 (Rack A)"] -.- S2["EC2 (Rack B)"]
S2 -.- S3["EC2 (Rack C)"]
end
subgraph Partition["Partition Placement Group"]
P1["Partition 1: [EC2][EC2]"] --- P2["Partition 2: [EC2][EC2]"]
P2 --- P3["Partition 3: [EC2][EC2]"]
end| Type | Key Characteristics |
|---|---|
| Cluster | ✅ Same rack — lowest latency ⚠️ Single rack failure risk |
| Spread | ✅ Different hardware — fault isolation ⚠️ Max 7 per AZ |
| Partition | ✅ Per-partition isolation ✅ Good for Cassandra, Kafka |
2. EC2 Networking & Security#
2.1 Security Groups (Stateful Firewall)#
Security groups act as a virtual firewall for EC2 instances:
| Feature | Security Group | NACL |
|---|---|---|
| State | Stateful | Stateless |
| Rules | Allow only | Allow + Deny |
| Evaluation | All rules evaluated | Rule number order |
| Scope | Instance-level | Subnet-level |
# Create security group
aws ec2 create-security-group \
--group-name web-sg \
--description "Web server security group" \
--vpc-id vpc-abc123
# Add inbound rules
aws ec2 authorize-security-group-ingress \
--group-id sg-abc123 \
--protocol tcp --port 80 --cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress \
--group-id sg-abc123 \
--protocol tcp --port 443 --cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress \
--group-id sg-abc123 \
--protocol tcp --port 22 --cidr 203.0.113.0/24 # SSH only from office2.2 EC2 User Data (Bootstrapping)#
Run scripts at instance launch:
#!/bin/bash
yum update -y
yum install -y httpd
systemctl enable httpd
systemctl start httpd
echo "<h1>Hello from $(hostname -f)</h1>" > /var/www/html/index.html# Launch with user data
aws ec2 run-instances \
--image-id ami-0abcdef1234567890 \
--instance-type t3.micro \
--user-data file://bootstrap.sh \
--security-group-ids sg-abc123 \
--subnet-id subnet-abc123 \
--iam-instance-profile Name=EC2-WebRole2.3 EC2 Instance Metadata (IMDS)#
Access instance metadata from within the instance:
# Get instance metadata (IMDSv1)
curl http://169.254.169.254/latest/meta-data/
# Get instance ID
curl http://169.254.169.254/latest/meta-data/instance-id
# Get public IP
curl http://169.254.169.254/latest/meta-data/public-ipv4
# Get IAM role credentials
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/MyRole
# IMDSv2 (token-based, more secure)
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" \
-H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
curl -H "X-aws-ec2-metadata-token: $TOKEN" \
http://169.254.169.254/latest/meta-data/⚡ Exam Tip: IMDSv2 is session-based and more secure. Always prefer IMDSv2 over IMDSv1. You can enforce IMDSv2 at instance launch.
3. Elastic Block Store (EBS)#
3.1 EBS Volume Types#
| Type | Max IOPS | Max Throughput | Use Case |
|---|---|---|---|
| gp3 (SSD) | 16,000 | 1,000 MB/s | General purpose, boot volumes |
| io2 Block Express (SSD) | 256,000 | 4,000 MB/s | Large databases, SAP |
| st1 (HDD) | 500 | 500 MB/s | Big data, logs (throughput-optimized) |
| sc1 (HDD) | 250 | 250 MB/s | Cold data, infrequent access |
3.2 EBS Snapshots#
# Create snapshot
aws ec2 create-snapshot \
--volume-id vol-abc123 \
--description "Web server backup - $(date +%Y-%m-%d)"
# Copy snapshot to another region
aws ec2 copy-snapshot \
--source-region us-east-1 \
--source-snapshot-id snap-abc123 \
--destination-region eu-west-1
# Create AMI from snapshot
aws ec2 register-image \
--name "MyApp-v1.0.0" \
--block-device-mappings DeviceName=/dev/xvda,Ebs={SnapshotId=snap-abc123}Snapshot Features:
- Incremental — Only changed blocks are saved
- Hibernation — Preserve RAM state for fast resume
- Recycle Bin — Recover accidentally deleted snapshots (retention 1 day to 1 year)
3.3 EBS Encryption#
- By default, EBS encryption is enabled (new accounts automatically encrypt)
- Uses KMS with a customer managed key
- Copying an unencrypted snapshot to encrypted creates encrypted volume
- No performance impact
4. Real-World Use Cases#
Use Case 1: Web Application with Auto Scaling#
┌──────────────┐
│ Route53 DNS │
└──────┬───────┘
│
┌──────┴───────┐
│ ALB │
│ (HTTPS) │
└──────┬───────┘
│
┌─────────────┼─────────────┐
│ │ │
┌─────┴──┐ ┌─────┴──┐ ┌─────┴──┐
│ EC2 #1 │ │ EC2 #2 │ │ EC2 #3 │
│(t3.med)│ │(t3.med)│ │(t3.med)│
├────────┤ ├────────┤ ├────────┤
│Web App │ │Web App │ │Web App │
└────┬───┘ └────┬───┘ └────┬───┘
│ │ │
┌────┴────────────┴────────────┴───┐
│ RDS Database │
│ (Multi-AZ) │
└──────────────────────────────────┘Auto Scaling Config:
{"AutoScalingGroupName": "web-asg",
"MinSize": 2,
"MaxSize": 10,
"DesiredCapacity": 2, "VPCZoneIdentifier": "subnet-1a,subnet-1b", "TargetGroupARNs": ["arn:aws:elasticloadbalancing:...:targetgroup/web-tg/abc"], "HealthCheckType": "ELB", "HealthCheckGracePeriod": 300 }Scaling Policy:
{"PolicyName": "scale-out-cpu", "PolicyType": "TargetTrackingScaling", "TargetTrackingConfiguration": { "PredefinedMetricSpecification": { "PredefinedMetricType": "ASGAverageCPUUtilization" },
"TargetValue": 70.0
}
}Use Case 2: Batch Processing with Spot Instances#
Scenario: Process 10 TB of nightly log files. Can be interrupted and restarted.
Solution: Spot Fleet with mixed instance types + EBS-optimized instances + checkpointing to S3
Use Case 3: Burstable vs Non-Burstable#
Scenario: A development web server has low CPU most of the time but occasional spikes.
Solution: Use t3.medium with CPU Credits. It accumulates credits during idle and uses them during bursts.
CPU Usage:
100% ┤ ██
75% ┤ ██████
50% ┤ ████████████
25% ┤████████████████████████████████████
└─────────────────────────────────→ Time
Burst Idle Burst Idle (T3 Unlimited)⚡ Exam Tip: T2/T3 instances have CPU credits. Unlimited mode lets you burst beyond credits (extra cost). Use for variable workloads, not consistent high CPU.
5. ⚡ Exam Tips#
- Termination Protection — Enabled by default. Must disable to terminate
- Instance Metadata —
169.254.169.254— always use IMDSv2 - ENA vs VF — ENA (Elastic Network Adapter) for enhanced networking
- Hibernate — Preserves RAM to EBS. Faster than reboot. Max 60 days
- Placement Groups — Cluster (low latency), Spread (isolation), Partition (big data)
- EBS vs Instance Store — EBS is persistent. Instance store is ephemeral but higher performance
- Nitro System — Underlying virtualization for newer instances. Better performance, security
- Dedicated Hosts — Physical server for existing socket/core licenses (BYOL)
✅ Chapter Quiz#
-
Which EC2 purchase option is best for a batch processing job that can be interrupted?
- A) On-Demand
- B) Reserved
- C) Spot
- D) Dedicated
-
You need the lowest latency between EC2 instances for HPC. Which placement group?
- A) Spread
- B) Partition
- C) Cluster
- D) Distributed
-
What is the lifecycle of an EBS snapshot?
- A) Full copy on every snapshot
- B) Incremental (only changed blocks)
- C) Differential (changed since last full)
- D) Always encrypted
-
Which EC2 feature preserves RAM state for faster resume?
- A) Stop
- B) Terminate
- C) Hibernate
- D) Reboot
-
What IP address do you use to access EC2 instance metadata from within the instance?
- A) 10.0.0.1
- B) 169.254.169.254
- C) 127.0.0.1
- D) 192.168.1.1
-
A company needs to launch EC2 instances in physically isolated hardware within a single AZ for low-latency networking. Which placement group should be used?
- A) Cluster
- B) Spread
- C) Partition
- D) Distributed
-
You need to attach 50 TB of block storage to a single EC2 instance. Which storage option supports this requirement?
- A) EBS volumes striped via RAID 0
- B) Instance Store
- C) EFS
- D) S3
-
An EC2 instance hosting a critical application failed. The administrator cannot connect via SSH and the system log shows a kernel panic. What should be done to restore the application?
- A) Stop the instance and start it again
- B) Terminate the instance and launch a new one from the same AMI
- C) Use EC2 Auto Recovery
- D) Reboot the instance
-
An application running on a t3.medium instance consistently uses 90% CPU for extended periods. What happens when CPU credit balance is exhausted in standard mode?
- A) The instance is stopped
- B) The instance is throttled to baseline CPU
- C) The instance is automatically upgraded to a larger type
- D) AWS charges additional fees for burst performance
-
A company wants to migrate a legacy application that requires a specific physical server for licensing purposes. Which EC2 option should be used?
- A) Reserved Instance
- B) Dedicated Host
- C) Dedicated Instance
- D) Spot Instance
-
A company runs a production database on an EC2 instance and needs the highest possible IOPS. Which EBS volume type should be selected?
- A) gp3
- B) io2 Block Express
- C) st1
- D) sc1
-
An application needs to be highly available across multiple Availability Zones. Which EC2 feature distributes instances across different physical locations?
- A) Placement groups
- B) Auto Scaling groups with instances in multiple AZs
- C) EC2 Dedicated Hosts
- D) EC2 Instance Store
-
A batch processing workload runs for 6 hours each night and can tolerate interruptions. Which purchasing option provides the LOWEST cost?
- A) On-Demand Instances
- B) Reserved Instances
- C) Spot Instances
- D) Dedicated Hosts
-
A company needs to migrate an application to AWS that requires a specific CPU socket and core license. Which EC2 option should be used?
- A) On-Demand Instances
- B) Dedicated Hosts
- C) Spot Instances
- D) Reserved Instances
-
Which EC2 instance family is optimized for memory-intensive workloads like in-memory databases and real-time analytics?
- A) C family (Compute Optimized)
- B) R family (Memory Optimized)
- C) I family (Storage Optimized)
- D) T family (Burstable)
-
An EC2 instance launched in a private subnet needs to download security patches from the internet. Which component must be configured?
- A) An Internet Gateway attached to the VPC
- B) A NAT Gateway in a public subnet with a route from the private subnet
- C) A VPC Peering connection to a public subnet
- D) An AWS Direct Connect connection
-
Which statements about Security Groups are correct? (Select TWO)
- A) Security Groups are stateless
- B) Security Groups support allow rules only
- C) Security Groups support both allow and deny rules
- D) Security Groups are stateful
- E) Security Groups filter traffic at the subnet level
-
A solutions architect needs to create a custom AMI from a running EC2 instance to standardize future deployments. What is the correct approach?
- A) Stop the instance, create an EBS snapshot of the root volume, and register it as an AMI
- B) Create an EBS snapshot from the root volume while the instance is running, then register the snapshot as an AMI
- C) Copy the root volume to a new instance and snapshot that
- D) Use AWS CloudFormation to create the AMI from the running instance
-
What is the key difference between EBS volumes and Instance Store volumes?
- A) EBS is ephemeral; Instance Store is persistent
- B) EBS is persistent; Instance Store is ephemeral
- C) Both provide persistent storage
- D) Both provide ephemeral storage
-
An EC2 instance with a 500 GB gp3 EBS volume is running low on disk space. How can the volume be resized with minimal downtime?
- A) Launch a new instance with a larger volume and migrate data
- B) Modify the EBS volume size, then extend the file system
- C) Add an Instance Store volume
- D) Create a new larger volume and attach it in place of the old volume
-
A web application behind an Application Load Balancer needs to scale EC2 instances based on average CPU utilization. Which Auto Scaling policy type is MOST appropriate?
- A) Simple scaling policy
- B) Target tracking scaling policy
- C) Scheduled scaling policy
- D) Manual scaling
-
Which placement group type provides the lowest possible network latency and highest throughput between EC2 instances?
- A) Spread placement group
- B) Partition placement group
- C) Cluster placement group
- D) Distributed placement group
-
How does an application running on an EC2 instance with an IAM role obtain temporary AWS credentials?
- A) Read from an environment variable set at launch
- B) Retrieve them from the EC2 instance metadata service (IMDS)
- C) Read from a configuration file downloaded from S3
- D) Retrieve them from AWS Secrets Manager
-
What happens when a t3.micro EC2 instance in standard mode exhausts its accumulated CPU credits?
- A) The instance is immediately stopped
- B) The instance continues running but CPU performance is throttled to the baseline
- C) The instance automatically acquires additional credits at no charge
- D) The instance is terminated and relaunched
-
A company needs to encrypt an existing unencrypted EBS volume. What is the MOST efficient approach?
- A) Enable encryption directly on the volume using the EC2 console
- B) Create a snapshot of the volume, copy the snapshot with encryption enabled, and create a new encrypted volume from the copied snapshot
- C) Format the volume with an encrypted file system
- D) Attach the volume to an instance and use OS-level encryption tools
📝 Answer Key
- C — Spot Instances are 90% cheaper but can be interrupted.
- C — Cluster placement group places instances in a single AZ for low latency.
- B — EBS snapshots are incremental (only changed blocks).
- C — Hibernate preserves RAM state to EBS for faster resume.
- B —
169.254.169.254is the link-local address for instance metadata. - A — Cluster placement groups provide low-latency networking within a single AZ.
- A — Multiple EBS volumes can be striped (RAID 0) to provide block storage exceeding single volume limits.
- B — A kernel panic indicates OS-level corruption; launch a fresh instance from the AMI.
- B — T2/T3 instances in standard mode are throttled to baseline CPU when credits are exhausted.
- B — Dedicated Hosts provide physical servers for BYOL and server-bound software licenses.
- B — io2 Block Express provides up to 256,000 IOPS for latency-sensitive, high-throughput database workloads.
- B — Auto Scaling groups configured with multiple AZs distribute instances across Availability Zones for HA.
- C — Spot Instances provide up to 90% discount but can be interrupted with a 2-minute warning.
- B — Dedicated Hosts provide physical servers for existing socket/core licenses and compliance needs.
- B — The R instance family (e.g., R6i, X2iedn) is optimized for memory-intensive workloads.
- B — A NAT Gateway in a public subnet with a route from the private subnet provides outbound internet access.
- B, D — Security Groups are stateful and only support allow rules (no explicit deny rules).
- B — Amazon EC2 can create an AMI from a running instance by taking snapshots and registering them.
- B — EBS volumes are persistent (data survives instance termination); Instance Store is ephemeral.
- B — Modify the EBS volume size (elastic volumes) and extend the file system online with minimal downtime.
- B — Target tracking scaling policy automatically adjusts capacity based on a target metric like CPU.
- C — Cluster placement groups place instances in the same rack within a single AZ for lowest latency.
- B — EC2 instance metadata at
169.254.169.254/latest/meta-data/iam/security-credentials/provides temporary credentials. - B — In standard mode, the instance is throttled to baseline CPU when credits are exhausted.
- B — Create an unencrypted snapshot, copy it with encryption enabled, then create a new encrypted volume.
📚 Additional Resources#
Next → VPC Networking