Microservices on EKS#
A Kubernetes-based microservices architecture using Amazon EKS, ALB, RDS, SQS, and Service Mesh – designed for teams that need portability, service discovery, and fine-grained traffic control.
Architecture Overview#
┌─────────────────────────────────────┐
│ Route53 (DNS) │
└────────────────┬────────────────────┘
│
┌────────────────▼────────────────────┐
│ AWS Load Balancer Controller │
│ (Ingress -> ALB) │
└──┬────────────┬────────────┬─────────┘
│ │ │
┌────────▼────┐ ┌────▼────┐ ┌────▼────────┐
│ Service A │ │Service B│ │ Service C │
│ (Orders) │ │ (Users) │ │ (Inventory) │
│ Pods: 3-10 │ │Pods:2-5 │ │ Pods: 2-6 │
└──────┬──────┘ └────┬────┘ └──────┬──────┘
│ │ │
┌──────▼──────┐ │ ┌────────▼──────┐
│ SQS (async) │ │ │ RDS Aurora │
└─────────────┘ │ │ (Multi-AZ) │
│ └───────────────┘
┌───────▼───────┐
│ ElastiCache │
│ (Redis) │
└───────────────┘Services Used#
| Service | Purpose | Configuration |
|---|---|---|
| EKS | Kubernetes control plane | Managed control plane, node groups in private subnets |
| ALB Ingress Controller | Layer 7 routing | Path-based routing to services, SSL termination |
| RDS Aurora | Relational database | Multi-AZ, 1 read replica, automated backups (35-day retention) |
| ElastiCache Redis | Distributed caching & sessions | Cluster mode, 2 shards, encryption in transit |
| SQS | Async service communication | Standard queues, DLQ per service, 14-day retention |
| SNS | Event notifications | Fan-out to multiple subscribers per event type |
| Cloud Map | Service discovery | DNS-based service registry for inter-service communication |
| App Mesh | Service mesh | Envoy sidecars, traffic splitting, observability |
Key Design Decisions#
| Decision | Rationale |
|---|---|
| EKS over ECS | Kubernetes portability, larger ecosystem, multi-cloud flexibility |
| ALB Ingress over NLB | Path-based routing, SSL termination, WAF integration |
| Managed node groups | Auto-scaling, automatic AMI updates, less operational overhead |
| SQS for async communication | Decouples services, buffers traffic spikes, DLQ for failures |
| App Mesh for observability | Envoy sidecars provide tracing, metrics, and traffic control without code changes |
| Cloud Map for service discovery | DNS-based resolution, health checks auto-remove unhealthy instances |
Kubernetes Manifest Example#
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: orders-service
namespace: microservices
spec:
replicas: 3
selector:
matchLabels:
app: orders
template:
metadata:
labels:
app: orders
spec:
containers:
- name: orders-api
image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/orders:v1.2.3
ports:
- containerPort: 3000
env:
- name: DB_HOST
value: aurora-cluster.cluster-xxx.us-east-1.rds.amazonaws.com
- name: REDIS_HOST
value: redis-cluster.xxx.0001.use1.cache.amazonaws.com
resources:
requests: { cpu: "256m", memory: "512Mi" }
limits: { cpu: "512m", memory: "1Gi" }
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: orders-hpa
namespace: microservices
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: orders-service
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70Real-World Use Case#
Scenario: A fintech startup with 5 engineering teams building a payment platform – needs independent deploy cycles and polyglot services.
How this architecture handles it:
- Independent scaling: Payment service scales to 20 pods during peak, Notification stays at 2
- Polyglot services: Orders (Node.js), Users (Go), Reports (Python) on the same cluster
- Canary deployments: App Mesh splits 5% traffic to new version, monitors errors, shifts to 100%
- Async processing: SQS buffers payment events during traffic bursts, prevents downstream overload
- Database isolation: Each service gets its own schema, connection pooling via RDS Proxy
Cost Estimate (Monthly)#
| Service | Estimated Cost |
|---|---|
| EKS control plane | ~$73 |
| EC2 node group (6 m5.large) | ~$420 |
| ALB | ~$22 |
| RDS Aurora (db.r5.large) | ~$300 |
| ElastiCache (cache.r5.large) | ~$150 |
| ECR storage | ~$5 |
| Total | ~$970/month |
Key Exam Takeaways#
- EKS = managed Kubernetes – AWS handles the control plane, you manage worker nodes
- ALB Ingress Controller = path-based routing to Kubernetes services
- HPA (Horizontal Pod Autoscaler) scales pods based on CPU/memory/custom metrics
- Cluster Autoscaler adjusts the number of worker nodes to fit pending pods
- App Mesh provides service mesh features without code changes
- EKS vs ECS: EKS for Kubernetes expertise/compliance; ECS for simplicity/AWS-native
- Fargate for EKS reduces node management but costs more