mirror of
https://github.com/ghndrx/terraform-foundation.git
synced 2026-02-10 06:45:06 +00:00
feat: Terraform Foundation - AWS Landing Zone
Enterprise-grade multi-tenant AWS cloud foundation. Modules: - GitHub OIDC for keyless CI/CD authentication - IAM account settings and security baseline - AWS Config Rules for compliance - ABAC (Attribute-Based Access Control) - SCPs (Service Control Policies) Features: - Multi-account architecture - Cost optimization patterns - Security best practices - Comprehensive documentation Tech: Terraform, AWS Organizations, IAM Identity Center
This commit is contained in:
212
docs/COST-OPTIMIZATION.md
Normal file
212
docs/COST-OPTIMIZATION.md
Normal file
@@ -0,0 +1,212 @@
|
||||
# Cost Optimization Guide
|
||||
|
||||
This document outlines cost-saving strategies implemented in this foundation and recommendations for further optimization.
|
||||
|
||||
## Built-In Cost Savings
|
||||
|
||||
### 1. Shared VPC Architecture
|
||||
|
||||
**Savings: ~$256/month per 3 tenants**
|
||||
|
||||
| Approach | NAT Gateways | Monthly Cost |
|
||||
|----------|--------------|--------------|
|
||||
| VPC per tenant (3 tenants, 2 AZ) | 6 | ~$192 |
|
||||
| **Shared VPC (single NAT)** | 1 | ~$32 |
|
||||
|
||||
The shared VPC with tenant isolation via security groups provides the same logical separation at a fraction of the cost.
|
||||
|
||||
### 2. Single NAT Gateway
|
||||
|
||||
For non-production or cost-sensitive workloads:
|
||||
|
||||
```hcl
|
||||
# terraform/02-network/main.tf
|
||||
variable "enable_nat" {
|
||||
default = true # Set to false to save ~$32/mo (no private subnet egress)
|
||||
}
|
||||
```
|
||||
|
||||
**Alternative**: NAT Instance (~$3/mo for t4g.nano) for dev environments.
|
||||
|
||||
### 3. GP3 Storage (Default)
|
||||
|
||||
All EBS and RDS storage uses GP3:
|
||||
- 20% cheaper than GP2
|
||||
- 3,000 IOPS included (vs 100 IOPS/GB for GP2)
|
||||
- Configurable IOPS and throughput
|
||||
|
||||
### 4. Fargate Spot (ECS)
|
||||
|
||||
```hcl
|
||||
# Configured in ECS template
|
||||
default_capacity_provider_strategy {
|
||||
base = 1 # 1 On-Demand for availability
|
||||
weight = 100
|
||||
capacity_provider = "FARGATE" # Change to FARGATE_SPOT for 70% savings
|
||||
}
|
||||
```
|
||||
|
||||
**Savings**: Up to 70% on Fargate compute.
|
||||
|
||||
### 5. EKS Spot Instances
|
||||
|
||||
```hcl
|
||||
# Uncomment in EKS template
|
||||
node_groups = {
|
||||
spot = {
|
||||
instance_types = ["t3.medium", "t3.large", "t3a.medium"] # Diversify!
|
||||
capacity_type = "SPOT"
|
||||
# ...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Savings**: Up to 90% on EC2 compute.
|
||||
|
||||
### 6. S3 Intelligent-Tiering
|
||||
|
||||
For logs bucket (already configured):
|
||||
|
||||
```hcl
|
||||
lifecycle_configuration {
|
||||
rule {
|
||||
transition {
|
||||
days = 90
|
||||
storage_class = "GLACIER"
|
||||
}
|
||||
expiration {
|
||||
days = 2555 # 7 years
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 7. CloudWatch Log Retention
|
||||
|
||||
All log groups configured with retention (default 30 days):
|
||||
|
||||
```hcl
|
||||
retention_in_days = 30 # Adjust based on compliance needs
|
||||
```
|
||||
|
||||
**Cost**: ~$0.03/GB/month for ingestion + storage.
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Compute Right-Sizing
|
||||
|
||||
1. **Start Small**: Use `t3.micro` or `t3.small` for non-prod
|
||||
2. **Monitor**: Use CloudWatch Container Insights / Compute Optimizer
|
||||
3. **Scale Down**: Reduce replica counts in dev/staging
|
||||
|
||||
### Reserved Capacity
|
||||
|
||||
| Resource | Savings | Commitment |
|
||||
|----------|---------|------------|
|
||||
| EC2 Reserved | 30-72% | 1-3 years |
|
||||
| RDS Reserved | 30-60% | 1-3 years |
|
||||
| Savings Plans (Compute) | 20-66% | 1-3 years |
|
||||
| ElastiCache Reserved | 30-55% | 1-3 years |
|
||||
|
||||
**Recommendation**: After 3 months of stable usage, purchase Compute Savings Plans.
|
||||
|
||||
### Database Optimization
|
||||
|
||||
1. **Aurora Serverless v2**: For variable workloads (scales to 0.5 ACU)
|
||||
2. **RDS Proxy**: Pool connections, reduce instance size
|
||||
3. **Read Replicas**: Only for read-heavy workloads
|
||||
4. **Stop Dev Databases**: Use Lambda to stop/start on schedule
|
||||
|
||||
```hcl
|
||||
# Example: Smaller dev database
|
||||
locals {
|
||||
instance_class = local.env == "prod" ? "db.r6g.large" : "db.t3.micro"
|
||||
}
|
||||
```
|
||||
|
||||
### Networking
|
||||
|
||||
1. **VPC Endpoints**: For S3, ECR, Secrets Manager (~$7/mo each, but saves NAT costs)
|
||||
2. **PrivateLink**: For high-volume AWS service access
|
||||
3. **CloudFront**: Cache static content, reduce origin load
|
||||
|
||||
### Monitoring Cost Control
|
||||
|
||||
```hcl
|
||||
# Reduce metric granularity in non-prod
|
||||
enhanced_monitoring_interval = local.env == "prod" ? 60 : 0
|
||||
|
||||
# Disable Performance Insights in dev
|
||||
performance_insights = local.env != "dev"
|
||||
```
|
||||
|
||||
### EKS Specific
|
||||
|
||||
1. **Karpenter**: Better bin-packing than Cluster Autoscaler
|
||||
2. **Bottlerocket OS**: Smaller footprint, faster boot
|
||||
3. **Fargate for Batch**: No idle nodes
|
||||
|
||||
## Cost Monitoring
|
||||
|
||||
### AWS Tools
|
||||
|
||||
1. **Cost Explorer**: Built-in, tag-based analysis
|
||||
2. **Budgets**: Already configured per-tenant
|
||||
3. **Cost Anomaly Detection**: ML-based alerts
|
||||
|
||||
### Third-Party
|
||||
|
||||
1. **Infracost**: PR-level cost estimation (in Makefile)
|
||||
2. **Kubecost**: Kubernetes cost allocation
|
||||
3. **Spot.io**: Spot instance management
|
||||
|
||||
## Environment-Based Defaults
|
||||
|
||||
```hcl
|
||||
locals {
|
||||
# Automatically scale down non-prod
|
||||
instance_class = {
|
||||
prod = "db.r6g.large"
|
||||
staging = "db.t3.small"
|
||||
dev = "db.t3.micro"
|
||||
}[local.env]
|
||||
|
||||
desired_count = {
|
||||
prod = 3
|
||||
staging = 2
|
||||
dev = 1
|
||||
}[local.env]
|
||||
|
||||
multi_az = local.env == "prod"
|
||||
}
|
||||
```
|
||||
|
||||
## Estimated Monthly Costs
|
||||
|
||||
### Minimal Setup (Dev/POC)
|
||||
|
||||
| Resource | Spec | Est. Cost |
|
||||
|----------|------|-----------|
|
||||
| NAT Gateway | 1 | $32 |
|
||||
| RDS | db.t3.micro | $13 |
|
||||
| ECS Fargate | 0.25 vCPU, 0.5GB x 2 | $15 |
|
||||
| ALB | 1 | $16 |
|
||||
| S3 + CloudWatch | Minimal | $5 |
|
||||
| **Total** | | **~$80/mo** |
|
||||
|
||||
### Production (Small)
|
||||
|
||||
| Resource | Spec | Est. Cost |
|
||||
|----------|------|-----------|
|
||||
| NAT Gateway | 1 | $32 |
|
||||
| RDS | db.r6g.large, Multi-AZ | $350 |
|
||||
| ECS Fargate | 1 vCPU, 2GB x 4 | $120 |
|
||||
| ALB | 1 | $25 |
|
||||
| EKS | Control plane | $73 |
|
||||
| EKS Nodes | 2x t3.medium | $60 |
|
||||
| S3 + CloudWatch | Moderate | $30 |
|
||||
| **Total** | | **~$690/mo** |
|
||||
|
||||
### Production (With Savings Plans)
|
||||
|
||||
Same as above with 1-year Compute Savings Plan: **~$480/mo** (30% savings)
|
||||
Reference in New Issue
Block a user