AWS Cloud Architecture: Building Scalable Systems
Amazon Web Services (AWS) provides a comprehensive suite of cloud services that enable you to build scalable, reliable, and cost-effective applications. This guide covers essential AWS services and architectural patterns.
Core AWS Services Overview
Compute Services
- EC2: Virtual servers in the cloud
- Lambda: Serverless compute service
- ECS/EKS: Container orchestration
- Elastic Beanstalk: Platform-as-a-Service
Storage Services
- S3: Object storage service
- EBS: Block storage for EC2
- EFS: Managed file system
- Glacier: Long-term archival storage
Database Services
- RDS: Managed relational databases
- DynamoDB: NoSQL database service
- ElastiCache: In-memory caching
- Redshift: Data warehousing
Networking Services
- VPC: Virtual Private Cloud
- CloudFront: Content Delivery Network
- Route 53: DNS service
- API Gateway: API management
Building a Three-Tier Architecture
Web Tier (Presentation Layer)
# CloudFormation template for web tier
Resources:
# Application Load Balancer
ApplicationLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: WebTierALB
Scheme: internet-facing
Type: application
SecurityGroups:
- !Ref ALBSecurityGroup
Subnets:
- !Ref PublicSubnet1
- !Ref PublicSubnet2
# Auto Scaling Group for web servers
WebTierAutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
VPCZoneIdentifier:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
LaunchTemplate:
LaunchTemplateId: !Ref WebTierLaunchTemplate
Version: !GetAtt WebTierLaunchTemplate.LatestVersionNumber
MinSize: 2
MaxSize: 10
DesiredCapacity: 4
TargetGroupARNs:
- !Ref WebTierTargetGroup
HealthCheckType: ELB
HealthCheckGracePeriod: 300
# Launch Template
WebTierLaunchTemplate:
Type: AWS::EC2::LaunchTemplate
Properties:
LaunchTemplateName: WebTierTemplate
LaunchTemplateData:
ImageId: ami-0abcdef1234567890 # Amazon Linux 2
InstanceType: t3.medium
SecurityGroupIds:
- !Ref WebTierSecurityGroup
IamInstanceProfile:
Arn: !GetAtt WebTierInstanceProfile.Arn
UserData:
Fn::Base64: !Sub |
#!/bin/bash
yum update -y
yum install -y nginx
systemctl start nginx
systemctl enable nginx
# Install CloudWatch agent
yum install -y amazon-cloudwatch-agent
# Configure application
aws s3 cp s3://${ConfigBucket}/app-config.json /etc/app/
Application Tier (Business Logic)
# Application Tier Auto Scaling Group
AppTierAutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
VPCZoneIdentifier:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
LaunchTemplate:
LaunchTemplateId: !Ref AppTierLaunchTemplate
Version: !GetAtt AppTierLaunchTemplate.LatestVersionNumber
MinSize: 2
MaxSize: 8
DesiredCapacity: 3
TargetGroupARNs:
- !Ref AppTierTargetGroup
# Application Load Balancer for app tier
AppTierLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: AppTierALB
Scheme: internal
Type: application
SecurityGroups:
- !Ref AppTierSecurityGroup
Subnets:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
# Launch Template for app servers
AppTierLaunchTemplate:
Type: AWS::EC2::LaunchTemplate
Properties:
LaunchTemplateName: AppTierTemplate
LaunchTemplateData:
ImageId: ami-0abcdef1234567890
InstanceType: t3.large
SecurityGroupIds:
- !Ref AppTierSecurityGroup
IamInstanceProfile:
Arn: !GetAtt AppTierInstanceProfile.Arn
UserData:
Fn::Base64: !Sub |
#!/bin/bash
yum update -y
yum install -y docker
systemctl start docker
systemctl enable docker
# Pull and run application container
docker pull ${ECRRepository}:latest
docker run -d -p 8080:8080 \
-e DB_HOST=${DatabaseEndpoint} \
-e REDIS_HOST=${RedisEndpoint} \
${ECRRepository}:latest
Database Tier
# RDS Database Subnet Group
DBSubnetGroup:
Type: AWS::RDS::DBSubnetGroup
Properties:
DBSubnetGroupDescription: Subnet group for RDS database
SubnetIds:
- !Ref DatabaseSubnet1
- !Ref DatabaseSubnet2
Tags:
- Key: Name
Value: DatabaseSubnetGroup
# RDS Instance
DatabaseInstance:
Type: AWS::RDS::DBInstance
Properties:
DBInstanceIdentifier: production-database
DBInstanceClass: db.r5.xlarge
Engine: postgres
EngineVersion: 13.7
AllocatedStorage: 100
StorageType: gp2
StorageEncrypted: true
MasterUsername: postgres
MasterUserPassword: !Ref DatabasePassword
VPCSecurityGroups:
- !Ref DatabaseSecurityGroup
DBSubnetGroupName: !Ref DBSubnetGroup
BackupRetentionPeriod: 7
PreferredBackupWindow: 03:00-04:00
PreferredMaintenanceWindow: sun:04:00-sun:05:00
DeletionProtection: true
MultiAZ: true
# ElastiCache Redis Cluster
RedisSubnetGroup:
Type: AWS::ElastiCache::SubnetGroup
Properties:
Description: Subnet group for Redis cluster
SubnetIds:
- !Ref DatabaseSubnet1
- !Ref DatabaseSubnet2
RedisCluster:
Type: AWS::ElastiCache::ReplicationGroup
Properties:
ReplicationGroupId: production-redis
ReplicationGroupDescription: Redis cluster for caching
NumCacheClusters: 2
Engine: redis
CacheNodeType: cache.r5.large
Port: 6379
SecurityGroupIds:
- !Ref RedisSecurityGroup
CacheSubnetGroupName: !Ref RedisSubnetGroup
AutomaticFailoverEnabled: true
MultiAZEnabled: true
Serverless Architecture with Lambda
API Gateway + Lambda + DynamoDB
# lambda_function.py
import json
import boto3
import uuid
from datetime import datetime
from decimal import Decimal
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('ProductsTable')
def lambda_handler(event, context):
http_method = event['httpMethod']
path = event['path']
try:
if http_method == 'GET' and path == '/products':
return get_products()
elif http_method == 'GET' and path.startswith('/products/'):
product_id = path.split('/')[-1]
return get_product(product_id)
elif http_method == 'POST' and path == '/products':
return create_product(json.loads(event['body']))
elif http_method == 'PUT' and path.startswith('/products/'):
product_id = path.split('/')[-1]
return update_product(product_id, json.loads(event['body']))
elif http_method == 'DELETE' and path.startswith('/products/'):
product_id = path.split('/')[-1]
return delete_product(product_id)
else:
return {
'statusCode': 404,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps({'error': 'Not found'})
}
except Exception as e:
print(f"Error: {str(e)}")
return {
'statusCode': 500,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps({'error': 'Internal server error'})
}
def get_products():
response = table.scan()
products = response['Items']
# Convert Decimal to float for JSON serialization
for product in products:
if 'price' in product:
product['price'] = float(product['price'])
return {
'statusCode': 200,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps(products)
}
def get_product(product_id):
response = table.get_item(Key={'id': product_id})
if 'Item' not in response:
return {
'statusCode': 404,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps({'error': 'Product not found'})
}
product = response['Item']
if 'price' in product:
product['price'] = float(product['price'])
return {
'statusCode': 200,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps(product)
}
def create_product(product_data):
product_id = str(uuid.uuid4())
timestamp = datetime.utcnow().isoformat()
item = {
'id': product_id,
'name': product_data['name'],
'description': product_data.get('description', ''),
'price': Decimal(str(product_data['price'])),
'category': product_data.get('category', ''),
'created_at': timestamp,
'updated_at': timestamp
}
table.put_item(Item=item)
# Convert Decimal back to float for response
item['price'] = float(item['price'])
return {
'statusCode': 201,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps(item)
}
def update_product(product_id, updates):
# Check if product exists
response = table.get_item(Key={'id': product_id})
if 'Item' not in response:
return {
'statusCode': 404,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps({'error': 'Product not found'})
}
# Build update expression
update_expression = "SET updated_at = :timestamp"
expression_values = {':timestamp': datetime.utcnow().isoformat()}
for key, value in updates.items():
if key != 'id': # Don't allow updating ID
update_expression += f", {key} = :{key}"
if key == 'price':
expression_values[f':{key}'] = Decimal(str(value))
else:
expression_values[f':{key}'] = value
table.update_item(
Key={'id': product_id},
UpdateExpression=update_expression,
ExpressionAttributeValues=expression_values
)
# Get updated item
response = table.get_item(Key={'id': product_id})
product = response['Item']
if 'price' in product:
product['price'] = float(product['price'])
return {
'statusCode': 200,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps(product)
}
def delete_product(product_id):
# Check if product exists
response = table.get_item(Key={'id': product_id})
if 'Item' not in response:
return {
'statusCode': 404,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*'
},
'body': json.dumps({'error': 'Product not found'})
}
table.delete_item(Key={'id': product_id})
return {
'statusCode': 204,
'headers': {
'Access-Control-Allow-Origin': '*'
}
}
SAM Template for Serverless App
# template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Parameters:
Environment:
Type: String
Default: dev
AllowedValues: [dev, staging, prod]
Globals:
Function:
Timeout: 30
Runtime: python3.9
Environment:
Variables:
ENVIRONMENT: !Ref Environment
DYNAMODB_TABLE: !Ref ProductsTable
Resources:
# API Gateway
ProductsApi:
Type: AWS::Serverless::Api
Properties:
StageName: !Ref Environment
Cors:
AllowMethods: "'GET,POST,PUT,DELETE,OPTIONS'"
AllowHeaders: "'Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token'"
AllowOrigin: "'*'"
# Lambda Function
ProductsFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/
Handler: lambda_function.lambda_handler
Events:
GetProducts:
Type: Api
Properties:
RestApiId: !Ref ProductsApi
Path: /products
Method: get
GetProduct:
Type: Api
Properties:
RestApiId: !Ref ProductsApi
Path: /products/{id}
Method: get
CreateProduct:
Type: Api
Properties:
RestApiId: !Ref ProductsApi
Path: /products
Method: post
UpdateProduct:
Type: Api
Properties:
RestApiId: !Ref ProductsApi
Path: /products/{id}
Method: put
DeleteProduct:
Type: Api
Properties:
RestApiId: !Ref ProductsApi
Path: /products/{id}
Method: delete
Policies:
- DynamoDBCrudPolicy:
TableName: !Ref ProductsTable
# DynamoDB Table
ProductsTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: !Sub 'ProductsTable-${Environment}'
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: id
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
StreamSpecification:
StreamViewType: NEW_AND_OLD_IMAGES
# S3 Bucket for file uploads
FilesBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: !Sub 'products-files-${Environment}-${AWS::AccountId}'
CorsConfiguration:
CorsRules:
- AllowedHeaders: ['*']
AllowedMethods: [GET, PUT, POST, DELETE]
AllowedOrigins: ['*']
MaxAge: 3000
Outputs:
ApiUrl:
Description: API Gateway endpoint URL
Value: !Sub 'https://${ProductsApi}.execute-api.${AWS::Region}.amazonaws.com/${Environment}'
Export:
Name: !Sub '${AWS::StackName}-ApiUrl'
TableName:
Description: DynamoDB table name
Value: !Ref ProductsTable
Export:
Name: !Sub '${AWS::StackName}-TableName'
Monitoring and Observability
CloudWatch Dashboards
{
"widgets": [
{
"type": "metric",
"properties": {
"metrics": [
["AWS/ApplicationELB", "RequestCount", "LoadBalancer", "app/WebTierALB/1234567890"],
[".", "TargetResponseTime", ".", "."],
[".", "HTTPCode_Target_2XX_Count", ".", "."],
[".", "HTTPCode_Target_4XX_Count", ".", "."],
[".", "HTTPCode_Target_5XX_Count", ".", "."]
],
"period": 300,
"stat": "Sum",
"region": "us-east-1",
"title": "ALB Metrics"
}
},
{
"type": "metric",
"properties": {
"metrics": [
["AWS/EC2", "CPUUtilization", "AutoScalingGroupName", "WebTierAutoScalingGroup"],
[".", "NetworkIn", ".", "."],
[".", "NetworkOut", ".", "."]
],
"period": 300,
"stat": "Average",
"region": "us-east-1",
"title": "EC2 Metrics"
}
},
{
"type": "metric",
"properties": {
"metrics": [
["AWS/RDS", "CPUUtilization", "DBInstanceIdentifier", "production-database"],
[".", "DatabaseConnections", ".", "."],
[".", "ReadLatency", ".", "."],
[".", "WriteLatency", ".", "."]
],
"period": 300,
"stat": "Average",
"region": "us-east-1",
"title": "RDS Metrics"
}
}
]
}
X-Ray Tracing
# Lambda with X-Ray tracing
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all
# Patch AWS SDK calls
patch_all()
@xray_recorder.capture('lambda_handler')
def lambda_handler(event, context):
subsegment = xray_recorder.begin_subsegment('business_logic')
try:
# Your business logic here
result = process_request(event)
subsegment.put_annotation('result_count', len(result))
return result
except Exception as e:
subsegment.add_exception(e)
raise
finally:
xray_recorder.end_subsegment()
@xray_recorder.capture('process_request')
def process_request(event):
# Process the request
return {'status': 'success'}
Security Best Practices
IAM Roles and Policies
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:UpdateItem",
"dynamodb:DeleteItem",
"dynamodb:Query",
"dynamodb:Scan"
],
"Resource": [
"arn:aws:dynamodb:*:*:table/ProductsTable",
"arn:aws:dynamodb:*:*:table/ProductsTable/index/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::products-files-*/*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
}
]
}
Security Groups
WebTierSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for web tier
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 80
ToPort: 80
SourceSecurityGroupId: !Ref ALBSecurityGroup
- IpProtocol: tcp
FromPort: 443
ToPort: 443
SourceSecurityGroupId: !Ref ALBSecurityGroup
- IpProtocol: tcp
FromPort: 22
ToPort: 22
SourceSecurityGroupId: !Ref BastionSecurityGroup
DatabaseSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for database tier
VpcId: !Ref VPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 5432
ToPort: 5432
SourceSecurityGroupId: !Ref AppTierSecurityGroup
Cost Optimization
Reserved Instances and Savings Plans
# AWS CLI commands for cost optimization
# Get Reserved Instance recommendations
aws ce get-reservation-purchase-recommendation \
--service "Amazon Elastic Compute Cloud - Compute" \
--account-scope PAYER
# Get Savings Plans recommendations
aws ce get-savings-plans-purchase-recommendation \
--savings-plans-type COMPUTE_SP
# Get cost and usage reports
aws ce get-cost-and-usage \
--time-period Start=2024-01-01,End=2024-01-31 \
--granularity MONTHLY \
--metrics BlendedCost \
--group-by Type=DIMENSION,Key=SERVICE
Auto Scaling Policies
# Target Tracking Scaling Policy
CPUTargetTrackingPolicy:
Type: AWS::AutoScaling::ScalingPolicy
Properties:
AutoScalingGroupName: !Ref WebTierAutoScalingGroup
PolicyType: TargetTrackingScaling
TargetTrackingConfiguration:
PredefinedMetricSpecification:
PredefinedMetricType: ASGAverageCPUUtilization
TargetValue: 70.0
ScaleOutCooldown: 300
ScaleInCooldown: 300
# Step Scaling Policy
RequestCountScalingPolicy:
Type: AWS::AutoScaling::ScalingPolicy
Properties:
AutoScalingGroupName: !Ref WebTierAutoScalingGroup
PolicyType: StepScaling
AdjustmentType: ChangeInCapacity
StepAdjustments:
- MetricIntervalLowerBound: 0
MetricIntervalUpperBound: 50
ScalingAdjustment: 1
- MetricIntervalLowerBound: 50
ScalingAdjustment: 2
Disaster Recovery
Multi-Region Setup
# Cross-region replication for S3
S3ReplicationRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: s3.amazonaws.com
Action: sts:AssumeRole
Policies:
- PolicyName: ReplicationPolicy
PolicyDocument:
Statement:
- Effect: Allow
Action:
- s3:GetObjectVersionForReplication
- s3:GetObjectVersionAcl
Resource: !Sub '${SourceBucket}/*'
- Effect Allow
Action:
- s3:ReplicateObject
- s3:ReplicateDelete
Resource: !Sub '${DestinationBucket}/*'
# RDS Cross-Region Backup
DatabaseSnapshot:
Type: AWS::RDS::DBSnapshot
Properties:
DBInstanceIdentifier: !Ref DatabaseInstance
DBSnapshotIdentifier: !Sub '${AWS::StackName}-snapshot-${AWS::Region}'
Conclusion
AWS provides a rich ecosystem of services for building scalable, reliable, and cost-effective cloud applications. By following well-architected principles and implementing proper monitoring, security, and cost optimization strategies, you can build robust systems that scale with your business needs.
Start with simple architectures and gradually add complexity as your requirements grow. Always consider the five pillars of the Well-Architected Framework: operational excellence, security, reliability, performance efficiency, and cost optimization.