AWS Auto Scaling Cheat Sheet

AWS Auto Scaling monitors your applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost. It allows you to build applications that have high availability and are fault-tolerant.

Key Components

1. Auto Scaling Groups (ASG)

A collection of EC2 instances treated as a logical grouping for the purposes of automatic scaling and management.
You define a minimum, maximum, and desired number of instances in each group.

2. Launch Templates & Launch Configurations

Specifies the configuration for the EC2 instances that the Auto Scaling group will launch.
Launch Template (Recommended): The newer version. Allows you to have multiple versions of a template and is more flexible. You can specify AMI ID, instance type, key pairs, security groups, block device mappings, etc.
Launch Configuration (Legacy): The older version. You can only specify one launch configuration for an ASG at a time, and it cannot be edited after creation.

Scaling Options & Policies

You can configure an Auto Scaling group to scale in several ways:

1. Maintain Current Instance Levels

The ASG will maintain the desired capacity at all times. It will perform health checks and replace any unhealthy instances.

2. Manual Scaling

You manually specify changes to the maximum, minimum, or desired capacity of the ASG.

3. Scheduled Scaling

Scale your application in response to predictable load changes.
You create a scheduled action that tells the ASG to perform a scaling action at a specific time (either one-time or on a recurring schedule).
Use Case: Scaling up before the start of a business day and scaling down at the end.

4. Dynamic Scaling

Responds to changing demand by scaling based on a metric.
Target Tracking Scaling: The most common and recommended policy. You select a metric (like average CPU utilization) and set a target value. The ASG automatically calculates the number of instances needed to keep the metric at, or close to, the target value.
Step Scaling: You define a set of scaling adjustments (e.g., "add 2 instances") that vary based on the size of the alarm breach.
Simple Scaling: The oldest policy. Scales based on a single adjustment in response to an alarm. A cooldown period is required to prevent rapid, successive scaling actions.

5. Predictive Scaling

Uses machine learning to analyze historical traffic data and forecast future load demands.
It proactively schedules scaling actions to ensure capacity is available before it's needed.
Only available for EC2 Auto Scaling groups.

Auto Scaling Lifecycle & Features

Lifecycle Hooks

Allow you to perform custom actions as the Auto Scaling group launches or terminates instances.
When a hook is triggered, the instance is paused in a wait state (e.g., Pending:Wait or Terminating:Wait). You can perform custom actions on the instance (like installing software or extracting logs) before it continues.

Cooldown Period

Ensures that the ASG doesn't launch or terminate additional instances before the previous scaling activity has taken effect.
Applies to Simple Scaling policies.

Instance Warm-up

Specifies the time, in seconds, until a newly launched instance can contribute to the CloudWatch metrics for the group. This prevents premature scaling actions based on data from an instance that is still initializing.

Warm Pools

A pool of pre-initialized EC2 instances that sits alongside your ASG.
When the ASG needs to scale out, it pulls from the warm pool, significantly reducing the latency it takes for the instance to be ready to serve traffic.

Health Checks

An ASG determines an instance's health status using:
- EC2 Status Checks: Monitors the underlying system status.
- Elastic Load Balancing (ELB) Health Checks: If you're using an ELB, it can mark an instance as unhealthy.
If an instance is marked as unhealthy, the ASG will terminate it and launch a replacement.

Termination Policies

When an ASG needs to scale in, it uses a termination policy to decide which instance(s) to terminate.
Default Policy: Aims to balance instances across Availability Zones to maintain high availability.
Custom Policies:
- OldestInstance: Terminate the oldest instance.
- NewestInstance: Terminate the newest instance.
- OldestLaunchConfiguration / OldestLaunchTemplate: Terminate instances with the oldest configuration/template.
- ClosestToNextInstanceHour: Terminate instances closest to the next billing hour (cost-saving).

Instance Protection

You can protect specific instances from being terminated during a scale-in event.

Standby State

You can put an instance in a Standby state to temporarily remove it from the ASG. The instance is still running, but the ASG will not send traffic to it or include it in health checks. You can troubleshoot or update it and then return it to service.

AWS Auto Scaling

📚 Recommended AWS Resources