Aurora Serverless Tutorial – Part 1

What Problem Does Aurora Serverless Solve?

Traditionally, managing a database requires you to provision a specific server size and capacity. This leads to common challenges:

Over-provisioning: To handle peak load, you often have to provision expensive, powerful instances that are underutilized most of the time.
Under-provisioning: To save costs, you might provision less capacity, which can lead to poor performance or downtime during traffic spikes.
Manual Scaling: Constant monitoring and manual intervention are needed to scale database capacity up or down as application load changes.

Aurora Serverless is designed to solve these problems by introducing a "pay-per-use" model that automatically adapts to your application's needs.

Aurora Serverless v1 works by decoupling compute from storage and introducing a proxy layer.

Distributed Storage: Like standard Aurora, the storage layer is a distributed, fault-tolerant volume that scales automatically.
Proxy Fleet: This is the key innovation. Instead of connecting directly to a database instance, your application connects to a highly available proxy fleet. This fleet manages client connections and routes them to the appropriate compute resources.
Warm Compute Pool: AWS maintains a pool of "warm" compute resources (the database engines). The proxy fleet can rapidly assign these resources to handle an incoming workload and release them when the workload subsides.

This architecture allows the database to "appear" serverless to the application, scaling compute up or down without dropping connections.

Diagram showing the proxy fleet in front of the database

Aurora Capacity Unit (ACU): The fundamental unit of measure for Aurora Serverless. An ACU is a combination of processing (vCPU) and memory capacity. You don't provision instances; you set a range of ACUs.
Min/Max Capacity: When you create a Serverless DB cluster, you define a minimum and maximum number of ACUs. The database will automatically scale within this range based on the current load.
Scaling to Zero: A key feature is the ability to automatically pause compute capacity after a specified period of inactivity (e.g., 5 minutes). When paused, the cluster "scales to zero," and you only pay for storage. When a new connection request comes in, the proxy fleet automatically resumes the compute resources.

No Instance Management: You don't have to choose DB instance types or provision servers.
Autoscaling: The database automatically starts up, shuts down, and scales capacity up or down based on your application's needs.
Cost-Effective: By automatically pausing when idle and scaling to match demand, you avoid paying for unused capacity.
Always Encrypted: Encryption at rest is enabled by default and cannot be disabled.
High Availability: Includes automatic Multi-AZ failover and the same fault-tolerant storage as standard Aurora.

Aurora Serverless v1 is ideal for specific types of workloads:

Infrequent or Intermittent Workloads: Applications that are used only a few times a day or week.
Unpredictable Workloads: Applications with sudden, unpredictable traffic spikes, such as a flash sale or a new product launch.
New Applications: When you are unsure of the required database capacity and instance size for a new application.
Development and Testing: Environments that are not used continuously, allowing them to scale down to zero when idle to save costs.

Highly Predictable Workloads: If your application has a very stable and predictable traffic pattern, a provisioned Aurora cluster with Reserved Instances might be more cost-effective.
Long-Running Queries: Workloads with very long queries or transactions that require a persistent connection may not be a good fit for the proxy-based architecture, which is better suited for shorter, more frequent queries.
Sources