AWS Compute Services

AWS Lambda

6 min read
Updated June 21, 2025
6,475 characters

The Core Paradigm: Event-Driven, Serverless Functions

The fundamental principle of Lambda is simple: You write the code, and Lambda runs it in response to events.

  • Serverless: You never have to worry about the underlying servers. AWS handles all the operational and administrative tasks, including OS patching, capacity provisioning, auto-scaling, code monitoring, and logging.
  • Event-Driven: A Lambda function remains idle until it is triggered by an event. An event source is the AWS service or custom application that publishes events to trigger your function.

Common event sources include:

  • API Gateway: An HTTP request triggers the function (for building REST APIs).
  • Amazon S3: An object creation or deletion event triggers the function (for image processing, data validation).
  • Amazon DynamoDB: A change to a table item triggers the function (via DynamoDB Streams).
  • Amazon SQS: A new message in a queue triggers the function (for decoupling and background processing).
  • Amazon EventBridge (CloudWatch Events): A scheduled event (cron job) or a custom event triggers the function.
  • AWS Step Functions: A state in a workflow triggers the function.

The Lambda Execution Lifecycle: Cold vs. Warm Starts

When a function is triggered for the first time or after a period of inactivity, Lambda creates a new execution environment. This process is known as a cold start. It involves:

  1. Downloading your code.
  2. Starting a new secure micro-VM (Firecracker).
  3. Initializing the function runtime (e.g., the JVM or Python interpreter).
  4. Running your function's initialization code (code outside the main handler).

This setup process can add latency to the first request. After the function runs, Lambda keeps the execution environment "warm" for a period, ready to process subsequent requests instantly. A request that hits a warm environment has much lower latency.


Core Function Configuration

When creating a Lambda function, you define several key parameters:

  • Runtime: The language environment for your code (e.g., Node.js, Python, Java, Go, Ruby, .NET). You can also provide a custom runtime.
  • Memory: You can allocate memory from 128 MB to 10,240 MB. CPU power is allocated proportionally to the memory you select.
  • Timeout: The maximum amount of time (up to 900 seconds / 15 minutes) that a single invocation of your function is allowed to run before being terminated.
  • Execution Role: A mandatory IAM role that grants your function permissions to interact with other AWS services. For example, a role might grant s3:GetObject permission to read from an S3 bucket and logs:CreateLogStream to write logs to CloudWatch.
  • Environment Variables: Key-value pairs that you can pass to your function code without changing the code itself, useful for storing configuration settings.

Managing Deployments and Dependencies

Versions and Aliases

Lambda provides powerful features for managing deployments safely.

  • Versions: Every time you upload new code to a function, you can publish a new, numbered Version. Versions are immutable snapshots of your function's code and configuration. The default version is $LATEST.
  • Aliases: An Alias is a pointer to a specific function version. You can create aliases like dev, staging, and prod. To promote code, you simply repoint the prod alias from an old version to a new, tested version. Aliases also support weighted routing, allowing you to perform gradual rollouts (e.g., sending 10% of traffic to the new version and 90% to the old one) for blue/green or canary deployments.

Lambda Layers

A Layer is a ZIP archive that can contain libraries, a custom runtime, or other dependencies.

  • Purpose: Layers allow you to manage your function's dependencies separately and share them across multiple functions. This keeps your deployment package size small and simplifies updates. A function can use up to 5 layers.

Concurrency, Networking, and Error Handling

Concurrency and Scaling

  • Concurrency is the number of requests that your function is serving at any given time. When your function is invoked, Lambda launches an instance of it to process the event. If other events come in while the first is still being processed, Lambda launches more instances.
  • By default, your account has a concurrency limit of 1,000 across all functions in a region.
  • You can configure Reserved Concurrency for a function to guarantee a certain number of concurrent executions are available for it at all times. This also acts as a "throttle" to prevent a function from overwhelming a downstream resource (like a database).

Networking

  • By default, a Lambda function can access the public internet but cannot access resources inside your VPC.
  • To access resources in a VPC (like an RDS database or an ElastiCache cluster), you must configure your function to connect to the VPC. When you do this, Lambda attaches an Elastic Network Interface (ENI) to your function, allowing it to communicate securely within your private network.

Error Handling and Dead-Letter Queues (DLQ)

  • Synchronous Invocations (e.g., from API Gateway): Errors are returned directly to the caller.
  • Asynchronous Invocations (e.g., from S3): If the function fails, Lambda will automatically retry the invocation twice. If all retries fail, the event is discarded. To avoid losing failed events, you can configure a Dead-Letter Queue (DLQ). A DLQ is an SQS queue or SNS topic where Lambda sends the failed event payload for later analysis and reprocessing.

Lambda Function URLs and Pricing

  • Function URLs: A newer, built-in feature that provides a dedicated HTTPS endpoint for your Lambda function. It's a simpler and more direct way to invoke a function via an HTTP request without needing to configure a separate API Gateway. Ideal for webhooks or simple, single-function APIs.
  • Pricing Model: Lambda has a generous free tier. Beyond that, you pay for two things:
    1. Number of Requests: A flat rate per million requests.
    2. Duration: The total compute time (in GB-seconds) consumed by your functions. This is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest millisecond, and multiplied by the memory you configured.