Core Architecture
Amazon DocumentDB features a cloud-native architecture that decouples storage and compute, allowing each to scale independently. A DocumentDB cluster consists of a cluster volume and DB instances.
- Cluster Volume (Storage Layer): This is the database storage layer. It's a single, virtual volume that spans multiple Availability Zones (AZs). DocumentDB automatically grows the storage size as your data increases, up to 128 TiB.
- Instances (Compute Layer): These are the compute resources that provide the processing power to handle database read and write operations. You can have a primary instance and up to 15 replica instances.
Attribute | Cluster Volume | Instance (Local Storage) |
---|---|---|
Data Type | Persistent data | Temporary data, logs, cache |
Scalability | Automatically scales up to 128 TiB | Limited to the DB Instance class |
Endpoints
DocumentDB provides different endpoints to connect to the appropriate instances for different use cases.
-
Cluster Endpoint:
- Always connects to the cluster's current primary instance.
- This is the main endpoint used for both read and write operations (e.g.,
find
,insert
,update
,delete
).
-
Reader Endpoint:
- Load-balances read-only connections across all available replica instances in the cluster.
- Use this endpoint for read-intensive workloads to distribute the load.
-
Instance Endpoint:
- Connects to a specific DB instance within the cluster.
- Useful for specialized workloads or administrative tasks that need to target a single, specific instance (either primary or a particular replica).
Key Features
MongoDB Compatibility
- Amazon DocumentDB emulates the MongoDB 3.6, 4.0, or 5.0 APIs, allowing you to use most of the same drivers, applications, and tools you use with MongoDB.
Performance
- Designed to deliver high throughput and millisecond latency. According to AWS, it can achieve millions of requests per second and offers up to twice the throughput of self-managed MongoDB.
Scalability
- Storage Scaling: Storage starts at 10 GB and automatically scales up to 128 TiB in 10 GB increments with no impact on cluster performance.
- Compute Scaling: You can scale compute resources vertically by changing the instance class for the DB instances in your cluster.
- Read Scaling: You can horizontally scale read throughput by adding up to 15 low-latency read replicas. Replication lag is typically under 100 milliseconds.
High Availability & Reliability
- Multi-AZ Durability: The cluster volume maintains six copies of your data across three Availability Zones, providing high durability.
- Automatic Failover: DocumentDB supports automatic failover. If the primary instance fails, a replica is automatically promoted to primary. You can set a promotion priority tier for replicas to influence which one is chosen.
- Fast Recovery: In the event of a crash, database restart time is typically under a minute.
- Fault-Tolerant Replicas: Replicas can serve as failover targets with no data loss. For maximum availability, create replicas in multiple AZs.
Backup & Restore
- Automated Backups: Continuous, automated backups are always enabled by default with no performance impact.
- Point-In-Time Restoration (PITR): You can restore your cluster to any point in time within your backup retention period, down to a granularity of five minutes.
- Snapshots: You can create manual cluster snapshots for long-term archival. Encrypted manual snapshots can be shared with other AWS accounts, and you can copy snapshots across AWS Regions for disaster recovery.
Security
- Authentication:
- Connections are authenticated using the standard MongoDB SCRAM (Salted Challenge Response Authentication Mechanism).
- Management APIs are authenticated and authorized using AWS IAM users, roles, and policies.
- Encryption:
- Encryption at Rest: Data is encrypted at rest by default using keys you manage through the AWS Key Management Service (KMS).
- Encryption in Transit: Data is encrypted in transit using Transport Layer Security (TLS).
- Network Isolation:
- DocumentDB clusters are deployed within an Amazon VPC, allowing you to isolate them in your own private network.
- Role-Based Access Control (RBAC):
- DocumentDB supports RBAC using MongoDB-compatible, built-in roles to enforce the principle of least privilege for database users.
Pricing
You are billed based on four main components:
- On-Demand DB Instances: Priced per second of usage (with a 10-minute minimum).
- Database Storage: Billed per GB-month.
- I/O Operations: Billed per million requests.
- Backup Storage: Billed per GB-month for storage exceeding your provisioned database storage size.
Limitations
- While highly compatible, DocumentDB does not support every MongoDB API and aggregation pipeline stage. Always verify that your specific application's required operations are supported.
- Server-side Javascript execution (e.g.,
db.eval()
,$where
operator) is not supported for security reasons.
Sources