AWS Database Migration Service

Core Components

AWS DMS consists of three main components that work together to perform a migration.

Replication Instance: This is a managed EC2 instance that hosts the replication tasks. It connects to your source and target, performs the data conversion and migration, and caches logs. The instance size should be chosen based on the workload.
Source Endpoint: A set of connection parameters that tells DMS where to find and how to connect to your source database. This can be an on-premises database, a database on an EC2 instance, an RDS instance, or a database from another cloud provider.
Target Endpoint: A set of connection parameters that tells DMS where to find and how to connect to your target database. The target must be a service within AWS, such as RDS, Redshift, S3, or a database on EC2.

The Migration Process

A typical database migration with DMS involves two phases: schema migration and data migration.

1. Schema Migration (using AWS Schema Conversion Tool - SCT)

For heterogeneous migrations (e.g., Oracle to PostgreSQL), the database schema and code objects (views, stored procedures, functions) must be converted.

AWS Schema Conversion Tool (SCT): The recommended tool for complex migrations.
- It analyzes your source database schema and automatically converts it to a format compatible with your target database.
- It provides an assessment report detailing the conversion complexity.
- SCT can also scan application source code to convert embedded SQL statements.
Basic Schema Copy: A feature within DMS for simpler migrations.
- It can automatically create tables and primary keys at the target.
- It does not migrate secondary indexes, foreign keys, or stored procedures. Use SCT for these.

For homogeneous migrations (e.g., MySQL to MySQL), you can typically use native database tools for schema export/import, or use DMS with SCT.

2. Data Migration (using AWS DMS Tasks)

Once the schema is ready on the target, you create a DMS task to move the data. DMS offers three primary migration modes.

DMS Migration Modes (Replication Tasks)

Full Load (migrate-existing-data):
- This mode performs a one-time migration of all existing data from the source to the target.
- It's suitable for scenarios where the source database can be taken offline during the migration.
- During a full load, DMS loads data table by table.
Change Data Capture - CDC (replicate-data-changes-only):
- This mode only replicates ongoing changes from the source to the target. It does not migrate any existing data.
- It's used to set up continuous replication or to sync databases after an initial data load has been completed through other means.
- The source database must be configured to produce replication logs (e.g., binlog for MySQL, transaction logs for SQL Server).
Full Load + CDC (migrate-existing-data-and-replicate-ongoing-changes):
- This is the most common mode for migrations with minimal downtime.
- DMS first performs a full load of all existing data.
- After the full load is complete, it seamlessly transitions to capturing and applying ongoing data changes from the source to the target.
- This keeps the source and target databases in sync, allowing you to switch over at a planned time.

Common Use Cases

Homogeneous Migrations: Migrating between the same database engines (e.g., on-premises MySQL to Amazon RDS for MySQL).
Heterogeneous Migrations: Migrating between different database engines (e.g., on-premises Oracle to Amazon Aurora PostgreSQL). This requires using SCT.
Database Consolidation: Streaming data from multiple source databases into a central data warehouse like Amazon Redshift or a data lake built on Amazon S3.
Continuous Data Replication: For disaster recovery, geographic distribution, or feeding live data to analytical systems. Replication can be from on-premises to AWS or between AWS regions.
- Limitation: DMS does not support replication from an on-premises source to another on-premises target.

Security

Encryption in Transit: DMS uses Secure Sockets Layer (SSL/TLS) to encrypt the connection between source/target endpoints and the replication instance. You can configure DMS to use SSL for endpoint connections.
Encryption at Rest:
- The replication instance storage and target database can be encrypted using an AWS Key Management Service (KMS) key.
- This protects the cached change logs and other data stored on the replication instance's disk.
Network Security: DMS replication instances are deployed within your VPC. You use VPC security groups and network ACLs to control traffic to and from the replication instance and your database endpoints.
Identity and Access Management (IAM): Use IAM roles and policies to manage permissions for accessing DMS resources and other related AWS services.

Monitoring & High Availability

Monitoring: DMS is integrated with Amazon CloudWatch. You can monitor metrics for the replication instance (CPU, Memory, Storage) and replication tasks (latency, throughput).
High Availability: You can enable Multi-AZ for the DMS replication instance. This creates a standby replica in a different Availability Zone, which takes over automatically if the primary instance fails, providing resilience for ongoing replication tasks.

AWS Database Migration Service

📚 Recommended AWS Resources