Amazon Aurora Machine Learning

The Problem It Solves

Traditionally, getting an ML prediction for data stored in a database required a multi-step, often slow, process:

Aurora ML simplifies this entire workflow into a single SQL query, eliminating the need for custom applications or data movement.

Diagram showing the simplified workflow with Aurora ML

Aurora ML is natively integrated with two key AWS Machine Learning services:

What it is: A fully managed service for building, training, and deploying any kind of ML model.
Aurora Integration: Allows you to invoke your custom SageMaker models directly from Aurora.
Common Use Cases:
- Fraud Detection: Pass transaction data to a classification model to get a real-time fraud score.
- Predicting Customer Churn: Analyze customer activity data to predict which customers are likely to leave.
- Product Recommendations: Recommend products to users based on their Browse history and past purchases.
- Credit Risk Assessment: Evaluate loan applications by passing applicant data to a risk model.

What it is: A Natural Language Processing (NLP) service that uses ML to find insights in text.
Aurora Integration: Allows you to analyze text stored in your database columns.
Common Use Cases:
- Sentiment Analysis: Determine if a product review or social media comment stored in a VARCHAR column is positive, negative, neutral, or mixed. This is the most common use case.
- Entity Detection: Extract key entities like people, places, and brands from text.

Grant Permissions: An administrator grants the Aurora DB cluster permission to access SageMaker and/or Comprehend via an IAM role.
Define a Function: A database user defines a stored function using SQL. This function points to a specific SageMaker model endpoint or a Comprehend action (e.g., sentiment analysis).
Invoke via SQL: The user calls the newly defined function within a standard SQL query, passing one or more table columns as inputs.
Batch & Predict: Aurora automatically gathers the data from the query, calls the AWS ML service in an optimized batch format, and gets the predictions.
Return Results: The predictions are returned to the user as a new column or value within the query results.

Real-Time Predictions: Because the integration is highly optimized and low-latency, you can enrich your application data with ML predictions in real-time.
Simplified Architecture: No need for "middleman" applications or complex data pipelines to move data for inference.
Improved Security: Data doesn't have to leave your VPC to get predictions (when using VPC Endpoints for the ML services), enhancing your security posture.
Ease of Use: Any developer or DBA who knows SQL can add ML capabilities to an application without needing deep ML expertise.

Supported Engines:
- Aurora MySQL (version 2.07.0 and higher, compatible with MySQL 5.7)
- Aurora PostgreSQL (version 1.1 and higher, compatible with PostgreSQL 10 and 11+)
Pricing: There is no additional charge for the Aurora Machine Learning feature itself. You only pay for the underlying usage of the SageMaker or Comprehend services that you invoke.
Sources