Databases 5 min read

AWS S3 SELECT, Aurora Multi-Master & Serverless, DynamoDB Backup & Global Tables, and Neptune Overview

The article introduces AWS S3 SELECT for in‑object SQL filtering, highlights Aurora's new Multi‑Master and Serverless capabilities, describes DynamoDB's on‑demand backup and Global Tables, and provides a brief overview of the managed graph database Neptune, emphasizing performance and cost benefits.

Liulishuo Tech Team
Liulishuo Tech Team
Liulishuo Tech Team
AWS S3 SELECT, Aurora Multi-Master & Serverless, DynamoDB Backup & Global Tables, and Neptune Overview

Storage

As more users adopt S3 as a data lake, AWS has released the S3 SELECT feature, which pushes SQL SELECT statements down to S3 objects, reducing data transfer and delivering up to four‑times performance improvement.

# 官方示例代码
import boto3
from s3select import ResponseHandler

class PrintingResponseHandler(ResponseHandler):
    def handle_records(self, record_data):
        print(record_data.decode('utf-8'))

handler = PrintingResponseHandler()
s3 = boto3.client('s3')
response = s3.select_object_content(
    Bucket="super-secret-reinvent-stuff",
    Key="stuff.csv",
    SelectRequest={
        'ExpressionType': 'SQL',
        'Expression': 'SELECT s._1 FROM S3Object AS s'',
        'InputSerialization': {
            'CompressionType': 'NONE',
            'CSV': {
                'FileHeaderInfo': 'IGNORE',
                'RecordDelimiter': '\n',
                'FieldDelimiter': ',',
            }
        },
        'OutputSerialization': {
            'CSV': {
                'RecordDelimiter': '\n',
                'FieldDelimiter': ',',
            }
        }
    }
)
handler.handle_response(response['Body'])

Preview currently supports CSV and JSON objects; future format support is expected. During preview S3 SELECT is free, and for Athena users it can significantly reduce data scanned and lower costs.

The ecosystem must also adopt the feature, with vendors like Cloudara and DataBricks expected to add support.

Glacier offers a similar capability called Glacier SELECT, which pushes SELECT logic to Glacier for faster data access.

Databases

Database updates include:

Aurora now supports Multi‑Master and Serverless.

DynamoDB adds on‑demand backup/recovery and Global Tables.

Amazon Neptune, a managed graph database, has been launched.

Aurora Multi‑Master

Previously Aurora allowed only a single master; now multiple AZ masters provide automatic failover with zero downtime. While write throughput is expected to improve, the impact on latency due to conflict resolution remains to be seen.

Aurora Serverless

Aurora Serverless offers true on‑demand, pay‑as‑you‑go OLTP capabilities, automatically scaling compute resources based on load without user‑managed nodes.

For more Aurora details, see the AWS Database team session “Deep Dive on the Amazon Aurora MySQL version”.

DynamoDB

DynamoDB now offers a one‑click backup feature that can back up tables of any size in seconds, even up to petabyte scale.

Global Tables provide cross‑region data availability with automatic conflict resolution, akin to an AWS‑managed version of Google Spanner.

Amazon Neptune

Neptune is a fully managed graph database with standard access interfaces, positioned as the “Aurora of the graph database space”.

[1] https://www.youtube.com/watch?v=rPmKo2g9znA

Stay tuned for more engineer insights in the coming days.

DatabaseAWSAuroraDynamoDBNeptuneS3 SELECT
Liulishuo Tech Team
Written by

Liulishuo Tech Team

Help everyone become a global citizen!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.