Cloud Native 8 min read

How the New SLS Data Processing Boosts Performance, Cuts Cost, and Simplifies Debugging with SPL

This article explains how Alibaba Cloud's SLS data processing resolves the tension between simple log collection and the need for structured, analyzable data by introducing a unified SPL syntax, delivering over tenfold performance gains, reducing costs to one‑third, and providing powerful debugging tools for cloud‑native log analytics.

Alibaba Cloud Observability

Jul 31, 2024

How the New SLS Data Processing Boosts Performance, Cuts Cost, and Simplifies Debugging with SPL

In system development and operations, logs are crucial but present a conflict between simple output/collection and the need for formatted, stored data for analysis.

High‑performance data pipelines such as Alibaba Cloud SLS or Kafka address the collection side, while SLS data processing provides downstream normalized data for business analysis.

Typical Data‑Processing Scenarios

Normalization : Extract key information from raw logs and convert it to a structured format.

Enrichment : Join additional details (e.g., product info) to raw IDs.

Masking : Comply with data‑security regulations by anonymizing sensitive fields.

Splitting : Separate combined log entries into individual records before analysis.

Distribution : Route different data types to specific targets for downstream use.

Improvements in the New Data‑Processing Engine

Integrated SPL with a unified syntax, offering line‑by‑line debugging and IDE‑like code hints.

Performance boost of over 10×, handling massive data spikes more smoothly.

Cost reduction to one‑third of the previous version.

New Engine Architecture

The engine hosts real‑time consumption tasks, uses SPL rules to process logs, and writes results to target Logstores. A scheduler launches one or more instances per task; each instance consumes one or more source shards, scaling elastically up to the number of source shards.

SPL vs. Legacy DSL

SPL uses a shell‑like syntax, reducing redundancy compared with the Python‑subset DSL. Examples:

| where field='ERROR'

| cmd arg1, arg2

SPL also preserves temporary field types and supports built‑in SQL functions.

Debugging Features

Run: execute the entire SPL script.

Debug: start debugging, pause at the first breakpoint, then step line‑by‑line.

Next Breakpoint / Next Line: continue execution.

Stop: terminate debugging.

Breakpoints are set by clicking the gutter next to line numbers; clicking again removes them.

Future Iterations

Upcoming upgrades will extend support to data‑distribution, enrichment, IP parsing, and cross‑region synchronization, and provide seamless migration of legacy DSL tasks to SPL via AST‑based translation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Engineering debugging Log Processing SPL

Written by

Alibaba Cloud Observability

Driving continuous progress in observability technology!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.