Cloud Native 18 min read

How to Seamlessly Fuse Multi‑Cloud Data with Alibaba SLS Object Import

This guide explains the challenges of multi‑cloud log aggregation and shows how Alibaba Cloud Log Service’s Object Import feature uses a two‑stage parallel architecture, smart file discovery, elastic scaling, and support for various file and compression formats to enable fast, reliable data integration from OSS and S3, with practical step‑by‑step walkthroughs and real‑world use cases such as billing and operation audits.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How to Seamlessly Fuse Multi‑Cloud Data with Alibaba SLS Object Import

Challenges in Multi‑Cloud Log Integration

Timely file discovery : Cloud object storage APIs only provide bucket traversal without time‑ordered listing, making it hard to detect newly added files among billions of objects.

Elastic scaling : Log volume fluctuates with business cycles; without auto‑scaling resources are either wasted during low traffic or cause latency during peaks.

Heterogeneous data formats : Different clouds and services emit logs with varying schemas and file types, requiring unified transformation.

SLS Object Import Architecture

File discovery stage : Implements multiple intelligent strategies to capture all new objects while minimizing overhead.

Data pulling stage : Independently pulls file contents at high speed.

Parallel execution : Both stages run concurrently, eliminating the discovery bottleneck of traditional pipelines.

Fast New‑File Detection (≤ 1 minute)

Periodic full‑bucket traversal as a reliable baseline.

Incremental traversal ordered by dictionary order for buckets that use monotonic naming.

Leverage OSS metadata indexing to accelerate detection in Alibaba Cloud OSS.

Enable AWS SQS notifications for S3 to receive instant change events.

Supported File and Compression Formats

Log formats: single‑line text, multi‑line text, single‑line JSON, CSV, ORC, Parquet.

Compression: zip, gzip, zstd, lz4, snappy, bzip2, deflate.

Elastic Scaling for Bursty Workloads

The import task automatically adjusts concurrency based on the number of files. Small files receive more parallel pull workers, while the system scales down during low‑traffic periods to avoid resource waste.

OSS Import Workflow

Grant the target Alibaba Cloud account the required OSS read permissions.

In the SLS console, create a new OSS import task and select the destination logstore (e.g., oss‑ingestion‑test).

Choose the source bucket region (e.g., Hangzhou) and the bucket name.

Specify a file‑path prefix (e.g., ingestion‑test/2025/04) to import only the desired directory.

Select the data format (e.g., single‑line JSON) and compression (e.g., snappy).

Optionally disable periodic checks for one‑time historical imports.

Preview the import result; if satisfactory, create the task.

By default the log timestamp is the ingestion time. To use a custom timestamp field, map the field (e.g., __time__) to epoch format via the “Time Format” setting.

S3 Import Workflow

Provide an AWS AccessKey ID and SecretAccessKey with read access to the target S3 bucket.

Create a new S3 import task in SLS, select the destination logstore (e.g., s3‑ingestion‑test), and set the bucket region (e.g., ap‑northeast‑1).

Enable SQS notifications when the bucket contains more than one million objects; otherwise the system falls back to full traversal.

Define file‑path prefix, optional regex filters, and time‑range filters as needed.

Configure data format, compression, and encoding, then preview and create the task.

Use Cases

Cross‑Cloud Billing Audit

* | project-rename product=ProductCode
| extend cost=PretaxGrossAmount
| extend originProduct='aliyun'
| project product, cost, originProduct
* | project-rename product=line_item_product_code
| extend cost=pricing_public_on_demand_cost
| extend originProduct='aws'
| project product, cost, originProduct

Aggregated queries can compute daily costs per provider or per product, applying currency conversion as needed.

Cross‑Cloud Operation Audit

* | expand-values -path='$.Records' content as item
| parse-json item
| project-away item
| extend originProduct='aws'
* | parse-json event
| project-away event
| extend originProduct='aliyun'

Typical analyses include hourly operation counts per product, detection of delete actions, and alerting on risky operations.

Best Practices

Compress large files with zstd (≈ 3.5× size reduction) to lower public‑network transfer costs.

When a bucket exceeds one million objects, use dictionary‑order naming for new files to guarantee discovery within two minutes.

Prefer many small files over few large ones; SLS allocates one concurrent pull per file, so smaller files increase overall throughput.

Organize logs into domain‑specific directories and run multiple import tasks in parallel for near‑real‑time ingestion.

Never append data to an existing file; SLS treats each new file as a distinct event to avoid duplicate ingestion.

Conclusion

SLS Object Import provides a cloud‑native solution for unifying multi‑cloud log data. By separating file discovery from data pulling, supporting parallel execution, and offering elastic scaling, it achieves near‑real‑time visibility of new data from OSS and S3 while handling a wide range of file formats and compressions. Future releases will extend support to additional cloud providers and more complex ingestion scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.