How to Secure Cloud Logs: End‑to‑End Sensitive Data Scanning and Masking on Alibaba Cloud

This guide walks through why enterprises need to scan and protect sensitive log data, explains the regulatory background, and provides a step‑by‑step implementation on Alibaba Cloud using Data Security Center, Logtail, SPL, Ingest Processors, SDKs, StoreView queries and periodic scans to achieve comprehensive data security and governance.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How to Secure Cloud Logs: End‑to‑End Sensitive Data Scanning and Masking on Alibaba Cloud

Background

Scanning and protecting sensitive data (personal identifiers, financial records, medical information) in log assets helps prevent unauthorized access, meet regulatory requirements such as GDPR, and improve overall data governance.

Solution Architecture

The solution uses Alibaba Cloud Data Security Center (DSC) to discover and classify sensitive data in Log Service (SLS) logs. It consists of two phases: an integration phase that configures data collection and desensitization, and an iteration/optimization phase that refines the desensitization rules based on DSC scan results.

Integration Phase

Analyze sample logs : Manually inspect a subset of logs to identify sensitive fields (e.g., name, email, IP, credit‑card, phone) and define industry‑specific tags.

SLS desensitization options :

Logtail side‑desensitization (Data Flow 1) – configure Logtail with a processing plugin that masks fields using regular expressions or SPL.

Logtail + Ingest Processor (Data Flow 2) – collect raw logs with Logtail, then apply SPL‑based masking in an Ingest Processor on the server side.

SDK + Ingest Processor (Data Flow 3) – write logs via Alibaba Cloud SDKs (Java, Python, Go, etc.) and configure the same SPL rules in an Ingest Processor.

DSC sensitive scanning : Authorize the SLS logstore in the DSC console, create a custom identification task, select the asset range and time window, and run the scan. The scan returns sensitivity levels and sample data for each field.

Iterative Optimization

Based on scan results, continuously adjust desensitization rules and schedule periodic scans (daily, weekly, or monthly) to keep protection up‑to‑date with new log types.

Mock Data Construction

A synthetic e‑commerce transaction log is generated with Mockaroo. Example JSON records:

{"customer_name":"Clarita Bassick","product_name":"Wine - Rhine Riesling Wolf Blass","quantity":88,"price":865.08,"purchase_date":"11/21/2024","shipping_address":"4 Twin Pines Terrace","payment_method":"PayPal","transaction_id":1,"delivery_status":"In Transit","email":"[email protected]","ip_address":"199.224.149.82","creadit_card":"56022548472842990","phone_number":"+27 380 246 6745"}

These logs are ingested into SLS via Logtail or SDK.

Desensitization Implementations

Logtail Side‑Desensitization

Configure Logtail with a JSON parsing plugin followed by a data‑desensitization plugin. The plugin masks fields according to the rules defined in the integration step.

SPL‑Based Desensitization

Use SPL pipelines to parse JSON and replace sensitive values with masked patterns:

* | parse-json content
| extend creadit_card=regexp_replace(creadit_card, '\d{12}$','***********')
| extend customer_name=regexp_replace(customer_name, '\S{4}$','******')
| extend phone_number=regexp_replace(phone_number, '\S{3}\s\S{4}$','*** ****')
| extend ip_address=regexp_replace(ip_address, '\d{1,3}.\d{1,3}$','**.**')
| extend email=regexp_replace(email, '\S+@','****@')
| project-away content

Logtail + Ingest Processor

Collect raw logs with Logtail, then attach an Ingest Processor to the target logstore and paste the same SPL statements. This offloads desensitization to the server side, preserving client performance.

SDK + Ingest Processor

Write logs using Alibaba Cloud SDKs (Java, Python, Go, etc.) and configure the SPL rules in an Ingest Processor attached to the logstore. The processing flow is identical to the Logtail + Ingest Processor case.

Sensitive Data Scanning with DSC

Create Scan Task

In DSC, authorize the SLS logstore, select a built‑in or custom identification template (e.g., Internet industry template), and launch a scan.

Pre‑ and Post‑Desensitization Comparison

Before desensitization, scans reveal fields such as email, IP, phone, and credit‑card numbers. After applying the desensitization rules, only non‑sensitive artifacts (e.g., internal IP) are reported, demonstrating effective privacy protection.

Periodic Scanning

Configure recurring scan tasks in DSC to run automatically at chosen intervals, ensuring continuous monitoring of newly ingested logs.

Query‑Time Desensitization via StoreView

StoreView creates a virtual view that joins multiple logstores. By adding the same SPL masking statements to the StoreView query, analysts can retrieve combined data without exposing raw sensitive fields.

Cross‑Store Query Example

Join advertising logs with transaction logs in a StoreView and apply SPL masking to hide email, credit‑card, and phone numbers before returning results to business users.

Summary

By combining Alibaba Cloud Data Security Center, Log Service, Logtail, SPL, Ingest Processors, and StoreView, organizations can systematically discover, mask, and monitor sensitive data in cloud logs, achieve regulatory compliance, reduce leakage risk, and maintain data governance while still supporting analytics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud-nativeLog Managementdata securitysensitive datadesensitization
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.