How to Secure Cloud Logs: End‑to‑End Sensitive Data Scanning and Masking on Alibaba Cloud
This guide walks through why enterprises need to scan and protect sensitive log data, explains the regulatory background, and provides a step‑by‑step implementation on Alibaba Cloud using Data Security Center, Logtail, SPL, Ingest Processors, SDKs, StoreView queries and periodic scans to achieve comprehensive data security and governance.
Background
Scanning and protecting sensitive data (personal identifiers, financial records, medical information) in log assets helps prevent unauthorized access, meet regulatory requirements such as GDPR, and improve overall data governance.
Solution Architecture
The solution uses Alibaba Cloud Data Security Center (DSC) to discover and classify sensitive data in Log Service (SLS) logs. It consists of two phases: an integration phase that configures data collection and desensitization, and an iteration/optimization phase that refines the desensitization rules based on DSC scan results.
Integration Phase
Analyze sample logs : Manually inspect a subset of logs to identify sensitive fields (e.g., name, email, IP, credit‑card, phone) and define industry‑specific tags.
SLS desensitization options :
Logtail side‑desensitization (Data Flow 1) – configure Logtail with a processing plugin that masks fields using regular expressions or SPL.
Logtail + Ingest Processor (Data Flow 2) – collect raw logs with Logtail, then apply SPL‑based masking in an Ingest Processor on the server side.
SDK + Ingest Processor (Data Flow 3) – write logs via Alibaba Cloud SDKs (Java, Python, Go, etc.) and configure the same SPL rules in an Ingest Processor.
DSC sensitive scanning : Authorize the SLS logstore in the DSC console, create a custom identification task, select the asset range and time window, and run the scan. The scan returns sensitivity levels and sample data for each field.
Iterative Optimization
Based on scan results, continuously adjust desensitization rules and schedule periodic scans (daily, weekly, or monthly) to keep protection up‑to‑date with new log types.
Mock Data Construction
A synthetic e‑commerce transaction log is generated with Mockaroo. Example JSON records:
{"customer_name":"Clarita Bassick","product_name":"Wine - Rhine Riesling Wolf Blass","quantity":88,"price":865.08,"purchase_date":"11/21/2024","shipping_address":"4 Twin Pines Terrace","payment_method":"PayPal","transaction_id":1,"delivery_status":"In Transit","email":"[email protected]","ip_address":"199.224.149.82","creadit_card":"56022548472842990","phone_number":"+27 380 246 6745"}These logs are ingested into SLS via Logtail or SDK.
Desensitization Implementations
Logtail Side‑Desensitization
Configure Logtail with a JSON parsing plugin followed by a data‑desensitization plugin. The plugin masks fields according to the rules defined in the integration step.
SPL‑Based Desensitization
Use SPL pipelines to parse JSON and replace sensitive values with masked patterns:
* | parse-json content
| extend creadit_card=regexp_replace(creadit_card, '\d{12}$','***********')
| extend customer_name=regexp_replace(customer_name, '\S{4}$','******')
| extend phone_number=regexp_replace(phone_number, '\S{3}\s\S{4}$','*** ****')
| extend ip_address=regexp_replace(ip_address, '\d{1,3}.\d{1,3}$','**.**')
| extend email=regexp_replace(email, '\S+@','****@')
| project-away contentLogtail + Ingest Processor
Collect raw logs with Logtail, then attach an Ingest Processor to the target logstore and paste the same SPL statements. This offloads desensitization to the server side, preserving client performance.
SDK + Ingest Processor
Write logs using Alibaba Cloud SDKs (Java, Python, Go, etc.) and configure the SPL rules in an Ingest Processor attached to the logstore. The processing flow is identical to the Logtail + Ingest Processor case.
Sensitive Data Scanning with DSC
Create Scan Task
In DSC, authorize the SLS logstore, select a built‑in or custom identification template (e.g., Internet industry template), and launch a scan.
Pre‑ and Post‑Desensitization Comparison
Before desensitization, scans reveal fields such as email, IP, phone, and credit‑card numbers. After applying the desensitization rules, only non‑sensitive artifacts (e.g., internal IP) are reported, demonstrating effective privacy protection.
Periodic Scanning
Configure recurring scan tasks in DSC to run automatically at chosen intervals, ensuring continuous monitoring of newly ingested logs.
Query‑Time Desensitization via StoreView
StoreView creates a virtual view that joins multiple logstores. By adding the same SPL masking statements to the StoreView query, analysts can retrieve combined data without exposing raw sensitive fields.
Cross‑Store Query Example
Join advertising logs with transaction logs in a StoreView and apply SPL masking to hide email, credit‑card, and phone numbers before returning results to business users.
Summary
By combining Alibaba Cloud Data Security Center, Log Service, Logtail, SPL, Ingest Processors, and StoreView, organizations can systematically discover, mask, and monitor sensitive data in cloud logs, achieve regulatory compliance, reduce leakage risk, and maintain data governance while still supporting analytics.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
