How Alibaba Cloud SLS Soft Delete Enables Instant, Low‑Cost Data Cleanup
This article explains Alibaba Cloud's Log Service (SLS) soft‑delete feature, describing its mark‑and‑filter mechanism, implementation steps, and real‑world scenarios where it replaces costly hard‑delete or ETL solutions with near‑instant, low‑impact data removal for compliance, emergencies, and test‑data contamination.
Log Service (SLS) is a cloud‑native observability platform that provides log, metric, and trace data handling with large‑scale, low‑cost, real‑time capabilities.
In typical scenarios it meets lifecycle management needs, but certain cases pose high cost and risk for traditional solutions, such as:
Accidentally writing user phone numbers in plaintext to terabytes of logs during deployment.
Test data contaminating production analytics during version upgrades.
Rapid removal of unexpected data after incidents.
SLS introduces a new “soft delete” feature that offers near‑index‑query performance to address emergency deletion and dirty‑data governance, allowing you to master the tool in two minutes.
What is Soft Delete
Traditional hard delete is not supported because SLS is designed for massive write and query performance; physically removing data requires locating files and indexes, which is costly and disrupts real‑time guarantees in distributed systems.
Soft delete uses a “mark + filter” mechanism: data remains physically but is hidden from users, ensuring system stability while meeting urgent deletion needs.
Soft Delete Implementation Principle
Soft delete works in two steps, with performance close to an index query:
Delete operation
Query to quickly select log rows to be deleted.
Mark the selected rows as deleted.
Query filtering automatically excludes marked data, effective immediately, while still supporting normal queries and SQL.
It is analogous to labeling unwanted items in a house (soft delete) and letting a garbage‑collection service (TTL expiration) physically remove them later, keeping the view clean without immediate cost.
Elegant Solution to User Pain Points
Scenario 1: Emergency response at 3 AM
An e‑commerce ops engineer discovered that a newly released order system wrote user phone numbers in plaintext to logs for two hours.
Traditional solution : stop service, rewrite logs via SPL, causing six‑hour downtime for terabytes of data.
Soft delete solution : specify time range and query (e.g., phoneNumber:*) and delete logs in seconds; queries instantly hide sensitive data; business continues without interruption; data is later physically removed by logstore TTL.
Scenario 2: Test data polluting production
A financial analyst found test data leaking into production logs, corrupting risk‑model training.
Traditional solutions : costly ETL cleaning, stopping analysis tasks, rebuilding indexes, taking 2‑3 days.
Soft delete solution : precisely mark test data for deletion (e.g., dataSource:testEnv) within seconds; analysis resumes immediately; TTL handles physical cleanup.
Scenario 3: Precise removal of abnormal logs
A SaaS provider’s new version generated massive error logs (error_code 500/502, event_type file_upload_error) that polluted monitoring and analysis.
Traditional solutions : ETL cleaning or adding ad‑hoc filter conditions, leading to high overhead and possible SLS limits.
Soft delete solution : use a complex query (e.g.,
version>=2.1 and version<2.3 and (error_code:500 or error_code:502) and event_type:file_upload_error) to delete matching logs in seconds; backend merges and caches deletions with negligible impact on query performance; reports automatically reflect corrected data.
Soft delete thus provides instant, low‑cost data removal while preserving system stability, and is already available in Singapore and China (North 6) regions, with more regions rolling out gradually.
For detailed usage, refer to the official documentation.
from_time = (int)(time.time()) - 2 * 3600 to_time = (int)(time.time()) toDeleteQuery = phoneNumber:* request = DeleteLogsRequest(project, logstore, from_time, to_time, query=toDeleteQuery) res: DeleteLogsResponse = client.delete_logs(request) from_time = (int)(time.time()) - 2 * 24 * 3600 to_time = (int)(time.time()) toDeleteQuery = dataSource:testEnv request = DeleteLogsRequest(project, logstore, from_time, to_time, query=toDeleteQuery) res: DeleteLogsResponse = client.delete_logs(request) from_time = (int)(time.time()) - 7 * 24 * 3600 to_time = (int)(time.time()) toDeleteQuery = '''version>=2.1 and version < 2.3 and __tag__:__path__: "/user/actiontrail.LOG" and (error_code:500 or error_code:502) and event_type:file_upload_error''' request = DeleteLogsRequest(project, logstore, from_time, to_time, query=toDeleteQuery) res: DeleteLogsResponse = client.delete_logs(request)Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Observability
Driving continuous progress in observability technology!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
