How a Global Enterprise Cut Log Analytics Costs by 87% with Alibaba Cloud SLS
A large multinational company migrated its multi‑cloud log pipeline from a fragmented AWS stack to Alibaba Cloud Log Service (SLS), achieving unified data processing, query, visualization and alerting while reducing total monthly cost by over 87% and gaining additional free storage and feature benefits.
Background and Challenges
A global enterprise with traffic across Europe, APAC and North America used Cloudflare for CDN and WAF, pushing logs to AWS S3 via Logpush. Their existing AWS stack (S3, Athena, CloudWatch Logs, QuickSight, Lambda/Glue) suffered from data silos, high query costs, complex ETL maintenance, and fragmented alerting.
SLS Solution Overview
The Alibaba Cloud SLS solution consolidates the entire observability workflow: data ingestion from S3, SPL‑based processing (time standardization, IP‑to‑Geo enrichment, IP masking, risk tagging), centralized storage in a domestic Logstore, interactive query, dashboards, and intelligent alerting.
Key Design Points
Dual‑mode file discovery : S3 event‑driven via SQS for low‑latency files plus periodic full scans to avoid missed data.
Elastic scaling : Auto‑scaling based on data volume rather than manual tuning; tasks balanced by data size.
Format handling : Automatic compression detection and explicit data format specification to avoid costly format guessing.
Reliability and replay : Point‑in‑time tracking, retry queues, idempotent processing, and per‑object deduplication.
Data Processing Example (SPL)
* | extend __time__ = cast(to_unixtime(date_parse(EdgeStartTimestamp, '%Y-%m-%dT%H:%i:%SZ')) as bigint)
| extend RequestId = RayID
| extend RequestPath = url_extract_path(ClientRequestURI)
| extend GeoCountry = ip_to_country(ClientIP), GeoRegion = ip_to_province(ClientIP), GeoCity = ip_to_city(ClientIP)
| extend ClientFingerprint = to_base64(sha256(to_utf8(ClientIP)))
| expand-values -keep SecuritySources
| parse-json -prefix 'Security' SecuritySources
| extend IsHighRisk = if(ClientRequestMethod='POST' and (ClientRequestReferer is null or SecurityAction='block'), 1, 0)
| project-away ClientIP, OriginIP, ResponseHeaders, RayIDSample Queries
WAF rule hit statistics
* | SELECT SecurityRuleID, count(*) AS TotalHits, count_if(IsHighRisk=1) AS HighRiskHits, approx_distinct(ClientFingerprint) AS UniqueClients FROM log WHERE SecurityRuleID IS NOT NULL AND SecurityRuleID <> '' GROUP BY SecurityRuleID ORDER BY TotalHits DESCTop attack source regions
* | SELECT GeoCountry, GeoCity, count(*) AS AttackCount, approx_distinct(ClientFingerprint) AS UniqueAttackers FROM log WHERE SecurityAction='block' GROUP BY GeoCountry, GeoCity ORDER BY AttackCount DESC LIMIT 10Origin 5xx error trend
* | SELECT time_series(__time__, '1m', '%Y-%m-%d %H:%i:%s', '0') AS TimeBucket, count_if(OriginResponseStatus>=500) AS Origin5xxCount, count_if(OriginResponseStatus>=500)*100.0/count(*) AS Origin5xxRate, count(*) AS TotalRequests FROM log GROUP BY TimeBucket ORDER BY TimeBucketRequest latency percentiles
* | SELECT RequestPath, count(*) AS RequestCount, approx_percentile(OriginResponseTime,0.50) AS LatencyP50, approx_percentile(OriginResponseTime,0.90) AS LatencyP90, approx_percentile(OriginResponseTime,0.99) AS LatencyP99 FROM log WHERE OriginResponseTime IS NOT NULL GROUP BY RequestPath HAVING count(*)>100 ORDER BY LatencyP99 DESC LIMIT 20Alert Rules
* | SELECT count_if(OriginResponseStatus>=500)*100.0/count(*) AS Origin5xxRate FROM log HAVING Origin5xxRate>5 * | SELECT count_if(IsHighRisk=1) AS HighRiskCount, count_if(IsHighRisk=1)*100.0/count(*) AS HighRiskRate FROM log HAVING HighRiskCount>100 OR HighRiskRate>10 * | SELECT count_if(SecurityAction='block') AS BlockCount, approx_distinct(ClientFingerprint) AS UniqueAttackers FROM log HAVING BlockCount>1000 OR UniqueAttackers>50Cost Comparison
Assumptions: 20 TB raw logs per day (compressed 1:10 → 2 TB transfer), 14‑day retention, 20 000 queries/day (≈100 TB scanned), 20 alert rules (every 3 min), ~100 dashboards.
SLS Pricing (pay‑by‑ingest)
Ingestion: 20 480 GB /day × $0.061 / GB × 30 days ≈ $37 478.40 / month
Outbound S3 transfer (compressed): 2 048 GB /day × 30 days = 61 440 GB; first 100 GB free, remaining billed per AWS tier → $5 113.00 / month
Storage (first 30 days free): $0
Processing, alerts, dashboards: no extra charge
Total SLS ≈ $42 591.40 / month
AWS CloudWatch Equivalent
Logs ingestion: 20 480 GB /day × $0.50 / GB × 30 days = $307 200 / month
Log storage (14 days): 286 720 GB × $0.03 / GB‑month = $8 601.60 / month
Logs Insights scanned: 102 400 GB /day × $0.005 / GB × 30 days = $15 360.00 / month
Standard alarms: 20 × $0.10 = $2.00 / month
Dashboards: 100 × $3.00 = $300.00 / month
Total AWS ≈ $331 463.60 / month
Result
Monthly savings ≈ $288 872.20, a cost reduction of about 87 % while gaining additional free storage days and unified capabilities.
Future Outlook
The solution also supports optional CloudFront acceleration for cross‑region transfers and plans to add GCP/Azure ingestion for true multi‑cloud observability.
References
Alibaba Cloud Log Service Pricing – https://www.alibabacloud.com/zh/product/log-service/pricing
AWS CloudWatch Pricing – https://aws.amazon.com/cn/cloudwatch/pricing/
AWS S3 Pricing – https://aws.amazon.com/cn/s3/pricing/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Observability
Driving continuous progress in observability technology!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
