Tag

SLS

1 views collected around this technical thread.

DaTaobao Tech
DaTaobao Tech
Jan 29, 2024 · Cloud Native

Observability: Logging, Metrics, and Tracing in Distributed Systems

Observability in distributed systems combines event logging, aggregated metrics, and request tracing—each offering distinct trade‑offs in detail, storage, and overhead—and while the ELK stack dominates log and metric handling, tracing solutions such as EagleEye and SkyWalking differ by protocol and language, prompting many teams to adopt unified, cloud‑native platforms like Alibaba Cloud’s Log Service for lower cost, real‑time analysis and simplified management.

ELKSLScloud-native
0 likes · 32 min read
Observability: Logging, Metrics, and Tracing in Distributed Systems
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Dec 10, 2020 · Operations

Root Cause Analysis and Resolution of Disk Exhaustion During a Promotion Event

During a large‑scale promotion, an online service suffered severe disk usage spikes due to undeleted log files held open by an SLS process, and the issue was resolved by identifying the open handles, terminating the process, and implementing log‑level controls to prevent recurrence.

Disk UsageLog ManagementSLS
0 likes · 7 min read
Root Cause Analysis and Resolution of Disk Exhaustion During a Promotion Event