Operations 12 min read

How Does Distributed Link Tracing Work? Inside SkyWalking’s Architecture

This article explains the concept of distributed link tracing, its principles, metrics, and implementation details—including monolithic and microservice approaches, OpenTracing standards, and how SkyWalking solves challenges like automatic span collection, context propagation, unique trace IDs, and sampling performance.

Architect's Guide
Architect's Guide
Architect's Guide
How Does Distributed Link Tracing Work? Inside SkyWalking’s Architecture

In distributed systems, especially microservice architectures, a single external request often triggers multiple internal modules, middleware, and machines. Determining which applications, modules, and nodes are involved, their order, and performance is the challenge addressed by link tracing.

What Is Link Tracing?

Link tracing reconstructs a distributed request into a call chain, displaying each service node’s latency, target machine, and request status.

Principles of Link Tracing

Key metrics for an interface include:

Response time (RT)

Exception responses

Location of slow requests

Monolithic Architecture

In early stages, systems are monolithic. Using AOP (Aspect‑Oriented Programming), we can record start and end times around business logic to calculate total latency and capture exceptions with minimal code intrusion.

Microservice Architecture

As services grow, they split into microservices. When a page is slow, the request may traverse A → C → B → D across many machines, making it hard to pinpoint the problematic service or node.

Link tracing solves three main pain points:

Difficult and lengthy issue diagnosis

Hard-to‑reproduce scenarios

Complex performance bottleneck analysis

It automatically collects data, builds a complete call chain, and visualizes component performance.

OpenTracing Standard

OpenTracing provides a lightweight, vendor‑agnostic API layer between applications and tracing systems, similar to JDBC’s standard interface.

Its data model consists of:

Trace : a complete request chain

Span : a single call with start and end timestamps

SpanContext : global context (e.g., traceId) passed between spans

These concepts enable distributed tracing systems to capture and correlate calls across services.

Collector Role

The collector gathers:

Global

trace_id
span_id

to identify each call parent_span_id to link child calls to their parents

Collected data is stored in Elasticsearch, MySQL, etc., for visualization.

SkyWalking Architecture

SkyWalking uses a plugin‑based Java agent to automatically collect spans without code changes. Context is propagated via headers (e.g., Dubbo attachments). It generates unique trace IDs using a Snowflake‑like algorithm, handling clock rollback by falling back to random IDs.

Sampling is performed (default 3 samples per 3 seconds) to reduce overhead, and upstream‑sampled contexts force downstream collection to ensure complete traces.

Performance Evaluation

Benchmarks show SkyWalking adds negligible CPU, memory, and latency overhead at 5000 TPS. Compared with Zipkin and Pinpoint, SkyWalking achieves significantly lower response times (22 ms vs. 117 ms and 201 ms) and offers non‑intrusive instrumentation.

Additional advantages include multi‑language support (Java, .NET Core, PHP, Node.js, Go, Lua) and extensible plugins.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MicroservicesObservabilityPerformance MonitoringOpenTracingDistributed TracingSkyWalking
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.