Operations 27 min read

Master Apache SkyWalking: Setup, Performance Comparison, and Advanced Tracing

This comprehensive guide introduces distributed tracing challenges in large microservice systems, explains what Apache SkyWalking is, compares it with Zipkin, Pinpoint and CAT, details performance test results, walks through installation, configuration, custom tracing, log integration, alerting, and high‑availability deployment.

Software Development Quality
Software Development Quality
Software Development Quality
Master Apache SkyWalking: Setup, Performance Comparison, and Advanced Tracing

Link Tracing Introduction

In large microservice architectures consisting of dozens or hundreds of services, common problems include:

How to connect the entire call chain and quickly locate issues? How to clarify dependencies between microservices? How to analyze performance of each microservice interface? How to trace the processing order of the whole business flow?

What is SkyWalking

1.1 SkyWalking Introduction

SkyWalking supports multiple languages and frameworks, including Java, Go, Node.js, and Python. It uses distributed tracing technology to monitor all internal and external calls, providing complete performance insights.

SkyWalking offers powerful features such as performance monitoring, fault diagnosis, debugging, data analysis, and alerting. It can integrate with Grafana and Elasticsearch for stronger monitoring and analysis capabilities.

In short, Apache SkyWalking is a powerful and easy‑to‑use APM tool that helps developers and DevOps teams understand application behavior and act promptly on performance problems.

Official website: https://skywalking.apache.org/ Download: https://skywalking.apache.org/downloads/ Github: https://github.com/apache/skywalking Documentation: https://skywalking.apache.org/docs/main/v8.5.0/readme/ Chinese documentation: https://skyapm.github.io/document-cn-translation-of-skywalking/

1.2 Comparison of Link Tracing Frameworks

Zipkin is Twitter's open‑source tracing tool, lightweight and easy to deploy.

Pinpoint is a Korean open‑source tracing and monitoring tool based on bytecode injection, supporting many plugins and a powerful UI, with no code intrusion on the client side.

SkyWalking is a domestic open‑source tracing and monitoring tool based on bytecode injection, supporting many plugins, a strong UI, and no code intrusion. It is now an Apache incubating project.

CAT is Dianping's open‑source platform covering tracing, monitoring, log collection, and alerting.

1.3 Performance Comparison

Simulated three concurrency levels (500, 750, 1000) using JMeter; each thread sends 30 requests with a 10 ms think time. Sampling rate is 1 (100%). Pinpoint default sampling is 20 % (changed to 100%). Zipkin default is 100%. Combined, there are 12 test scenarios.

Results show that among the three tracing components, SkyWalking's probe has the smallest impact on throughput, Zipkin is moderate, and Pinpoint significantly reduces throughput (e.g., at 500 concurrent users, service throughput drops from 1385 to 774). CPU and memory impact stays within about 10 %.

1.4 Main Features of SkyWalking

1. Multiple monitoring methods via language probes and service mesh.

2. Supports automatic probes for Java, .NET Core, and Node.js.

3. Lightweight and efficient; no need for big‑data platforms or many servers.

4. Modular design – UI, storage, and cluster management have multiple selectable mechanisms.

5. Alerting support.

6. Excellent visualization solutions.

2. SkyWalking Environment Setup and Deployment

SkyWalking consists of four main components:

SkyWalking agent – binds with the business system to collect monitoring data.

SkyWalking OAP service – processes and stores data, provides APIs for the UI.

SkyWalking webapp – the front‑end UI for displaying data.

Database (MySQL, Elasticsearch, etc.) – stores the monitoring data.

2.1 Download SkyWalking

Download: http://skywalking.apache.org/downloads/

Directory structure:

2.2 Deploy SkyWalking OAP Service

Start script:

bin/startup.sh

Log files are stored in the

logs

directory.

After successful startup, two services run:

skywalking-oap-server

(ports 11800 for data collection and 12800 for UI requests) and

skywalking-webapp

(default port 8080). Ports can be changed in

config/application.yml

.

Webapp port configuration (default 8080) can be modified in

webapp/webapp.yml

.

2.3 Three Core Concepts

Service: a set of workloads providing the same behavior; the name can be defined when using the agent.

Service Instance: each individual workload (a real process) within a service.

Endpoint: the request path of a specific service, such as an HTTP URI or a gRPC class‑method signature.

3. SkyWalking Integration with Microservices

3.1 Linux – Jar Deployment

Prepare a Spring Boot executable jar and start it with the SkyWalking agent via the

-javaagent

parameter.

<code>#!/bin/sh
# SkyWalking Agent configuration
export SW_AGENT_NAME=springboot-skywalking-demo
export SW_AGENT_COLLECTOR_BACKEND_SERVICES=127.0.0.1:11800
export SW_AGENT_SPAN_LIMIT=2000
export JAVA_AGENT=-javaagent:/usr/local/soft/apache-skywalking-apm-bin-es7/agent/skywalking-agent.jar
java $JAVA_AGENT -jar springboot-skywalking-demo-0.0.1-SNAPSHOT.jar
</code>

Equivalent command:

<code>java -javaagent:/usr/local/soft/apache-skywalking-apm-bin-es7/agent/skywalking-agent.jar \
-DSW_AGENT_COLLECTOR_BACKEND_SERVICES=127.0.0.1:11800 \
-DSW_AGENT_NAME=springboot-skywalking-demo -jar springboot-skywalking-demo-0.0.1-SNAPSHOT.jar
</code>

These parameters correspond to properties in

agent/config/agent.config

:

<code># The service name in UI
agent.service_name=${SW_AGENT_NAME:Your_ApplicationName}
# Backend service addresses.
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:127.0.0.1:11800}
</code>

3.2 Windows – IDEA

Configure JVM parameters in the IDE as shown:

Use

-DSW_AGENT_COLLECTOR_BACKEND_SERVICES

to specify the remote collector address; the

-javaagent

must point to the local path of

skywalking-agent.jar

.

3.3 Tracing Across Multiple Microservices

To trace across multiple services, add the

-javaagent

parameter to each microservice (e.g.,

mall-gateway

,

mall-order

,

mall-user

) and test via

http://localhost:8888/user/findOrderByUserId/1

.

3.4 Copy Gateway Plugin

Copy the gateway plugin from

agent/optional-plugins

to

agent/plugins

:

4. SkyWalking Persistence of Trace Data

By default SkyWalking uses an H2 database (configured in

config/application.yml

).

4.1 MySQL Persistence

Modify

config/application.yml

to use MySQL as the storage backend and update the connection settings. Add the MySQL driver JAR to

oap-libs

because it is not included by default.

<code>storage:
  selector: ${SW_STORAGE:mysql}
  mysql:
    properties:
      jdbcUrl: ${SW_JDBC_URL:"jdbc:mysql://localhost:3306/swtest"}
      dataSource.user: ${SW_DATA_SOURCE_USER:root}
      dataSource.password: ${SW_DATA_SOURCE_PASSWORD:root}
</code>

After starting SkyWalking, tables are created in the

swtest

database.

5. Custom SkyWalking Tracing

Add the tracing toolkit dependency:

<code>&lt;dependency&gt;
  &lt;groupId&gt;org.apache.skywalking&lt;/groupId&gt;
  &lt;artifactId&gt;apm-toolkit-trace&lt;/artifactId&gt;
  &lt;version&gt;8.4.0&lt;/version&gt;
&lt;/dependency&gt;
</code>

5.1 @Trace Annotation

Annotate business methods with

@Trace

to make them appear in the UI trace view.

5.2 @Tag / @Tags

Use

@Tag

or

@Tags

to add extra information such as parameters and return values.

<code>@Trace
@Tag(key = "list", value = "returnedObj")
public List&lt;User&gt; list(){
    return userMapper.list();
}

@Trace
@Tags({@Tag(key = "param", value = "arg[0]"),
       @Tag(key = "user", value = "returnedObj")})
public User getById(Integer id){
    return userMapper.getById(id);
}
</code>

6. SkyWalking Log Integration

Add the logback toolkit dependency:

<code>&lt;dependency&gt;
  &lt;groupId&gt;org.apache.skywalking&lt;/groupId&gt;
  &lt;artifactId&gt;apm-toolkit-logback-1.x&lt;/artifactId&gt;
  &lt;version&gt;8.5.0&lt;/version&gt;
&lt;/dependency&gt;
</code>

Configure

logback-spring.xml

to include the

%tid

placeholder:

<code>&lt;configuration&gt;
  &lt;include resource="org/springframework/boot/logging/logback/defaults.xml"/&gt;
  &lt;appender name="console" class="ch.qos.logback.core.ConsoleAppender"&gt;
    &lt;encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder"&gt;
      &lt;layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout"&gt;
        &lt;Pattern&gt;${CONSOLE_LOG_PATTERN}&lt;/Pattern&gt;
      &lt;/layout&gt;
    &lt;/encoder&gt;
  &lt;/appender&gt;
  &lt;root level="INFO"&gt;
    &lt;appender-ref ref="console"/&gt;
  &lt;/root&gt;
&lt;/configuration&gt;
</code>

Enable gRPC log reporting (available from v8.4.0):

<code>&lt;appender name="grpc-log" class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.log.GRPCLogClientAppender"&gt;
  &lt;encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder"&gt;
    &lt;layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.mdc.TraceIdMDCPatternLogbackLayout"&gt;
      &lt;Pattern&gt;%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{tid}] [%thread] %-5level %logger{36} -%msg%n&lt;/Pattern&gt;
    &lt;/layout&gt;
  &lt;/encoder&gt;
&lt;/appender&gt;
&lt;root level="info"&gt;
  &lt;appender-ref ref="grpc-log"/&gt;
&lt;/root&gt;
</code>

Agent configuration for gRPC log reporting:

<code>plugin.toolkit.log.grpc.reporter.server_host=${SW_GRPC_LOG_SERVER_HOST:192.168.3.100}
plugin.toolkit.log.grpc.reporter.server_port=${SW_GRPC_LOG_SERVER_PORT:11800}
plugin.toolkit.log.grpc.reporter.max_message_size=${SW_GRPC_LOG_MAX_MESSAGE_SIZE:10485760}
plugin.toolkit.log.grpc.reporter.upstream_timeout=${SW_GRPC_LOG_GRPC_UPSTREAM_TIMEOUT:30}
</code>

7. SkyWalking Alerting

Alert rules are defined in

config/alarm-settings.yml

. Example rules include:

Service average response time > 1 s in the last 3 minutes.

Service success rate < 80 % in the last 2 minutes.

Percentage of requests with response time > 1 s in the last 3 minutes.

Instance average response time > 1 s with name matching a regex in the last 2 minutes.

Endpoint average response time > 1 s in the last 2 minutes.

Database access average response time > 1 s in the last 2 minutes.

Each rule contains fields such as rule name, metric name, include/exclude names, threshold, operator, period, count, silence period, and message.

Webhook

When an alarm triggers, SkyWalking sends a POST request with JSON payload to the configured webhook URL.

<code>[
  {
    "scopeId": 1,
    "scope": "SERVICE",
    "name": "serviceA",
    "id0": "12",
    "id1": "",
    "ruleName": "service_resp_time_rule",
    "alarmMessage": "alarmMessage xxxx",
    "startTime": 1560524171000
  },
  {
    "scopeId": 1,
    "scope": "SERVICE",
    "name": "serviceB",
    "id0": "23",
    "id1": "",
    "ruleName": "service_resp_time_rule",
    "alarmMessage": "alarmMessage yyy",
    "startTime": 1560524171000
  }
]
</code>

Fields explained: scope, name, ids, ruleName, alarmMessage, startTime.

Email Alert Implementation

Add

spring-boot-starter-mail

dependency and configure SMTP settings.

<code>&lt;dependency&gt;
  &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
  &lt;artifactId&gt;spring-boot-starter-mail&lt;/artifactId&gt;
&lt;/dependency&gt;
</code>
<code>server:
  port: 9134
spring:
  mail:
    host: smtp.163.com
    username: [email protected]
    password: your_email_service_key
    default-encoding: utf-8
    port: 465
    protocol: smtp
    properties:
      mail:
        debug: false
        smtp:
          socketFactory:
            class: javax.net.ssl.SSLSocketFactory
</code>

Define a DTO and a controller to receive alarms and send email:

<code>@RestController
@RequestMapping("/alarm")
public class SwAlarmController {
    private final JavaMailSender sender;
    @Value("${spring.mail.username}")
    private String from;

    @PostMapping("/receive")
    public void receive(@RequestBody List&lt;SwAlarmDTO&gt; alarmList) {
        SimpleMailMessage message = new SimpleMailMessage();
        message.setFrom(from);
        message.setTo(from);
        message.setSubject("Alarm Email");
        message.setText(getContent(alarmList));
        sender.send(message);
    }

    private String getContent(List&lt;SwAlarmDTO&gt; alarmList) {
        StringBuilder sb = new StringBuilder();
        for (SwAlarmDTO dto : alarmList) {
            sb.append("scopeId: ").append(dto.getScopeId())
              .append("\nscope: ").append(dto.getScope())
              .append("\nName: ").append(dto.getName())
              .append("\nID: ").append(dto.getId0())
              .append("\nRule: ").append(dto.getRuleName())
              .append("\nMessage: ").append(dto.getAlarmMessage())
              .append("\nTime: ").append(dto.getStartTime())
              .append("\n\n----------\n\n");
        }
        return sb.toString();
    }
}
</code>

Add the webhook URL to

config/alarm-settings.yml

:

<code>webhooks:
  - http://127.0.0.1:9134/alarm/receive
</code>

Test by adding a 2‑second sleep in a service method, invoking the endpoint, and confirming that an email is received.

8. SkyWalking High Availability

In production, the backend should support high throughput and high availability. Deploy a SkyWalking OAP cluster registered with Nacos; as long as at least one OAP instance is running, tracing continues.

Requirements:

At least one Nacos instance (or Nacos cluster).

At least one Elasticsearch or MySQL instance (or a cluster).

At least two SkyWalking OAP services.

At least one UI service (UI can be clustered behind Nginx).

Configure

config/application.yml

to use Nacos as the registry:

<code>registry:
  type: nacos
  nacos:
    serverLists: 127.0.0.1:8848
    namespace: skywalking
    group: SKY
    username: nacos
    password: nacos
</code>

Set the storage selector to Elasticsearch 7:

<code>storage:
  selector: ${SW_STORAGE:elasticsearch7}
  elasticsearch7:
    nameSpace: skywalking
    clusterNodes: 127.0.0.1:9200
</code>

Configure UI

webapp.yml

with a list of OAP servers:

<code>collector:
  ribbon:
    listOfServers: 192.168.3.10:11800,192.168.3.12:11800
</code>

Start services with the JVM parameter pointing to both OAP backends:

<code>-DSW_AGENT_COLLECTOR_BACKEND_SERVICES=192.168.3.10:11800,192.168.3.12:11800
</code>
monitoringmicroservicesperformance testingAlertingDistributed TracingSkyWalking
Software Development Quality
Written by

Software Development Quality

Discussions on software development quality, R&D efficiency, high availability, technical quality, quality systems, assurance, architecture design, tool platforms, test development, continuous delivery, continuous testing, etc. Contact me with any article questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.