Operations 27 min read

Design and Implementation of a Distributed Log Service: Tianyan vs ELK

This article examines the challenges of building a high‑performance log service for distributed systems, compares the traditional ELK stack with the Tianyan platform, details Tianyan's architecture—including ingest, storage, and consumer components, SDK and Minos collection methods, high‑throughput transmission with Disruptor and Bigpipe, log retrieval, resource isolation, dynamic cleaning, and best‑practice recommendations.

Top Architect

Sep 20, 2023

Design and Implementation of a Distributed Log Service: Tianyan vs ELK

The article begins by outlining the major challenges of log services in distributed environments, such as massive log volume, diverse formats, and the need for scalable, reliable collection and storage.

It then reviews the common ELK solution and highlights the differences of the Tianyan log platform, which offers easier integration, customizable resources, and better scalability.

2. ELK Common Solution and Tianyan Architecture

Section 2.1 introduces the Elastic Stack components (Ingest, Shippers, Queues, Processors) and tools like Elastic Agent, Fleet, APM, Beats, Logstash, and Elasticsearch, illustrating their roles with diagrams.

Section 2.2 details the Store component, emphasizing Elasticsearch as the core storage engine.

3. Tianyan Log Service

Section 3.1 describes the overall system architecture, showing how logs are collected, transmitted, stored, and isolated per product line.

Section 3.2 focuses on log collection methods: the SDK (Java Appender) and Minos (Baidu's streaming log transport).

public class LogClientAppender<E> extends AppenderBase<E> {
    private static final Logger LOGGER = LoggerFactory.getLogger(LogClientAppender.class);
    @Override
    protected void append(E eventObject) {
        ILoggingEvent event = filter(eventObject);
        if (event != null) {
            MessageLogSender.getExecutor().submit(new LogbackTask(event, LogNodeFactory.getLogNodeSyncDto()));
        }
    }
}

The SDK supports Log4j, Logback, Log4j2, and forwards log events to a high‑performance Disruptor queue.

TraceFactory.getSqltracer().end(returnObj, className, methodName, realParams, dbType, sqlType, sql, sqlUrl)

For MyBatis tracing, the interceptor is registered as:

sqlSessionFactory.getConfiguration().addInterceptor(new IlogMybatisPlugin());

Section 3.3 explains the high‑concurrency transmission pipeline: logs are first placed into a Disruptor ring buffer, then into a Bigpipe offline queue, with a fallback BigQueue for rare failures.

Section 3.4 describes log retrieval via Kibana and Elasticsearch, supporting various query types (text, term, phrase, prefix, logical) with examples of DSL queries.

{
  "query": {
    "bool": {
      "must": [{
        "multi_match": {
          "query": "searchValue",
          "fields": ["message", "exception"],
          "type": "best_fields"
        }
      }]
    }
  }
}

5. Resource Isolation

Tianyan isolates transmission and storage resources per product line to avoid contention, describing a five‑step workflow from log generation to ES storage.

6. Dynamic Cleaning and Storage Downgrade

The platform monitors ES cluster usage, automatically deletes the oldest indices when thresholds are exceeded, and periodically snapshots data to low‑cost object storage (BOS) for long‑term retention.

7. Best Practices

Practical guidance includes product‑line onboarding, log filtering rules (content, name, combined), and operational tips for maintaining high availability and performance.

Overall, the article provides a comprehensive technical guide to building, operating, and optimizing a distributed log service, contrasting traditional ELK with the Tianyan solution.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Logging Disruptor ELK log-aggregation Bigpipe Tianyan

Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.