Tagged articles
27 articles
Page 1 of 1
MaGe Linux Operations
MaGe Linux Operations
May 10, 2026 · Cloud Native

Docker Container Fails to Start? Common Causes and Troubleshooting Commands

This guide walks operators through a systematic, step‑by‑step process for diagnosing Docker container startup failures, covering status checks, log inspection, detailed use of docker inspect, and categorized troubleshooting of image, configuration, resource, permission, network, and volume issues with concrete commands and examples.

ConfigurationContainerDocker
0 likes · 27 min read
Docker Container Fails to Start? Common Causes and Troubleshooting Commands
Xiao Liu Lab
Xiao Liu Lab
Jan 3, 2026 · Operations

How to Quickly Identify Unexpected Linux Server Reboots and Their Causes

This guide shows Linux administrators step‑by‑step how to locate reboot timestamps, retrieve full reboot histories, examine log files, analyze kernel and crash logs, check service and resource issues, and investigate human or scheduled actions, enabling fast root‑cause diagnosis of unplanned server restarts.

OperationsRebootServer
0 likes · 9 min read
How to Quickly Identify Unexpected Linux Server Reboots and Their Causes
Xuanwu Backend Tech Stack
Xuanwu Backend Tech Stack
Nov 24, 2025 · Databases

Mastering MySQL’s Three Core Logs: Redo, Undo, and Binlog Explained

This article provides a comprehensive guide to MySQL’s three essential logs—redo, undo, and binlog—detailing their hierarchy, purposes, write mechanisms, configuration parameters, and how they cooperate during transaction processing and replication, while also offering troubleshooting tips for common issues.

Binlogdatabaselogs
0 likes · 17 min read
Mastering MySQL’s Three Core Logs: Redo, Undo, and Binlog Explained
JakartaEE China Community
JakartaEE China Community
Nov 4, 2025 · Operations

How Logs, Traces, and Metrics Differ—and Why It Matters

Logs, tracing, and metrics each serve distinct monitoring goals—logs capture discrete events for debugging and audit, traces map request flows to pinpoint performance bottlenecks, and metrics provide time‑series health data; understanding their differences and integrating tools like ELK, OpenTelemetry, Prometheus, and Grafana enables robust observability.

ELKGrafanaMetrics
0 likes · 7 min read
How Logs, Traces, and Metrics Differ—and Why It Matters
Efficient Ops
Efficient Ops
May 7, 2025 · Operations

Why Choose SigNoz for Open‑Source Observability? A Deep Dive

This article introduces SigNoz, a self‑hosted open‑source observability platform that unifies metrics, logs, and traces, outlines its core capabilities, shows how to install it with Docker, and compares its resource efficiency to commercial solutions like DataDog and Elastic.

MetricsObservabilityOpenTelemetry
0 likes · 4 min read
Why Choose SigNoz for Open‑Source Observability? A Deep Dive
Liangxu Linux
Liangxu Linux
Jan 16, 2025 · Databases

Inside MySQL: How Buffer Pools, Indexes, and Logs Power Modern Databases

This article explains MySQL’s internal architecture, covering how data pages, B+‑tree and hash indexes, the Buffer Pool, Adaptive Hash Index, Change Buffer, Undo/Redo logs, the InnoDB storage engine, and the server layer work together to provide fast, reliable CRUD operations and support replication.

Database ArchitectureInnoDBbuffer pool
0 likes · 14 min read
Inside MySQL: How Buffer Pools, Indexes, and Logs Power Modern Databases
DataFunSummit
DataFunSummit
May 22, 2024 · Operations

Building an Observability System: Practices and Solutions from Yanhuang Data

This article explains how to build a robust observability system for cloud‑native microservice architectures, detailing the three core signals—metrics, traces, and logs—common challenges such as complexity and data silos, and presents Yanhuang Data’s integrated platform with unified data collection, storage, analysis, and visualization solutions.

KubernetesMetricsObservability
0 likes · 23 min read
Building an Observability System: Practices and Solutions from Yanhuang Data
DataFunTalk
DataFunTalk
Jan 21, 2024 · Cloud Native

Building a System Observability Framework with YHP: Practices, Challenges, and Integrated Solutions

This article explains how YHP enables cloud‑native systems to achieve comprehensive observability by defining the three core signals—metrics, traces, and logs—addressing common enterprise pain points, and presenting an integrated platform that unifies data collection, storage, analysis, and visualization for efficient fault diagnosis and performance monitoring.

Cloud NativeData PlatformMetrics
0 likes · 22 min read
Building a System Observability Framework with YHP: Practices, Challenges, and Integrated Solutions
Liangxu Linux
Liangxu Linux
Jul 23, 2023 · Cloud Native

How to Retrieve Crash Logs of a Restarted Pod with kubectl --previous

When a Kubernetes pod repeatedly crashes and the container keeps restarting, the standard kubectl logs command may miss the previous instance's output, but using the --previous flag lets you fetch logs from the last terminated container by reading the symlinked files under /var/log/pods.

container crashkubectllogs
0 likes · 7 min read
How to Retrieve Crash Logs of a Restarted Pod with kubectl --previous
Architecture Digest
Architecture Digest
Oct 21, 2022 · Operations

Benchmarking and Sizing Your Elasticsearch Cluster for Logs and Metrics

This article explains how to assess hardware resources, calculate required Elasticsearch cluster size based on data volume, and perform indexing and search benchmark tests to ensure stable performance and optimal throughput for log and metric workloads in production environments.

BenchmarkingCluster SizingElasticsearch
0 likes · 10 min read
Benchmarking and Sizing Your Elasticsearch Cluster for Logs and Metrics
Xiaolei Talks DB
Xiaolei Talks DB
Jun 6, 2022 · Databases

Master TiDB 6.0 Troubleshooting with PingCAP Clinic and Diag

This article explains how to streamline TiDB 6.0 fault diagnosis by replacing manual screenshot and log collection with PingCAP's Clinic service and the TiUP Diag tool, covering data collection, upload procedures, security considerations, and additional diagnostic capabilities.

ClinicTiDBTiUP
0 likes · 19 min read
Master TiDB 6.0 Troubleshooting with PingCAP Clinic and Diag
dbaplus Community
dbaplus Community
Apr 25, 2022 · Operations

From Monitoring to Observability: Expert Insights on Evolving Cloud‑Native Operations

In this interview series, three industry experts explain how monitoring differs from observability, the shifts required for ops, developers, and architects, the core methodologies and technologies behind metrics, traces, and logs, and practical guidance for selecting and integrating observability tools in cloud‑native environments.

MetricsObservabilityOperations
0 likes · 16 min read
From Monitoring to Observability: Expert Insights on Evolving Cloud‑Native Operations
Alibaba Cloud Native
Alibaba Cloud Native
Apr 13, 2022 · Cloud Native

From Dapper to OpenTelemetry: A Practical Guide to Distributed Tracing and Observability

This article explains the challenges of long request chains in micro‑service architectures, reviews Google’s Dapper tracing requirements, introduces OpenTracing and OpenCensus standards, compares their strengths, and details how OpenTelemetry unifies tracing, metrics and logs with practical integration steps and best‑practice guidance.

Cloud NativeDistributed TracingMetrics
0 likes · 24 min read
From Dapper to OpenTelemetry: A Practical Guide to Distributed Tracing and Observability
Ops Development Stories
Ops Development Stories
Apr 19, 2021 · Cloud Native

Mastering Kubernetes Component Troubleshooting with pprof and Log Analysis

Learn a systematic approach to diagnosing Kubernetes core component issues by identifying faulty nodes, analyzing logs via systemd or static pods, and leveraging Go's pprof tool for performance profiling, including step‑by‑step commands and UI visualizations for components like kube‑apiserver, scheduler, controller‑manager, and kubelet.

Cloud NativeKuberneteslogs
0 likes · 9 min read
Mastering Kubernetes Component Troubleshooting with pprof and Log Analysis
dbaplus Community
dbaplus Community
Feb 22, 2020 · Databases

How to Perform Daily Maintenance on GaussDB T Clusters Without Pitfalls

This guide walks you through the essential daily maintenance tasks for GaussDB T clusters, covering ETCD startup, cluster health checks, host resource monitoring, tablespace usage, abnormal wait events, log inspection, and common error troubleshooting with concrete commands and SQL examples.

Cluster ManagementDatabase MaintenanceError Handling
0 likes · 11 min read
How to Perform Daily Maintenance on GaussDB T Clusters Without Pitfalls
21CTO
21CTO
Jul 12, 2017 · Fundamentals

Why Logs Are the Hidden Backbone of Distributed Systems and Real‑Time Data

This note distills Jay Kreps' extensive blog on logs, explaining their core role in distributed databases, real‑time data pipelines, replication, and state‑machine consistency, and showing how logs unify concepts from version control to streaming architectures.

data replicationlogsreal-time data
0 likes · 12 min read
Why Logs Are the Hidden Backbone of Distributed Systems and Real‑Time Data
ITPUB
ITPUB
Jun 22, 2016 · Databases

Understanding MySQL Architecture: Files, Logs, and Configuration Explained

This article provides a comprehensive overview of MySQL's architecture, detailing its database and instance components, various file types, configuration parameters, and the roles of different log files such as error, slow query, binary, and redo logs.

ConfigurationDatabase Fileslogs
0 likes · 15 min read
Understanding MySQL Architecture: Files, Logs, and Configuration Explained
Qunar Tech Salon
Qunar Tech Salon
Jul 8, 2015 · Big Data

Understanding Logs: The Foundation of Distributed Systems, Data Integration, and Stream Processing

This article explains how logs—simple, append‑only, time‑ordered records—serve as the core abstraction behind databases, distributed systems, data integration pipelines, and modern stream‑processing platforms such as Kafka and Hadoop, illustrating their design, scalability, and practical challenges.

Big DataData IntegrationDistributed Systems
0 likes · 45 min read
Understanding Logs: The Foundation of Distributed Systems, Data Integration, and Stream Processing
Architect
Architect
Jul 6, 2015 · Big Data

Understanding Logs: The Core of Distributed Systems and Data Integration

This article explains how logs—simple, append‑only, time‑ordered records—serve as the fundamental abstraction behind databases, distributed systems, data integration pipelines, and stream‑processing platforms like Kafka and Hadoop, illustrating their role in ordering, replication, scalability, and real‑time analytics.

Data IntegrationDistributed SystemsHadoop
0 likes · 48 min read
Understanding Logs: The Core of Distributed Systems and Data Integration