Tagged articles

1273 articles

Page 12 of 13

Mar 20, 2019 · Fundamentals

Why Disk I/O Speed Depends on Sequential Access: Page Cache, Scheduling, and B+Tree Insights

This article explains how disk I/O performance is shaped by page cache behavior, sequential versus random access, elevator scheduling, and storage data structures such as B+Tree and LSM trees, showing why operating systems and databases must design for sequential reads and writes to achieve optimal throughput.

B+TreeDisk I/OKafka

0 likes · 18 min read

Why Disk I/O Speed Depends on Sequential Access: Page Cache, Scheduling, and B+Tree Insights

DataFunTalk

Mar 7, 2019 · Big Data

Design and Evolution of Didi's Real‑Time Data Computing Platform

The article details how Didi built and iterated its real‑time data platform, describing the shift from MySQL‑based batch processing to a Kafka‑Samza‑Druid architecture with Spark Streaming and Flink, the challenges addressed, and the current capabilities and operational metrics.

Big DataDruidFlink

0 likes · 9 min read

Design and Evolution of Didi's Real‑Time Data Computing Platform

58 Tech

Feb 13, 2019 · Databases

Design and Implementation of a MySQL Slow Query Log Analysis System on the CloudDB Platform

The article describes the architecture, core functions, and workflow of a MySQL slow‑query analysis system built for the CloudDB platform, covering both daily report generation and real‑time slow‑SQL monitoring using tools such as pt‑query‑digest, ELK, and Kafka.

Database OptimizationELKKafka

0 likes · 7 min read

Design and Implementation of a MySQL Slow Query Log Analysis System on the CloudDB Platform

ITFLY8 Architecture Home

Jan 29, 2019 · Operations

How to Optimize Large-Scale Log Systems for Real-Time Monitoring and Scalability

This article examines the design, deployment, and optimization of massive log systems, comparing architectures, discussing real‑time versus near‑real‑time requirements, and presenting practical improvements such as memory, CPU, network tuning, data partitioning, storage reduction, and component upgrades using ELK, Kafka, Fluentd, and HBase.

Big DataELKFluentd

0 likes · 18 min read

How to Optimize Large-Scale Log Systems for Real-Time Monitoring and Scalability

dbaplus Community

Dec 12, 2018 · Backend Development

How to Choose the Right Message Queue: RabbitMQ vs Kafka

This article examines the role of message‑queue middleware in high‑concurrency IM systems, compares popular open‑source options such as ActiveMQ, RabbitMQ, Kafka, RocketMQ and ZeroMQ, and provides a detailed multi‑dimensional framework—including functionality, performance, reliability, operational management, and ecosystem factors—to help engineers select the most suitable queue for their specific business needs.

KafkaMessage QueueMiddleware Selection

0 likes · 28 min read

How to Choose the Right Message Queue: RabbitMQ vs Kafka

Manbang Technology Team

Dec 12, 2018 · Big Data

Kafka Overview: Core Concepts, Architecture, Configuration, and Usage in Real-Time Computing

This article provides a comprehensive technical overview of Kafka, covering its core concepts, producer and consumer models, architecture, configuration parameters, replication mechanisms, performance optimizations, operational monitoring, tooling scripts, and related product implementations for real-time data processing.

Big DataKafkaMessage Queue

0 likes · 18 min read

Kafka Overview: Core Concepts, Architecture, Configuration, and Usage in Real-Time Computing

ITPUB

Dec 10, 2018 · Big Data

How Meituan Syncs MySQL to Hive in Real-Time Using Binlog, Canal, and Camus

This article explains Meituan's architecture for accurately and efficiently moving MySQL data into a Hive data warehouse by capturing binlog streams with Canal, transporting them via Kafka, and restoring them offline with Camus and a merge process that handles inserts, updates, and deletes.

BinlogKafkahive

0 likes · 14 min read

How Meituan Syncs MySQL to Hive in Real-Time Using Binlog, Canal, and Camus

Meituan Technology Team

Dec 6, 2018 · Big Data

Real-time Binlog Collection and Offline MySQL Data Restoration for Data Warehousing

The article presents a CDC solution that combines Alibaba’s Canal for real‑time MySQL binlog capture into Kafka with LinkedIn’s Camus for hourly Kafka‑to‑Hive loading, then merges snapshots and incremental binlog data to accurately and efficiently rebuild ODS tables, supporting sharding and delete events.

BinlogCDCCamus

0 likes · 14 min read

Real-time Binlog Collection and Offline MySQL Data Restoration for Data Warehousing

UCloud Tech

Nov 29, 2018 · Operations

How UCloud’s Physical Network Orchestrator Cuts IDC Build Time from Days to Hours

UCloud’s physical network orchestrator automates large‑scale data‑center switch configuration, reducing IDC network build cycles from 2‑3 days to 2‑3 hours, boosting success rates to 99%, while handling 3000+ switches, 200 Gbps access throughput, and supporting hybrid‑cloud real‑time connectivity through a scenario‑driven, Kafka‑backed architecture.

Configuration ManagementKafkaUCloud

0 likes · 14 min read

How UCloud’s Physical Network Orchestrator Cuts IDC Build Time from Days to Hours

Programmer DD

Nov 27, 2018 · Backend Development

How to Prevent Duplicate Message Consumption with Spring Cloud Stream Consumer Groups

This article explains why duplicate message consumption occurs when using Spring Cloud Stream with RabbitMQ or Kafka, introduces the concept of consumer groups, and provides a step‑by‑step Java example showing how to configure and use consumer groups to ensure each message is processed by only one instance.

Kafkaconsumer-groupspring-boot

0 likes · 6 min read

How to Prevent Duplicate Message Consumption with Spring Cloud Stream Consumer Groups

dbaplus Community

Nov 20, 2018 · Backend Development

20 Proven Kafka Best Practices for High‑Throughput Clusters

This guide presents New Relic’s 20 practical best‑practice recommendations—covering partitions, consumers, producers, and brokers—to help engineers design, tune, and monitor Apache Kafka deployments for reliable, high‑throughput performance.

BrokersConsumersHigh Throughput

0 likes · 15 min read

20 Proven Kafka Best Practices for High‑Throughput Clusters

21CTO

Nov 20, 2018 · Big Data

What Languages and Tools Do Big Data Experts Use? Insights from 31 IT Leaders

Based on interviews with 31 IT leaders from 28 organizations, this article reveals the most popular programming languages, frameworks, and platforms—such as Python, Scala, Spark, Kafka, TensorFlow, and Tableau—currently driving big‑data extraction, analysis, and reporting, and highlights emerging trends and tool preferences.

Big DataKafkaPython

0 likes · 12 min read

What Languages and Tools Do Big Data Experts Use? Insights from 31 IT Leaders

360 Tech Engineering

Oct 18, 2018 · Big Data

KafkaBridge: A Multi‑Language Kafka Client SDK for Simplified Read/Write Operations

KafkaBridge is an open‑source, multi‑language SDK built on librdkafka that offers a minimal, easy‑to‑use interface for producing and consuming messages in Apache Kafka, with optimizations for PHP‑FPM, extensive language support, and detailed performance benchmarks.

GolangKafkaPHP

0 likes · 7 min read

KafkaBridge: A Multi‑Language Kafka Client SDK for Simplified Read/Write Operations

Ctrip Technology

Oct 17, 2018 · Big Data

Design and Evolution of Ctrip Flight Ticket Log Tracking System

This article describes how Ctrip's flight ticket team built a massive log‑tracking platform using Elasticsearch, Kafka, and Spark, evaluated storage options such as Cassandra and HBase, introduced secondary indexing and hot‑cold data separation, and continuously evolved the architecture to balance resource usage and query performance.

KafkaLog Analyticsarchitecture

0 likes · 7 min read

Design and Evolution of Ctrip Flight Ticket Log Tracking System

DataFunTalk

Oct 14, 2018 · Big Data

Exploring Real-Time Data Warehouse Practices Based on HBase

The article details the evolution from an offline to a real‑time HBase data warehouse, covering business scenarios, the use of Maxwell for MySQL‑to‑Kafka ingestion, Phoenix for SQL access, CDH cluster tuning, monitoring, and several production case studies.

HBaseKafkaPhoenix

0 likes · 14 min read

Exploring Real-Time Data Warehouse Practices Based on HBase

360 Tech Engineering

Oct 14, 2018 · Big Data

KafkaBridge: A Multi‑language Kafka Client SDK for High‑Performance Data Production and Consumption

KafkaBridge is a lightweight, multi‑language SDK that wraps Kafka read/write operations, offering a minimal, reliable API for C/C++, PHP, Python and Go, with php‑fpm long‑connection optimizations, and includes compilation, usage, and performance testing details.

GolangKafkaPHP

0 likes · 6 min read

KafkaBridge: A Multi‑language Kafka Client SDK for High‑Performance Data Production and Consumption

Efficient Ops

Oct 13, 2018 · Big Data

Boost Your Kafka Integration with KafkaBridge: Multi-Language SDK Overview

KafkaBridge is a lightweight, multi-language SDK that simplifies Kafka read/write operations, offering unified interfaces, long‑connection reuse for PHP‑FPM, and reliable message delivery, with detailed compilation steps, usage examples, and performance benchmarks across C++, Python, PHP, and Go.

GolangKafkaPHP

0 likes · 7 min read

Boost Your Kafka Integration with KafkaBridge: Multi-Language SDK Overview

Architecture Talk

Sep 30, 2018 · Backend Development

Why Event‑Driven Architecture Beats Command‑Driven Design in Microservices

This article explains how shifting from synchronous command‑driven interactions to asynchronous event‑driven flows reduces coupling, improves scalability, and enables flexible querying in distributed systems, while also discussing hybrid patterns, the single‑writer principle, and practical advantages illustrated with Kafka‑based examples.

Event-Driven ArchitectureEventsKafka

0 likes · 13 min read

Why Event‑Driven Architecture Beats Command‑Driven Design in Microservices

MaGe Linux Operations

Sep 29, 2018 · Information Security

Build a Real-Time Security Log Collection & Alert System with ELK, Kafka, and Sentinl

This guide walks through collecting security device and Nginx logs using ELK 5.5.2, Logstash grok patterns, Kafka and Flume pipelines on CentOS 7, and configuring Sentinl or ElastAlert for DingTalk and email alerts, complete with code snippets and deployment commands.

DingTalkELKElastAlert

0 likes · 16 min read

Build a Real-Time Security Log Collection & Alert System with ELK, Kafka, and Sentinl

21CTO

Sep 14, 2018 · Backend Development

How Message Queues Enable Near Real‑Time Incremental Indexing in Search Engines

This article examines the high‑real‑time requirements of incremental data ingestion for search engines, compares three update schemes, and details how adopting a Kafka subscription‑based message‑queue approach dramatically improves latency and flexibility for the Nuomi search framework.

KafkaMessage Queueincremental indexing

0 likes · 8 min read

How Message Queues Enable Near Real‑Time Incremental Indexing in Search Engines

JD Tech

Sep 4, 2018 · Backend Development

Design and Evolution of an Order Dispatch System for Instant Delivery Platforms

This article describes the evolution, architectural design, and key implementation details of an order dispatch system for instant‑delivery services, covering problem analysis, delay‑task mechanisms such as database polling, DelayQueue and TimingWheel, and the final solution that combines Redis with a timing‑wheel scheduler and asynchronous processing.

Kafkadelay queueinstant delivery

0 likes · 11 min read

Design and Evolution of an Order Dispatch System for Instant Delivery Platforms

Architecture Digest

Sep 2, 2018 · Backend Development

Design and Implementation of a Real‑Time Log Collection System Using Go, Etcd, and Kafka

This article describes the shortcomings of a legacy log‑collection architecture, proposes a streamlined real‑time design that centralises configuration in Etcd, uses a single Go‑based logagent to tail files, applies per‑service rate limiting, and forwards logs to Kafka for downstream processing.

GoKafkaReal-Time

0 likes · 17 min read

Design and Implementation of a Real‑Time Log Collection System Using Go, Etcd, and Kafka

dbaplus Community

Aug 8, 2018 · Big Data

How to Build a Real‑Time Data Platform: Tech Stack & Design Patterns

This article explains the architecture of a Real‑Time Data Platform (RTDP), details the technical selection of core components such as DBus, Kafka, Wormhole, Moonbox and Davinci, and discusses data management, security, operations, and four deployment modes—synchronization, flow, rotation and intelligent—illustrating how each fits different business scenarios.

Big Data ArchitectureData IntegrationKafka

0 likes · 24 min read

How to Build a Real‑Time Data Platform: Tech Stack & Design Patterns

Architecture Digest

Aug 7, 2018 · Big Data

Apache Kafka Overview, Architecture, and Sample Producer/Consumer Code

This article provides a comprehensive overview of Apache Kafka, comparing it with ActiveMQ, explaining its distributed architecture, topics, partitions, consumption models, high‑availability mechanisms, exactly‑once semantics, and includes detailed Java producer and consumer code examples for practical implementation.

Big DataConsumerDistributed Messaging

0 likes · 22 min read

Apache Kafka Overview, Architecture, and Sample Producer/Consumer Code

Full-Stack Internet Architecture

Jul 6, 2018 · Backend Development

Performance Comparison of Kafka, RabbitMQ, and RocketMQ for Small Message Sending

This article evaluates the server‑side performance of three popular message middleware platforms—Kafka, RabbitMQ, and RocketMQ—by measuring throughput and latency when sending small 124‑byte messages, revealing that Kafka leads, followed by RocketMQ and then RabbitMQ.

Distributed SystemsKafkaMessage Queue

0 likes · 6 min read

Performance Comparison of Kafka, RabbitMQ, and RocketMQ for Small Message Sending

Meituan Technology Team

Jul 5, 2018 · Big Data

Meituan Dianping User Action System (UAS): Architecture and Implementation for Real-time User Behavior Processing

Meituan‑Dianping’s User Action System unifies disparate user‑behavior events with a 5W1H format, ingests them via a proprietary MAPI channel into Kafka, processes them in real‑time using Storm and a Lambda batch‑speed architecture, and delivers millisecond‑level responses for billions of daily events while offering flexible, modular query and storage options.

KafkaLambda architectureStorm

0 likes · 17 min read

Meituan Dianping User Action System (UAS): Architecture and Implementation for Real-time User Behavior Processing

Architecture Digest

Jun 26, 2018 · Backend Development

Message Queue Middleware: Why Use It, Drawbacks, Selection, High Availability, and Reliability

This article reviews the essential concepts of message‑queue middleware, covering why it is used, its disadvantages, how to choose among popular MQs, and practical techniques for ensuring high availability, avoiding duplicate consumption, reliable transmission, and ordered processing.

KafkaMessage QueueRabbitMQ

0 likes · 17 min read

Message Queue Middleware: Why Use It, Drawbacks, Selection, High Availability, and Reliability

Architecture Digest

Jun 19, 2018 · Backend Development

Understanding Kafka: Architecture, Message Integrity, and Performance Considerations

This article explains Kafka's role as a distributed message queue, covering its architecture, replication mechanisms, producer‑consumer workflow, message integrity guarantees, schema management, and performance tuning for high‑throughput, low‑latency backend systems.

Distributed SystemsKafkaMessage Queue

0 likes · 6 min read

Understanding Kafka: Architecture, Message Integrity, and Performance Considerations

Architecture Digest

Jun 18, 2018 · Operations

Design and Optimization of Large‑Scale Log Systems

This article examines the challenges of handling massive log data in high‑traffic e‑commerce platforms and presents a comprehensive architecture, optimization strategies, and practical implementations—including Rsyslog, Kafka, Fluentd, and the ELK stack—to improve scalability, performance, and reliability of log management systems.

Big DataELKFluentd

0 likes · 17 min read

Design and Optimization of Large‑Scale Log Systems

Architecture Digest

Jun 5, 2018 · Fundamentals

Kafka and RocketMQ Architecture: Availability, Reliability, and Design Considerations

This article compares the architectures of Kafka and RocketMQ, examines their availability and reliability mechanisms, evaluates their strengths and weaknesses, and proposes hybrid designs and simplified MQ architectures for building highly available and reliable messaging systems.

AvailabilityKafkaMessage Queue

0 likes · 12 min read

Kafka and RocketMQ Architecture: Availability, Reliability, and Design Considerations

Programmer DD

Jun 3, 2018 · Backend Development

Designing a China‑Style Microservice Stack 2.0: Practical Component Guide

This article presents a practical, China‑focused microservice reference stack built on Spring Cloud, detailing core support components such as Zuul, Eureka, Apollo, and Spring Boot, as well as monitoring tools like Kafka, ELK, CAT, KairosDB, ZMon, and Hystrix, and explains when and how to apply each in production environments.

ApolloBackend ArchitectureKafka

0 likes · 20 min read

Designing a China‑Style Microservice Stack 2.0: Practical Component Guide

Java Captain

May 24, 2018 · Big Data

Debugging a Kafka Data Drop: A Step‑by‑Step Troubleshooting Case Study

After a recent feature release caused a sharp decline in a key data metric, the team followed a systematic, fourteen‑step troubleshooting process—including verification, code review, DBA involvement, local debugging, environment comparison, logging, packet capture, service restarts, request mode changes, load testing, and partition resizing—to identify and resolve a Kafka‑related throughput bottleneck.

KafkaLoad TestingPerformance debugging

0 likes · 8 min read

Debugging a Kafka Data Drop: A Step‑by‑Step Troubleshooting Case Study

Architecture Digest

May 14, 2018 · Backend Development

Implementing and Optimizing a High‑Concurrency Flash Sale System with Optimistic Lock, Distributed Rate Limiting, Redis Cache, and Kafka

This article walks through building a Java‑based flash‑sale (秒杀) service, diagnosing overselling issues, and progressively enhancing it with optimistic locking, distributed rate limiting, Redis caching, and asynchronous Kafka processing to achieve higher throughput and data consistency under heavy concurrency.

KafkaPerformance Testingdistributed rate limiting

0 likes · 14 min read

Implementing and Optimizing a High‑Concurrency Flash Sale System with Optimistic Lock, Distributed Rate Limiting, Redis Cache, and Kafka

Java Backend Technology

May 10, 2018 · Backend Development

How to Build a Real‑Time 8‑Hour Hot‑Article Ranking System at Massive Scale

This article explains how to design a distributed backend that ingests millions of clicks per second, stores data with Kafka and HDFS, computes top‑N articles using sliding windows and periodic aggregation, handles node failures, and mitigates click fraud, all while balancing accuracy and resource usage.

BackendKafkaReal-Time

0 likes · 13 min read

How to Build a Real‑Time 8‑Hour Hot‑Article Ranking System at Massive Scale

Tencent Cloud Developer

May 3, 2018 · Operations

Tencent Cloud Kafka Automated Operations Practices

Tencent Cloud’s senior engineer Yang Yuan explains how their managed Kafka service tackles version diversity, resource allocation, dynamic scaling, broker addition/removal, and partition migration using versioned clusters, bin‑packing algorithms, penalty weighting, and predictive scheduling to sustain trillions of messages and billions of messages per minute.

KafkaOperations AutomationResource Management

0 likes · 14 min read

Tencent Cloud Kafka Automated Operations Practices

Qunar Tech Salon

May 3, 2018 · Big Data

Understanding Kafka Message Formats Across Versions 0.7.x, 0.8.x, and 0.10.x

This article explains the evolution of Kafka message formats from version 0.7.x through 0.8.x (including 0.9.x) to 0.10.x, detailing each field, compression handling, and the design motivations behind the changes.

Big DataKafkaMessage Format

0 likes · 9 min read

Understanding Kafka Message Formats Across Versions 0.7.x, 0.8.x, and 0.10.x

21CTO

Apr 28, 2018 · Big Data

Why Kafka Dominates Real-Time Data Streaming in the Big Data Era

This article explains why Kafka has become essential for real‑time data streaming in the big‑data era, detailing its performance advantages, core use cases, major adopters, multilingual support, and how its scalable storage and retention mechanisms empower modern data pipelines.

KafkaReal-time Streaming

0 likes · 10 min read

Why Kafka Dominates Real-Time Data Streaming in the Big Data Era

ITFLY8 Architecture Home

Apr 23, 2018 · Big Data

How Ctrip Built a Scalable Real‑Time User Behavior System with Kafka, Storm, and Redis

Ctrip’s real‑time user behavior service, a core foundation for recommendations, ads, and user profiling, was redesigned with a Java‑based stack (Kafka, Storm, Redis, MySQL) to achieve millisecond‑level latency, high availability, and ten‑fold scalability across processing and output flows.

KafkaReal-time StreamingStorm

0 likes · 12 min read

How Ctrip Built a Scalable Real‑Time User Behavior System with Kafka, Storm, and Redis

Huawei Cloud Developer Alliance

Apr 17, 2018 · Big Data

How a Big Data Platform Powers Real‑Time Facial Recognition for Billion‑Scale Face Libraries

This case study details how Beijing 恒远华信息技术有限公司 built a dynamic face‑capture and real‑time recognition solution on Huawei FusionInsight HD, leveraging deep‑learning algorithms, distributed storage, and stream processing to handle hundreds of millions of faces with high speed, efficiency, and security.

Apache StormHBaseHuawei FusionInsight

0 likes · 17 min read

How a Big Data Platform Powers Real‑Time Facial Recognition for Billion‑Scale Face Libraries

Didi Tech

Apr 11, 2018 · Backend Development

How to Turn Synchronous RPC into Asynchronous Queues for Reliable Microservices

The article examines the reliability challenges of microservice architectures that rely heavily on synchronous RPC calls, and proposes a comprehensive solution that converts failing RPCs to asynchronous message‑queue workflows, introduces a write‑ahead‑queue for transactional consistency between databases and queues, and outlines offset management to ensure end‑to‑end fault tolerance.

KafkaMessage QueueMicroservices

0 likes · 12 min read

How to Turn Synchronous RPC into Asynchronous Queues for Reliable Microservices

StarRing Big Data Open Lab

Mar 30, 2018 · Operations

How Milano Transforms Large-Scale Cluster Log Analysis with ELK and Kafka

Milano, a distributed log collection and analysis platform built on the ELK stack, leverages Filebeat, Kafka, Logstash, Elasticsearch, and Kibana to provide high‑throughput, low‑latency, secure, and visual log management for massive clusters, addressing the challenges of traditional manual log inspection.

Big DataDistributed SystemsELK

0 likes · 8 min read

How Milano Transforms Large-Scale Cluster Log Analysis with ELK and Kafka

Snowball Engineer Team

Mar 23, 2018 · Big Data

Redesigning Snowball's Log Collection Architecture During Hadoop Cluster Expansion

The article details Snowball's challenges with a saturated CDH Hadoop cluster, outlines the limitations of the original Kafka‑based log pipeline, and explains how a comprehensive redesign using FlumeNG, Spillable Memory Channels, and custom HDFS sinks resolves latency, data loss, and high‑load issues while supporting future growth.

Cluster MigrationFlumeNGHadoop

0 likes · 6 min read

Redesigning Snowball's Log Collection Architecture During Hadoop Cluster Expansion

Java Architect Essentials

Mar 15, 2018 · Backend Development

Message Queue Middleware: Concepts, Use Cases, and Common Implementations

This article introduces message queue middleware, explains its role in distributed systems for decoupling, asynchronous processing, traffic shaping, and log handling, and reviews typical use cases, architectural patterns, JMS models, and popular products such as ActiveMQ, RabbitMQ, ZeroMQ, and Kafka.

Distributed SystemsJMSKafka

0 likes · 20 min read

Message Queue Middleware: Concepts, Use Cases, and Common Implementations

Programmer DD

Mar 12, 2018 · Backend Development

How to Choose the Right Message Queue: Practical Insights Beyond the Hype

This article shares a seasoned developer’s perspective on selecting a message‑queue middleware, outlining typical adoption stages, three key evaluation criteria—coder expertise, current and future requirements, and community/ecosystem health—and offering candid advice on avoiding common pitfalls.

Backend ArchitectureKafkaMQ selection

0 likes · 9 min read

How to Choose the Right Message Queue: Practical Insights Beyond the Hype

Beike Product & Technology

Mar 9, 2018 · Big Data

How Lianjia Built a Low‑Latency Real‑Time Data Platform with Spark Streaming

This article details Lianjia's journey of designing and implementing a low‑latency, stable real‑time computing platform using Spark Streaming on YARN, covering technical selection, architecture components, version compatibility challenges, exactly‑once semantics, graceful shutdown, Kafka tuning, and future enhancements.

Big DataExactly-OnceKafka

0 likes · 11 min read

How Lianjia Built a Low‑Latency Real‑Time Data Platform with Spark Streaming

Huawei Cloud Developer Alliance

Feb 27, 2018 · Big Data

Master Vehicle IoT in 5 Minutes: Data Open & Collection Explained

This article introduces vehicle networking concepts, explains Huawei's OceanConnect solution, and details how data open and data collection are implemented using platforms like DAP, Kafka, and Hadoop to provide reliable, real‑time vehicle information for various applications.

Data OpenHadoopHuawei OceanConnect

0 likes · 6 min read

Master Vehicle IoT in 5 Minutes: Data Open & Collection Explained

ITFLY8 Architecture Home

Feb 25, 2018 · Big Data

Building Scalable Data Platforms with SMACK: Spark, Mesos, Akka, Cassandra & Kafka

Learn how to construct a scalable data processing platform using the SMACK stack—Spark, Mesos, Akka, Cassandra, and Kafka—covering storage design, processing workflows, resource management, deployment options, and fault‑tolerant task execution for both batch and streaming workloads.

AkkaKafkaMesos

0 likes · 14 min read

Building Scalable Data Platforms with SMACK: Spark, Mesos, Akka, Cassandra & Kafka

iQIYI Technical Product Team

Jan 31, 2018 · Big Data

Evolution of iQIYI Real-Time Big Data Collection System

iQIYI’s big‑data collection system has progressed from simple HTTP log uploads to a Flume‑Kafka pipeline and finally to a custom Venus‑Agent architecture with centralized configuration, persistent offsets, dual‑Kafka streams and Flink processing, now handling tens of millions of queries per second and over three hundred billion records daily to power its AI‑driven services.

Big DataFlinkFlume

0 likes · 15 min read

Evolution of iQIYI Real-Time Big Data Collection System

Hujiang Technology

Jan 29, 2018 · Operations

Design and Implementation of a Low‑Impact Distributed Tracing System for Service Calls

This article describes the background, design goals, architecture, implementation details, and lessons learned from building a low‑overhead, low‑intrusion distributed tracing system using Kafka, Elasticsearch, and OpenTracing to monitor microservice interactions and support performance analysis and DevOps decision‑making.

Distributed TracingElasticsearchKafka

0 likes · 9 min read

Design and Implementation of a Low‑Impact Distributed Tracing System for Service Calls

dbaplus Community

Jan 16, 2018 · Big Data

Kafka MirrorMaker Mastery: Real‑Time Sync, Tuning & Troubleshooting

Kafka MirrorMaker provides near‑real‑time cross‑data‑center replication by consuming from a source cluster and producing to a target cluster, and this guide explains its core features, new vs. old consumer APIs, partition assignment strategies, performance tuning, network considerations, and practical command‑line examples.

Consumer APIKafkaMirrorMaker

0 likes · 13 min read

Kafka MirrorMaker Mastery: Real‑Time Sync, Tuning & Troubleshooting

Meituan Technology Team

Jan 12, 2018 · Backend Development

Design and Implementation of Meituan Hotel Full-Chain Log and Trace System

To cope with Meituan Hotel’s exploding micro‑service complexity, the infrastructure team built the Satellite System—combining MTrace and a selective, zero‑intrusion Log4j2‑based logging pipeline that streams enriched logs through Kafka, Storm, Redis and Elasticsearch, delivering second‑level trace‑log queries and six‑month retention, dramatically speeding up debugging.

Distributed TracingElasticsearchKafka

0 likes · 11 min read

Design and Implementation of Meituan Hotel Full-Chain Log and Trace System

MaGe Linux Operations

Dec 11, 2017 · Big Data

Master Kafka Basics: Architecture, Core Concepts, and Hands‑On Python Experiments

This article explains Kafka’s core concepts—including producers, consumers, topics, partitions, brokers, and consumer groups—describes its distributed architecture with leader‑follower replication, and provides three hands‑on kafka‑python experiments that demonstrate basic messaging, fault‑tolerant consumer groups, and offset management for reliable consumption.

Distributed StreamingKafkaOffset Management

0 likes · 9 min read

Master Kafka Basics: Architecture, Core Concepts, and Hands‑On Python Experiments

ITFLY8 Architecture Home

Nov 29, 2017 · Backend Development

How Kafka Stores Messages: Partitions, Segments, and Sparse Indexes Explained

This article explains Kafka's internal message storage mechanism, detailing how topics are divided into partitions, how partitions are segmented into LogSegments with data and index files, and how sparse indexing enables efficient offset lookups.

KafkaLogSegmentMessage Storage

0 likes · 9 min read

How Kafka Stores Messages: Partitions, Segments, and Sparse Indexes Explained

21CTO

Nov 11, 2017 · Big Data

How We Built a Scalable Seller Log System with Kafka, Storm, ES & HBase

This article explains the design and implementation of a unified seller‑operation logging platform that uses Kafka for ingestion, Storm for real‑time processing, Elasticsearch for hot‑data search, and HBase for cold‑data storage, detailing the challenges faced and the optimizations applied.

Big DataElasticsearchHBase

0 likes · 12 min read

How We Built a Scalable Seller Log System with Kafka, Storm, ES & HBase

Architecture Digest

Nov 11, 2017 · Big Data

Design and Implementation of a Seller Log System Using Kafka, Storm, Elasticsearch, and HBase

This article describes the design and implementation of a seller log system, detailing the use of Kafka for high‑throughput messaging, Storm for real‑time stream processing, Elasticsearch for hot‑data search, and HBase for cold‑data storage, along with challenges faced and optimization solutions.

KafkaStormStreaming

0 likes · 12 min read

Design and Implementation of a Seller Log System Using Kafka, Storm, Elasticsearch, and HBase

360 Zhihui Cloud Developer

Oct 26, 2017 · Backend Development

Understanding Kafka’s NIO Selector: How the Selector Class Manages Connections

This article delves into Kafka’s network layer implementation, explaining the Selector class’s role in registering socket channels, handling connection events, and orchestrating reads and writes via KafkaChannel and TransportLayer, while illustrating packet structures and providing code snippets for key functions like register, connect, poll, and send.

KafkaNetwork I/Obackend-development

0 likes · 7 min read

Understanding Kafka’s NIO Selector: How the Selector Class Manages Connections

Hujiang Technology

Oct 17, 2017 · Operations

Design and Implementation of a Distributed Real-Time Log Collection and Analysis System Using the ELK/EFK Stack

This article describes the background, requirements, architecture choices, performance testing, and lessons learned from building a large‑scale, distributed log collection and analysis platform at Hujiang using Elasticsearch, Logstash, Kibana, Filebeat, and Kafka to handle billions of log entries daily.

ELKFilebeatKafka

0 likes · 12 min read

Design and Implementation of a Distributed Real-Time Log Collection and Analysis System Using the ELK/EFK Stack

dbaplus Community

Oct 15, 2017 · Big Data

How JD Built a Scalable Seller Log Platform with Kafka, Storm, ES & HBase

This article details JD's end‑to‑end seller log system architecture, explaining why Kafka, Storm, Elasticsearch and HBase were chosen, the challenges faced during scaling, and the practical solutions implemented to achieve a unified, high‑throughput logging platform for merchants and operations.

Big DataElasticsearchHBase

0 likes · 13 min read

How JD Built a Scalable Seller Log Platform with Kafka, Storm, ES & HBase

StarRing Big Data Open Lab

Oct 13, 2017 · Artificial Intelligence

Real‑Time KMeans Streaming with Sophon & Slipstream: From Model Training to Kafka Prediction

This guide demonstrates how to train a KMeans model with Transwarp Sophon and deploy it in Slipstream for real‑time streaming predictions on Kafka data, covering model export, stream creation, SQL‑based inference, and result persistence.

KMeansKafkaSlipstream

0 likes · 7 min read

Real‑Time KMeans Streaming with Sophon & Slipstream: From Model Training to Kafka Prediction

Dada Group Technology

Sep 29, 2017 · Operations

Overwatch: A Distributed System Monitoring Platform for Real‑Time RPC Visibility

Overwatch is an open‑source distributed monitoring platform built by Dada‑Jingdong Home that collects, aggregates, and visualizes RPC traffic across thousands of micro‑services in real time, enabling engineers to quickly pinpoint the root cause of system failures using directed‑graph visualizations and CQRS‑based data queries.

CQRSKafkaRPC

0 likes · 10 min read

Overwatch: A Distributed System Monitoring Platform for Real‑Time RPC Visibility

Qunar Tech Salon

Sep 25, 2017 · Big Data

Comprehensive Guide to Spark Ecosystem: Data Warehouse, Machine Learning, Streaming, and Enterprise Use Cases

This article provides an extensive overview of Apache Spark’s ecosystem—including its data‑warehouse capabilities, ML/MLlib libraries, streaming with Spark Streaming, external frameworks, and real‑world enterprise case studies—while also noting a promotional announcement for a React Native conference.

Big DataKafkaSpark

0 likes · 21 min read

Comprehensive Guide to Spark Ecosystem: Data Warehouse, Machine Learning, Streaming, and Enterprise Use Cases

CoolHome R&D Department

Sep 18, 2017 · Big Data

How Vimur Leverages Kafka for Real‑Time Data Migration and Synchronization

This article details how Vimur, a Kafka‑based real‑time data pipeline, addresses the challenges of service splitting, database sharding, data migration, and synchronization by using CDC, a unified Avro format, and a change distribution platform to support search indexing, cache refresh, and reactive architectures.

AvroCDCData Migration

0 likes · 21 min read

How Vimur Leverages Kafka for Real‑Time Data Migration and Synchronization

21CTO

Sep 14, 2017 · Backend Development

How PhxQueue Achieves High‑Availability, High‑Throughput Distributed Queuing with Paxos

PhxQueue is a Tencent‑open‑source, Paxos‑based distributed queue that delivers at‑least‑once delivery, synchronous disk flushing, strict ordering, multi‑subscription, and high throughput, outperforming Kafka in reliability and failover scenarios while supporting massive workloads such as WeChat Pay.

KafkaPaxosWeChat

0 likes · 17 min read

How PhxQueue Achieves High‑Availability, High‑Throughput Distributed Queuing with Paxos

dbaplus Community

Sep 13, 2017 · Big Data

How Kafka’s High‑Level Consumer Works, Rebalance Challenges, and the Next‑Gen Design

This article explains Kafka’s High‑Level and Low‑Level consumer models, the semantics of Consumer Groups, the rebalance algorithm and its drawbacks, and outlines the planned redesign in Kafka 0.9.x that introduces a central Coordinator to solve herd and split‑brain issues.

ConsumerHigh-Level ConsumerKafka

0 likes · 20 min read

How Kafka’s High‑Level Consumer Works, Rebalance Challenges, and the Next‑Gen Design

Java High-Performance Architecture

Sep 12, 2017 · Big Data

What Is KSQL? A Beginner’s Guide to Real‑Time Stream SQL on Kafka

KSQL is an open‑source, distributed SQL engine for Apache Kafka that enables continuous, real‑time queries on streaming data, lowering the barrier for analysts to perform stream processing, monitoring, security checks, and analytics without writing code.

KSQLKafkaReal-time analytics

0 likes · 6 min read

What Is KSQL? A Beginner’s Guide to Real‑Time Stream SQL on Kafka

WeChat Backend Team

Sep 12, 2017 · Backend Development

How PhxQueue Achieves High‑Throughput, High‑Reliability Distributed Queuing with Paxos

PhxQueue, an open‑source, Paxos‑based distributed queue from WeChat, delivers at‑least‑once delivery, synchronous disk flushing, strict ordering, multi‑subscription, and high availability, outperforming Kafka in reliability and latency while maintaining comparable throughput, as demonstrated through detailed design, performance, and failover analyses.

Distributed SystemsKafkaPaxos

0 likes · 26 min read

How PhxQueue Achieves High‑Throughput, High‑Reliability Distributed Queuing with Paxos

Architecture Digest

Sep 9, 2017 · Backend Development

Jkes: A Java‑Kafka‑ElasticSearch Search Framework – Installation, Configuration, and Usage Guide

Jkes is a Java‑based search framework built on Kafka and ElasticSearch that provides annotation‑driven JPA‑style mapping, REST APIs for indexing and searching, and detailed integration with Spring Boot, offering developers a complete backend solution for scalable document search.

ElasticsearchKafkaSearch Framework

0 likes · 12 min read

Jkes: A Java‑Kafka‑ElasticSearch Search Framework – Installation, Configuration, and Usage Guide

Architecture Digest

Sep 7, 2017 · Big Data

Design and Implementation of Bilibili's Lancer Log Collection System

The article presents the architecture, component design, optimizations, and reliability guarantees of Bilibili's Lancer log collection system, a Flume‑based distributed pipeline that handles both real‑time and offline data streams for billions of events daily.

Big DataDistributed SystemsFlume

0 likes · 13 min read

Design and Implementation of Bilibili's Lancer Log Collection System

dbaplus Community

Sep 5, 2017 · Big Data

Why Kafka Needs High Availability: Deep Dive into Replication and Leader Election

This article explains why Kafka introduced High Availability in version 0.8, covering the necessity of data replication and leader election, the internal replication and ACK mechanisms, Zookeeper metadata structures, broker failover procedures, and the command‑line tools that help manage and rebalance a Kafka cluster.

KafkaReplicationhigh availability

0 likes · 36 min read

Why Kafka Needs High Availability: Deep Dive into Replication and Leader Election

BiCaiJia Technology Team

Sep 2, 2017 · Backend Development

Integrate Kafka with Spring Boot 1.4 Using Spring Integration – Step‑by‑Step Guide

This guide walks you through setting up Kafka and Zookeeper, adding Spring Integration dependencies, configuring application.yml, creating producer and consumer configurations with @Configuration and @EnableKafka, implementing a @KafkaListener, and testing the integration via a Spring MVC endpoint, while highlighting common pitfalls.

KafkaMessagingSpring Boot

0 likes · 6 min read

Integrate Kafka with Spring Boot 1.4 Using Spring Integration – Step‑by‑Step Guide

BiCaiJia Technology Team

Sep 2, 2017 · Big Data

How to Install and Test Kafka on CentOS: A Step‑by‑Step Guide

This guide walks you through installing Zookeeper and Kafka on a CentOS server, configuring essential settings, creating topics, and running producers and consumers, while highlighting common pitfalls and providing the exact commands needed for a successful deployment.

Big DataCentOSInstallation

0 likes · 6 min read

How to Install and Test Kafka on CentOS: A Step‑by‑Step Guide

Architecture Digest

Aug 29, 2017 · Big Data

Introduction to Apache Kafka: Concepts, Architecture, and Core APIs

This article provides a comprehensive overview of Apache Kafka, explaining its role in real‑time data pipelines and stream processing, describing key concepts such as topics, partitions, logs, producers, consumers, replication, guarantees, and how Kafka functions as both a messaging and storage system.

Consumer APIDistributed StreamingKafka

0 likes · 13 min read

Introduction to Apache Kafka: Concepts, Architecture, and Core APIs

21CTO

Jul 23, 2017 · Backend Development

Comparing Kafka and RocketMQ: Architecture, Availability, and Reliability Insights

This article examines the architectures of Kafka and RocketMQ, analyzes their availability and reliability mechanisms, evaluates their strengths and weaknesses, and proposes a hybrid MQ design that combines the benefits of both systems while simplifying dependencies and improving fault tolerance.

AvailabilityKafkaMessage Queue

0 likes · 13 min read

Comparing Kafka and RocketMQ: Architecture, Availability, and Reliability Insights

21CTO

Jul 20, 2017 · Backend Development

How Ctrip Built a Real-Time User Data Collection System with Netty and Kafka

This article details Ctrip's design and implementation of a high‑throughput, low‑latency user data collection platform that leverages Java NIO, Netty, and a custom Kafka‑based messaging layer, covering architecture, encryption, compression, disaster‑recovery, performance testing, and downstream analytics products.

AvroBackend ArchitectureData Streaming

0 likes · 17 min read

How Ctrip Built a Real-Time User Data Collection System with Netty and Kafka

Architecture Digest

Jul 20, 2017 · Backend Development

Kafka and RocketMQ Architecture: Availability, Reliability, and Design Insights

This article examines the architectures of Kafka and RocketMQ, analyzes their availability and reliability mechanisms, compares their strengths and weaknesses, and proposes hybrid designs and simplified MQ solutions for building robust message‑queue systems.

KafkaReliabilityRocketMQ

0 likes · 14 min read

Kafka and RocketMQ Architecture: Availability, Reliability, and Design Insights

Architecture Digest

Jul 18, 2017 · Backend Development

Design and Implementation of Ctrip Real‑Time User Data Collection System

This article describes the design, technology selection, and performance evaluation of Ctrip's real‑time user behavior data collection platform, covering Netty‑based network handling, Kafka/Hermes messaging, encryption, compression, Avro backup, and related analytics products, with detailed feasibility analysis and benchmark results.

Backend ArchitectureDistributed SystemsKafka

0 likes · 17 min read

Design and Implementation of Ctrip Real‑Time User Data Collection System

21CTO

Jul 8, 2017 · Big Data

Ctrip’s Scalable Real‑Time User Behavior System with Kafka, Storm, Redis

This article details Ctrip’s redesign of its real‑time user behavior service, covering the new architecture, data flow, use of Java, Kafka, Storm, Redis, and MySQL, and how it achieves high real‑time performance, availability, scalability, and fault‑tolerance to support massive travel‑industry traffic.

KafkaReal-TimeStorm

0 likes · 12 min read

Ctrip’s Scalable Real‑Time User Behavior System with Kafka, Storm, Redis

21CTO

Jun 11, 2017 · Big Data

How Kafka Guarantees High Reliability – Architecture, Replication & Benchmarks

This article explains Kafka's distributed architecture, topic‑partition model, replication and ISR mechanisms, data durability settings, delivery guarantees, deduplication strategies, and presents benchmark results that illustrate how configuration choices affect throughput and latency in real‑world deployments.

Distributed MessagingKafkaReplication

0 likes · 33 min read

How Kafka Guarantees High Reliability – Architecture, Replication & Benchmarks

Architecture Digest

Jun 11, 2017 · Big Data

Kafka High‑Reliability Architecture, Storage Mechanisms, Replication, and Benchmark Analysis

This article explains Kafka's distributed architecture, its topic‑partition storage model, replication and synchronization mechanisms, reliability guarantees such as ISR and high‑watermark, and presents benchmark results that illustrate how replication factor, acks settings, and partition count affect throughput and latency.

KafkaReliabilitybenchmark

0 likes · 34 min read

Kafka High‑Reliability Architecture, Storage Mechanisms, Replication, and Benchmark Analysis

Architecture Digest

Jun 9, 2017 · Big Data

A Comprehensive Guide for Big Data Beginners: From Hadoop Fundamentals to Machine Learning

This guide walks beginners through the entire big‑data ecosystem, covering the 4V characteristics, core open‑source frameworks, Hadoop setup, Hive and SQL on Hadoop, data ingestion and export tools, task scheduling, real‑time processing with Kafka, Storm and Spark Streaming, and an introduction to machine‑learning applications.

HadoopKafkaSpark

0 likes · 17 min read

A Comprehensive Guide for Big Data Beginners: From Hadoop Fundamentals to Machine Learning

Tongcheng Travel Technology Center

Jun 5, 2017 · Backend Development

Evolution of a High‑Scale Push Notification System at Tongcheng Travel

This article chronicles the multi‑year architectural evolution of Tongcheng Travel's push notification platform, detailing early batch‑job designs, successive redesigns using Redis, MongoDB, Kafka, Go, and .NET, and the performance, scalability, and operational improvements achieved through each major version.

GoKafkaMongoDB

0 likes · 12 min read

Evolution of a High‑Scale Push Notification System at Tongcheng Travel

MaGe Linux Operations

May 28, 2017 · Backend Development

Understanding Kafka’s Architecture: Topics, Partitions, and Reliability

This article explains Kafka’s core architecture—including brokers, topics, partitions, offsets, producer and consumer mechanics, replication, availability, consistency, persistence, performance optimizations, and Zookeeper integration—providing a comprehensive guide for building reliable distributed messaging systems.

Distributed MessagingKafkaOFFSET

0 likes · 15 min read

Understanding Kafka’s Architecture: Topics, Partitions, and Reliability

Architecture Digest

May 18, 2017 · Backend Development

Design and Architecture of Ctrip's Real‑Time User Behavior Service

The article describes how Ctrip rebuilt its real‑time user behavior platform using a Java‑based stack (Kafka, Storm, Redis, MySQL) to achieve millisecond‑level latency, high availability, scalable performance, and robust handling of traffic spikes, failures, and data back‑pressure.

Backend ArchitectureKafkaReal-Time

0 likes · 12 min read

Design and Architecture of Ctrip's Real‑Time User Behavior Service

Tongcheng Travel Technology Center

May 11, 2017 · Operations

Design and Experience of a Near Real-Time Log System Based on Kafka and Elasticsearch

This article describes the architecture, deployment, configuration, maintenance, and performance results of a large‑scale near real‑time logging platform built with Kafka, Flume, and Elasticsearch, highlighting practical lessons and future plans for resource‑efficient operation.

ElasticsearchKafkaLog Management

0 likes · 6 min read

Design and Experience of a Near Real-Time Log System Based on Kafka and Elasticsearch

Architects' Tech Alliance

May 7, 2017 · Big Data

Building a Complete Big Data Platform: From Hadoop Basics to Real‑Time Analytics

This guide walks beginners through the entire big‑data ecosystem—explaining the 4V characteristics, listing essential open‑source components, teaching Hadoop setup, Hive and SparkSQL usage, data ingestion with Sqoop, Flume and Kafka, task scheduling with Oozie, and real‑time processing with Storm and Spark Streaming.

Big DataHadoopKafka

0 likes · 20 min read

Building a Complete Big Data Platform: From Hadoop Basics to Real‑Time Analytics

Architecture Digest

Apr 27, 2017 · Big Data

Kafka High‑Reliability Architecture, Storage Mechanisms, and Performance Benchmark

This article explains Kafka's distributed architecture, its topic‑partition storage model, replication and ISR mechanisms, leader election, delivery guarantees, configuration for high reliability, and presents extensive benchmark results showing how replication factor, acks settings, and partition count affect throughput and latency.

Kafkahigh reliabilityperformance benchmark

0 likes · 39 min read

Kafka High‑Reliability Architecture, Storage Mechanisms, and Performance Benchmark

ITFLY8 Architecture Home

Apr 21, 2017 · Backend Development

Mastering Kafka: Producer‑Consumer vs Pub/Sub Patterns for Scalable Backend Design

This article explains Kafka's core concepts and compares producer‑consumer and publish‑subscribe models, illustrating how to apply each pattern for data ingestion and event distribution in distributed backend systems, and offers practical design alternatives when Kafka’s native capabilities fall short.

Backend ArchitectureKafkaMessage Queue

0 likes · 10 min read

Mastering Kafka: Producer‑Consumer vs Pub/Sub Patterns for Scalable Backend Design

Qunar Tech Salon

Apr 21, 2017 · Big Data

Ensuring Exact‑Once Semantics in Spark Streaming with Kafka: Offline Repair and Data Deduplication Strategies

This article explains why Spark Streaming combined with Kafka can only guarantee at‑least‑once delivery, outlines the challenges of delayed and out‑of‑order events, and presents practical offline‑repair, deduplication, and output‑format techniques—including code examples—to achieve exact‑once semantics in big‑data pipelines.

Exact-OnceHBaseHDFS

0 likes · 11 min read

Ensuring Exact‑Once Semantics in Spark Streaming with Kafka: Offline Repair and Data Deduplication Strategies

Tongcheng Travel Technology Center

Apr 10, 2017 · Operations

Sentinel Monitoring System: Real‑Time Business Log Monitoring and Incident Detection for an Airline Ticket Platform

The Sentinel system was built to provide real‑time, zero‑modification monitoring of airline ticket business services by consuming Tianwang logs through a Storm cluster, offering flexible rule configuration, addressing performance pitfalls, and planning future enhancements such as custom monitoring scripts and visual dashboards.

KafkaLog ProcessingReal-Time

0 likes · 6 min read

Sentinel Monitoring System: Real‑Time Business Log Monitoring and Incident Detection for an Airline Ticket Platform

StarRing Big Data Open Lab

Apr 1, 2017 · Big Data

How Inceptor StreamSQL Simplifies Real-Time Data Processing with SQL

This article introduces Inceptor StreamSQL, explains its core concepts of Stream, StreamJob, and Application, and provides a step‑by‑step tutorial—from creating a Kafka source to launching a StreamJob and querying results—highlighting its ease of use and performance benefits.

InceptorKafkaStreamSQL

0 likes · 10 min read

How Inceptor StreamSQL Simplifies Real-Time Data Processing with SQL

ITFLY8 Architecture Home

Mar 26, 2017 · Big Data

How to Build Scalable Log Monitoring and Analytics with ELK, Kafka, and Spark

This article explains various enterprise log types, recommends monitoring tools like Cacti, Zabbix, Splunk, and the ELK stack, and details architectures for handling server, application, and user‑click logs using technologies such as Logstash, Elasticsearch, Kibana, Kafka, Flume, and Spark.

AnalyticsBig DataELK

0 likes · 26 min read

How to Build Scalable Log Monitoring and Analytics with ELK, Kafka, and Spark

Efficient Ops

Mar 20, 2017 · Big Data

How eBay Built a Scalable Kafka‑Based Real‑Time Data Transmission Platform

This article details eBay's year‑long development of an enterprise‑grade, Kafka‑driven data transmission platform, covering its architecture, core services, monitoring and automation strategies, as well as performance tuning techniques that enable high throughput, low latency, and reliable cross‑data‑center replication.

Data StreamingKafkaReal-time Processing

0 likes · 22 min read

How eBay Built a Scalable Kafka‑Based Real‑Time Data Transmission Platform

Qunar Tech Salon

Mar 1, 2017 · Big Data

Building Prism: Qunar’s Real‑Time Data Platform and DevOps Journey

The article describes how Qunar designed and evolved its Prism real‑time data platform—leveraging ELK, Kafka, Spark, Docker, and Mesos—to improve data collection, monitoring, and analysis, reduce deployment time, and support scalable DevOps operations across the company.

Big DataELKKafka

0 likes · 11 min read

Building Prism: Qunar’s Real‑Time Data Platform and DevOps Journey

ITFLY8 Architecture Home

Feb 24, 2017 · Big Data

How ELK, Kafka, and Spark Streaming Revolutionize Log Management in Big Data Environments

This article explores the evolution of log processing in the big‑data era, detailing how ELK Stack, Kafka, and Spark Streaming work together to provide scalable, real‑time log collection, analysis, and visualization for modern cloud‑native operations.

Big DataELKKafka

0 likes · 12 min read

How ELK, Kafka, and Spark Streaming Revolutionize Log Management in Big Data Environments

Efficient Ops

Feb 23, 2017 · Operations

How Qunar Built Prism: A Real‑Time Data Platform That Halves Deployment Time

This article describes how Qunar’s Prism platform combines ELK, Kafka, Spark, Docker and other open‑source tools to create a real‑time data pipeline that speeds up problem localization, reduces deployment time, and improves resource utilization across development and operations teams.

DevOpsDockerELK

0 likes · 14 min read

How Qunar Built Prism: A Real‑Time Data Platform That Halves Deployment Time

Tencent Cloud Developer

Feb 14, 2017 · Databases

TDSQL Audit Capability: Architecture, Kafka Integration, and Consistency Hash Implementation

TDSQL’s cloud‑based audit solution combines a three‑proxy high‑availability layer, Kafka’s O(1) persistent messaging, and a distributed audit‑server that uses consistent hashing and multi‑coroutine processing to consume data within seconds, while fault‑tolerant offsets, majority acknowledgments, and Tencent Cloud MongoDB storage ensure secure, ordered, scalable, and highly reliable audit logging.

KafkaMongoDBTDSQL

0 likes · 7 min read

TDSQL Audit Capability: Architecture, Kafka Integration, and Consistency Hash Implementation

dbaplus Community

Feb 13, 2017 · Backend Development

Why Message Queues Are Essential for Scalable Distributed Systems

Message queues act as a crucial middleware component in distributed systems, addressing coupling, asynchronous processing, traffic shaping, and high availability, with real-world scenarios such as asynchronous handling, decoupling, traffic throttling, logging, and communication, while reviewing popular solutions like ActiveMQ, RabbitMQ, ZeroMQ, Kafka, and JMS.

Backend ArchitectureJMSKafka

0 likes · 20 min read

Why Message Queues Are Essential for Scalable Distributed Systems

StarRing Big Data Open Lab

Feb 10, 2017 · Information Security

Securing Kafka with Kerberos and ACLs: A Practical Guide

This article explains Kafka's architecture, identifies its security vulnerabilities, and presents Transwarp's Kerberos authentication and ACL-based authorization solutions, including configuration steps, code examples, and best practices for building a secure Kafka service.

ACLKafkaKerberos

0 likes · 12 min read

Securing Kafka with Kerberos and ACLs: A Practical Guide

21CTO

Jan 18, 2017 · Big Data

Build a Lightweight, High‑Availability Real‑Time Stream Processing System

Learn how to construct a simple, high‑availability real‑time stream processing platform using lightweight components such as Kafka, Zookeeper, Thrift/Avro, and optional storage like MongoDB or Elasticsearch, offering a practical alternative to heavyweight frameworks like Storm and Spark Streaming for small‑to‑medium enterprises.

Big DataKafkaReal-Time

0 likes · 5 min read

Build a Lightweight, High‑Availability Real‑Time Stream Processing System

Java High-Performance Architecture

Jan 12, 2017 · Big Data

How Does Kafka’s Cluster Architecture Enable Scalable Messaging?

This article explains Kafka’s overall cluster structure, how producers write messages to partitions, and how consumer groups read messages, covering topics, partitions, leaders, followers, offsets, and load‑balancing mechanisms in a concise technical overview.

BackendKafkaMessage Queue

0 likes · 5 min read

How Does Kafka’s Cluster Architecture Enable Scalable Messaging?