Tagged articles
34 articles
Page 1 of 1
DataFunTalk
DataFunTalk
Nov 6, 2025 · Cloud Native

How Tencent Music Cut Kafka Costs by 50% with Cloud‑Native AutoMQ

Tencent Music migrated its massive Kafka streaming infrastructure to the cloud‑native AutoMQ platform, slashing operational costs by over half, achieving second‑level partition migration, and dramatically improving scaling efficiency while maintaining high‑throughput, low‑latency data processing for its music services.

AutoMQCost OptimizationData Streaming
0 likes · 16 min read
How Tencent Music Cut Kafka Costs by 50% with Cloud‑Native AutoMQ
High Availability Architecture
High Availability Architecture
Oct 30, 2025 · Operations

How Tencent Music Cut Kafka Costs by 50% with Cloud‑Native AutoMQ

Tencent Music replaced its traditional Kafka clusters with the cloud‑native AutoMQ platform, slashing infrastructure costs by over half, achieving second‑level partition migration, and dramatically simplifying operations while maintaining high‑throughput, low‑latency data streams for its massive music services.

AutoMQCloud NativeData Streaming
0 likes · 17 min read
How Tencent Music Cut Kafka Costs by 50% with Cloud‑Native AutoMQ
Sanyou's Java Diary
Sanyou's Java Diary
Dec 2, 2024 · Big Data

Understanding Kafka: Core Architecture, Storage, and Reliability Explained

This article provides a comprehensive overview of Kafka, covering its overall structure, key components such as brokers, producers, consumers, topics, partitions, replicas, leader‑follower mechanics, logical and physical storage models, producer and consumer workflows, configuration parameters, partition assignment strategies, rebalancing, log retention and compaction, indexing, zero‑copy transmission, and the reliability concepts that ensure data durability.

Data StreamingDistributed SystemsKafka
0 likes · 18 min read
Understanding Kafka: Core Architecture, Storage, and Reliability Explained
Code Mala Tang
Code Mala Tang
Jul 5, 2024 · Frontend Development

Master TransformStream: Real-World Uses, Code Samples, and Common Pitfalls

TransformStream, a core component of the Streams API, enables developers to process and convert data chunks on the fly, offering examples ranging from simple text uppercase conversion to complex scenarios like compression, video transcoding, real-time IoT filtering, and handling common pitfalls such as errors and backpressure.

Data StreamingNode.jsTransformStream
0 likes · 13 min read
Master TransformStream: Real-World Uses, Code Samples, and Common Pitfalls
Java Architect Essentials
Java Architect Essentials
Jun 26, 2024 · Databases

Why Organizations Should Consider Using Apache Kafka Instead of Relational Databases

This article explains why organizations may replace traditional relational databases with Apache Kafka as a system of record, highlighting Kafka's economic, scalable, immutable log capabilities, event replay, flexibility for diverse use cases, and its suitability for highly regulated, data‑intensive environments.

Data StreamingEvent-Driven ArchitectureImmutable Log
0 likes · 10 min read
Why Organizations Should Consider Using Apache Kafka Instead of Relational Databases
Volcano Engine Developer Services
Volcano Engine Developer Services
Nov 16, 2023 · Big Data

Why Replace Logstash with Flink? Boost Log Processing Performance and Reliability

This article examines the shortcomings of Logstash in log collection—data loss, poor performance, high troubleshooting cost, and lack of dynamic scaling—and demonstrates how migrating to Flink can provide at‑least‑once semantics, flexible error handling, high‑throughput low‑latency processing, automatic resource scaling, and advanced analytics within the ELK ecosystem.

Data StreamingELKFlink
0 likes · 9 min read
Why Replace Logstash with Flink? Boost Log Processing Performance and Reliability
21CTO
21CTO
Oct 4, 2023 · Artificial Intelligence

How LangStream Merges Data Streams with Generative AI for Real‑Time LLM Apps

LangStream, the new open‑source framework from DataStax, combines event‑driven data streaming with generative AI, offering seamless integration with vector databases like Astra DB, Milvus, and Pinecone, and providing a Kubernetes‑based runtime that enables real‑time LLM applications without extensive coding.

Data StreamingKubernetesLLM
0 likes · 7 min read
How LangStream Merges Data Streams with Generative AI for Real‑Time LLM Apps
Sanyou's Java Diary
Sanyou's Java Diary
Sep 21, 2023 · Big Data

Understanding Kafka: Core Concepts, Architecture, and Reliability Explained

This article provides a comprehensive overview of Kafka, covering its overall architecture, key components such as brokers, producers, consumers, topics, partitions, replicas, and ZooKeeper, as well as logical and physical storage mechanisms, producer and consumer workflows, configuration parameters, partition assignment strategies, rebalancing, and the replication model that ensures data reliability.

Data StreamingDistributed SystemsKafka
0 likes · 18 min read
Understanding Kafka: Core Concepts, Architecture, and Reliability Explained
Baidu Geek Talk
Baidu Geek Talk
Sep 18, 2023 · Big Data

How Real‑Time Interception and Bitmap UV Calculation Boost Mobile App Quality

This article explains how a performance middle‑platform for mobile apps uses real‑time change interception, unique color IDs, bitmap‑based UV counting, exception de‑obfuscation, and a multi‑stage data pipeline to detect and isolate problems early, reduce user impact, and improve overall app reliability.

Data Streamingbitmap UVcaching
0 likes · 21 min read
How Real‑Time Interception and Bitmap UV Calculation Boost Mobile App Quality
DataFunTalk
DataFunTalk
Jan 20, 2023 · Big Data

Introduction to Flink CDC: Incremental Snapshot Algorithm and Framework

This article introduces Flink CDC, explains its incremental snapshot algorithm and the 2.0 framework design, compares it with traditional CDC pipelines, discusses the core API and dialect concept, and outlines community growth and future plans, providing a comprehensive technical overview for data engineers.

Apache FlinkBig DataChange Data Capture
0 likes · 13 min read
Introduction to Flink CDC: Incremental Snapshot Algorithm and Framework
21CTO
21CTO
Nov 1, 2022 · Backend Development

Inside Netflix’s Scalable Backend: Microservices, CDN, and Data Pipelines

This article dissects Netflix’s massive backend system—covering its dual‑cloud deployment, Open Connect CDN, micro‑service architecture, API gateway, container platform, caching layers, data stores, and real‑time streaming pipelines—to reveal how the streaming giant achieves extreme scalability, reliability, and performance.

Cloud NativeData StreamingMicroservices
0 likes · 16 min read
Inside Netflix’s Scalable Backend: Microservices, CDN, and Data Pipelines
IT Architects Alliance
IT Architects Alliance
Oct 9, 2022 · Backend Development

Event‑Driven Messaging Patterns at Wix: Consumption, Projection, End‑to‑End Streaming, In‑Memory KV Stores, Scheduling, Transactions, and Aggregation

The article describes how Wix engineers built a robust, Kafka‑based event‑driven messaging infrastructure for over 1,400 microservices, detailing patterns such as consumption and projection, end‑to‑end streaming with websockets, in‑memory KV stores, schedule‑and‑forget jobs, exactly‑once transactions, and event aggregation to achieve scalability, resilience, and low‑latency data access.

Data StreamingDistributed SystemsEvent-Driven Architecture
0 likes · 16 min read
Event‑Driven Messaging Patterns at Wix: Consumption, Projection, End‑to‑End Streaming, In‑Memory KV Stores, Scheduling, Transactions, and Aggregation
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Sep 29, 2022 · Backend Development

Scaling Event‑Driven Messaging at Wix with Kafka: Key Patterns

This article explains how Wix uses Kafka‑based event‑driven messaging to decouple microservices, improve scalability, and achieve exactly‑once processing through patterns such as consume‑and‑project, end‑to‑end event streams, in‑memory KV stores, scheduled jobs, transactional events, and event aggregation.

Data StreamingDistributed SystemsEvent-Driven Architecture
0 likes · 16 min read
Scaling Event‑Driven Messaging at Wix with Kafka: Key Patterns
SQB Blog
SQB Blog
Sep 22, 2022 · Big Data

How We Built a Low‑Latency Advertising Billing System with Kafka Streams

This article describes the design, implementation, and performance of ShouQianBa's advertising billing system, detailing the migration from Apache Druid to Kafka Streams, the architecture for real‑time event processing, data aggregation, persistence, fault tolerance, and the achieved low‑latency, high‑throughput metrics.

AdvertisingData StreamingReal-time Billing
0 likes · 15 min read
How We Built a Low‑Latency Advertising Billing System with Kafka Streams
Ops Development Stories
Ops Development Stories
Aug 19, 2022 · Big Data

Master Kafka: From Core Concepts to Advanced Operations and Performance Tuning

This comprehensive guide explains Kafka’s origins, core architecture, data structures, write and read workflows, operational commands for topic and consumer‑group management, and practical performance‑tuning tips such as disk layout, JVM settings, flush policies, and log retention, providing a complete reference for engineers working with distributed streaming platforms.

Data StreamingDistributed MessagingKafka
0 likes · 32 min read
Master Kafka: From Core Concepts to Advanced Operations and Performance Tuning
DataFunTalk
DataFunTalk
Jul 31, 2022 · Big Data

Design, Evolution, and Optimization of NetEase's Log Collection and Transmission Service (Datastream‑NG)

This article presents a comprehensive overview of NetEase's log collection and transmission platform, detailing its evolution from 2011 to the current Datastream‑NG architecture, the system's design goals, core component optimizations, operational monitoring, and future plans for intelligent scaling and diagnostics.

Big DataCloud NativeData Streaming
0 likes · 23 min read
Design, Evolution, and Optimization of NetEase's Log Collection and Transmission Service (Datastream‑NG)
Alibaba Cloud Native
Alibaba Cloud Native
Jul 17, 2022 · Cloud Native

Build Real-Time CDC Pipelines on Alibaba Cloud EventBridge with DTS

This article explains Change Data Capture (CDC) concepts, compares open‑source CDC tools, and shows how to leverage Alibaba Cloud EventBridge and DTS to build real‑time CDC pipelines, covering setup steps, event‑bus vs event‑stream choices, best‑practice scenarios such as CQRS, microservice decoupling, database backup, and SQL auditing.

CDCCloud NativeDTS
0 likes · 12 min read
Build Real-Time CDC Pipelines on Alibaba Cloud EventBridge with DTS
Tencent Tech
Tencent Tech
Jun 23, 2022 · Big Data

Why Apache InLong’s Graduation Marks a New Era for Big Data Integration

Apache InLong, originally contributed by Tencent, has graduated to an Apache top‑level project, offering a one‑stop framework for petabyte‑scale data ingestion, processing, and reliable streaming, and is now widely adopted across advertising, payment, social, gaming, and AI industries.

ApacheBig Data IntegrationData Streaming
0 likes · 5 min read
Why Apache InLong’s Graduation Marks a New Era for Big Data Integration
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 16, 2022 · Big Data

Using Flink CDC to Capture MySQL Changes and Sync Them to ClickHouse

This article introduces Change Data Capture (CDC), compares query‑based and log‑based approaches, explains Debezium and ClickHouse, and provides detailed Flink CDC and Flink SQL CDC examples—including Java source code, custom deserialization schema, ClickHouse sink implementation, and required Maven dependencies—to synchronize MySQL data into ClickHouse in real time.

Big DataCDCClickHouse
0 likes · 17 min read
Using Flink CDC to Capture MySQL Changes and Sync Them to ClickHouse
Big Data Technology & Architecture
Big Data Technology & Architecture
Dec 22, 2021 · Big Data

Using Flink CDC to Capture MySQL Changes and Sink Them into ClickHouse

This article explains Change Data Capture (CDC), compares query‑based and log‑based approaches, introduces Debezium and ClickHouse, and provides step‑by‑step Flink CDC and Flink SQL CDC examples—including Java source, deserialization, sink code and required Maven dependencies—to stream MySQL binlog changes into ClickHouse for real‑time analytics.

Big DataCDCClickHouse
0 likes · 14 min read
Using Flink CDC to Capture MySQL Changes and Sink Them into ClickHouse
MaGe Linux Operations
MaGe Linux Operations
Jun 3, 2021 · Big Data

Why Kafka Handles Billions of Messages: Architecture, Use Cases, and Fast Performance

This article introduces Kafka, LinkedIn’s high‑throughput distributed messaging system, explains its core concepts such as brokers, topics, partitions, offsets, producers, consumers, and consumer groups, outlines common use cases like asynchronous decoupling and data‑stream processing, and details its fast performance mechanisms, fault‑tolerance, installation, and configuration steps.

Big DataData StreamingInstallation
0 likes · 11 min read
Why Kafka Handles Billions of Messages: Architecture, Use Cases, and Fast Performance
DataFunTalk
DataFunTalk
May 4, 2021 · Big Data

Design and Implementation of a Real-Time Data Transmission Platform Based on Apache Flink at AutoHome

This article presents the background, requirements, architectural design, component interaction, and implementation details of AutoHome's real‑time data transmission platform built on Apache Flink, highlighting its high availability, exactly‑once semantics, scalability, DDL handling, and integration with existing streaming services.

Apache FlinkBig DataData Streaming
0 likes · 18 min read
Design and Implementation of a Real-Time Data Transmission Platform Based on Apache Flink at AutoHome
Architecture Digest
Architecture Digest
Mar 25, 2021 · Big Data

Uber's Multi-Region Kafka Architecture and Disaster Recovery

This article explains how Uber built a multi‑region Kafka infrastructure with disaster‑recovery capabilities, detailing its replication topology, active/active and active/passive consumption modes, offset‑management service, and the challenges of ensuring reliable, low‑latency data streaming across regions.

Data StreamingKafkaOffset Management
0 likes · 9 min read
Uber's Multi-Region Kafka Architecture and Disaster Recovery
dbaplus Community
dbaplus Community
Oct 15, 2019 · Big Data

How to Build Real‑Time Data Pipelines for E‑Commerce Promotions

This article examines the surge in real‑time data demands for e‑commerce promotions, outlines how to collect, compute, and deliver streaming data, compares batch and stream processing, lists typical use cases, and discusses the challenges of building scalable, low‑latency pipelines.

Data StreamingReal-Timemonitoring
0 likes · 11 min read
How to Build Real‑Time Data Pipelines for E‑Commerce Promotions
21CTO
21CTO
Jul 20, 2017 · Backend Development

How Ctrip Built a Real-Time User Data Collection System with Netty and Kafka

This article details Ctrip's design and implementation of a high‑throughput, low‑latency user data collection platform that leverages Java NIO, Netty, and a custom Kafka‑based messaging layer, covering architecture, encryption, compression, disaster‑recovery, performance testing, and downstream analytics products.

AvroBackend ArchitectureData Streaming
0 likes · 17 min read
How Ctrip Built a Real-Time User Data Collection System with Netty and Kafka
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Mar 21, 2017 · Big Data

How Real-Time Data Streaming Is Transforming Industries Today

This article explains how real‑time data streaming turns massive, continuously growing datasets into actionable insights across finance, energy, and e‑commerce, showcasing early adopters like ConocoPhillips and DHL while urging businesses to rethink models for the next wave of data management.

Big DataData StreamingReal-time analytics
0 likes · 7 min read
How Real-Time Data Streaming Is Transforming Industries Today
Efficient Ops
Efficient Ops
Mar 20, 2017 · Big Data

How eBay Built a Scalable Kafka‑Based Real‑Time Data Transmission Platform

This article details eBay's year‑long development of an enterprise‑grade, Kafka‑driven data transmission platform, covering its architecture, core services, monitoring and automation strategies, as well as performance tuning techniques that enable high throughput, low latency, and reliable cross‑data‑center replication.

Data StreamingKafkaReal-time Processing
0 likes · 22 min read
How eBay Built a Scalable Kafka‑Based Real‑Time Data Transmission Platform