Tagged articles

252 articles

Page 3 of 3

May 31, 2018 · Backend Development

Design and Architecture of a Unified MySQL Data Synchronization Platform

This article details the design of a unified MySQL data synchronization platform that consolidates offline sync, real‑time subscription, and real‑time sync into BatchJob, StreamJob, and PieJob abstractions, describing task implementations, cluster architecture, high‑availability mechanisms, and evolution challenges such as file loss and metadata handling.

Backend ArchitectureBatch Processingdata synchronization

0 likes · 10 min read

Design and Architecture of a Unified MySQL Data Synchronization Platform

Java High-Performance Architecture

May 22, 2018 · Big Data

Is Apache Kafka Right for You? Core Features, Stream Processing, and Use Cases

This article explains Apache Kafka’s evolution and adoption by Fortune‑500 firms, outlines its two core capabilities—messaging (queue and publish/subscribe) and stream processing via the Java Stream API—provides example code, typical use cases, and guidance on scenarios where Kafka may not be the best solution.

Apache KafkaUse Casesstream processing

0 likes · 5 min read

Is Apache Kafka Right for You? Core Features, Stream Processing, and Use Cases

Alibaba Cloud Developer

May 21, 2018 · Databases

How TcpRT Enables Real‑Time Service Quality Monitoring for Massive Cloud Databases

TcpRT is a real‑time instrumentation and diagnostic system for Alibaba Cloud RDS that non‑intrusively collects TCP trace data, aggregates billions of records per day, applies statistical and Cauchy‑based anomaly detection, and pinpoints root causes across hosts, proxies, and network devices at massive scale.

Cloud DatabasesSIGMODanomaly detection

0 likes · 27 min read

How TcpRT Enables Real‑Time Service Quality Monitoring for Massive Cloud Databases

Architecture Digest

Mar 14, 2018 · Big Data

Attributes Matrix and Data Flow Models of Apache Streaming Platforms

This article presents a comprehensive attributes matrix and data‑flow model overview for major Apache streaming platforms, comparing versions, sponsors, event handling, fault tolerance, processing order, latency, resource management, APIs, and supported connectors to aid practical technology selection.

ApacheBig Dataattributes matrix

0 likes · 16 min read

Attributes Matrix and Data Flow Models of Apache Streaming Platforms

Meituan Technology Team

Jan 26, 2018 · Big Data

Design and Implementation of a Real-Time Data Processing System at Meituan

Meituan designed a Storm‑based real‑time data processing platform that guarantees at‑least‑once delivery and high availability, employs a custom spout, regression‑driven traffic smoothing, and a low‑latency KV store with atomic operations, persisting results in Kafka, MySQL and Cellar to power merchant dashboards and heat‑tag analytics, while planning broader real‑time analytics expansion.

Big DataDistributed SystemsStorm

0 likes · 10 min read

Design and Implementation of a Real-Time Data Processing System at Meituan

Alibaba Cloud Developer

Jan 25, 2018 · Big Data

How Alibaba’s Real‑Time Big Data Engine Powered a Record‑Breaking Double 11

This article explains how Alibaba built a massive real‑time computing platform using Flink and its Blink extensions, detailing the challenges of ultra‑low latency, exactly‑once guarantees, and high throughput, and showing how these technologies powered the record‑breaking Double 11 shopping festival.

FlinkReal‑Time ComputingSQL

0 likes · 20 min read

How Alibaba’s Real‑Time Big Data Engine Powered a Record‑Breaking Double 11

dbaplus Community

Dec 26, 2017 · Big Data

Turning Raw Logs into Structured Data with DBus Visual Rule Operators

This article explains how the open‑source DBus platform, combined with the Wormhole streaming engine, captures raw application logs, lets users configure visual rule operators, and transforms the unstructured message part into schema‑driven, Kafka‑ready data for downstream analytics.

Big DataDBusLog Processing

0 likes · 15 min read

Turning Raw Logs into Structured Data with DBus Visual Rule Operators

StarRing Big Data Open Lab

Dec 22, 2017 · Big Data

Slipstream 5.1 Unveiled: New CEP, Session Windows & Event‑Driven Engine

Slipstream 5.1 expands its real‑time stream processing capabilities with richer Complex Event Processing syntax, introduces Session Window support for session‑based analytics, and enhances the Morphling event‑driven engine, all accessible via SQL, making advanced streaming applications easier for both developers and business users.

Real-time analyticscomplex event processingsession window

0 likes · 8 min read

Slipstream 5.1 Unveiled: New CEP, Session Windows & Event‑Driven Engine

Architecture Digest

Dec 16, 2017 · Big Data

Performance Comparison of Apache Flink and Apache Storm for Real‑Time Stream Processing

This report presents a systematic performance evaluation of Apache Flink and Apache Storm across multiple real‑time processing scenarios, measuring throughput, latency, message‑delivery semantics, and state‑backend effects, and provides recommendations for selecting the most suitable engine based on the observed results.

Big DataFlinkReal-time analytics

0 likes · 21 min read

Performance Comparison of Apache Flink and Apache Storm for Real‑Time Stream Processing

Qunar Tech Salon

Nov 30, 2017 · Big Data

Performance Comparison of Apache Flink and Apache Storm for Real-Time Stream Processing

This article presents a comprehensive performance evaluation of Apache Flink versus Apache Storm across multiple real‑time processing scenarios, measuring throughput, latency, and the impact of different configurations and delivery semantics to guide framework selection and optimization.

Exactly-OnceFlinkReal-time analytics

0 likes · 16 min read

ITPUB

Nov 23, 2017 · Big Data

7 Typical Big Data Projects Every Hadoop Engineer Should Know

The article outlines seven common big‑data initiatives—data integration, specialized analytics, Hadoop‑as‑a‑service, stream processing, complex event handling, ETL pipelines, and SAS replacement—explaining their goals, typical technologies such as HDFS, Hive, Spark, Storm, Kafka, and practical considerations for enterprises adopting Hadoop ecosystems.

Data IntegrationHadoopproject types

0 likes · 8 min read

7 Typical Big Data Projects Every Hadoop Engineer Should Know

Alibaba Cloud Developer

Nov 21, 2017 · Big Data

Inside Alibaba’s Stream Computing: 4.72 B Events/sec & 25.6 K Payments/sec on Double 11

Alibaba’s Double 11 showcase reveals how its upgraded stream computing platform handled a 100% year‑over‑year data surge, achieving 256 K successful payments per second and processing 472 million events per second in real time through a highly optimized Flink‑based architecture.

AlibabaBig DataFlink

0 likes · 10 min read

Inside Alibaba’s Stream Computing: 4.72 B Events/sec & 25.6 K Payments/sec on Double 11

ITPUB

Nov 13, 2017 · Big Data

How Real‑Time Big Data Stream Computing Powers Double 11 E‑Commerce Success

The article explains how NetEase’s real‑time big‑data stream computing platform, Sloth, handles massive, continuously generated data during China’s Double 11 shopping festival, covering use cases, architectural shifts from batch to incremental processing, technical challenges, and the role of stream‑SQL for easier development.

Distributed SystemsReal‑Time ComputingSQL

0 likes · 16 min read

How Real‑Time Big Data Stream Computing Powers Double 11 E‑Commerce Success

ITPUB

Nov 13, 2017 · Big Data

How Real-Time Big Data Streaming Powers Double 11 E‑Commerce Success

The article explains how continuous data generation and real‑time stream processing enable e‑commerce platforms like NetEase Kaola to handle massive Double 11 traffic, showcasing use cases, architectural shifts from batch to incremental computing, and the technical challenges of latency, accuracy, and fault tolerance.

Distributed SystemsReal-time StreamingSQL

0 likes · 15 min read

How Real-Time Big Data Streaming Powers Double 11 E‑Commerce Success

dbaplus Community

Oct 15, 2017 · Big Data

How JD Built a Scalable Seller Log Platform with Kafka, Storm, ES & HBase

This article details JD's end‑to‑end seller log system architecture, explaining why Kafka, Storm, Elasticsearch and HBase were chosen, the challenges faced during scaling, and the practical solutions implemented to achieve a unified, high‑throughput logging platform for merchants and operations.

Big DataElasticsearchHBase

0 likes · 13 min read

How JD Built a Scalable Seller Log Platform with Kafka, Storm, ES & HBase

Java High-Performance Architecture

Sep 12, 2017 · Big Data

What Is KSQL? A Beginner’s Guide to Real‑Time Stream SQL on Kafka

KSQL is an open‑source, distributed SQL engine for Apache Kafka that enables continuous, real‑time queries on streaming data, lowering the barrier for analysts to perform stream processing, monitoring, security checks, and analytics without writing code.

KSQLKafkaReal-time analytics

0 likes · 6 min read

What Is KSQL? A Beginner’s Guide to Real‑Time Stream SQL on Kafka

Baixing.com Technical Team

Sep 4, 2017 · Big Data

How Flink SQL Simplifies Real-Time Data Cleaning Compared to Storm

This article introduces Flink’s background, architecture, and ecosystem, then demonstrates a step‑by‑step tutorial on using Flink SQL to clean and transform streaming data from Kafka, highlighting its advantages over Storm for real‑time ETL.

Apache FlinkFlinkSQL

0 likes · 12 min read

How Flink SQL Simplifies Real-Time Data Cleaning Compared to Storm

Architecture Digest

Sep 3, 2017 · Big Data

An Overview of Big Data Processing Frameworks: Batch, Stream, and Hybrid Systems

This article introduces the evolution of big‑data processing from Google’s MapReduce concept to modern open‑source frameworks, defines big data and its 3V characteristics, outlines typical processing pipelines, and compares batch, stream, and hybrid systems such as Hadoop, Storm, Samza, Spark, and Flink.

Batch ProcessingBig DataFlink

0 likes · 20 min read

An Overview of Big Data Processing Frameworks: Batch, Stream, and Hybrid Systems

21CTO

Aug 14, 2017 · Big Data

Unveiling Flink’s Multi‑Layer Execution Graph: From StreamGraph to Physical Deployment

This article explains Flink’s architecture, detailing the roles of Client, JobManager and TaskManager, walks through a SocketTextStreamWordCount example, and clarifies the four‑layer graph model—StreamGraph, JobGraph, ExecutionGraph, and the physical execution graph—highlighting why each layer exists.

Big DataExecution GraphFlink

0 likes · 9 min read

Unveiling Flink’s Multi‑Layer Execution Graph: From StreamGraph to Physical Deployment

Alibaba Cloud Developer

Jun 8, 2017 · Big Data

Flink Forward 2017: Stream Processing Insights from Alibaba, Uber & Netflix

The article recounts the 2017 Flink Forward conference in San Francisco, highlighting key sessions from DataArtisans, Uber, Netflix and Alibaba, and discusses real‑time stream processing use cases, large‑scale deployments, runtime and TableAPI/SQL improvements, and the growing adoption of Flink in the industry.

Apache FlinkBig DataFlink

0 likes · 16 min read

Flink Forward 2017: Stream Processing Insights from Alibaba, Uber & Netflix

Alibaba Cloud Developer

May 25, 2017 · Big Data

How Alibaba’s Blink Engine Redefines Real‑Time Big Data Processing

This article explains how Alibaba’s Blink, built on Apache Flink, transforms batch‑oriented big‑data platforms into a unified, high‑performance real‑time computing engine, detailing its architecture, state management, checkpointing, and successful deployment in e‑commerce, search, recommendation, and online machine‑learning scenarios.

AlibabaBig DataFlink

0 likes · 17 min read

How Alibaba’s Blink Engine Redefines Real‑Time Big Data Processing

Suning Technology

May 18, 2017 · Big Data

Why Apache Flink Beats Spark and Storm in Stream Processing

This article examines Apache Flink's stream‑processing architecture, compares its native streaming model, fault‑tolerance, performance and SQL capabilities with Spark and Storm, and concludes that Flink offers a more powerful and efficient solution despite some maturity gaps.

Apache FlinkSparkStorm

0 likes · 12 min read

Why Apache Flink Beats Spark and Storm in Stream Processing

Architecture Digest

May 18, 2017 · Backend Development

Design and Architecture of Ctrip's Real‑Time User Behavior Service

The article describes how Ctrip rebuilt its real‑time user behavior platform using a Java‑based stack (Kafka, Storm, Redis, MySQL) to achieve millisecond‑level latency, high availability, scalable performance, and robust handling of traffic spikes, failures, and data back‑pressure.

Backend ArchitectureKafkaReal-Time

0 likes · 12 min read

Design and Architecture of Ctrip's Real‑Time User Behavior Service

Architecture Digest

Feb 11, 2017 · Big Data

LeKe Sports Big Data Platform Evolution: From Early ETL Reporting to 2.0 Streaming Architecture

The article describes how LeKe Sports built and continuously upgraded its Hadoop‑based big data platform—from a manual ETL‑to‑Elasticsearch reporting system to a 2.0 architecture featuring Spark Streaming, SQL‑based query layers, Elasticsearch indexing, and cloud‑native storage and backup solutions—to meet rapidly growing PB‑scale data demands.

Big DataData PlatformETL

0 likes · 5 min read

LeKe Sports Big Data Platform Evolution: From Early ETL Reporting to 2.0 Streaming Architecture

21CTO

Jan 18, 2017 · Big Data

Build a Lightweight, High‑Availability Real‑Time Stream Processing System

Learn how to construct a simple, high‑availability real‑time stream processing platform using lightweight components such as Kafka, Zookeeper, Thrift/Avro, and optional storage like MongoDB or Elasticsearch, offering a practical alternative to heavyweight frameworks like Storm and Spark Streaming for small‑to‑medium enterprises.

Big DataKafkaReal-Time

0 likes · 5 min read

Build a Lightweight, High‑Availability Real‑Time Stream Processing System

Alibaba Cloud Developer

Jan 9, 2017 · Big Data

How Alibaba Scaled Real‑Time Data Processing for Double 11: Architecture & Lessons

This article details Alibaba's real‑time computing architecture for the 2016 Double 11 event, covering background, core components such as DRC, TT, Galaxy, OTS, XTool and OneService, and explains optimization techniques, fault‑tolerance strategies, stress‑testing practices, and future upgrade plans to handle massive streaming data workloads.

Big DataPerformance OptimizationReal‑Time Computing

0 likes · 14 min read

How Alibaba Scaled Real‑Time Data Processing for Double 11: Architecture & Lessons

GF Securities FinTech

Sep 21, 2016 · Big Data

How GF Securities Leverages Lambda/Kappa Architectures for Real-Time Stock Analytics

This article explains how GF Securities built a customized Lambda/Kappa‑style big‑data platform that integrates CEP, Spark, Flink and Kafka to deliver low‑latency stock price alerts, real‑time news, and capital‑flow trading strategies for the finance industry.

CEPLambda architectureSpark

0 likes · 18 min read

How GF Securities Leverages Lambda/Kappa Architectures for Real-Time Stock Analytics

Architecture Digest

Sep 10, 2016 · Big Data

Designing a Real-Time Stream Computing Platform for E‑commerce Peak Traffic at Yihaodian

The article describes how Yihaodian built a low‑latency, highly available, and easily scalable streaming computation platform using Storm, Kafka, Linux containers and a custom CGroup management framework to handle massive e‑commerce traffic spikes and real‑time analytics.

KafkaResource IsolationStorm

0 likes · 9 min read

Designing a Real-Time Stream Computing Platform for E‑commerce Peak Traffic at Yihaodian

dbaplus Community

Aug 18, 2016 · Big Data

How Zhejiang Mobile Scaled Billion‑Level Real‑Time Stream Processing with Storm

This article details Zhejiang Mobile's architecture and practical experience in building a billion‑scale real‑time stream computing platform using Storm, Kafka, Flume, and Redis, covering use cases, system design, performance bottlenecks, optimization techniques, and monitoring strategies.

Apache StormBig Data ArchitectureFlume

0 likes · 20 min read

How Zhejiang Mobile Scaled Billion‑Level Real‑Time Stream Processing with Storm

Ctrip Technology

Aug 12, 2016 · Big Data

Ctrip's Real-Time Data Platform: Architecture, Practices, and Lessons Learned

This article details Ctrip's journey building a unified real-time data platform—covering business motivations, architectural requirements, technology choices like Kafka and Storm, implementation of Avro schemas, monitoring, alerting, operational lessons, and future explorations such as Streaming CQL and JStorm.

AlertingBig DataKafka

0 likes · 15 min read

Architect

Jul 14, 2016 · Big Data

Understanding Custom Stream IDs and Topology Building in Apache Storm

This article explains how to construct Apache Storm topologies with custom stream IDs, demonstrates the classic WordCountTopology example, and provides detailed Java code snippets illustrating spout and bolt configurations, stream declarations, and grouping strategies for real‑time stream processing.

Apache StormBig DataCustom Stream ID

0 likes · 8 min read

Understanding Custom Stream IDs and Topology Building in Apache Storm

Art of Distributed System Architecture Design

Jun 11, 2016 · Big Data

Overview of Open-Source Real-Time Stream Processing Systems

This article provides a concise overview of several open‑source real‑time stream processing platforms—including S4, Storm, StreamBase, HStreaming, Esper/NEsper, Kafka, Scribe, and Flume—highlighting their primary features, programming languages, and project links for future technical research.

ApacheBig DataReal-Time

0 likes · 5 min read

Overview of Open-Source Real-Time Stream Processing Systems

Architect

May 22, 2016 · Big Data

Understanding Flink Execution Resources: Operator Chains, Task Slots, Slot Sharing and CoLocation

This article explains Flink's core execution‑resource concepts—including operator chaining, task slots, slot‑sharing groups and co‑location groups—detailing their conditions, API controls, internal implementation, and how they together maximize throughput and resource utilization in stream processing.

Big DataFlinkResource Management

0 likes · 11 min read

Understanding Flink Execution Resources: Operator Chains, Task Slots, Slot Sharing and CoLocation

Big Data and Microservices

Apr 19, 2016 · Industry Insights

Designing a Scalable Real‑Time Stock Prediction Architecture with Open‑Source Tools

This article outlines a reference architecture for a low‑latency, horizontally scalable real‑time stock prediction system built with open‑source components such as Spring Cloud Data Flow, Apache Geode, Spark MLlib, and Hadoop, and discusses data flow steps, simplified deployment, and algorithm choices for market forecasting.

Big DataReal-TimeStock Prediction

0 likes · 7 min read

Architect

Mar 29, 2016 · Big Data

Understanding Apache Storm Architecture, Stream Groupings, and the Acker Mechanism

This article provides a comprehensive overview of Apache Storm’s architecture, including the roles of Nimbus, Supervisor, and ZooKeeper, explains various stream groupings, details the Acker mechanism, and describes task execution, parallelism calculation, and internal data flow within the Storm cluster.

Apache StormBig DataReal-time analytics

0 likes · 19 min read

Understanding Apache Storm Architecture, Stream Groupings, and the Acker Mechanism

Qunar Tech Salon

Feb 24, 2016 · Artificial Intelligence

Overview and Architecture of Pora: A Real‑Time Personalization Analytics Platform

The article introduces Pora, a real‑time offline‑realtime analytics system for personalized search that combines high‑throughput stream processing, low‑latency computation, online learning algorithms, and a modular architecture to support continuous 24/7 operation and large‑scale performance optimizations.

AIOnline LearningReal-time analytics

0 likes · 6 min read

Overview and Architecture of Pora: A Real‑Time Personalization Analytics Platform

21CTO

Jan 25, 2016 · Big Data

How Alibaba’s Pora Powers Real‑Time Personalization at Massive Scale

Pora (Personal Offline Realtime Analyze) is a high‑throughput, low‑latency platform that captures user behavior in real time, enabling Alibaba’s search engine to deliver personalized results, support online learning, and run 24/7 with massive data volumes.

AlibabaBig DataPora

0 likes · 6 min read

How Alibaba’s Pora Powers Real‑Time Personalization at Massive Scale

Efficient Ops

Jan 5, 2016 · Information Security

How Apache Eagle Secures Hadoop: Real‑Time Big Data Threat Detection

Apache Eagle is an open‑source, distributed, real‑time security monitoring platform for Hadoop that combines stream‑processing, scalable policy enforcement, and machine‑learning user profiling to protect massive data assets across eBay’s production clusters.

Apache EagleBig DataHadoop

0 likes · 19 min read

How Apache Eagle Secures Hadoop: Real‑Time Big Data Threat Detection

Qunar Tech Salon

Dec 15, 2015 · Big Data

Real-Time Computing with Apache Storm: Architecture, Code Samples, and Fault Tolerance

This article explains the principles of real-time computing, compares it with offline batch processing, and demonstrates a practical solution using Kafka for ingestion, Apache Storm for continuous computation, and various storage options, while also covering streaming concepts and Storm's high‑availability mechanisms.

Apache StormKafkaReal‑Time Computing

0 likes · 8 min read

Real-Time Computing with Apache Storm: Architecture, Code Samples, and Fault Tolerance

Efficient Ops

Nov 26, 2015 · Big Data

Expert Insights on User Profiling and Stream Processing in Big Data

This article presents expert Q&A on effective user behavior analysis techniques for building detailed user profiles and compares mainstream stream‑processing solutions, outlining key factors such as latency, throughput, parallelism, and fault tolerance for selecting the right real‑time data platform.

Big Datastream processinguser profiling

0 likes · 11 min read

Expert Insights on User Profiling and Stream Processing in Big Data

21CTO

Nov 23, 2015 · Big Data

How Dianping Scales Real‑Time Analytics with Apache Storm

This article explains how Dianping built a millisecond‑level real‑time computation platform using Apache Storm, covering use cases, system architecture, core Storm concepts, performance tuning, best practices, and a detailed Q&A on their production deployment.

Apache StormBig DataReal-time analytics

0 likes · 23 min read

How Dianping Scales Real‑Time Analytics with Apache Storm

Art of Distributed System Architecture Design

Sep 25, 2015 · Big Data

Understanding Storm: A Distributed Real-Time Computation System

The article explains the need for low‑latency, high‑performance, distributed real‑time processing, outlines the challenges such systems must address, and introduces Storm as a Hadoop‑like framework for stream processing, detailing its architecture, fault‑tolerance mechanisms, transactional topology, and large‑scale deployment at Taobao.

Big DataDistributed SystemsReal-time Processing

0 likes · 14 min read

Understanding Storm: A Distributed Real-Time Computation System

21CTO

Sep 24, 2015 · Big Data

Comparing Apache Storm, Spark, and Samza: Which Real‑Time Stream Processor Fits Your Needs?

Apache Storm, Spark Streaming, and Samza are three open‑source, low‑latency, scalable distributed systems for real‑time data processing; this article outlines their architectures, key concepts, differences in data handling, state management, delivery guarantees, and typical use‑cases to help you choose the right framework.

Apache SamzaApache StormBig Data

0 likes · 7 min read

Comparing Apache Storm, Spark, and Samza: Which Real‑Time Stream Processor Fits Your Needs?

Art of Distributed System Architecture Design

Sep 24, 2015 · Big Data

Comparative Overview of Apache Storm, Spark Streaming, and Samza for Real-Time Data Processing

This article introduces Apache Storm, Spark Streaming, and Samza, explains their architectures, common features, key differences such as delivery guarantees and state management, and provides guidance on selecting the most suitable framework for various real‑time big‑data use cases.

Apache StormBig DataComparison

0 likes · 8 min read

Comparative Overview of Apache Storm, Spark Streaming, and Samza for Real-Time Data Processing

Art of Distributed System Architecture Design

Sep 23, 2015 · Big Data

Overview of Open-Source Real-Time Stream Processing Systems

This article provides a concise overview of several open‑source real‑time stream processing platforms—including S4, Storm, StreamBase, HStreaming, Esper/NEsper, Kafka, Scribe, and Flume—highlighting their main features, programming languages, and project links for further reference.

Big DataKafkaReal-Time

0 likes · 5 min read

Java High-Performance Architecture

Sep 13, 2015 · Frontend Development

Why Gulp Outperforms Grunt: A Clear Comparison for Frontend Build Tasks

The article compares Gulp and Grunt, highlighting Gulp's clearer configuration and higher efficiency through stream processing, illustrated with CSS concatenation and minification code examples, and explains how Grunt's multiple I/O operations make it slower.

Frontend BuildTask Runnercss minification

0 likes · 2 min read

Why Gulp Outperforms Grunt: A Clear Comparison for Frontend Build Tasks

Qunar Tech Salon

Jul 8, 2015 · Big Data

Understanding Logs: The Foundation of Distributed Systems, Data Integration, and Stream Processing

This article explains how logs—simple, append‑only, time‑ordered records—serve as the core abstraction behind databases, distributed systems, data integration pipelines, and modern stream‑processing platforms such as Kafka and Hadoop, illustrating their design, scalability, and practical challenges.

Big DataData IntegrationDistributed Systems

0 likes · 45 min read

Understanding Logs: The Foundation of Distributed Systems, Data Integration, and Stream Processing

Architect

Jul 6, 2015 · Big Data

Understanding Logs: The Core of Distributed Systems and Data Integration

This article explains how logs—simple, append‑only, time‑ordered records—serve as the fundamental abstraction behind databases, distributed systems, data integration pipelines, and stream‑processing platforms like Kafka and Hadoop, illustrating their role in ordering, replication, scalability, and real‑time analytics.

Data IntegrationDistributed SystemsHadoop

0 likes · 48 min read

Understanding Logs: The Core of Distributed Systems and Data Integration

Art of Distributed System Architecture Design

Jun 19, 2015 · Big Data

Storm vs Spark: Which Real‑Time Analytics Platform Wins for Your Business?

The article compares Apache Storm and Apache Spark, examining their origins, architecture, language support, integration capabilities, and performance characteristics, and offers guidance on selecting the right platform for real‑time business intelligence based on specific workload and infrastructure needs.

Apache SparkApache StormBig Data

0 likes · 11 min read

Storm vs Spark: Which Real‑Time Analytics Platform Wins for Your Business?

High Availability Architecture

May 15, 2015 · Big Data

Real-Time Computing at Dianping: Architecture, Use Cases, and Best Practices

During a detailed live session, senior Dianping engineer Wang Xinchun explains the company's real‑time computing platform built on Apache Storm, covering use cases such as dashboards, search and recommendation, system architecture, data ingestion tools like Blackhole and Puma, performance tuning, monitoring, and practical best‑practice recommendations.

Apache StormBig DataReal‑Time Computing

0 likes · 21 min read

Real-Time Computing at Dianping: Architecture, Use Cases, and Best Practices

Art of Distributed System Architecture Design

Apr 15, 2015 · Big Data

Understanding Stream Processing, Event Sourcing, and Complex Event Processing

The article explains the fundamentals of stream processing, event sourcing, and complex event processing, comparing raw event storage with aggregated results, illustrating architectures with Kafka, Samza, and other frameworks, and highlighting benefits such as scalability, flexibility, and decoupling for modern data‑driven systems.

Apache KafkaApache SamzaBig Data

0 likes · 11 min read

Understanding Stream Processing, Event Sourcing, and Complex Event Processing

Qunar Tech Salon

Mar 16, 2015 · Big Data

Comparison of Apache Storm, Spark Streaming, and Samza for Real‑Time Data Processing

This article introduces Apache Storm, Spark Streaming, and Apache Samza, outlines their architectures, highlights commonalities and differences such as delivery guarantees and state management, and offers guidance on selecting the most suitable framework for various real‑time big‑data use cases.

Apache SamzaApache StormBig Data

0 likes · 8 min read

Comparison of Apache Storm, Spark Streaming, and Samza for Real‑Time Data Processing