Tagged articles
278 articles
Page 3 of 3
Xianyu Technology
Xianyu Technology
Oct 13, 2020 · Industry Insights

How Xianyu Uses Real‑Time Data Analytics to Accelerate Operations

This case study explains how Xianyu built a real‑time data analysis platform called Nanomirror to democratize data science, enabling dynamic drill‑down, intelligent facet analysis, AB‑bucket evaluation, and metric prediction, thereby shortening experiment cycles and improving operational decision‑making.

Case StudyReal-time analyticsfacet analysis
0 likes · 11 min read
How Xianyu Uses Real‑Time Data Analytics to Accelerate Operations
ITPUB
ITPUB
Sep 14, 2020 · Big Data

How Alibaba’s DChain Data Converger Auto‑Generates Real‑Time Wide Tables with SQL Pipelines

This article explains how the ADC (Alibaba DChain Data Converger) project automatically creates large real‑time tables by letting users configure metrics on the front‑end, then generating and publishing SQL through a pipeline that leverages design patterns, priority queues, and tree‑based data structures for efficient cross‑database processing.

Design PatternsFlinkReal-time analytics
0 likes · 15 min read
How Alibaba’s DChain Data Converger Auto‑Generates Real‑Time Wide Tables with SQL Pipelines
DataFunTalk
DataFunTalk
Aug 25, 2020 · Databases

Real‑time Data Ingestion and Optimization with ClickHouse at ByteDance

This article details ByteDance's engineering practices for using ClickHouse to ingest, store, and query massive real‑time recommendation and advertising data, covering early external‑transaction mechanisms, the risks of direct INSERTs, the design and evaluation of Kafka Engine versus Flink pipelines, and a series of performance and reliability improvements implemented to support high‑frequency workloads.

ClickHouseDatabase OptimizationKafka
0 likes · 20 min read
Real‑time Data Ingestion and Optimization with ClickHouse at ByteDance
IT Architects Alliance
IT Architects Alliance
Aug 12, 2020 · Big Data

Introduction to Confluent KSQL for Real-Time Stream Processing

This article introduces Confluent KSQL, a SQL‑based real‑time stream processing engine for Kafka, covering its architecture, stream vs table concepts, query lifecycle, Docker‑based setup, DDL commands, example joins, windowed aggregations, connectors, and its advantages and limitations.

Big DataDockerKSQL
0 likes · 9 min read
Introduction to Confluent KSQL for Real-Time Stream Processing
Youku Technology
Youku Technology
Aug 6, 2020 · Big Data

Alibaba Entertainment Data Platform: The Journey Ahead

The presentation outlines how Alibaba's entertainment data platform has evolved to meet the real‑time, low‑cost, and scalable analytics demands of campaigns such as Double 11 and 618, detailing its architecture, real‑time processing, pre‑computed data cubes, practical design choices, and lessons learned from implementation challenges.

Big DataReal-time analytics
0 likes · 1 min read
Alibaba Entertainment Data Platform: The Journey Ahead
Tencent Cloud Developer
Tencent Cloud Developer
Jul 13, 2020 · Big Data

Building MVP: A Lightweight Big Data Analysis System for Product Growth

The article describes how a lightweight big‑data analysis platform called MVP was built from scratch—using a User‑Event‑Config model, HDFS + ClickHouse + Spark, and four modules for metric monitoring, root‑cause alerts, deep growth analysis, and A/B testing—enabling real‑time insights in seconds instead of days and dramatically accelerating product‑growth operations.

AARRR ModelClickHouseHDFS
0 likes · 9 min read
Building MVP: A Lightweight Big Data Analysis System for Product Growth
DataFunTalk
DataFunTalk
Jul 10, 2020 · Big Data

Apache Flink Practice at NetEase: Architecture, Scale, and Future Directions

This article details NetEase's evolution from Storm to Flink for real‑time computing, describing the Sloth platform's architecture, large‑scale deployment, diverse business scenarios, monitoring, alerting, and future development plans, illustrating how Flink powers data synchronization, real‑time warehousing, and e‑commerce analytics and recommendation.

Data WarehouseFlinkNetEase
0 likes · 15 min read
Apache Flink Practice at NetEase: Architecture, Scale, and Future Directions
DataFunTalk
DataFunTalk
Jun 18, 2020 · Big Data

Real-time Data Processing at QuTouTiao: Flink + ClickHouse Architecture and Practices

QuTouTiao leverages Flink and ClickHouse to build a high‑performance real‑time analytics platform that supports hourly Hive pipelines and sub‑second ClickHouse queries, achieving sub‑second response for 80% of requests through streaming ingestion, exactly‑once semantics, multi‑cluster coordination, and optimized ClickHouse storage and connector designs.

Big DataClickHouseFlink
0 likes · 16 min read
Real-time Data Processing at QuTouTiao: Flink + ClickHouse Architecture and Practices
Big Data Technology Architecture
Big Data Technology Architecture
Jun 18, 2020 · Big Data

Understanding Data Lakes, Data Warehouses, and Real-Time Analytics with Hologres

This article analyzes the challenges of traditional data lake and warehouse architectures, explains why unified storage and compute are needed for real‑time and batch workloads, and introduces Hologres as a cloud‑native, high‑performance engine that combines PostgreSQL compatibility with Flink‑driven analytics to deliver a true real‑time data warehouse solution.

Data WarehouseFlinkHologres
0 likes · 13 min read
Understanding Data Lakes, Data Warehouses, and Real-Time Analytics with Hologres
Big Data Technology Architecture
Big Data Technology Architecture
Jun 16, 2020 · Big Data

Real-time Multi-dimensional Analytics and SlimBase State Backend at Kuaishou: Flink Applications and Optimizations

This article describes how Kuaishou leverages Apache Flink for large‑scale real‑time multi‑dimensional analytics, details the architecture of its analytics platform using Kudu storage and KwaiBI, and introduces SlimBase—a lightweight, embedded shared state backend that replaces RocksDB to reduce I/O, latency, and CPU overhead.

FlinkKuaishouKudu
0 likes · 17 min read
Real-time Multi-dimensional Analytics and SlimBase State Backend at Kuaishou: Flink Applications and Optimizations
DataFunTalk
DataFunTalk
Jun 11, 2020 · Big Data

Real-time Multi-dimensional Analytics and SlimBase State Backend at Kuaishou: Flink Applications and Optimizations

This article presents Kuaishou's extensive use of Apache Flink for real-time multi-dimensional analytics, detailing the platform's architecture, cluster scale, data processing pipelines, the design of a shared state storage engine called SlimBase, and performance improvements achieved through replacing RocksDB with a customized HBase‑based solution.

Big DataFlinkKuaishou
0 likes · 15 min read
Real-time Multi-dimensional Analytics and SlimBase State Backend at Kuaishou: Flink Applications and Optimizations
Big Data Technology Architecture
Big Data Technology Architecture
Jun 4, 2020 · Big Data

Building a Real-Time OLAP Analytics Platform for QQ Music with ClickHouse and Tencent Cloud EMR

QQ Music’s data team tackled massive PB‑scale, real‑time analytics challenges by migrating from Hive to a ClickHouse‑based OLAP platform integrated with Tencent Cloud EMR and Superset, achieving low‑latency, high‑availability data processing, self‑service visualization, and efficient read/write scaling for billions of daily events.

ClickHouseCloud EMRData visualization
0 likes · 11 min read
Building a Real-Time OLAP Analytics Platform for QQ Music with ClickHouse and Tencent Cloud EMR
Meituan Technology Team
Meituan Technology Team
Apr 23, 2020 · Big Data

Design and Evolution of Meituan's OCTO Data Center (Watt) for Trillion‑Scale Real‑Time Analytics

Meituan’s self‑built OCTO data center, codenamed Watt, transforms over ten trillion daily records into multi‑dimensional, real‑time metrics by using stateless, horizontally‑scalable compute nodes, hierarchical aggregation, and lock‑free processing, achieving sub‑second latency, five‑nine availability, and reducing weekly operations from twenty hours to ten minutes.

MeituanReal-time analytics
0 likes · 17 min read
Design and Evolution of Meituan's OCTO Data Center (Watt) for Trillion‑Scale Real‑Time Analytics
58 Tech
58 Tech
Mar 4, 2020 · Big Data

Applying Flink State Management to Real‑Time Recommendation Scenarios

This article explains how Flink's flexible state management, including Broadcast, Keyed, and Operator states, can be used to solve real‑time recommendation challenges such as per‑minute UV, click, and exposure counting, while addressing locality mapping and data‑delay issues with Druid as the downstream store.

Broadcast StateDruidFlink
0 likes · 13 min read
Applying Flink State Management to Real‑Time Recommendation Scenarios
Suning Technology
Suning Technology
Feb 22, 2020 · Big Data

How SuNing’s Big Data Engine Powers Health‑Code Pandemic Management

During the COVID‑19 pandemic, SuNing launched a public travel information registration system that leverages massive big‑data processing, high‑concurrency architecture, Kafka streaming, and real‑time analytics to create a city‑wide health‑code network, enabling precise epidemic control, mobility tracking, and robust data privacy safeguards.

Big DataHealth CodeReal-time analytics
0 likes · 5 min read
How SuNing’s Big Data Engine Powers Health‑Code Pandemic Management
HomeTech
HomeTech
Feb 6, 2020 · Product Management

AutoBI One‑Stop Data Visualization Platform: Architecture, Technical Highlights, and Use Cases

The document outlines AutoBI, a company‑wide one‑stop data visualization platform, detailing its background, overall architecture, key technical components such as real‑time/offline data switching and query processing, integration capabilities, and practical case studies, highlighting efficiency gains and future development plans.

BackendBig DataDashboard
0 likes · 8 min read
AutoBI One‑Stop Data Visualization Platform: Architecture, Technical Highlights, and Use Cases
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 9, 2020 · Big Data

Design and Evolution of iQIYI Real-Time Analysis Platform (RAP)

iQIYI’s Real‑Time Analysis Platform (RAP) combines Apache Druid with Spark/Flink to deliver minute‑level, low‑latency multidimensional analytics via a web wizard, supporting hundreds of streaming tasks and thousands of reports across membership, recommendation, and TV monitoring, while simplifying development and maintenance.

Apache DruidBig DataFlink
0 likes · 13 min read
Design and Evolution of iQIYI Real-Time Analysis Platform (RAP)
dbaplus Community
dbaplus Community
Jan 7, 2020 · Databases

Why ClickHouse Beats Presto for Real‑Time Metrics: A Deep Dive

This article examines the shortcomings of a Storm‑based real‑time metric platform, outlines the requirements for a stable, SQL‑driven, fast engine, and explains why ClickHouse was chosen over Presto, detailing performance benchmarks, architectural advantages, cluster configuration, engine options, best practices, and common operational issues.

ClickHousePrestoReal-time analytics
0 likes · 18 min read
Why ClickHouse Beats Presto for Real‑Time Metrics: A Deep Dive
Meituan Technology Team
Meituan Technology Team
Nov 28, 2019 · Backend Development

Wedge: Design and Implementation of an Advertising Experiment Configuration Platform

Wedge is a Meituan‑Dianping advertising experiment configuration platform that provides extensible, flow‑based A/B testing with version control, real‑time monitoring, and a user‑friendly UI, enabling algorithm, engineering, and business teams to rapidly iterate, audit, and roll back complex vertical and horizontal experiments.

AB testingAdvertisingBackend
0 likes · 12 min read
Wedge: Design and Implementation of an Advertising Experiment Configuration Platform
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 22, 2019 · Big Data

Real-Time Data Verification: Building a Log Comparison Solution with Flink, Elasticsearch, and Hive

This article explains how to design and implement a real‑time data verification framework using Flink to generate wide tables, storing detailed records in Elasticsearch or HDFS with Hive for cross‑checking against offline data, ensuring trustworthy metrics for dashboards and stakeholders.

Big DataData verificationElasticsearch
0 likes · 7 min read
Real-Time Data Verification: Building a Log Comparison Solution with Flink, Elasticsearch, and Hive
Big Data Technology Architecture
Big Data Technology Architecture
Oct 15, 2019 · Big Data

Introduction to Apache Kylin: A Fast Big Data OLAP Engine

Apache Kylin is an open‑source, Hadoop‑based OLAP engine that provides sub‑second, multi‑dimensional SQL queries on massive datasets, with features such as cube pre‑computation, real‑time analytics, and seamless BI tool integration, and its latest v2.6.4 release adds numerous fixes and improvements.

Apache KylinBI IntegrationHadoop
0 likes · 4 min read
Introduction to Apache Kylin: A Fast Big Data OLAP Engine
JD Retail Technology
JD Retail Technology
Oct 14, 2019 · Databases

Overview of JDNoSQL Platform and Its Real-Time Advertising Use Cases

The article introduces JDNoSQL, a distributed column‑oriented key‑value store built on HDFS, outlines its core features, describes various business scenarios including real‑time ad computation, details the system architecture with Kafka and Flink, and presents table designs for ad impression and click statistics.

Big DataFlinkKafka
0 likes · 13 min read
Overview of JDNoSQL Platform and Its Real-Time Advertising Use Cases
Xianyu Technology
Xianyu Technology
Jul 23, 2019 · Operations

Automated Service Fault Localization System Architecture

The automated service fault localization system ingests massive real‑time instrumentation data, builds call‑chain graphs, and instantly pinpoints the exact component causing timeouts or other errors, achieving developer‑level accuracy within seconds instead of minutes while remaining simple, fast, and fully automated.

Big DataFault LocalizationOperations
0 likes · 8 min read
Automated Service Fault Localization System Architecture
360 Tech Engineering
360 Tech Engineering
Jul 18, 2019 · Databases

Principles and Practices of Apache Doris: Architecture, Key Technologies, and Real‑World Use Cases

This article presents a comprehensive overview of Apache Doris, covering its positioning as a distributed MPP analytical database, core architecture with FE and BE nodes, key technologies such as vectorized execution and materialized views, integration with Kafka and Elasticsearch, additional features, roadmap, and detailed case studies from Baidu Statistics and Meituan, illustrating its practical deployment and performance characteristics.

Apache DorisColumnar StorageData Warehouse
0 likes · 25 min read
Principles and Practices of Apache Doris: Architecture, Key Technologies, and Real‑World Use Cases
ITPUB
ITPUB
Jul 2, 2019 · Databases

How ClickHouse Powers Ctrip’s Hotel Data Platform for Billions of Daily Updates

This article explains how Ctrip’s hotel data intelligence platform handles over ten billion daily data updates and nearly a million queries by adopting ClickHouse, detailing the system's background, the reasons for choosing ClickHouse over other solutions, the data ingestion pipelines, monitoring strategies, operational practices, and performance outcomes.

Big DataClickHouseReal-time analytics
0 likes · 13 min read
How ClickHouse Powers Ctrip’s Hotel Data Platform for Billions of Daily Updates
Ctrip Technology
Ctrip Technology
Jun 26, 2019 · Databases

Applying ClickHouse for a High‑Performance Hotel Data Intelligence Platform

This article describes how Ctrip Hotel's data intelligence platform leverages ClickHouse to achieve real‑time analytics on billions of daily updates and millions of queries, detailing the system architecture, data ingestion pipelines, monitoring, and operational lessons learned for large‑scale, high‑availability data services.

Data WarehouseReal-time analyticsdata pipeline
0 likes · 12 min read
Applying ClickHouse for a High‑Performance Hotel Data Intelligence Platform
ITPUB
ITPUB
May 29, 2019 · Big Data

How to Build a Trillion-Scale Real-Time Data Platform: Lessons from DTCC 2019

In a DTCC 2019 keynote, Zhao Qun, director of big‑data platform at Percent Point, outlines the challenges of trillion‑scale real‑time analytics and presents a transparent, fine‑grained architecture built on Kafka, Spark Streaming, ClickHouse, HBase, Ceph and Elasticsearch, detailing design principles, component sizing, multi‑center deployment, performance testing and operational safeguards.

Big DataKafkaReal-time analytics
0 likes · 17 min read
How to Build a Trillion-Scale Real-Time Data Platform: Lessons from DTCC 2019
DataFunTalk
DataFunTalk
Mar 7, 2019 · Big Data

Design and Evolution of Didi's Real‑Time Data Computing Platform

The article details how Didi built and iterated its real‑time data platform, describing the shift from MySQL‑based batch processing to a Kafka‑Samza‑Druid architecture with Spark Streaming and Flink, the challenges addressed, and the current capabilities and operational metrics.

Big DataDruidFlink
0 likes · 9 min read
Design and Evolution of Didi's Real‑Time Data Computing Platform
AntTech
AntTech
Mar 6, 2019 · Databases

How Ant Financial Scaled the 2019 Alipay New Year Red Envelope Event with GeaBase Graph Database and Real‑Time Data Intelligence

The 2019 Alipay New Year "Five Blessings" red‑envelope campaign, serving 450 million users, leveraged Ant Financial's GeaBase distributed graph database, a real‑time data‑intelligence platform, and OceanBase elastic resources to achieve millisecond‑level ranking, seconds‑level transaction audit, and seamless high‑concurrency performance.

AlipayBackendBig Data
0 likes · 10 min read
How Ant Financial Scaled the 2019 Alipay New Year Red Envelope Event with GeaBase Graph Database and Real‑Time Data Intelligence
dbaplus Community
dbaplus Community
Mar 5, 2019 · Databases

How HTAP and DRDS HTAP Enable Real‑Time OLTP/OLAP Integration

This article explains the concepts of OLTP, OLAP and HTAP, describes the DRDS HTAP architecture—including its engine and storage layers, Fireworks Spark‑based engine, optimizer stages, and streaming capabilities—and demonstrates cross‑database MPP queries and streaming joins while outlining suitable use cases and limitations.

DRDSDatabase ArchitectureHTAP
0 likes · 17 min read
How HTAP and DRDS HTAP Enable Real‑Time OLTP/OLAP Integration
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 3, 2019 · Big Data

How Apache Flink Powers Real‑Time Big Data at Alibaba and Beyond

The 2018 Flink Forward China conference in Beijing showcased Apache Flink’s evolution, Alibaba’s massive contributions—including the Blink fork, real‑time BI, online learning and city‑level analytics—and highlighted how industry leaders like Alibaba, Didi and others leverage Flink for scalable, low‑latency big‑data processing across diverse use cases.

Apache FlinkBatch-Stream FusionReal-time analytics
0 likes · 19 min read
How Apache Flink Powers Real‑Time Big Data at Alibaba and Beyond
58 Tech
58 Tech
Nov 26, 2018 · Big Data

Big Data OLAP Applications and Practices: Insights from Xiaomi and 58.com

The article reviews the 2018 58 Group technology salon on big‑data OLAP, summarizing Xiaomi’s one‑stop OLAP architecture, 58.com’s challenges and solutions using Kylin, Druid, and UnionSQL, and the practical implementations and optimizations that illustrate modern OLAP practices.

Data WarehouseDruidKylin
0 likes · 12 min read
Big Data OLAP Applications and Practices: Insights from Xiaomi and 58.com
21CTO
21CTO
Nov 7, 2018 · Big Data

Why Data Streams Are the Backbone of Real-Time Big Data Analytics

Data streams, akin to endless rivers, enable continuous, real-time processing of diverse sources such as IoT telemetry, web logs, and e-commerce events, offering advantages over batch processing, while presenting challenges like scalability and fault tolerance, and are supported by tools like Kinesis, Kafka, Flink, and Storm.

Amazon KinesisApache KafkaBig Data
0 likes · 6 min read
Why Data Streams Are the Backbone of Real-Time Big Data Analytics
Big Data and Microservices
Big Data and Microservices
Sep 2, 2018 · Industry Insights

How Big Data and AI Are Transforming Financial Services: 5 Key Applications

The article explores how big data and artificial intelligence are reshaping finance through automated risk management, advanced customer data handling, personalized services, predictive analytics, and real‑time analysis, highlighting practical methods, benefits, and future trends for financial institutions.

Real-time analyticsartificial intelligencefinance
0 likes · 9 min read
How Big Data and AI Are Transforming Financial Services: 5 Key Applications
Beike Product & Technology
Beike Product & Technology
Jun 22, 2018 · Big Data

Beike Zhaofang's 秒X Real‑Time Analytics Platform: Architecture, Implementation, and Use Cases

The article details the design and deployment of the 秒X real‑time analytics platform at Beike Zhaofang, covering its background, Spark Streaming‑based architecture, fast configuration, data processing pipeline, monitoring, visualization, practical applications, and future development plans.

DruidElasticsearchReal-time analytics
0 likes · 7 min read
Beike Zhaofang's 秒X Real‑Time Analytics Platform: Architecture, Implementation, and Use Cases
Tencent Cloud Developer
Tencent Cloud Developer
Jun 1, 2018 · Backend Development

Building Tencent Xinge: Architecture and Practices for Massive Mobile Push Service

The talk details Tencent Xinge’s architecture and cloud‑native practices that enable hundred‑billion‑level mobile push, combining terminal integration, real‑time backend filtering, distributed bitmap selection, precise‑push AI models, and DevOps pipelines to deliver fast, scalable, data‑driven notifications with effect tracking.

Backend ArchitectureBig DataDistributed Systems
0 likes · 18 min read
Building Tencent Xinge: Architecture and Practices for Massive Mobile Push Service
Ctrip Technology
Ctrip Technology
May 9, 2018 · Artificial Intelligence

Ctrip's Real-Time Anti-Fraud System: Architecture, Big Data, and AI Innovations

The article details Ctrip's mature real‑time anti‑fraud platform, describing its big‑data parallel processing, AI‑driven models, device‑fingerprinting, CDNA service, and evolving architecture that together achieve sub‑150 ms decision latency while handling billions of daily transactions.

CtripReal-time analyticsartificial intelligence
0 likes · 10 min read
Ctrip's Real-Time Anti-Fraud System: Architecture, Big Data, and AI Innovations
JD Tech
JD Tech
Feb 28, 2018 · Operations

CallGraph: JD.com's Distributed Tracing and Service Governance Platform

CallGraph is JD.com's internally developed distributed tracing and service governance platform that addresses the challenges of monitoring complex microservice architectures by providing low‑intrusion, low‑latency tracing, real‑time analytics, configurable sampling, and integration with JMQ, Storm, Spark, HBase, and JimDB for both operational insight and performance optimization.

Big DataDistributed TracingMicroservices
0 likes · 12 min read
CallGraph: JD.com's Distributed Tracing and Service Governance Platform
ITPUB
ITPUB
Jan 18, 2018 · Operations

How to Build Real‑Time User Login Dashboards with MySQL Binlog & Logtail

This guide walks through enabling MySQL binlog, installing Logtail, configuring data collection, indexing, previewing logs, writing custom SQL queries for user login analysis, constructing real‑time dashboards, setting abnormal‑login alerts, and backing up data to OSS for long‑term storage.

AlertingBinlogDashboard
0 likes · 10 min read
How to Build Real‑Time User Login Dashboards with MySQL Binlog & Logtail
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Dec 22, 2017 · Big Data

Slipstream 5.1 Unveiled: New CEP, Session Windows & Event‑Driven Engine

Slipstream 5.1 expands its real‑time stream processing capabilities with richer Complex Event Processing syntax, introduces Session Window support for session‑based analytics, and enhances the Morphling event‑driven engine, all accessible via SQL, making advanced streaming applications easier for both developers and business users.

Real-time analyticscomplex event processingsession window
0 likes · 8 min read
Slipstream 5.1 Unveiled: New CEP, Session Windows & Event‑Driven Engine
Qunar Tech Salon
Qunar Tech Salon
Dec 21, 2017 · Big Data

Experience and Optimization Strategies for Apache Kylin in Real-Time OLAP

This article shares a data engineer's three‑year experience using Apache Kylin for real‑time OLAP on petabyte‑scale data, describing the business background, challenges of pre‑computation, cube modeling, dimension reduction, and various optimization techniques such as hierarchy, mandatory, and joint dimensions, as well as precise count‑distinct handling.

Apache KylinBig DataCube Optimization
0 likes · 13 min read
Experience and Optimization Strategies for Apache Kylin in Real-Time OLAP
Architecture Digest
Architecture Digest
Dec 16, 2017 · Big Data

Performance Comparison of Apache Flink and Apache Storm for Real‑Time Stream Processing

This report presents a systematic performance evaluation of Apache Flink and Apache Storm across multiple real‑time processing scenarios, measuring throughput, latency, message‑delivery semantics, and state‑backend effects, and provides recommendations for selecting the most suitable engine based on the observed results.

Big DataFlinkReal-time analytics
0 likes · 21 min read
Performance Comparison of Apache Flink and Apache Storm for Real‑Time Stream Processing
dbaplus Community
dbaplus Community
Jul 16, 2017 · Big Data

How Vipshop Scaled Real‑Time OLAP: From GreenPlum to Presto, Kylin, and Redis

Vipshop faced massive data growth that broke traditional RDBMS, causing slow OLAP queries, inefficient ETL, and long development cycles, so it iteratively rebuilt its analytics stack—adding Hadoop/Hive, a self‑service UI, Presto, Kylin, and Redis—to achieve sub‑second query responses, higher concurrency, and a flexible, low‑latency BI solution.

Data WarehouseKylinOLAP
0 likes · 23 min read
How Vipshop Scaled Real‑Time OLAP: From GreenPlum to Presto, Kylin, and Redis
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jul 10, 2017 · Databases

How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, and ETA

This article explains how Didi leverages HBase across multiple business scenarios—including order lifecycle queries, driver‑passenger trajectory tracking, ETA calculations, and cluster monitoring—while addressing multi‑language support, rowkey design, GeoHash indexing, and multi‑tenant resource management.

Database designGeoHashHBase
0 likes · 13 min read
How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, and ETA
21CTO
21CTO
Jun 19, 2017 · Databases

How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, ETA and Monitoring

This article explains how Didi leverages HBase’s distributed architecture, multi‑language APIs, and custom rowkey designs to support online order queries, driver‑passenger trajectory tracking with GeoHash, real‑time ETA calculations, and a monitoring platform, while managing multi‑tenant resources through DHS and RS Group.

DidiGeoHashHBase
0 likes · 13 min read
How Didi Scales HBase for Real‑Time Orders, Geo‑Tracking, ETA and Monitoring
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Mar 21, 2017 · Big Data

How Real-Time Data Streaming Is Transforming Industries Today

This article explains how real‑time data streaming turns massive, continuously growing datasets into actionable insights across finance, energy, and e‑commerce, showcasing early adopters like ConocoPhillips and DHL while urging businesses to rethink models for the next wave of data management.

Big DataData StreamingReal-time analytics
0 likes · 7 min read
How Real-Time Data Streaming Is Transforming Industries Today
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Nov 1, 2016 · Big Data

Will SQL on Hadoop Replace Hybrid Architectures? Key Big Data Trends Unveiled

The article analyzes four major big‑data evolution trends—SQL on Hadoop overtaking hybrid architectures, SSDs becoming cache in Hadoop clusters, the rise of real‑time analytics, and the convergence of cloud computing with big data—while presenting supporting data, predictions, and architectural diagrams.

Big DataReal-time analyticsSQL on Hadoop
0 likes · 15 min read
Will SQL on Hadoop Replace Hybrid Architectures? Key Big Data Trends Unveiled
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Oct 8, 2016 · Big Data

Evolving Data Warehouses with Hadoop & Spark: Core Technologies

Data warehouses centralize and transform enterprise data for multidimensional analysis, and modern demands have spawned four types—traditional, real‑time, associative discovery, and data marts—each with distinct technical requirements, while Hadoop‑based solutions like Transwarp Data Hub address challenges of scale, variety, latency, and security.

Big DataHadoopReal-time analytics
0 likes · 21 min read
Evolving Data Warehouses with Hadoop & Spark: Core Technologies
Ctrip Technology
Ctrip Technology
Sep 2, 2016 · Big Data

Why Druid? Architecture, Indexing, Use Cases, and Lessons Learned

This article introduces Druid as an open‑source, distributed column‑store OLAP engine, explains its architecture and indexing mechanisms, discusses real‑time and batch data ingestion for order analytics at Qunar, compares it with other engines, and shares practical tips and pitfalls.

CaravelDruidOLAP
0 likes · 8 min read
Why Druid? Architecture, Indexing, Use Cases, and Lessons Learned
Efficient Ops
Efficient Ops
Jun 30, 2016 · Big Data

How Spark Enables Real‑Time Microservice Performance Profiling

This article explains how IBM Research and Cloudinsight use Apache Spark to capture, analyze, and visualize microservice communication in real time, addressing challenges of observability, bottleneck detection, and latency attribution in large‑scale cloud environments.

Operational MonitoringReal-time analyticsSpark
0 likes · 10 min read
How Spark Enables Real‑Time Microservice Performance Profiling
Architect
Architect
Mar 29, 2016 · Big Data

Understanding Apache Storm Architecture, Stream Groupings, and the Acker Mechanism

This article provides a comprehensive overview of Apache Storm’s architecture, including the roles of Nimbus, Supervisor, and ZooKeeper, explains various stream groupings, details the Acker mechanism, and describes task execution, parallelism calculation, and internal data flow within the Storm cluster.

Apache StormBig DataReal-time analytics
0 likes · 19 min read
Understanding Apache Storm Architecture, Stream Groupings, and the Acker Mechanism
21CTO
21CTO
Mar 20, 2016 · Operations

How CAT Powers Real‑Time Distributed Monitoring at Scale

This article introduces CAT, a Java‑based open‑source distributed real‑time monitoring system, covering its origins, design goals, architecture, message processing pipeline, instrumentation model, and how it achieves high availability, scalability, and low‑latency analytics for large‑scale internet services.

CATDistributed MonitoringJava
0 likes · 17 min read
How CAT Powers Real‑Time Distributed Monitoring at Scale
ITPUB
ITPUB
Mar 11, 2016 · Databases

Unlock Real-Time Analytics with Oracle 12c In-Memory: Architecture & Best Practices

This article explains how Oracle 12c's In-Memory feature enables hybrid OLTP/OLAP workloads by storing columnar data in a dedicated memory area, covering its architecture, data loading, consistency mechanisms, query acceleration techniques, and integration with RAC for high‑availability deployments.

In-MemoryOLAPOracle
0 likes · 19 min read
Unlock Real-Time Analytics with Oracle 12c In-Memory: Architecture & Best Practices
Qunar Tech Salon
Qunar Tech Salon
Feb 24, 2016 · Artificial Intelligence

Overview and Architecture of Pora: A Real‑Time Personalization Analytics Platform

The article introduces Pora, a real‑time offline‑realtime analytics system for personalized search that combines high‑throughput stream processing, low‑latency computation, online learning algorithms, and a modular architecture to support continuous 24/7 operation and large‑scale performance optimizations.

AIOnline LearningReal-time analytics
0 likes · 6 min read
Overview and Architecture of Pora: A Real‑Time Personalization Analytics Platform
21CTO
21CTO
Jan 25, 2016 · Big Data

How Alibaba’s Pora Powers Real‑Time Personalization at Massive Scale

Pora (Personal Offline Realtime Analyze) is a high‑throughput, low‑latency platform that captures user behavior in real time, enabling Alibaba’s search engine to deliver personalized results, support online learning, and run 24/7 with massive data volumes.

AlibabaBig DataPora
0 likes · 6 min read
How Alibaba’s Pora Powers Real‑Time Personalization at Massive Scale
21CTO
21CTO
Nov 23, 2015 · Big Data

How Dianping Scales Real‑Time Analytics with Apache Storm

This article explains how Dianping built a millisecond‑level real‑time computation platform using Apache Storm, covering use cases, system architecture, core Storm concepts, performance tuning, best practices, and a detailed Q&A on their production deployment.

Apache StormBig DataReal-time analytics
0 likes · 23 min read
How Dianping Scales Real‑Time Analytics with Apache Storm
21CTO
21CTO
Nov 4, 2015 · Big Data

How We Built a Real‑Time Log Analytics Platform with Storm and Cardinality Counting

To monitor hundreds of web apps on UAE’s PaaS platform in near‑real time, we combined Storm with lightweight log transport, a memcached‑based fqueue, and adaptive cardinality counting to efficiently compute PV, UV, response times, and custom metrics while handling cross‑cluster log aggregation.

Big DataCardinality countingLog Processing
0 likes · 9 min read
How We Built a Real‑Time Log Analytics Platform with Storm and Cardinality Counting

Storm vs Spark: Which Real‑Time Analytics Platform Wins for Your Business?

The article compares Apache Storm and Apache Spark, examining their origins, architecture, language support, integration capabilities, and performance characteristics, and offers guidance on selecting the right platform for real‑time business intelligence based on specific workload and infrastructure needs.

Apache SparkApache StormBig Data
0 likes · 11 min read
Storm vs Spark: Which Real‑Time Analytics Platform Wins for Your Business?