Tagged articles
73 articles
Page 1 of 1
ByteDance Data Platform
ByteDance Data Platform
Feb 2, 2026 · Big Data

How StreamShield Powers Production‑Grade Resilience for Apache Flink at Massive Scale

ByteDance’s StreamShield delivers a three‑layer resiliency framework—engine self‑healing, hybrid replication at the cluster level, and chaos‑tested releases—that enables over 70,000 concurrent Flink jobs on 11 million CPU cores to meet strict SLAs with second‑level startup and robust fault tolerance.

Apache FlinkByteDanceReal‑Time Computing
0 likes · 6 min read
How StreamShield Powers Production‑Grade Resilience for Apache Flink at Massive Scale
Baidu Geek Talk
Baidu Geek Talk
Sep 1, 2025 · Big Data

How Baidu Netdisk Built a High‑Performance Real‑Time Engine with Flink

This article explains how Baidu Netdisk transitioned from Spark Streaming to a Flink‑based Tiangong real‑time computing engine, detailing the evolution, reasons for choosing Flink, architecture, configuration examples, business use cases, technical challenges, and future platform plans.

Baidu NetdiskBig DataFlink
0 likes · 16 min read
How Baidu Netdisk Built a High‑Performance Real‑Time Engine with Flink
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 7, 2025 · Operations

How Alibaba Scales Flink to Millions of Cores: Real‑Time Ops Secrets

This article details Alibaba's decade‑long evolution of its real‑time computing platform, the massive operational challenges of managing Flink clusters at million‑core scale, and the comprehensive strategies—including SLA metrics, self‑healing services, cloud‑native redesign, and job‑level advisory tools—used to ensure stability, cost efficiency, and performance during peak events like Double‑11.

Apache FlinkCloud NativeJob Advisory
0 likes · 19 min read
How Alibaba Scales Flink to Millions of Cores: Real‑Time Ops Secrets
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Nov 22, 2024 · Cloud Native

Large‑Scale Cloud‑Edge Collaborative Technology Based on Cloud‑Native Wins Zhejiang Province Science and Technology Progress Award

Alibaba Cloud, together with Zhejiang University, Alipay and Xieyun Technology, received the Zhejiang Province Science and Technology Progress First Prize for their cloud‑native large‑scale cloud‑edge collaborative platform, which addresses edge resource constraints, real‑time computing, and massive node management, and has been widely applied across multiple industries.

CNCFContainerReal‑Time Computing
0 likes · 5 min read
Large‑Scale Cloud‑Edge Collaborative Technology Based on Cloud‑Native Wins Zhejiang Province Science and Technology Progress Award
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 2, 2024 · Big Data

How Real-Time Computing Transforms Finance, Automotive, Logistics, and Retail

Businesses across finance, automotive, logistics, and retail are increasingly adopting real-time computing with Flink and Hologres to meet growing data volume and latency demands, enabling instant analytics, risk monitoring, dynamic recommendations, and efficient operations, while cloud architectures evolve to support massive, low‑latency data streams.

FlinkHologresReal‑Time Computing
0 likes · 19 min read
How Real-Time Computing Transforms Finance, Automotive, Logistics, and Retail
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 31, 2024 · Big Data

2023 Data Development Trends and Outlook for 2024

The article reviews how data development accelerated in 2023—with mature offline computing, rapid adoption of real‑time and lake‑warehouse solutions, and a clearer technical layering—while offering practical insights and future directions for professionals entering 2024.

Big DataReal‑Time Computingdata engineering
0 likes · 8 min read
2023 Data Development Trends and Outlook for 2024
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Dec 27, 2023 · Big Data

Recap of Tongcheng Travel’s 7th Big Data Technology Salon – Talks on StarRocks, Paimon, Iceberg, Data+AI, Vector Retrieval, Real‑Time Computing, and Hotel Ranking

The 7th Tongcheng Travel Big Data Technology Salon in Beijing featured a series of expert talks covering StarRocks architecture evolution, lake‑house solutions with Paimon, Iceberg real‑time upsert, Data+AI for travel recommendation, vector retrieval in AI, JD Logistics real‑time computing governance, and multi‑task hotel ranking modeling, providing deep technical insights and future roadmaps.

AIBig DataLakehouse
0 likes · 10 min read
Recap of Tongcheng Travel’s 7th Big Data Technology Salon – Talks on StarRocks, Paimon, Iceberg, Data+AI, Vector Retrieval, Real‑Time Computing, and Hotel Ranking
Didi Tech
Didi Tech
Sep 21, 2023 · Cloud Native

OBC: A Cloud-Native Real-Time Computing Engine for Metrics at Didi

To replace costly, duplicated Flink jobs, Didi built Observe‑Compute (OBC), a cloud‑native, PromQL‑driven real‑time metric engine with centralized policy management, scalable containerized workers, and zero‑downtime scaling, achieving million‑RMB annual savings while handling 10 M points per second.

Flink alternativeOBCPromQL
0 likes · 17 min read
OBC: A Cloud-Native Real-Time Computing Engine for Metrics at Didi
DataFunSummit
DataFunSummit
Aug 5, 2023 · Big Data

Manbang Group's Real-Time Computing, Data Architecture, and Product Practices

Manbang Group shares its practical experiences and insights on real-time computing, multi‑cloud platform architecture, data warehousing with Flink and Holo, real‑time decision and feature platforms, and future plans for scaling these systems to support logistics and recommendation algorithms.

Cloud NativeData ArchitectureFlink
0 likes · 16 min read
Manbang Group's Real-Time Computing, Data Architecture, and Product Practices
NetEase Media Technology Team
NetEase Media Technology Team
May 23, 2023 · Cloud Native

How NetEase Media Scaled Flink with Kubernetes: Architecture, Optimizations, and Lessons Learned

This article details NetEase Media's migration of most Flink jobs to a self‑built real‑time platform on Kubernetes, covering the benefits of K8s isolation, the chosen native deployment mode, performance‑critical optimizations, monitoring, resource‑recommendation, and future directions for cloud‑native streaming workloads.

Cloud NativeFlinkKubernetes
0 likes · 20 min read
How NetEase Media Scaled Flink with Kubernetes: Architecture, Optimizations, and Lessons Learned
Baidu Geek Talk
Baidu Geek Talk
Mar 27, 2023 · Big Data

Precise Watermark Design and Implementation in Baidu's Unified Streaming-Batch Data Warehouse

The article details Baidu's precise watermark design for its unified streaming‑batch data warehouse, describing how a centralized watermark server and client ensure end‑to‑end data completeness, align real‑time and batch windows with 99.9‑99.99% precision, and support accurate anti‑fraud calculations within the broader big‑data ecosystem.

Apache FlinkBaiduBig Data
0 likes · 14 min read
Precise Watermark Design and Implementation in Baidu's Unified Streaming-Batch Data Warehouse
DataFunTalk
DataFunTalk
Jan 1, 2023 · Big Data

Zhihu's Real-Time Computing Platform: From Skytree 1.0 to Mipha 2.0

Zhihu’s real‑time computing platform, initially built as Skytree 1.0 on Kubernetes and later re‑engineered as Mipha 2.0 with Flink SQL, unified metadata management, dynamic jar loading, UDF support, Protobuf format, CDC integration, and extensive operational optimizations, now processes petabyte‑scale data with high reliability.

FlinkKubernetesReal‑Time Computing
0 likes · 21 min read
Zhihu's Real-Time Computing Platform: From Skytree 1.0 to Mipha 2.0
vivo Internet Technology
vivo Internet Technology
Dec 28, 2022 · Big Data

Vivo Real-Time Computing Platform: Architecture, Practices, and Applications

The Vivo Real‑Time Computing Platform, built on Apache Flink, delivers a one‑stop data construction and governance solution that processes up to 5 PB daily, offering high‑availability submission and control services, robust stability, rich SQL usability, efficient Kubernetes deployment, strong security, and supports real‑time warehouses and short‑video recommendation, while targeting future elastic scaling and lake‑house unification.

Apache FlinkData PlatformReal‑Time Computing
0 likes · 18 min read
Vivo Real-Time Computing Platform: Architecture, Practices, and Applications
NetEase Media Technology Team
NetEase Media Technology Team
Nov 17, 2022 · Backend Development

Design and Evolution of NetEase Advertising Engine Platform

NetEase’s advertising engine platform evolved from a monolithic, high‑concurrency system handling over a billion daily requests into a layered, distributed architecture that unifies indexing, billing, user‑tagging, and monitoring services, leverages Elasticsearch and custom extensions for fast retrieval, and plans further upgrades such as a custom retrieval kernel and Go‑based services.

Backend ArchitectureDistributed SystemsIndexing Service
0 likes · 21 min read
Design and Evolution of NetEase Advertising Engine Platform
vivo Internet Technology
vivo Internet Technology
Nov 16, 2022 · Industry Insights

Vivo 2022 Dev Conference: Frontend Compiler, Low‑Code, Real‑Time & Cloud‑Native

The 2022 Vivo developer conference showcased a series of technical breakthroughs—including a custom wepy‑chameleon compiler for frontend upgrades, low‑code platforms for backend and game development, a real‑time computing platform built on Flink, advanced graph scheduling, cloud‑native container strategies, monitoring enhancements, database automation, and large‑scale messaging middleware—highlighting Vivo's comprehensive push toward efficiency and innovation across its internet services.

Cloud NativeContainerMessaging
0 likes · 14 min read
Vivo 2022 Dev Conference: Frontend Compiler, Low‑Code, Real‑Time & Cloud‑Native
DataFunSummit
DataFunSummit
Oct 10, 2022 · Big Data

Stability Optimization Practices for Flink Jobs at Tencent

This article presents Tencent's practical experience in improving Flink job stability, covering the Oceanus platform, stability challenges, and concrete optimization techniques such as reducing failures, minimizing impact, accelerating recovery, and proactive issue detection, followed by a summary and future outlook.

Big DataFlinkReal‑Time Computing
0 likes · 12 min read
Stability Optimization Practices for Flink Jobs at Tencent
Zuoyebang Tech Team
Zuoyebang Tech Team
Jun 17, 2022 · Big Data

How FlinkSQL Auto‑Tuning Saves Resources and Guarantees SLA

This article describes the design and implementation of an automated FlinkSQL tuning system that monitors metrics, evaluates task health with rule‑based logic, calculates optimal resource adjustments, and performs fast scaling to reduce cluster waste, lower operational costs, and maintain SLA compliance.

AkkaAuto ScalingFlink
0 likes · 15 min read
How FlinkSQL Auto‑Tuning Saves Resources and Guarantees SLA
HomeTech
HomeTech
Apr 27, 2022 · Big Data

AutoStream Real‑Time Computing Platform: Architecture, Resource Management, Scaling, Lakehouse Integration, and PyFlink Practices

This article details Car Home's AutoStream platform evolution from Storm to Flink‑based versions, covering real‑time application scenarios, strict budget‑controlled resource management, automatic scaling, lake‑house architecture with Iceberg, PyFlink integration, and future plans for resource optimisation and batch‑stream unification.

AutoStreamFlinkLakehouse
0 likes · 15 min read
AutoStream Real‑Time Computing Platform: Architecture, Resource Management, Scaling, Lakehouse Integration, and PyFlink Practices
DataFunSummit
DataFunSummit
Apr 22, 2022 · Big Data

Huya Real-Time Computing SLA Practice: Platform Evolution, Core SLA Definition, Capability Building, and Future Outlook

The talk details Huya’s real‑time computing platform evolution from chaotic early stages to a unified, containerized system, defines core SLA metrics focused on latency compliance, describes capability enhancements such as demand monitoring, task analysis, dynamic scaling, and outlines future goals for usability, stability, openness, and unified stream‑batch processing.

FlinkReal‑Time ComputingSLA
0 likes · 12 min read
Huya Real-Time Computing SLA Practice: Platform Evolution, Core SLA Definition, Capability Building, and Future Outlook
DataFunTalk
DataFunTalk
Apr 15, 2022 · Big Data

Huya Real-Time Computing SLA Practices: Platform Evolution, Core SLA Definition, Capability Building, and Future Outlook

This article details Huya's real‑time computing platform evolution, core SLA definitions focused on latency compliance, capability enhancements such as demand management, task analysis, dynamic resource scaling, and outlines future directions emphasizing usability, stability, openness, and unified batch‑stream processing.

FlinkReal‑Time ComputingSLA
0 likes · 13 min read
Huya Real-Time Computing SLA Practices: Platform Evolution, Core SLA Definition, Capability Building, and Future Outlook
dbaplus Community
dbaplus Community
Feb 23, 2022 · Big Data

Inside OPPO’s Real‑Time Computing Platform: Architecture, Practices, and Future Roadmap

This article details OPPO’s real‑time computing platform, covering its business scope, big‑data architecture built on Flink, Spark and Trino, the end‑to‑end job development lifecycle, SQL IDE features, diagnostic and monitoring mechanisms, link latency tracking, SLA guarantees, practical use cases, and upcoming lakehouse and cloud‑native evolution.

FlinkReal‑Time Computingbig data platform
0 likes · 23 min read
Inside OPPO’s Real‑Time Computing Platform: Architecture, Practices, and Future Roadmap
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 4, 2022 · Big Data

Big Data Mastery Roadmap: Learning Path, Resources, Future Trends and Interview Guidance

This comprehensive guide outlines a step‑by‑step learning roadmap for aspiring big data professionals, covering fundamentals, programming languages, Linux, databases, distributed theory, networking, offline and real‑time computing, data governance, warehouses, toolchains, video/book recommendations, future industry trends, interview tips, and community resources.

Big DataData GovernanceDistributed Systems
0 likes · 42 min read
Big Data Mastery Roadmap: Learning Path, Resources, Future Trends and Interview Guidance
DataFunTalk
DataFunTalk
Jan 1, 2022 · Big Data

JD's Flink Journey: Evolution, Optimizations, and Future Directions

This article details JD's adoption of Flink for real‑time computing, covering its evolution from Storm to Flink on Kubernetes, the platform architecture, major optimization techniques such as preview topology, backpressure handling, dynamic rebalance, checkpoint‑as‑savepoint, and outlines future plans including stream‑batch integration, stability improvements, intelligent operations, and AI integration.

Big DataFlinkJD
0 likes · 10 min read
JD's Flink Journey: Evolution, Optimizations, and Future Directions
Youzan Coder
Youzan Coder
Dec 17, 2021 · Big Data

Upgrading Real-Time Computing Engine from Flink 1.10 to 1.13: Practices and Challenges

Youzan upgraded its real‑time computing engine from Flink 1.10 to 1.13 to meet rising SQL and containerization demands, gaining enhanced SQL syntax, time‑function handling, Window TVF standardization, Hive integration, K8s stability, elastic scaling, richer Kafka and format support, improved metrics and debugging tools, and successfully migrated all custom connectors, UDFs, and SQL jobs to the new Kubernetes‑based platform.

FlinkReal‑Time Computingcontainerization
0 likes · 22 min read
Upgrading Real-Time Computing Engine from Flink 1.10 to 1.13: Practices and Challenges
DataFunTalk
DataFunTalk
Dec 10, 2021 · Big Data

Building and Evolving NetEase Yanxuan Real-Time Computing Platform: Architecture, SQLization, Serviceization, and Data Governance

This article details NetEase Yanxuan's real-time computing platform development from 2017 to present, covering its architecture, Flink‑SQL development environment, service‑oriented deployment, resource optimization, cloud‑native migration, comprehensive data governance, and future plans for stream‑batch integration and intelligent job diagnostics.

Big DataCloud NativeData Governance
0 likes · 14 min read
Building and Evolving NetEase Yanxuan Real-Time Computing Platform: Architecture, SQLization, Serviceization, and Data Governance
Big Data Technology Architecture
Big Data Technology Architecture
Sep 17, 2021 · Big Data

Real‑time Computing Platform Architecture, Flink Migration, and One‑stop Platform at 58.com

This article details the design and implementation of 58.com’s real‑time computing platform, covering its architecture, data ingestion, storage, Flink‑based stream processing, SQL extensions, performance optimizations, Storm‑to‑Flink migration tools, the Wstream management console, state handling, monitoring, and future roadmap.

Data PlatformFlinkReal‑Time Computing
0 likes · 16 min read
Real‑time Computing Platform Architecture, Flink Migration, and One‑stop Platform at 58.com
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 15, 2021 · Big Data

How to Pick Real-Time Dimension & Result Tables for Cloud‑Native Big Data

This article examines the evolution of big‑data architectures toward cloud‑native, real‑time processing, and provides a detailed comparison of dimension‑table and result‑table options—including MySQL, Redis, and Alibaba Cloud Tablestore—along with their performance, cost, and scalability characteristics for Flink SQL workloads.

Big DataFlink SQLReal‑Time Computing
0 likes · 28 min read
How to Pick Real-Time Dimension & Result Tables for Cloud‑Native Big Data
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 6, 2021 · Big Data

Real-Time Computing and Data Warehouse Solutions with Apache Flink: Architecture, Technology Selection, and Implementation

This article explores the evolution of real-time computing in the big data domain, detailing Apache Flink's capabilities, architectural designs, technology selections such as Kafka, Canal, HBase, ClickHouse, and provides practical implementation guides and case studies from Alibaba, Tencent, and other enterprises.

FlinkReal‑Time Computingdata-warehouse
0 likes · 33 min read
Real-Time Computing and Data Warehouse Solutions with Apache Flink: Architecture, Technology Selection, and Implementation
Youzan Coder
Youzan Coder
Jan 13, 2021 · Big Data

Flink Real-time Task Resource Optimization Practice at Youzan

At Youzan, Flink real‑time tasks running on Kubernetes are optimized by daily GC‑log memory analysis and Kafka‑throughput monitoring, which compute recommended heap sizes and parallelism adjustments to eliminate over‑provisioned CPU and memory, automate alerts, and pave the way for fully automated resource tuning.

FlinkGC tuningKubernetes
0 likes · 16 min read
Flink Real-time Task Resource Optimization Practice at Youzan
DataFunTalk
DataFunTalk
Dec 10, 2020 · Artificial Intelligence

Evolution and Architecture of Beike Commercial Strategy Algorithm Platform

This article details the evolution of Beike's commercial strategy algorithm platform, describing its business scenarios, bidding mechanisms, architecture redesign across online, near‑real‑time, and offline layers, model training, vector retrieval, service governance, and the performance and stability improvements achieved.

Algorithm PlatformBeikeMicroservices
0 likes · 19 min read
Evolution and Architecture of Beike Commercial Strategy Algorithm Platform
DataFunTalk
DataFunTalk
Dec 7, 2020 · Big Data

Jingdong's Flink Real‑Time Computing Platform: Containerization, Optimizations, and Future Roadmap

This article details Jingdong's evolution from Storm to Flink, the architecture of its Kubernetes‑based real‑time computing platform, extensive containerization practices, performance and stability optimizations, and the future plan to unify batch‑stream processing while expanding SQL support and intelligent operations.

Batch-Stream IntegrationFlinkKubernetes
0 likes · 16 min read
Jingdong's Flink Real‑Time Computing Platform: Containerization, Optimizations, and Future Roadmap
Tencent Cloud Developer
Tencent Cloud Developer
Sep 9, 2020 · Big Data

Tencent Game Marketing Deduplication Service: Technical Evolution from TDW to ClickHouse

Tencent’s game marketing analysis system “EAS” evolved from inefficient TDW HiveSQL jobs and file‑heavy real‑time pipelines to a scalable ClickHouse‑based deduplication service that processes hundreds of thousands of daily activity counts in sub‑second time, offering fast, reliable, and maintainable participant deduplication for massive marketing campaigns.

LevelDBMPPOLAP
0 likes · 10 min read
Tencent Game Marketing Deduplication Service: Technical Evolution from TDW to ClickHouse
JD Tech Talk
JD Tech Talk
Aug 21, 2020 · Artificial Intelligence

JD Digits' Intelligent Anti‑Fraud Platform: AI‑Driven Real‑Time Fraud Detection and Knowledge‑Graph Solutions

JD Digits' intelligent anti‑fraud platform leverages machine learning, big‑data processing, graph neural networks and small‑sample knowledge‑graph algorithms to provide millisecond‑level, real‑time protection across 600+ scenarios, while also offering AI‑powered solutions to banks and publishing research at top conferences.

AIGraph Neural NetworkReal‑Time Computing
0 likes · 6 min read
JD Digits' Intelligent Anti‑Fraud Platform: AI‑Driven Real‑Time Fraud Detection and Knowledge‑Graph Solutions
DataFunTalk
DataFunTalk
Jul 22, 2020 · Big Data

Building a Real-Time Computing Platform with Apache Flink at iQIYI: Architecture, Improvements, and Business Cases

iQIYI’s senior data engineer shares the evolution of its big‑data services from Hadoop to a Flink‑based real‑time computing platform, detailing architecture, monitoring improvements, StreamingSQL capabilities, business use cases like recommendation and deep‑learning data generation, and future plans for unified stream‑batch processing.

Apache FlinkData PlatformFlink
0 likes · 11 min read
Building a Real-Time Computing Platform with Apache Flink at iQIYI: Architecture, Improvements, and Business Cases
Beike Product & Technology
Beike Product & Technology
Jun 12, 2020 · Big Data

Design and Implementation of SQL on Streaming (SQL 1.0 → SQL 2.0) in a Real‑Time Computing Platform

This article describes the evolution of a real‑time computing platform from SQL 1.0 built on Spark Structured Streaming to SQL 2.0 powered by Flink‑SQL, covering dynamic tables, continuous queries, dimension‑table joins, cache optimization, DDL extensions, platformization, operational challenges and future roadmap.

Big DataDimension TableFlink
0 likes · 19 min read
Design and Implementation of SQL on Streaming (SQL 1.0 → SQL 2.0) in a Real‑Time Computing Platform
Didi Tech
Didi Tech
Apr 30, 2020 · Big Data

Didi’s Real‑Time Computing Practices with Apache Flink and StreamSQL

Didi has unified its real‑time computing on Apache Flink, creating an enhanced StreamSQL service with extended DDL, built‑in parsers and UDX, supporting thousands of nodes, millions of jobs, and trillions of daily records, while addressing state management, high availability, multi‑language UDFs, and pursuing real‑time ML and data‑warehouse integration.

Apache FlinkBig DataDidi
0 likes · 13 min read
Didi’s Real‑Time Computing Practices with Apache Flink and StreamSQL
Tencent Cloud Developer
Tencent Cloud Developer
Apr 26, 2020 · Backend Development

Design and Evolution of Ctrip Flight Search System: High‑Throughput Caching, Real‑Time Computing, Load Balancing and AI

Ctrip’s flight search service processes two billion daily queries by employing a multi‑level Redis cache, machine‑learning‑driven TTLs, distributed pooling and overload protection, AI‑based anti‑scraping, and robust load‑balancing across three data centers, delivering sub‑second latency, up to three‑fold throughput gains and significant cost reductions.

AIDistributed SystemsReal‑Time Computing
0 likes · 23 min read
Design and Evolution of Ctrip Flight Search System: High‑Throughput Caching, Real‑Time Computing, Load Balancing and AI
DataFunTalk
DataFunTalk
Apr 22, 2020 · Big Data

Didi's Real-Time Computing Practices with Apache Flink: Architecture, StreamSQL, and Operational Insights

Senior Didi technology expert Liang Li-yin shares how Didi leverages Apache Flink for large‑scale real‑time computing, covering service architecture, StreamSQL advantages, multi‑cluster management, task control, monitoring, meta‑store integration, challenges, and future plans such as high availability, real‑time ML, and unified batch‑stream processing.

Apache FlinkBig DataReal‑Time Computing
0 likes · 14 min read
Didi's Real-Time Computing Practices with Apache Flink: Architecture, StreamSQL, and Operational Insights
Dada Group Technology
Dada Group Technology
Apr 15, 2020 · Big Data

Practice Experience of Dada Group's Real-Time Computation SQLization Using Dada Flink SQL

This article details Dada Group's development of the Dada Flink SQL engine, describing its background, architecture, parser design, dimension‑table join strategies, numerous enhancements such as HA support, Kafka keyword handling, metadata integration, Redis and ClickHouse sinks, BINLOG simplification, and future migration plans toward Flink 1.10.

FlinkReal‑Time ComputingSQL Engine
0 likes · 12 min read
Practice Experience of Dada Group's Real-Time Computation SQLization Using Dada Flink SQL
HomeTech
HomeTech
Mar 11, 2020 · Big Data

Streaming SQL with Apache Flink: Theory, Platform Optimizations, and Real‑Time Use Cases

This article introduces Apache Flink's Streaming SQL, explains its theoretical foundations such as the table‑stream relationship and watermark semantics, describes the platform's practical enhancements—including source/sink wrappers, built‑in functions, and native Retract Stream support—and showcases several real‑time computation examples.

Apache FlinkDataStreamReal‑Time Computing
0 likes · 31 min read
Streaming SQL with Apache Flink: Theory, Platform Optimizations, and Real‑Time Use Cases
Youzan Coder
Youzan Coder
Feb 28, 2020 · Big Data

Flink Checkpoint Principle Analysis and Failure Cause Investigation

The article thoroughly explains Apache Flink’s checkpoint mechanism—including state types, coordinator workflow, exactly‑once versus at‑least‑once semantics, common failure sources such as code exceptions, storage or network issues, and practical configuration tips like interval settings, local recovery and externalized checkpoints.

Apache FlinkCheckpointExactly-Once
0 likes · 15 min read
Flink Checkpoint Principle Analysis and Failure Cause Investigation
DataFunTalk
DataFunTalk
Nov 7, 2019 · Big Data

Real-Time Computing Engine at Beike: Architecture, Practices, and Future Plans

This article details Beike's real‑time computing engine, covering its background, streaming platform built on Spark Streaming and Flink, data ingestion via Kafka, metadata handling, SQL‑based task development, monitoring, storage solutions, and future roadmap for resource management and AI‑enhanced monitoring.

Big DataFlinkKafka
0 likes · 14 min read
Real-Time Computing Engine at Beike: Architecture, Practices, and Future Plans
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Sep 3, 2019 · Big Data

Practical Experiences and Lessons Learned in Building a Flink‑Based Real‑Time Computing Platform at Tongcheng‑Elong

This article details the design, implementation, and optimization of a Flink‑based real‑time computing platform at Tongcheng‑Elong, covering the evolution from Storm to Flink, support for FlinkSQL and FlinkStream, metric collection, logging, data lineage, savepoint management, and numerous stability fixes contributed back to the open‑source community.

Big DataData LineageFlink
0 likes · 16 min read
Practical Experiences and Lessons Learned in Building a Flink‑Based Real‑Time Computing Platform at Tongcheng‑Elong
AntTech
AntTech
Jul 24, 2019 · Artificial Intelligence

From Silicon Valley to Ant Financial: He Changhua’s Journey and the Rise of AI‑Driven Real‑Time Big Data Platforms

The article chronicles former Google engineer He Changhua’s return to China to lead Ant Financial’s AI and big‑data initiatives, highlighting the BASIC strategy, the open‑source SQLFlow tool, and the ambitious development of a fully real‑time, AI‑powered big‑data computing platform.

AIAnt FinancialReal‑Time Computing
0 likes · 8 min read
From Silicon Valley to Ant Financial: He Changhua’s Journey and the Rise of AI‑Driven Real‑Time Big Data Platforms
DataFunTalk
DataFunTalk
Jul 15, 2019 · Big Data

Key Infrastructure Considerations for Autonomous Driving: Storage, Computing, and Services

The article reviews the essential infrastructure for autonomous driving, covering massive sensor data storage strategies, the role of metadata, offline and real‑time computing platforms, basic micro‑service components, and various business scenarios, highlighting why robust big‑data handling is critical.

Big DataReal‑Time Computingautonomous driving
0 likes · 14 min read
Key Infrastructure Considerations for Autonomous Driving: Storage, Computing, and Services
Qunar Tech Salon
Qunar Tech Salon
Jun 25, 2019 · Databases

Recap of QInfrarch 2019: Technical Talks on Private Cloud, Real‑time Hotel Computing, Intelligent Ticket Customer Service, and OceanBase

The 2019 QInfrarch technical carnival in Suzhou gathered nearly 160 engineers from Qunar and partner companies to share in‑depth sessions on private cloud databases, real‑time hotel computation, AI‑driven ticket customer service, and OceanBase architecture, concluding with networking, tea breaks, and a lucky‑draw for attendees.

AI Customer ServiceOceanBaseQInfrarch
0 likes · 5 min read
Recap of QInfrarch 2019: Technical Talks on Private Cloud, Real‑time Hotel Computing, Intelligent Ticket Customer Service, and OceanBase
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 6, 2019 · Artificial Intelligence

How SQLFlow Is Making AI as Simple as Writing a SQL Query

Ant Group's Vice‑CTO announced the open‑source SQLFlow tool that merges SQL simplicity with machine‑learning power, aiming to lower AI adoption barriers, while chief architect He Changhua outlines a real‑time big‑data platform that fuses OLTP, OLAP, and AI for universal data intelligence.

AIData PlatformReal‑Time Computing
0 likes · 11 min read
How SQLFlow Is Making AI as Simple as Writing a SQL Query
Big Data Technology & Architecture
Big Data Technology & Architecture
May 29, 2019 · Cloud Native

Real-Time Computing Solutions with Flink and HBase: Architecture, Market Analysis, and Use Cases

The article presents Alibaba Cloud's real-time computing solution based on Flink and HBase, covering market competition, open‑source ecosystem, containerized architecture on Kubernetes, and typical applications such as online education video analysis, city‑brain traffic management, and fraud detection.

Big DataCloud NativeFlink
0 likes · 12 min read
Real-Time Computing Solutions with Flink and HBase: Architecture, Market Analysis, and Use Cases
DataFunTalk
DataFunTalk
Mar 1, 2019 · Big Data

Renrenche Mobile Data Platform: Architecture, Real‑Time Computing, and BI Solutions

The article presents Renrenche’s end‑to‑end mobile data platform, detailing its overall architecture, real‑time Spark‑based computation engine, Web IDE, metadata management, BI reporting built on ClickHouse, and how data‑driven practices empower both online and offline business operations.

BI reportingBig DataReal‑Time Computing
0 likes · 15 min read
Renrenche Mobile Data Platform: Architecture, Real‑Time Computing, and BI Solutions
HomeTech
HomeTech
Jan 18, 2019 · Big Data

Data Mill: A Real‑Time Spark Streaming Framework for DSP Business Support

Data Mill is a Spark‑Streaming‑based real‑time computation framework that abstracts tasks as DataFrames, enables SQL‑driven development, and supports DSP business requirements by reducing latency to 15‑30 minutes while providing a scalable architecture, caching strategy, and automated fault handling.

CacheDSPReal‑Time Computing
0 likes · 10 min read
Data Mill: A Real‑Time Spark Streaming Framework for DSP Business Support
Didi Tech
Didi Tech
Dec 18, 2018 · Big Data

Evolution and Architecture of Didi's Real-Time Computing Platform

From early self‑built Storm and Spark Streaming clusters to a unified YARN‑based Spark platform and finally a low‑latency Flink system with extended CEP and StreamSQL capabilities, Didi’s real‑time computing platform evolved through three stages, delivering multi‑tenant isolation, rich SQL processing, and dramatically reduced development costs.

Big DataCEPFlink
0 likes · 9 min read
Evolution and Architecture of Didi's Real-Time Computing Platform
Manbang Technology Team
Manbang Technology Team
Dec 12, 2018 · Big Data

Kafka Overview: Core Concepts, Architecture, Configuration, and Usage in Real-Time Computing

This article provides a comprehensive technical overview of Kafka, covering its core concepts, producer and consumer models, architecture, configuration parameters, replication mechanisms, performance optimizations, operational monitoring, tooling scripts, and related product implementations for real-time data processing.

Big DataKafkaMessage Queue
0 likes · 18 min read
Kafka Overview: Core Concepts, Architecture, Configuration, and Usage in Real-Time Computing
Xianyu Technology
Xianyu Technology
Jul 28, 2018 · Big Data

Real-Time Computation Architecture for Non-Timeline Feed Ranking

The paper presents a real‑time computation architecture on Alibaba Cloud Blink that scores and ranks non‑timeline feed items within a sliding 72‑hour window, updating rankings every few minutes, using Redis ZSET for fast retrieval, and discusses scaling optimizations such as interval tuning and external join‑and‑rank services.

Big DataReal‑Time Computingfeed ranking
0 likes · 6 min read
Real-Time Computation Architecture for Non-Timeline Feed Ranking
Ctrip Technology
Ctrip Technology
Jul 17, 2018 · Big Data

Meteor: A Real-Time Computation Platform Based on Storm for Ctrip Marketing

The article introduces Meteor, a Storm‑based real‑time computation platform developed by Ctrip Marketing to simplify topology management, automate deployment, and improve resource efficiency for complex marketing scenarios, highlighting its architecture, features, and measurable business impact.

Real‑Time ComputingStormmarketing platform
0 likes · 10 min read
Meteor: A Real-Time Computation Platform Based on Storm for Ctrip Marketing
JD Tech
JD Tech
Nov 30, 2017 · Artificial Intelligence

Interview with JD Infrastructure Chief Architect He Xiaofeng on Real‑time Computing and Product Data Mining

He Xiaofeng, JD Mall Infrastructure chief architect, discusses his role in building a real‑time computing platform, applying streaming frameworks, machine learning, and knowledge‑graph techniques to product data mining, improve search accuracy, and outline future research directions.

InfrastructureJD.comReal‑Time Computing
0 likes · 5 min read
Interview with JD Infrastructure Chief Architect He Xiaofeng on Real‑time Computing and Product Data Mining
ITPUB
ITPUB
Nov 13, 2017 · Big Data

How Real‑Time Big Data Stream Computing Powers Double 11 E‑Commerce Success

The article explains how NetEase’s real‑time big‑data stream computing platform, Sloth, handles massive, continuously generated data during China’s Double 11 shopping festival, covering use cases, architectural shifts from batch to incremental processing, technical challenges, and the role of stream‑SQL for easier development.

Distributed SystemsReal‑Time Computinge‑commerce
0 likes · 16 min read
How Real‑Time Big Data Stream Computing Powers Double 11 E‑Commerce Success
Alibaba Cloud Developer
Alibaba Cloud Developer
May 25, 2017 · Big Data

How Alibaba’s Blink Engine Redefines Real‑Time Big Data Processing

This article explains how Alibaba’s Blink, built on Apache Flink, transforms batch‑oriented big‑data platforms into a unified, high‑performance real‑time computing engine, detailing its architecture, state management, checkpointing, and successful deployment in e‑commerce, search, recommendation, and online machine‑learning scenarios.

AlibabaBig DataFlink
0 likes · 17 min read
How Alibaba’s Blink Engine Redefines Real‑Time Big Data Processing
21CTO
21CTO
Apr 19, 2017 · Artificial Intelligence

How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning

Alibaba’s e‑commerce search engine evolved over three years from offline batch models to a sophisticated AI-driven system that integrates real‑time feature ingestion, online learning, deep and reinforcement learning, enabling dynamic personalization and decision‑making that boosts conversion during high‑traffic events like Double 11.

AIOnline LearningReal‑Time Computing
0 likes · 15 min read
How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 9, 2017 · Big Data

How Alibaba Scaled Real‑Time Data Processing for Double 11: Architecture & Lessons

This article details Alibaba's real‑time computing architecture for the 2016 Double 11 event, covering background, core components such as DRC, TT, Galaxy, OTS, XTool and OneService, and explains optimization techniques, fault‑tolerance strategies, stress‑testing practices, and future upgrade plans to handle massive streaming data workloads.

Big DataReal‑Time Computingarchitecture
0 likes · 14 min read
How Alibaba Scaled Real‑Time Data Processing for Double 11: Architecture & Lessons
Ctrip Technology
Ctrip Technology
Jan 5, 2017 · Artificial Intelligence

Design and Implementation of a Billion‑Scale Generalized Recommendation System at Tencent Cloud

This article explains how Tencent built a billion‑scale, generalized recommendation system by designing a reusable algorithm library, deploying a low‑latency, highly available real‑time streaming platform (R2), and offering a cloud‑based recommendation engine that simplifies integration for internet businesses.

AIReal‑Time Computingcloud computing
0 likes · 11 min read
Design and Implementation of a Billion‑Scale Generalized Recommendation System at Tencent Cloud
21CTO
21CTO
Jan 16, 2016 · Artificial Intelligence

How Alibaba’s Dual-Path Real-Time Computing Powers Search During Double 11

This article explains Alibaba’s dual‑link real‑time computing framework, detailing its micro‑ and macro‑level pipelines, key components such as Pora, iGraph and SP, online learning architectures, pointwise and pairwise ranking models, bandit‑based strategy optimization, PID‑controlled traffic balancing, and the impressive performance gains achieved during the Double 11 shopping festival.

AlibabaOnline LearningPID control
0 likes · 22 min read
How Alibaba’s Dual-Path Real-Time Computing Powers Search During Double 11
Architect
Architect
Jan 16, 2016 · Artificial Intelligence

Real‑Time Computing System for Alibaba Search: Architecture, Online Learning, and Strategy Optimization

The article presents Alibaba's real‑time computing platform for search, detailing its micro‑ and macro‑level architectures, online learning frameworks, point‑wise and pair‑wise ranking models, bandit‑based strategy optimization, and PID‑controlled traffic regulation, and reports significant performance gains during the Double‑11 shopping festival.

Online LearningPID controlReal‑Time Computing
0 likes · 22 min read
Real‑Time Computing System for Alibaba Search: Architecture, Online Learning, and Strategy Optimization
Qunar Tech Salon
Qunar Tech Salon
Dec 15, 2015 · Big Data

Real-Time Computing with Apache Storm: Architecture, Code Samples, and Fault Tolerance

This article explains the principles of real-time computing, compares it with offline batch processing, and demonstrates a practical solution using Kafka for ingestion, Apache Storm for continuous computation, and various storage options, while also covering streaming concepts and Storm's high‑availability mechanisms.

Apache StormKafkaReal‑Time Computing
0 likes · 8 min read
Real-Time Computing with Apache Storm: Architecture, Code Samples, and Fault Tolerance
High Availability Architecture
High Availability Architecture
May 15, 2015 · Big Data

Real-Time Computing at Dianping: Architecture, Use Cases, and Best Practices

During a detailed live session, senior Dianping engineer Wang Xinchun explains the company's real‑time computing platform built on Apache Storm, covering use cases such as dashboards, search and recommendation, system architecture, data ingestion tools like Blackhole and Puma, performance tuning, monitoring, and practical best‑practice recommendations.

Apache StormBig DataReal‑Time Computing
0 likes · 21 min read
Real-Time Computing at Dianping: Architecture, Use Cases, and Best Practices