Tagged articles
135 articles
Page 2 of 2
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 2, 2019 · Big Data

Integrating Apache Flink with Apache Pulsar for Scalable Elastic Data Processing

This article explains how Apache Pulsar and Apache Flink can be combined to provide a unified, scalable, and fault‑tolerant data processing platform, covering Pulsar's architecture, its differences from other messaging systems, various integration patterns, and concrete code examples for stream and batch workloads.

Apache FlinkApache PulsarBig Data
0 likes · 13 min read
Integrating Apache Flink with Apache Pulsar for Scalable Elastic Data Processing
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 20, 2019 · Big Data

Comprehensive Guide to Flink SQL: Background, New Features, Programming Model, Operators, Functions, and a Practical NBA Scoring Leader Example

This article provides an in‑depth overview of Flink SQL, covering its origins, the latest 1.7.0 and 1.8.0 enhancements, the underlying programming model, common operators and built‑in functions, and a complete end‑to‑end example that analyzes NBA scoring‑leader data using Flink SQL.

Apache FlinkBig DataFlink SQL
0 likes · 27 min read
Comprehensive Guide to Flink SQL: Background, New Features, Programming Model, Operators, Functions, and a Practical NBA Scoring Leader Example
Big Data Technology & Architecture
Big Data Technology & Architecture
May 22, 2019 · Big Data

Key Changes and New Features in Apache Flink 1.8.0 Release

Apache Flink 1.8.0 introduces incremental state cleanup with TTL, updates Hadoop support, deprecates TableEnvironment static methods, adds new Kafka deserialization schema, modifies Maven dependencies, and provides several configuration and Table API enhancements for better stream‑processing performance and compatibility.

Apache FlinkHadoopTable API
0 likes · 7 min read
Key Changes and New Features in Apache Flink 1.8.0 Release
Big Data Technology & Architecture
Big Data Technology & Architecture
May 19, 2019 · Big Data

Implementing End-to-End Exactly-Once Semantics in Apache Flink with Apache Kafka Using Two-Phase Commit Sink

This article explains how Apache Flink’s TwoPhaseCommitSinkFunction, introduced in version 1.4, enables end-to-end exactly-once semantics when integrated with Apache Kafka, detailing the checkpoint mechanism and the two-phase commit protocol that ensures reliable data processing.

Apache FlinkApache KafkaBig Data
0 likes · 4 min read
Implementing End-to-End Exactly-Once Semantics in Apache Flink with Apache Kafka Using Two-Phase Commit Sink
G7 EasyFlow Tech Circle
G7 EasyFlow Tech Circle
Apr 23, 2019 · Big Data

How We Scaled Fatigue Event Processing to 45K TPS with Apache Flink

By iteratively redesigning the fatigue‑event detection pipeline and leveraging Apache Flink’s stateful stream processing, the team reduced network overhead, cut resource usage to a third, and achieved a stable 45,000 TPS throughput on six containers with 20 GB memory, while outlining three optimization phases and practical lessons.

Apache FlinkFatigue DetectionIoT
0 likes · 13 min read
How We Scaled Fatigue Event Processing to 45K TPS with Apache Flink
Big Data Technology Architecture
Big Data Technology Architecture
Apr 22, 2019 · Big Data

Comparison of Apache Spark and Apache Flink: Programming Models, Streaming, State Management, and Exactly-Once Semantics

This article compares Apache Spark and Apache Flink, outlining their programming models, streaming mechanisms, state management, time semantics, and exactly‑once guarantees, and highlights the strengths and differences of each framework for batch and real‑time big‑data processing.

Apache FlinkApache SparkExactly-Once
0 likes · 8 min read
Comparison of Apache Spark and Apache Flink: Programming Models, Streaming, State Management, and Exactly-Once Semantics
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 29, 2019 · Big Data

Weekly Knowledge Digest: Apache Flink Deep Dives on JOIN LATERAL, TimeInterval, Temporal Table, and State Management

This week's digest shares a personal anecdote and a series of technical deep‑dives into Apache Flink, covering JOIN LATERAL, TimeInterval JOIN, Temporal Table JOIN, state management, and related code examples, while also previewing upcoming work schedules and recommended Flink reference articles.

Apache FlinkBig DataSQL Join
0 likes · 5 min read
Weekly Knowledge Digest: Apache Flink Deep Dives on JOIN LATERAL, TimeInterval, Temporal Table, and State Management
ITPUB
ITPUB
Mar 28, 2019 · Big Data

Why Pravega Matters: Native Stream Storage for Low‑Latency, Exactly‑Once Data Pipelines

Pravega, Dell’s native stream storage project, addresses the challenges of modern low‑latency, exactly‑once stream processing by combining tiered storage, Apache BookKeeper, and seamless Flink integration, offering a unified solution that reduces development, storage, and operational costs compared to traditional message systems like Kafka.

Apache FlinkExactly-OnceKafka Comparison
0 likes · 10 min read
Why Pravega Matters: Native Stream Storage for Low‑Latency, Exactly‑Once Data Pipelines
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 22, 2019 · Big Data

Weekly Knowledge Points: Apache Flink Continuous Queries, Kafka Connectors, SQL Overview, JOIN Operator, and Table API

This weekly briefing introduces Apache Flink's continuous query mechanism, demonstrates how to integrate Kafka as a DataStream connector, provides an overview of Flink SQL features, explains the implementation and optimization of dual‑stream JOIN operators, and showcases the Table API with end‑to‑end examples.

Apache FlinkBig DataTable API
0 likes · 3 min read
Weekly Knowledge Points: Apache Flink Continuous Queries, Kafka Connectors, SQL Overview, JOIN Operator, and Table API
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 20, 2019 · Databases

Understanding JOIN Operators: From Traditional Databases to Apache Flink Streaming

This article explains the purpose and types of SQL JOIN operators, demonstrates their syntax and semantics with examples, compares traditional database joins to Apache Flink's streaming two‑stream join implementation, and discusses optimization techniques such as state management, shuffle handling, and join reordering.

Apache FlinkState ManagementStreaming
0 likes · 22 min read
Understanding JOIN Operators: From Traditional Databases to Apache Flink Streaming
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 19, 2019 · Big Data

Comprehensive Overview of SQL and Apache Flink SQL Features with Practical Code Examples

This article provides an in-depth introduction to SQL, its history and ANSI standards, then details Apache Flink's SQL capabilities—including SELECT, WHERE, GROUP BY, UNION, JOIN, window functions, and user-defined functions—accompanied by extensive code examples and a complete end‑to‑end Flink job implementation.

Apache FlinkBig DataStreaming
0 likes · 34 min read
Comprehensive Overview of SQL and Apache Flink SQL Features with Practical Code Examples
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 17, 2019 · Big Data

Understanding Continuous Queries in Apache Flink: From Static Queries to Dynamic Tables and Trigger Simulations

This article explains how Apache Flink implements continuous queries for unbounded stream processing, compares static and continuous query semantics, demonstrates how MySQL triggers can simulate continuous queries in append‑only and update scenarios, and discusses Flink's connector, source, sink, and retraction mechanisms for correct incremental computation.

Apache FlinkBig DataContinuous Query
0 likes · 18 min read
Understanding Continuous Queries in Apache Flink: From Static Queries to Dynamic Tables and Trigger Simulations
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 13, 2019 · Big Data

Understanding Fault Tolerance and Exactly-Once Semantics in Apache Flink

This article explains Apache Flink's fault‑tolerance mechanisms, including checkpointing, barrier alignment, the differences between At‑Least‑Once and Exactly‑Once semantics, configuration options, incremental checkpointing, and the requirements for external sources and sinks to achieve end‑to‑end exactly‑once processing.

Apache FlinkBig DataExactly-Once
0 likes · 15 min read
Understanding Fault Tolerance and Exactly-Once Semantics in Apache Flink
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 12, 2019 · Big Data

Understanding Apache Flink’s Core Design: “Batch Is a Special Case of Stream” and Its Architecture

This article explains Apache Flink’s fundamental design principle that treats batch as a special case of stream, compares native streaming with micro‑batching, describes its deployment modes, fault‑tolerance mechanisms, unified data and scheduling layers, and outlines Alibaba’s architectural optimizations for the platform.

Apache FlinkBatch Processingnative streaming
0 likes · 15 min read
Understanding Apache Flink’s Core Design: “Batch Is a Special Case of Stream” and Its Architecture
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 3, 2019 · Big Data

How Apache Flink Powers Real‑Time Big Data at Alibaba and Beyond

The 2018 Flink Forward China conference in Beijing showcased Apache Flink’s evolution, Alibaba’s massive contributions—including the Blink fork, real‑time BI, online learning and city‑level analytics—and highlighted how industry leaders like Alibaba, Didi and others leverage Flink for scalable, low‑latency big‑data processing across diverse use cases.

Apache FlinkBatch-Stream FusionOpen-source
0 likes · 19 min read
How Apache Flink Powers Real‑Time Big Data at Alibaba and Beyond
Qunar Tech Salon
Qunar Tech Salon
Oct 25, 2018 · Big Data

Why Alibaba Chose Apache Flink: Architecture, Scale, and Future Directions

This article explains how Alibaba adopted Apache Flink as a unified, low‑latency, high‑throughput big‑data engine, detailing its stream‑first design, state management, checkpointing, massive production deployment, community contributions, and upcoming plans for a unified API, SQL layer, broader language support, and AI integration.

AlibabaApache FlinkBig Data
0 likes · 13 min read
Why Alibaba Chose Apache Flink: Architecture, Scale, and Future Directions
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 15, 2018 · Big Data

Why Alibaba Chose Apache Flink: A Deep Dive into Its Big Data Journey

This article explains how Alibaba adopted Apache Flink as a unified, low‑latency, high‑throughput big data engine, covering its origins, technical advantages over Spark, large‑scale deployment, state management, checkpointing, API unification, and future directions in streaming and batch processing.

AlibabaApache FlinkUnified Engine
0 likes · 14 min read
Why Alibaba Chose Apache Flink: A Deep Dive into Its Big Data Journey
JD Tech Talk
JD Tech Talk
Aug 2, 2018 · Big Data

Real-Time Order Statistics with Apache Flink in a Data Aggregation Platform

This article explains how the data aggregation platform adopts Apache Flink for high‑throughput, low‑latency stream processing, covering the complete workflow from data source integration, transformation operations, windowing and time concepts, to a concrete order‑count example with custom aggregation logic.

Apache FlinkEvent TimeFlink
0 likes · 10 min read
Real-Time Order Statistics with Apache Flink in a Data Aggregation Platform
Meituan Technology Team
Meituan Technology Team
Nov 16, 2017 · Big Data

Performance Comparison of Apache Flink and Apache Storm for Real-Time Stream Processing

The study benchmarks Apache Flink against Apache Storm on a shared cluster, showing Flink delivering three‑to‑five times higher throughput and roughly half the latency across simple, sleep‑induced, and windowed workloads, with modest throughput loss for exactly‑once semantics, leading to a recommendation of Flink for high‑performance, stateful real‑time stream processing.

Apache FlinkApache StormExactly-Once
0 likes · 19 min read
Performance Comparison of Apache Flink and Apache Storm for Real-Time Stream Processing
Suning Technology
Suning Technology
May 18, 2017 · Big Data

Why Apache Flink Beats Spark and Storm in Stream Processing

This article examines Apache Flink's stream‑processing architecture, compares its native streaming model, fault‑tolerance, performance and SQL capabilities with Spark and Storm, and concludes that Flink offers a more powerful and efficient solution despite some maturity gaps.

Apache FlinkSparkStorm
0 likes · 12 min read
Why Apache Flink Beats Spark and Storm in Stream Processing