Big Data 6 min read

Curated Collection of Big Data, Flink, Hadoop and Real‑Time Computing Articles from the “Big Data Technology and Architecture” Series

This article presents a carefully organized catalogue of over a hundred technical posts covering Flink source‑code analysis, fundamental and advanced big‑data structures, Hadoop ecosystem components, real‑time streaming with Spark and Kafka, as well as system design guidelines and miscellaneous insights, each linked to its original publication for easy reference.

Big Data Technology & Architecture

Jun 30, 2019

Curated Collection of Big Data, Flink, Hadoop and Real‑Time Computing Articles from the “Big Data Technology and Architecture” Series

The author reflects on the half‑year effort of producing original and curated high‑quality articles under the "Big Data Technology and Architecture" brand, and provides a categorized index for readers.

Flink from Beginner to Advanced – A series of deep‑dive posts on Flink components, execution plans, JobManager, TaskManager, operators, windows, time handling, connectors, SQL, state management, fault tolerance, and real‑world case studies.

Big Data Fundamentals – Articles on core Java collections, concurrent data structures, and their performance characteristics.

Big Data Advanced Topics – Tutorials on JVM & NIO basics, distributed theory, Zookeeper, RPC, Netty, Linux, and more.

Flink Introductory Series – Guides on Flink basics, DataSet/DataStream APIs, cluster deployment, restart strategies, broadcast variables, time stamps, watermarks, and integration with Kafka, Redis, MySQL.

Flink Advanced Series – In‑depth discussions on fault tolerance, duality, continuous queries, connectors, SQL, joins, state handling, and performance optimizations.

Hadoop Ecosystem Series – Coverage of Hadoop basics, MapReduce, HDFS, YARN, Hive, HBase, Phoenix, compression formats, and related best practices.

Real‑Time Computing Series (Spark, Kafka, etc.) – Articles on Spark Streaming, Structured Streaming, Kafka fundamentals, exactly‑once semantics, back‑pressure, data skew, memory tuning, and emerging technologies like Apache Pulsar.

Standards and System Design – Design case studies of large‑scale log systems, Redis development standards, interview preparation, and data‑processing methods.

Miscellaneous – Personal reflections, growth advice, and community engagement notes.

Each entry includes a direct link to the original article, enabling readers to explore the full content.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Big Data Flink Streaming Hadoop

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.