Tagged articles

560 articles

Page 6 of 6

Mar 5, 2019 · Databases

How HTAP and DRDS HTAP Enable Real‑Time OLTP/OLAP Integration

This article explains the concepts of OLTP, OLAP and HTAP, describes the DRDS HTAP architecture—including its engine and storage layers, Fireworks Spark‑based engine, optimizer stages, and streaming capabilities—and demonstrates cross‑database MPP queries and streaming joins while outlining suitable use cases and limitations.

DRDSDatabase ArchitectureHTAP

0 likes · 17 min read

How HTAP and DRDS HTAP Enable Real‑Time OLTP/OLAP Integration

Big Data Technology & Architecture

Mar 5, 2019 · Big Data

Real-time Top‑N Book Ranking with Apache Flink

This tutorial explains how to implement a real‑time top‑N hot‑selling book ranking that outputs the most clicked books every five seconds using Apache Flink, Kafka, sliding processing‑time windows, and a custom TopN aggregation function.

FlinkStreamingTopN

0 likes · 7 min read

Real-time Top‑N Book Ranking with Apache Flink

Big Data Technology & Architecture

Mar 3, 2019 · Big Data

Getting Started with Flink Kafka Connector: Concepts, Setup, and Sample Code

This article introduces the Flink‑Kafka connector, explains essential Kafka concepts, shows how to configure checkpointing, provides Maven dependencies, and includes complete Java examples for both producing to and consuming from Kafka within a Flink streaming job.

Big DataConnectorFlink

0 likes · 8 min read

Getting Started with Flink Kafka Connector: Concepts, Setup, and Sample Code

Big Data Technology & Architecture

Feb 28, 2019 · Big Data

Understanding Flink Window Types and Their Implementations

This article explains Flink's window concepts—including time‑based, count‑based, tumbling, sliding, and session windows—provides practical Scala code examples for each type, and links to related resources on Flink basics, APIs, deployment, and advanced features.

Big DataFlinkScala

0 likes · 5 min read

Understanding Flink Window Types and Their Implementations

Big Data Technology & Architecture

Feb 25, 2019 · Big Data

Understanding Flink DataSetAPI and DataStreamAPI

This article introduces Apache Flink's DataSetAPI and DataStreamAPI, explains their source, transformation, and sink concepts, highlights the key differences in transformation handling, and notes the series' goal of publishing over 500 big‑data tutorials for learners from beginner to expert.

Big DataDataSetAPIDataStreamAPI

0 likes · 2 min read

Understanding Flink DataSetAPI and DataStreamAPI

58 Tech

Jan 24, 2019 · Mobile Development

Integrating WebRTC Real‑Time Audio/Video with WeChat Mini Programs: Architecture and Implementation Details

This article describes a comprehensive solution for enabling real‑time audio and video communication between existing WebRTC endpoints and WeChat Mini Programs by introducing a WebRTC Gateway and Streaming Server, detailing architecture, signaling flows, media conversion, performance optimizations, and session reliability mechanisms.

RTMPRTPStreaming

0 likes · 12 min read

Integrating WebRTC Real‑Time Audio/Video with WeChat Mini Programs: Architecture and Implementation Details

Alibaba Cloud Developer

Jan 3, 2019 · Big Data

How Apache Flink Powers Real‑Time Big Data at Alibaba and Beyond

The 2018 Flink Forward China conference in Beijing showcased Apache Flink’s evolution, Alibaba’s massive contributions—including the Blink fork, real‑time BI, online learning and city‑level analytics—and highlighted how industry leaders like Alibaba, Didi and others leverage Flink for scalable, low‑latency big‑data processing across diverse use cases.

Apache FlinkBatch-Stream FusionReal-time analytics

0 likes · 19 min read

How Apache Flink Powers Real‑Time Big Data at Alibaba and Beyond

Big Data Technology & Architecture

Jan 2, 2019 · Big Data

Understanding Spark Streaming Backpressure Mechanism

The article explains how Spark Streaming backpressure, introduced in version 1.5, automatically adjusts data ingestion rates based on processing delays, replaces manual rate limits, and details its architecture, configuration parameters, and usage for preventing data backlog and executor OOM.

Big DataRate ControlSpark

0 likes · 6 min read

Understanding Spark Streaming Backpressure Mechanism

Xianyu Technology

Dec 20, 2018 · Operations

Optimizing Short Video Playback with Preloading and Proxy Caching

By preloading the MP4 header and initial frames and routing playback through a local proxy that caches range‑requested segments in an LRU disk store, the system moves the moov box to the file start (or fetches it separately), cutting short‑video start‑up latency to roughly 800 ms and delivering near‑instant playback.

ProxyStreamingcaching

0 likes · 13 min read

Optimizing Short Video Playback with Preloading and Proxy Caching

Didi Tech

Dec 18, 2018 · Big Data

Evolution and Architecture of Didi's Real-Time Computing Platform

From early self‑built Storm and Spark Streaming clusters to a unified YARN‑based Spark platform and finally a low‑latency Flink system with extended CEP and StreamSQL capabilities, Didi’s real‑time computing platform evolved through three stages, delivering multi‑tenant isolation, rich SQL processing, and dramatically reduced development costs.

Big DataCEPFlink

0 likes · 9 min read

Evolution and Architecture of Didi's Real-Time Computing Platform

DataFunTalk

Dec 18, 2018 · Big Data

Flink-based Real-time Data Warehouse Practice at Yanxuan

This talk presents Yanxuan’s real‑time data warehouse built on Flink, covering background challenges, overall architecture and implementation, data quality measures, monitoring, and practical application scenarios, while highlighting design goals of flexibility, high development efficiency, and stringent data quality requirements.

FlinkStreamingreal-time data warehouse

0 likes · 14 min read

Flink-based Real-time Data Warehouse Practice at Yanxuan

21CTO

Nov 27, 2018 · Big Data

How Netflix’s Data‑Driven Playbook Is Challenging Hollywood’s Creative Rules

Netflix’s data‑driven strategy, which uses massive subscriber analytics to shape original content and marketing, has sparked a clash with Hollywood’s traditional, relationship‑focused approach, leading to internal power struggles, leadership changes, and a broader debate over algorithmic versus human intuition in entertainment.

Data AnalyticsNetflixStreaming

0 likes · 8 min read

How Netflix’s Data‑Driven Playbook Is Challenging Hollywood’s Creative Rules

Alibaba Cloud Developer

Nov 14, 2018 · Backend Development

Inside Ant's Real-Time Video Call System: Architecture & Optimizations

This article explores Ant Financial's real-time video call platform, detailing its technical choices, system architecture, signaling reliability design, network optimization strategies, and future directions for multi‑party video conferencing and interactive live streaming.

Ant FinancialReal-time VideoSignal Reliability

0 likes · 19 min read

Inside Ant's Real-Time Video Call System: Architecture & Optimizations

Youku Technology

Oct 29, 2018 · Artificial Intelligence

Improving Online Video Experience: Youku’s End‑to‑End Video Quality Enhancement Techniques

Youku enhances online video by applying intelligent post‑production contrast mapping, device‑specific HDR tone‑mapping, high‑frame‑rate restoration through frame‑rate conversion, and ROI‑aware encoding that allocates bitrate to key visual areas, complemented by audio processing, to deliver cinema‑grade quality across diverse screens.

HDRROI encodingStreaming

0 likes · 9 min read

Improving Online Video Experience: Youku’s End‑to‑End Video Quality Enhancement Techniques

ITPUB

Oct 23, 2018 · Big Data

How Meituan Built a Scalable Real‑Time Data Warehouse with Flink

This article explains how Meituan tackled growing real‑time data demands by redesigning its streaming platform, adopting a layered real‑time data warehouse architecture, selecting storage and compute technologies such as Cellar, Elasticsearch, Druid and Flink, and sharing practical tips on dimension expansion, joins, and aggregation to achieve higher throughput and lower latency.

Data ArchitectureFlinkMeituan

0 likes · 15 min read

How Meituan Built a Scalable Real‑Time Data Warehouse with Flink

Meituan Technology Team

Oct 18, 2018 · Big Data

Building a Real-Time Data Warehouse with Flink at Meituan

Meituan replaced its Storm‑based pipeline with a four‑layer real‑time data warehouse powered by Flink, using hybrid storage (Cellar KV, Elasticsearch, Druid, MySQL) to deliver low‑latency, high‑throughput services, dramatically simplifying SQL‑driven development, unifying metrics, cutting compute costs, and paving the way for offline‑grade accuracy and reliability.

FlinkMeituanStreaming

0 likes · 16 min read

Building a Real-Time Data Warehouse with Flink at Meituan

360 Tech Engineering

Oct 18, 2018 · Big Data

KafkaBridge: A Multi‑Language Kafka Client SDK for Simplified Read/Write Operations

KafkaBridge is an open‑source, multi‑language SDK built on librdkafka that offers a minimal, easy‑to‑use interface for producing and consuming messages in Apache Kafka, with optimizations for PHP‑FPM, extensive language support, and detailed performance benchmarks.

GolangKafkaPHP

0 likes · 7 min read

KafkaBridge: A Multi‑Language Kafka Client SDK for Simplified Read/Write Operations

Efficient Ops

Oct 13, 2018 · Big Data

Boost Your Kafka Integration with KafkaBridge: Multi-Language SDK Overview

KafkaBridge is a lightweight, multi-language SDK that simplifies Kafka read/write operations, offering unified interfaces, long‑connection reuse for PHP‑FPM, and reliable message delivery, with detailed compilation steps, usage examples, and performance benchmarks across C++, Python, PHP, and Go.

GolangKafkaPHP

0 likes · 7 min read

Boost Your Kafka Integration with KafkaBridge: Multi-Language SDK Overview

Alibaba Cloud Developer

Oct 12, 2018 · Artificial Intelligence

How Alibaba’s ‘Ali Xiaomi’ Prediction Platform Boosts Smart Customer Service with AI

Alibaba’s Ali Xiaomi prediction platform leverages AI techniques—including order and issue prediction, deep CTR models, reinforcement learning, and streaming computation—to proactively anticipate user intents, improve click‑through, resolution and satisfaction rates across multiple chatbot services, while addressing code duplication and model deployment challenges.

AIPredictionStreaming

0 likes · 14 min read

How Alibaba’s ‘Ali Xiaomi’ Prediction Platform Boosts Smart Customer Service with AI

Big Data and Microservices

Sep 4, 2018 · Big Data

Exploring Five Big Data Architectures—from Traditional to Unified AI Designs

The article examines the evolution of big‑data processing by comparing five prevalent architectures—traditional Hadoop‑based stacks, streaming‑only designs, Kappa, Lambda, and the unified Unifield model—highlighting their strengths, weaknesses, and suitable scenarios while discussing the limitations of classic BI systems and the role of distributed storage, computation, and machine‑learning integration.

Big DataData ArchitectureHadoop

0 likes · 14 min read

Exploring Five Big Data Architectures—from Traditional to Unified AI Designs

Meitu Technology

Aug 14, 2018 · Big Data

Meitu Data Platform Architecture and Practices

Meitu’s data platform, serving dozens of apps with 500 million monthly active users and billions of daily events, combines the Arachnia log‑collection system, Kafka ingestion, multi‑layer storage (HDFS, MongoDB, HBase, Elasticsearch), offline Hive/MapReduce processing and real‑time Storm/Flink/Naix pipelines, supported by data‑workshop tools, staged evolution for scalability, and robust security and query‑validation mechanisms.

Big DataData PlatformETL

0 likes · 16 min read

Meitu Data Platform Architecture and Practices

Didi Tech

Aug 14, 2018 · Databases

Recap of the Open Source Salon: Latest Developments in Open Source Databases and Streaming Processing

On August 4 2018, Didi Open Source and the Open Source Database Forum hosted a salon where five industry experts presented the latest advances in open‑source databases—covering Redis multi‑data‑center deployment, MySQL InnoDB Cluster, streaming‑processing architecture, PostGIS GIS solutions, and Redis 5.0 features—followed by a Q&A, a prize draw, and a showcase of Didi’s growing open‑source portfolio.

PostGISStreamingdatabase

0 likes · 4 min read

Recap of the Open Source Salon: Latest Developments in Open Source Databases and Streaming Processing

360 Tech Engineering

Jul 9, 2018 · Fundamentals

Fundamentals of Audio and Video: Basics, Encoding, Processing, and Real‑Time Communication

This technical sharing session by a senior audio‑video engineer from 360 Video Cloud explains core concepts of video and audio, their encoding pipelines, media processing techniques, streaming protocols, and the challenges and key technologies behind real‑time communication (RTC).

RTCStreamingaudio

0 likes · 8 min read

Fundamentals of Audio and Video: Basics, Encoding, Processing, and Real‑Time Communication

AntTech

Jul 3, 2018 · Backend Development

Evolution of Financial‑Grade Message Queues at Ant Financial

The article reviews the ten‑year evolution of Ant Financial's message queue, detailing its core reliability, consistency, availability and performance requirements, the architectural mechanisms built to meet them, the shift to pull‑mode and API‑mode designs, and the recent integration of compute capabilities to create a smart data transmission platform.

Big DataDistributed SystemsMessage Queue

0 likes · 13 min read

Evolution of Financial‑Grade Message Queues at Ant Financial

ITPUB

Jun 2, 2018 · Big Data

Mastering Spark: Core Concepts, Architecture, Streaming & Performance Tuning

This comprehensive guide explains Spark's ecosystem, execution principles, key features, deployment architectures, core concepts like RDD, Transformations, Actions, Jobs, Stages, Shuffle and Cache, as well as Spark Streaming mechanics and practical resource‑tuning tips for optimal big‑data processing.

Big DataClusterRDD

0 likes · 15 min read

Mastering Spark: Core Concepts, Architecture, Streaming & Performance Tuning

Qunar Tech Salon

May 22, 2018 · Frontend Development

Understanding React 16 Server‑Side Rendering (SSR) and Its Rendering Strategies

This article explains how React 16 rewrote its server‑side rendering layer to support four rendering modes—including string and streaming approaches—by introducing the ReactPartialRenderer and ReactMarkupReadableStream classes, and details the process of converting virtual DOM trees into HTML strings.

JavaScriptReactSSR

0 likes · 8 min read

Understanding React 16 Server‑Side Rendering (SSR) and Its Rendering Strategies

Qunar Tech Salon

May 3, 2018 · Big Data

Understanding Kafka Message Formats Across Versions 0.7.x, 0.8.x, and 0.10.x

This article explains the evolution of Kafka message formats from version 0.7.x through 0.8.x (including 0.9.x) to 0.10.x, detailing each field, compression handling, and the design motivations behind the changes.

Big DataKafkaMessage Format

0 likes · 9 min read

Understanding Kafka Message Formats Across Versions 0.7.x, 0.8.x, and 0.10.x

UCloud Tech

Jan 8, 2018 · Fundamentals

How Modern Video Players Work: Architecture, Engines, and Cross‑Platform Strategies

This article explains the functional architecture of video players, details the multimedia engine workflow, compares web, Flash, Android, and iOS playback technologies, and addresses common challenges such as audio‑video synchronization, fast start, low latency, and buffering.

Audio-Video SyncStreamingcross‑platform

0 likes · 14 min read

How Modern Video Players Work: Architecture, Engines, and Cross‑Platform Strategies

Architecture Digest

Nov 11, 2017 · Big Data

Design and Implementation of a Seller Log System Using Kafka, Storm, Elasticsearch, and HBase

This article describes the design and implementation of a seller log system, detailing the use of Kafka for high‑throughput messaging, Storm for real‑time stream processing, Elasticsearch for hot‑data search, and HBase for cold‑data storage, along with challenges faced and optimization solutions.

KafkaStormStreaming

0 likes · 12 min read

Design and Implementation of a Seller Log System Using Kafka, Storm, Elasticsearch, and HBase

iQIYI Technical Product Team

Oct 20, 2017 · Big Data

Evolution and Architecture of iQiyi's Big Playback Core

iQiyi’s big playback core, created in 2013 under architect Gavin, unified fragmented players across PC, mobile and TV by evolving from a C/C++ XBMC‑based V1 to feature‑rich V3 with DRM, Dolby, hybrid P2P‑CDN, VR, multi‑instance support and major performance gains, paving the way for an intelligent next‑gen native player.

Cross‑platform developmentMedia EngineSoftware Architecture

0 likes · 11 min read

Evolution and Architecture of iQiyi's Big Playback Core

Qunar Tech Salon

Sep 25, 2017 · Big Data

Comprehensive Guide to Spark Ecosystem: Data Warehouse, Machine Learning, Streaming, and Enterprise Use Cases

This article provides an extensive overview of Apache Spark’s ecosystem—including its data‑warehouse capabilities, ML/MLlib libraries, streaming with Spark Streaming, external frameworks, and real‑world enterprise case studies—while also noting a promotional announcement for a React Native conference.

Big DataKafkaSpark

0 likes · 21 min read

Comprehensive Guide to Spark Ecosystem: Data Warehouse, Machine Learning, Streaming, and Enterprise Use Cases

Java Backend Technology

Aug 16, 2017 · Big Data

Understanding Apache Kafka: Architecture, Core Principles, and Use Cases

This article introduces Apache Kafka as a fast, scalable distributed publish‑subscribe system, explains its core components, Zookeeper coordination, startup workflow, key features, and common scenarios such as log collection, activity tracking, and stream processing.

Distributed MessagingStreamingZooKeeper

0 likes · 7 min read

Understanding Apache Kafka: Architecture, Core Principles, and Use Cases

Java Backend Technology

Aug 15, 2017 · Backend Development

Why Apache Kafka Outperforms Traditional Message Queues: Architecture & Use Cases

This article explains Apache Kafka’s distributed publish‑subscribe design, its core components, storage model, broker behavior, integration with ZooKeeper, performance comparisons with RabbitMQ and ActiveMQ, and provides a practical example application illustrating producer and consumer APIs.

Apache KafkaStreamingbackend-development

0 likes · 17 min read

Why Apache Kafka Outperforms Traditional Message Queues: Architecture & Use Cases

21CTO

Jul 8, 2017 · Big Data

Ctrip’s Scalable Real‑Time User Behavior System with Kafka, Storm, Redis

This article details Ctrip’s redesign of its real‑time user behavior service, covering the new architecture, data flow, use of Java, Kafka, Storm, Redis, and MySQL, and how it achieves high real‑time performance, availability, scalability, and fault‑tolerance to support massive travel‑industry traffic.

KafkaReal-TimeStorm

0 likes · 12 min read

Ctrip’s Scalable Real‑Time User Behavior System with Kafka, Storm, Redis

Huawei Cloud Developer Alliance

Jun 29, 2017 · Mobile Development

Master Video App Development in 3 Simple Steps – Huawei’s Secret Guide

This guide walks Huawei competition participants through the essential steps for building a video app, covering user‑need analysis, UI simplification, backend setup, performance considerations, screen adaptation, messaging, precise marketing, and experience measurement, while providing useful resource links.

BackendHuaweiMobile Development

0 likes · 5 min read

Master Video App Development in 3 Simple Steps – Huawei’s Secret Guide

MaGe Linux Operations

May 24, 2017 · Big Data

Demystifying Big Data: From HDFS to Spark, Hive, and Real‑Time Streaming

This article explains how big data challenges traditional storage, introduces HDFS for distributed file management, describes parallel processing frameworks like MapReduce, Tez, and Spark, compares higher‑level tools such as Hive and Pig, and explores real‑time streaming and key‑value stores for low‑latency analytics.

HadoopMapReduceSpark

0 likes · 9 min read

Demystifying Big Data: From HDFS to Spark, Hive, and Real‑Time Streaming

MaGe Linux Operations

May 3, 2017 · Big Data

From Storage to Real‑Time: The Evolution of Big Data Technologies

This article outlines the three historical stages of big data technology—from early storage and batch processing, through market‑driven integration with Hive, to today’s focus on speed with Spark, Impala and streaming—while detailing the Hadoop ecosystem components such as HDFS, MapReduce, KV stores and emerging solutions like YDB.

HDFSHadoopMapReduce

0 likes · 13 min read

From Storage to Real‑Time: The Evolution of Big Data Technologies

360 Quality & Efficiency

Apr 17, 2017 · Operations

St-load: Streaming Load Testing Tool – Installation, Configuration, and Usage Guide

This guide introduces St-load, a Linux‑based streaming load testing tool, and provides step‑by‑step instructions for installing, compiling, and using its RTMP, HTTP, and HLS testing features, including command‑line examples and parameter explanations for both push and pull scenarios.

LinuxLoad TestingRTMP

0 likes · 5 min read

St-load: Streaming Load Testing Tool – Installation, Configuration, and Usage Guide

StarRing Big Data Open Lab

Apr 14, 2017 · Big Data

How StreamSQL Applications Enable Secure Resource Isolation in Real-Time Data Streams

This article explains the StreamSQL Application concept, its resource‑sharing and isolation mechanisms, the DDL syntax for managing Applications, and a step‑by‑step example showing how a new user can create, run, and isolate StreamJobs without affecting existing production workloads.

Resource IsolationStreamSQLStreaming

0 likes · 8 min read

How StreamSQL Applications Enable Secure Resource Isolation in Real-Time Data Streams

21CTO

Mar 10, 2017 · Big Data

Inside Tencent Analytics: How TA Handles TB‑Scale Real‑Time Web Data

Tencent Analytics (TA) is a free web analytics platform that processes terabytes of daily data in real time, using a custom architecture featuring JavaScript collection, event streaming, in‑memory computation, and NoSQL storage with Redis and LevelDB, offering site owners instant insights and high availability.

Big DataLevelDBReal-time Processing

0 likes · 12 min read

Inside Tencent Analytics: How TA Handles TB‑Scale Real‑Time Web Data

Architecture Digest

Feb 28, 2017 · Big Data

Architecture and Real‑Time Processing Design of Tencent Analytics (TA)

This article explains the architecture, real‑time computation framework, and storage solutions of Tencent Analytics, detailing how massive TB‑level web‑traffic data are collected via JavaScript, processed in memory‑centric streaming components, and stored using Redis and LevelDB to achieve second‑level updates.

Big DataLevelDBNoSQL

0 likes · 13 min read

Architecture and Real‑Time Processing Design of Tencent Analytics (TA)

Architects Research Society

Nov 27, 2016 · Big Data

An Introduction to Apache Beam and Its Beam Model for Unified Batch and Stream Processing

This article introduces Apache Beam, its Beam Model, and how the Beam SDK enables developers to write unified, flexible pipelines for both bounded batch jobs and unbounded streaming workloads, illustrating concepts with mobile‑gaming examples and detailed code snippets.

Apache BeamBatch ProcessingBeam Model

0 likes · 19 min read

An Introduction to Apache Beam and Its Beam Model for Unified Batch and Stream Processing

21CTO

Sep 11, 2016 · Fundamentals

Why Video Compression Matters: Decoding Formats, Codecs, and Streaming Essentials

This article explores the evolution of live‑streaming platforms, explains common video file extensions, details compression techniques and standards such as MPEG and H.264, and highlights the bandwidth and storage challenges that drive modern video encoding decisions.

H.264MPEGStreaming

0 likes · 11 min read

Why Video Compression Matters: Decoding Formats, Codecs, and Streaming Essentials

360 Quality & Efficiency

Jul 28, 2016 · Fundamentals

Fundamentals of Audio/Video Encoding and ffmpeg Command Basics

This article introduces ffmpeg as a powerful multimedia framework, explains container formats, bitrate, resolution, and frame rate concepts, outlines key live‑streaming performance metrics, and provides essential ffmpeg command‑line options and examples for streaming and transcoding.

StreamingVideo Encodingaudio encoding

0 likes · 6 min read

Fundamentals of Audio/Video Encoding and ffmpeg Command Basics

Architecture Digest

Jul 21, 2016 · Fundamentals

Design and Implementation of Low‑Latency Real‑Time Streaming Protocols: RTP, RTCP, and Packet‑Loss Solutions

The article explains why TCP‑based protocols cannot meet low‑latency requirements for live‑streaming conferences and introduces RTP, RTCP, jitter, round‑trip time, and three packet‑loss mitigation strategies—retransmission, forward error correction, and cross‑transport—along with a brief overview of DCCP for congestion control.

DCCPFECLow-Latency

0 likes · 14 min read

Design and Implementation of Low‑Latency Real‑Time Streaming Protocols: RTP, RTCP, and Packet‑Loss Solutions

21CTO

May 15, 2016 · Big Data

How LinkedIn Scales Kafka to Trillions of Messages: Lessons in Reliability, Cost, and Security

LinkedIn’s engineering team details how they have scaled Apache Kafka from billions to over a trillion daily messages, focusing on quotas, a new ZooKeeper‑free consumer, reliability enhancements, security features, monitoring frameworks, and ecosystem integrations to improve cost, availability, and performance.

KafkaLinkedInReliability

0 likes · 13 min read

How LinkedIn Scales Kafka to Trillions of Messages: Lessons in Reliability, Cost, and Security

Art of Distributed System Architecture Design

May 11, 2016 · Industry Insights

How LinkedIn Scales Kafka to Over 1 Trillion Messages Daily

LinkedIn’s engineering team details how they scaled Kafka from a few billion to over a trillion daily messages, covering quotas, a new ZooKeeper‑free consumer, reliability upgrades, security roadmaps, monitoring frameworks, failure testing, cluster balancing, and ecosystem integrations.

KafkaLinkedInReliability

0 likes · 12 min read

How LinkedIn Scales Kafka to Over 1 Trillion Messages Daily

Meituan Technology Team

Apr 29, 2016 · Big Data

Introduction to Spark in Big Data

Apache Spark, a versatile big‑data platform supporting batch processing, SQL queries, real‑time streaming, and machine‑learning workloads, dramatically accelerates data‑intensive jobs, as demonstrated by Meituan‑Dianping, where its high‑performance engine reduces execution times and enhances scalability across diverse analytical and operational pipelines.

Batch ProcessingBig DataSpark

0 likes · 1 min read

ITPUB

Apr 29, 2016 · Databases

How to Stream Large MySQL Query Results Without Running Out of Memory

MySQL normally loads an entire query result into memory, which can cause out‑of‑memory errors on large tables, but by adding the -q option in the console, enabling useCursorFetch in JDBC URLs, and setting stmt.setFetchSize(Integer.MIN_VALUE), you can switch to a streaming mode that returns rows one at a time.

JDBCMemoryResultSet

0 likes · 3 min read

How to Stream Large MySQL Query Results Without Running Out of Memory

Art of Distributed System Architecture Design

Mar 30, 2016 · Big Data

The Growing Role of Apache Kafka in Modern Big Data Architectures

The article explains how Apache Kafka has become a pivotal, high‑scalable publish‑subscribe system in the big‑data ecosystem, addressing the limitations of traditional databases, enabling real‑time data integration across specialized distributed systems, and shaping future data‑governance practices.

Apache KafkaData IntegrationStreaming

0 likes · 7 min read

The Growing Role of Apache Kafka in Modern Big Data Architectures

Node Underground

Mar 25, 2016 · Backend Development

How to Implement Bigpipe with HTTP Chunked Transfer in Node.js, PHP, and Java

This article explores the Bigpipe technique for accelerating first‑screen rendering by leveraging HTTP 1.1 chunked transfer, comparing implementations in PHP, Java, Node.js (including Express and Koa), and demonstrating parallel module flushing with async patterns such as callbacks, async parallel, co, and async/await.

BigpipeHTTP chunkedNode.js

0 likes · 17 min read

How to Implement Bigpipe with HTTP Chunked Transfer in Node.js, PHP, and Java

Architect

Mar 8, 2016 · Big Data

In‑Depth Analysis of Apache Kafka: Architecture, Core Concepts, and Benchmark

This article provides a comprehensive technical overview of Apache Kafka, covering its architecture, core concepts, design goals, comparison with other message queues, replication, consumer groups, delivery guarantees, and performance benchmarking, making it a valuable resource for big‑data engineers.

Big DataKafkaReplication

0 likes · 30 min read

In‑Depth Analysis of Apache Kafka: Architecture, Core Concepts, and Benchmark

21CTO

Mar 7, 2016 · Backend Development

When to Choose Kafka Over RabbitMQ: A Practical Comparison

This article compares Kafka and RabbitMQ, examining their design philosophies, throughput capabilities, consumer diversity, message ordering, and handling of individual messages, to help engineers decide which system suits high-volume or flexible-consumer scenarios and understand the trade-offs of each technology.

KafkaRabbitMQStreaming

0 likes · 7 min read

When to Choose Kafka Over RabbitMQ: A Practical Comparison

ITPUB

Jan 19, 2016 · Databases

Surprising PostgreSQL Features That Redefine What a Database Can Do

This article showcases seven remarkable PostgreSQL extensions—including multi‑master replication, Greenplum MPP OLAP, pg_shard/FDW sharding, PostGIS 3D GIS, GPU‑accelerated PG‑Strom, PipelineDB streaming, and the versatile FDW interface—illustrating how they enable high‑availability, massive analytics, geographic intelligence, and real‑time data processing.

Database ExtensionsFDWGIS

0 likes · 5 min read

Surprising PostgreSQL Features That Redefine What a Database Can Do

High Availability Architecture

Jan 6, 2016 · Big Data

Spark Latest Features, Tungsten Project, and Hulu’s Production Practices

This article reviews Spark's evolution from version 1.2 to 1.6, explains the DataFrame and Tungsten projects, shares Hulu’s real‑world Spark deployments, and discusses performance‑related challenges such as stack overflow, streaming receiver latency, and class‑loader deadlocks.

DataFramesDataset APIHulu

0 likes · 17 min read

Spark Latest Features, Tungsten Project, and Hulu’s Production Practices

Architect

Dec 30, 2015 · Big Data

Real-Time Big Data Processing with Storm and Kafka on Alibaba Cloud

This article explains how to build a large‑scale, real‑time vehicle monitoring system using Apache Storm and Kafka on Alibaba Cloud, covering the challenges of big‑data ingestion, system architecture, deployment steps, performance testing, and practical lessons learned.

Alibaba CloudBig DataKafka

0 likes · 12 min read

Real-Time Big Data Processing with Storm and Kafka on Alibaba Cloud

dbaplus Community

Nov 27, 2015 · Big Data

Why Spark Is the Next Big Thing in Big Data: Core Concepts Explained

This article provides a comprehensive overview of Apache Spark, covering its origins, core concepts such as RDDs, transformations, actions, dependencies, execution modes, and key components like Spark SQL, Streaming, MLlib, and GraphX, while also offering practical code examples and visual illustrations.

DataFramesGraphXMLlib

0 likes · 18 min read

Why Spark Is the Next Big Thing in Big Data: Core Concepts Explained

21CTO

Sep 30, 2015 · Operations

How LinkedIn Scaled Kafka to Process Over 1 Trillion Messages Daily

Since 2011, LinkedIn has expanded its Kafka deployment from handling billions to over a trillion messages per day, focusing on quotas, a new ZooKeeper‑free consumer, reliability enhancements, security, monitoring frameworks, fault‑injection testing, cluster balancing, and ecosystem integrations, offering valuable lessons for large‑scale streaming systems.

KafkaLinkedInReliability

0 likes · 12 min read

How LinkedIn Scaled Kafka to Process Over 1 Trillion Messages Daily

Qunar Tech Salon

Aug 18, 2015 · Big Data

Overview of Spark Big Data Analytics Framework Components

Spark’s big‑data analytics ecosystem comprises core components such as the in‑memory RDD data structure, Streaming for real‑time processing, GraphX for graph analytics, MLlib for machine‑learning, Spark SQL for querying, the Tachyon file system, and SparkR, each enabling scalable, distributed computation.

Big DataGraphXMLlib

0 likes · 5 min read

Overview of Spark Big Data Analytics Framework Components

Qunar Tech Salon

Mar 10, 2015 · Big Data

Kafka Overview: Architecture, Core Concepts, and Comparison with Other Message Queues

This article provides a comprehensive overview of Kafka, covering its background, design goals, architecture, key terminology, message routing, consumer groups, delivery guarantees, and a comparison with other popular message queue systems such as RabbitMQ, Redis, ZeroMQ, and ActiveMQ.

ConsumerKafkaMessage Queue

0 likes · 21 min read

Kafka Overview: Architecture, Core Concepts, and Comparison with Other Message Queues