Tagged articles
560 articles
Page 6 of 6
dbaplus Community
dbaplus Community
Mar 5, 2019 · Databases

How HTAP and DRDS HTAP Enable Real‑Time OLTP/OLAP Integration

This article explains the concepts of OLTP, OLAP and HTAP, describes the DRDS HTAP architecture—including its engine and storage layers, Fireworks Spark‑based engine, optimizer stages, and streaming capabilities—and demonstrates cross‑database MPP queries and streaming joins while outlining suitable use cases and limitations.

DRDSDatabase ArchitectureHTAP
0 likes · 17 min read
How HTAP and DRDS HTAP Enable Real‑Time OLTP/OLAP Integration
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 25, 2019 · Big Data

Understanding Flink DataSetAPI and DataStreamAPI

This article introduces Apache Flink's DataSetAPI and DataStreamAPI, explains their source, transformation, and sink concepts, highlights the key differences in transformation handling, and notes the series' goal of publishing over 500 big‑data tutorials for learners from beginner to expert.

Big DataDataSetAPIDataStreamAPI
0 likes · 2 min read
Understanding Flink DataSetAPI and DataStreamAPI
58 Tech
58 Tech
Jan 24, 2019 · Mobile Development

Integrating WebRTC Real‑Time Audio/Video with WeChat Mini Programs: Architecture and Implementation Details

This article describes a comprehensive solution for enabling real‑time audio and video communication between existing WebRTC endpoints and WeChat Mini Programs by introducing a WebRTC Gateway and Streaming Server, detailing architecture, signaling flows, media conversion, performance optimizations, and session reliability mechanisms.

RTMPRTPStreaming
0 likes · 12 min read
Integrating WebRTC Real‑Time Audio/Video with WeChat Mini Programs: Architecture and Implementation Details
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 3, 2019 · Big Data

How Apache Flink Powers Real‑Time Big Data at Alibaba and Beyond

The 2018 Flink Forward China conference in Beijing showcased Apache Flink’s evolution, Alibaba’s massive contributions—including the Blink fork, real‑time BI, online learning and city‑level analytics—and highlighted how industry leaders like Alibaba, Didi and others leverage Flink for scalable, low‑latency big‑data processing across diverse use cases.

Apache FlinkBatch-Stream FusionReal-time analytics
0 likes · 19 min read
How Apache Flink Powers Real‑Time Big Data at Alibaba and Beyond
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 2, 2019 · Big Data

Understanding Spark Streaming Backpressure Mechanism

The article explains how Spark Streaming backpressure, introduced in version 1.5, automatically adjusts data ingestion rates based on processing delays, replaces manual rate limits, and details its architecture, configuration parameters, and usage for preventing data backlog and executor OOM.

Big DataRate ControlSpark
0 likes · 6 min read
Understanding Spark Streaming Backpressure Mechanism
Xianyu Technology
Xianyu Technology
Dec 20, 2018 · Operations

Optimizing Short Video Playback with Preloading and Proxy Caching

By preloading the MP4 header and initial frames and routing playback through a local proxy that caches range‑requested segments in an LRU disk store, the system moves the moov box to the file start (or fetches it separately), cutting short‑video start‑up latency to roughly 800 ms and delivering near‑instant playback.

ProxyStreamingcaching
0 likes · 13 min read
Optimizing Short Video Playback with Preloading and Proxy Caching
Didi Tech
Didi Tech
Dec 18, 2018 · Big Data

Evolution and Architecture of Didi's Real-Time Computing Platform

From early self‑built Storm and Spark Streaming clusters to a unified YARN‑based Spark platform and finally a low‑latency Flink system with extended CEP and StreamSQL capabilities, Didi’s real‑time computing platform evolved through three stages, delivering multi‑tenant isolation, rich SQL processing, and dramatically reduced development costs.

Big DataCEPFlink
0 likes · 9 min read
Evolution and Architecture of Didi's Real-Time Computing Platform
DataFunTalk
DataFunTalk
Dec 18, 2018 · Big Data

Flink-based Real-time Data Warehouse Practice at Yanxuan

This talk presents Yanxuan’s real‑time data warehouse built on Flink, covering background challenges, overall architecture and implementation, data quality measures, monitoring, and practical application scenarios, while highlighting design goals of flexibility, high development efficiency, and stringent data quality requirements.

FlinkStreamingreal-time data warehouse
0 likes · 14 min read
Flink-based Real-time Data Warehouse Practice at Yanxuan
21CTO
21CTO
Nov 27, 2018 · Big Data

How Netflix’s Data‑Driven Playbook Is Challenging Hollywood’s Creative Rules

Netflix’s data‑driven strategy, which uses massive subscriber analytics to shape original content and marketing, has sparked a clash with Hollywood’s traditional, relationship‑focused approach, leading to internal power struggles, leadership changes, and a broader debate over algorithmic versus human intuition in entertainment.

Data AnalyticsNetflixStreaming
0 likes · 8 min read
How Netflix’s Data‑Driven Playbook Is Challenging Hollywood’s Creative Rules
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 14, 2018 · Backend Development

Inside Ant's Real-Time Video Call System: Architecture & Optimizations

This article explores Ant Financial's real-time video call platform, detailing its technical choices, system architecture, signaling reliability design, network optimization strategies, and future directions for multi‑party video conferencing and interactive live streaming.

Ant FinancialReal-time VideoSignal Reliability
0 likes · 19 min read
Inside Ant's Real-Time Video Call System: Architecture & Optimizations
Youku Technology
Youku Technology
Oct 29, 2018 · Artificial Intelligence

Improving Online Video Experience: Youku’s End‑to‑End Video Quality Enhancement Techniques

Youku enhances online video by applying intelligent post‑production contrast mapping, device‑specific HDR tone‑mapping, high‑frame‑rate restoration through frame‑rate conversion, and ROI‑aware encoding that allocates bitrate to key visual areas, complemented by audio processing, to deliver cinema‑grade quality across diverse screens.

HDRROI encodingStreaming
0 likes · 9 min read
Improving Online Video Experience: Youku’s End‑to‑End Video Quality Enhancement Techniques
ITPUB
ITPUB
Oct 23, 2018 · Big Data

How Meituan Built a Scalable Real‑Time Data Warehouse with Flink

This article explains how Meituan tackled growing real‑time data demands by redesigning its streaming platform, adopting a layered real‑time data warehouse architecture, selecting storage and compute technologies such as Cellar, Elasticsearch, Druid and Flink, and sharing practical tips on dimension expansion, joins, and aggregation to achieve higher throughput and lower latency.

Data ArchitectureFlinkMeituan
0 likes · 15 min read
How Meituan Built a Scalable Real‑Time Data Warehouse with Flink
Meituan Technology Team
Meituan Technology Team
Oct 18, 2018 · Big Data

Building a Real-Time Data Warehouse with Flink at Meituan

Meituan replaced its Storm‑based pipeline with a four‑layer real‑time data warehouse powered by Flink, using hybrid storage (Cellar KV, Elasticsearch, Druid, MySQL) to deliver low‑latency, high‑throughput services, dramatically simplifying SQL‑driven development, unifying metrics, cutting compute costs, and paving the way for offline‑grade accuracy and reliability.

FlinkMeituanStreaming
0 likes · 16 min read
Building a Real-Time Data Warehouse with Flink at Meituan
Efficient Ops
Efficient Ops
Oct 13, 2018 · Big Data

Boost Your Kafka Integration with KafkaBridge: Multi-Language SDK Overview

KafkaBridge is a lightweight, multi-language SDK that simplifies Kafka read/write operations, offering unified interfaces, long‑connection reuse for PHP‑FPM, and reliable message delivery, with detailed compilation steps, usage examples, and performance benchmarks across C++, Python, PHP, and Go.

GolangKafkaPHP
0 likes · 7 min read
Boost Your Kafka Integration with KafkaBridge: Multi-Language SDK Overview
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 12, 2018 · Artificial Intelligence

How Alibaba’s ‘Ali Xiaomi’ Prediction Platform Boosts Smart Customer Service with AI

Alibaba’s Ali Xiaomi prediction platform leverages AI techniques—including order and issue prediction, deep CTR models, reinforcement learning, and streaming computation—to proactively anticipate user intents, improve click‑through, resolution and satisfaction rates across multiple chatbot services, while addressing code duplication and model deployment challenges.

AIPredictionStreaming
0 likes · 14 min read
How Alibaba’s ‘Ali Xiaomi’ Prediction Platform Boosts Smart Customer Service with AI
Big Data and Microservices
Big Data and Microservices
Sep 4, 2018 · Big Data

Exploring Five Big Data Architectures—from Traditional to Unified AI Designs

The article examines the evolution of big‑data processing by comparing five prevalent architectures—traditional Hadoop‑based stacks, streaming‑only designs, Kappa, Lambda, and the unified Unifield model—highlighting their strengths, weaknesses, and suitable scenarios while discussing the limitations of classic BI systems and the role of distributed storage, computation, and machine‑learning integration.

Big DataData ArchitectureHadoop
0 likes · 14 min read
Exploring Five Big Data Architectures—from Traditional to Unified AI Designs
Meitu Technology
Meitu Technology
Aug 14, 2018 · Big Data

Meitu Data Platform Architecture and Practices

Meitu’s data platform, serving dozens of apps with 500 million monthly active users and billions of daily events, combines the Arachnia log‑collection system, Kafka ingestion, multi‑layer storage (HDFS, MongoDB, HBase, Elasticsearch), offline Hive/MapReduce processing and real‑time Storm/Flink/Naix pipelines, supported by data‑workshop tools, staged evolution for scalability, and robust security and query‑validation mechanisms.

Big DataData PlatformETL
0 likes · 16 min read
Meitu Data Platform Architecture and Practices
Didi Tech
Didi Tech
Aug 14, 2018 · Databases

Recap of the Open Source Salon: Latest Developments in Open Source Databases and Streaming Processing

On August 4 2018, Didi Open Source and the Open Source Database Forum hosted a salon where five industry experts presented the latest advances in open‑source databases—covering Redis multi‑data‑center deployment, MySQL InnoDB Cluster, streaming‑processing architecture, PostGIS GIS solutions, and Redis 5.0 features—followed by a Q&A, a prize draw, and a showcase of Didi’s growing open‑source portfolio.

PostGISStreamingdatabase
0 likes · 4 min read
Recap of the Open Source Salon: Latest Developments in Open Source Databases and Streaming Processing
AntTech
AntTech
Jul 3, 2018 · Backend Development

Evolution of Financial‑Grade Message Queues at Ant Financial

The article reviews the ten‑year evolution of Ant Financial's message queue, detailing its core reliability, consistency, availability and performance requirements, the architectural mechanisms built to meet them, the shift to pull‑mode and API‑mode designs, and the recent integration of compute capabilities to create a smart data transmission platform.

Big DataDistributed SystemsMessage Queue
0 likes · 13 min read
Evolution of Financial‑Grade Message Queues at Ant Financial
ITPUB
ITPUB
Jun 2, 2018 · Big Data

Mastering Spark: Core Concepts, Architecture, Streaming & Performance Tuning

This comprehensive guide explains Spark's ecosystem, execution principles, key features, deployment architectures, core concepts like RDD, Transformations, Actions, Jobs, Stages, Shuffle and Cache, as well as Spark Streaming mechanics and practical resource‑tuning tips for optimal big‑data processing.

Big DataClusterRDD
0 likes · 15 min read
Mastering Spark: Core Concepts, Architecture, Streaming & Performance Tuning
iQIYI Technical Product Team
iQIYI Technical Product Team
Oct 20, 2017 · Big Data

Evolution and Architecture of iQiyi's Big Playback Core

iQiyi’s big playback core, created in 2013 under architect Gavin, unified fragmented players across PC, mobile and TV by evolving from a C/C++ XBMC‑based V1 to feature‑rich V3 with DRM, Dolby, hybrid P2P‑CDN, VR, multi‑instance support and major performance gains, paving the way for an intelligent next‑gen native player.

Cross‑platform developmentMedia EngineSoftware Architecture
0 likes · 11 min read
Evolution and Architecture of iQiyi's Big Playback Core
Qunar Tech Salon
Qunar Tech Salon
Sep 25, 2017 · Big Data

Comprehensive Guide to Spark Ecosystem: Data Warehouse, Machine Learning, Streaming, and Enterprise Use Cases

This article provides an extensive overview of Apache Spark’s ecosystem—including its data‑warehouse capabilities, ML/MLlib libraries, streaming with Spark Streaming, external frameworks, and real‑world enterprise case studies—while also noting a promotional announcement for a React Native conference.

Big DataKafkaSpark
0 likes · 21 min read
Comprehensive Guide to Spark Ecosystem: Data Warehouse, Machine Learning, Streaming, and Enterprise Use Cases
21CTO
21CTO
Jul 8, 2017 · Big Data

Ctrip’s Scalable Real‑Time User Behavior System with Kafka, Storm, Redis

This article details Ctrip’s redesign of its real‑time user behavior service, covering the new architecture, data flow, use of Java, Kafka, Storm, Redis, and MySQL, and how it achieves high real‑time performance, availability, scalability, and fault‑tolerance to support massive travel‑industry traffic.

KafkaReal-TimeStorm
0 likes · 12 min read
Ctrip’s Scalable Real‑Time User Behavior System with Kafka, Storm, Redis
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Jun 29, 2017 · Mobile Development

Master Video App Development in 3 Simple Steps – Huawei’s Secret Guide

This guide walks Huawei competition participants through the essential steps for building a video app, covering user‑need analysis, UI simplification, backend setup, performance considerations, screen adaptation, messaging, precise marketing, and experience measurement, while providing useful resource links.

BackendHuaweiMobile Development
0 likes · 5 min read
Master Video App Development in 3 Simple Steps – Huawei’s Secret Guide
MaGe Linux Operations
MaGe Linux Operations
May 24, 2017 · Big Data

Demystifying Big Data: From HDFS to Spark, Hive, and Real‑Time Streaming

This article explains how big data challenges traditional storage, introduces HDFS for distributed file management, describes parallel processing frameworks like MapReduce, Tez, and Spark, compares higher‑level tools such as Hive and Pig, and explores real‑time streaming and key‑value stores for low‑latency analytics.

HadoopMapReduceSpark
0 likes · 9 min read
Demystifying Big Data: From HDFS to Spark, Hive, and Real‑Time Streaming
MaGe Linux Operations
MaGe Linux Operations
May 3, 2017 · Big Data

From Storage to Real‑Time: The Evolution of Big Data Technologies

This article outlines the three historical stages of big data technology—from early storage and batch processing, through market‑driven integration with Hive, to today’s focus on speed with Spark, Impala and streaming—while detailing the Hadoop ecosystem components such as HDFS, MapReduce, KV stores and emerging solutions like YDB.

HDFSHadoopMapReduce
0 likes · 13 min read
From Storage to Real‑Time: The Evolution of Big Data Technologies
21CTO
21CTO
Mar 10, 2017 · Big Data

Inside Tencent Analytics: How TA Handles TB‑Scale Real‑Time Web Data

Tencent Analytics (TA) is a free web analytics platform that processes terabytes of daily data in real time, using a custom architecture featuring JavaScript collection, event streaming, in‑memory computation, and NoSQL storage with Redis and LevelDB, offering site owners instant insights and high availability.

Big DataLevelDBReal-time Processing
0 likes · 12 min read
Inside Tencent Analytics: How TA Handles TB‑Scale Real‑Time Web Data
Architecture Digest
Architecture Digest
Feb 28, 2017 · Big Data

Architecture and Real‑Time Processing Design of Tencent Analytics (TA)

This article explains the architecture, real‑time computation framework, and storage solutions of Tencent Analytics, detailing how massive TB‑level web‑traffic data are collected via JavaScript, processed in memory‑centric streaming components, and stored using Redis and LevelDB to achieve second‑level updates.

Big DataLevelDBNoSQL
0 likes · 13 min read
Architecture and Real‑Time Processing Design of Tencent Analytics (TA)
360 Quality & Efficiency
360 Quality & Efficiency
Jul 28, 2016 · Fundamentals

Fundamentals of Audio/Video Encoding and ffmpeg Command Basics

This article introduces ffmpeg as a powerful multimedia framework, explains container formats, bitrate, resolution, and frame rate concepts, outlines key live‑streaming performance metrics, and provides essential ffmpeg command‑line options and examples for streaming and transcoding.

StreamingVideo Encodingaudio encoding
0 likes · 6 min read
Fundamentals of Audio/Video Encoding and ffmpeg Command Basics
Architecture Digest
Architecture Digest
Jul 21, 2016 · Fundamentals

Design and Implementation of Low‑Latency Real‑Time Streaming Protocols: RTP, RTCP, and Packet‑Loss Solutions

The article explains why TCP‑based protocols cannot meet low‑latency requirements for live‑streaming conferences and introduces RTP, RTCP, jitter, round‑trip time, and three packet‑loss mitigation strategies—retransmission, forward error correction, and cross‑transport—along with a brief overview of DCCP for congestion control.

DCCPFECLow-Latency
0 likes · 14 min read
Design and Implementation of Low‑Latency Real‑Time Streaming Protocols: RTP, RTCP, and Packet‑Loss Solutions
Meituan Technology Team
Meituan Technology Team
Apr 29, 2016 · Big Data

Introduction to Spark in Big Data

Apache Spark, a versatile big‑data platform supporting batch processing, SQL queries, real‑time streaming, and machine‑learning workloads, dramatically accelerates data‑intensive jobs, as demonstrated by Meituan‑Dianping, where its high‑performance engine reduces execution times and enhances scalability across diverse analytical and operational pipelines.

Batch ProcessingBig DataSpark
0 likes · 1 min read
Introduction to Spark in Big Data
ITPUB
ITPUB
Apr 29, 2016 · Databases

How to Stream Large MySQL Query Results Without Running Out of Memory

MySQL normally loads an entire query result into memory, which can cause out‑of‑memory errors on large tables, but by adding the -q option in the console, enabling useCursorFetch in JDBC URLs, and setting stmt.setFetchSize(Integer.MIN_VALUE), you can switch to a streaming mode that returns rows one at a time.

JDBCMemoryResultSet
0 likes · 3 min read
How to Stream Large MySQL Query Results Without Running Out of Memory

The Growing Role of Apache Kafka in Modern Big Data Architectures

The article explains how Apache Kafka has become a pivotal, high‑scalable publish‑subscribe system in the big‑data ecosystem, addressing the limitations of traditional databases, enabling real‑time data integration across specialized distributed systems, and shaping future data‑governance practices.

Apache KafkaData IntegrationStreaming
0 likes · 7 min read
The Growing Role of Apache Kafka in Modern Big Data Architectures
Node Underground
Node Underground
Mar 25, 2016 · Backend Development

How to Implement Bigpipe with HTTP Chunked Transfer in Node.js, PHP, and Java

This article explores the Bigpipe technique for accelerating first‑screen rendering by leveraging HTTP 1.1 chunked transfer, comparing implementations in PHP, Java, Node.js (including Express and Koa), and demonstrating parallel module flushing with async patterns such as callbacks, async parallel, co, and async/await.

BigpipeHTTP chunkedNode.js
0 likes · 17 min read
How to Implement Bigpipe with HTTP Chunked Transfer in Node.js, PHP, and Java
Architect
Architect
Mar 8, 2016 · Big Data

In‑Depth Analysis of Apache Kafka: Architecture, Core Concepts, and Benchmark

This article provides a comprehensive technical overview of Apache Kafka, covering its architecture, core concepts, design goals, comparison with other message queues, replication, consumer groups, delivery guarantees, and performance benchmarking, making it a valuable resource for big‑data engineers.

Big DataKafkaReplication
0 likes · 30 min read
In‑Depth Analysis of Apache Kafka: Architecture, Core Concepts, and Benchmark
21CTO
21CTO
Mar 7, 2016 · Backend Development

When to Choose Kafka Over RabbitMQ: A Practical Comparison

This article compares Kafka and RabbitMQ, examining their design philosophies, throughput capabilities, consumer diversity, message ordering, and handling of individual messages, to help engineers decide which system suits high-volume or flexible-consumer scenarios and understand the trade-offs of each technology.

KafkaRabbitMQStreaming
0 likes · 7 min read
When to Choose Kafka Over RabbitMQ: A Practical Comparison
ITPUB
ITPUB
Jan 19, 2016 · Databases

Surprising PostgreSQL Features That Redefine What a Database Can Do

This article showcases seven remarkable PostgreSQL extensions—including multi‑master replication, Greenplum MPP OLAP, pg_shard/FDW sharding, PostGIS 3D GIS, GPU‑accelerated PG‑Strom, PipelineDB streaming, and the versatile FDW interface—illustrating how they enable high‑availability, massive analytics, geographic intelligence, and real‑time data processing.

Database ExtensionsFDWGIS
0 likes · 5 min read
Surprising PostgreSQL Features That Redefine What a Database Can Do
Architect
Architect
Dec 30, 2015 · Big Data

Real-Time Big Data Processing with Storm and Kafka on Alibaba Cloud

This article explains how to build a large‑scale, real‑time vehicle monitoring system using Apache Storm and Kafka on Alibaba Cloud, covering the challenges of big‑data ingestion, system architecture, deployment steps, performance testing, and practical lessons learned.

Alibaba CloudBig DataKafka
0 likes · 12 min read
Real-Time Big Data Processing with Storm and Kafka on Alibaba Cloud
dbaplus Community
dbaplus Community
Nov 27, 2015 · Big Data

Why Spark Is the Next Big Thing in Big Data: Core Concepts Explained

This article provides a comprehensive overview of Apache Spark, covering its origins, core concepts such as RDDs, transformations, actions, dependencies, execution modes, and key components like Spark SQL, Streaming, MLlib, and GraphX, while also offering practical code examples and visual illustrations.

DataFramesGraphXMLlib
0 likes · 18 min read
Why Spark Is the Next Big Thing in Big Data: Core Concepts Explained
21CTO
21CTO
Sep 30, 2015 · Operations

How LinkedIn Scaled Kafka to Process Over 1 Trillion Messages Daily

Since 2011, LinkedIn has expanded its Kafka deployment from handling billions to over a trillion messages per day, focusing on quotas, a new ZooKeeper‑free consumer, reliability enhancements, security, monitoring frameworks, fault‑injection testing, cluster balancing, and ecosystem integrations, offering valuable lessons for large‑scale streaming systems.

KafkaLinkedInReliability
0 likes · 12 min read
How LinkedIn Scaled Kafka to Process Over 1 Trillion Messages Daily
Qunar Tech Salon
Qunar Tech Salon
Aug 18, 2015 · Big Data

Overview of Spark Big Data Analytics Framework Components

Spark’s big‑data analytics ecosystem comprises core components such as the in‑memory RDD data structure, Streaming for real‑time processing, GraphX for graph analytics, MLlib for machine‑learning, Spark SQL for querying, the Tachyon file system, and SparkR, each enabling scalable, distributed computation.

Big DataGraphXMLlib
0 likes · 5 min read
Overview of Spark Big Data Analytics Framework Components