Tagged articles
16 articles
Page 1 of 1
Alibaba Cloud Developer
Alibaba Cloud Developer
May 27, 2024 · Big Data

How MaxCompute’s New Offline‑Near‑Real‑Time Architecture Revolutionizes Big Data Workloads

This article explains how MaxCompute’s integrated offline‑and‑near‑real‑time architecture, built on Delta Table, solves complex big‑data scenarios by providing unified storage, ACID transactions, upsert, time‑travel, automatic data‑file governance and low‑latency query capabilities while reducing cost and operational complexity.

Delta TableMaxComputedata-warehouse
0 likes · 27 min read
How MaxCompute’s New Offline‑Near‑Real‑Time Architecture Revolutionizes Big Data Workloads
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 16, 2024 · Big Data

MaxCompute’s Integrated Offline & Near‑Real‑Time Architecture: Transaction Table 2.0 Explained

This article explains MaxCompute’s new integrated offline‑and‑near‑real‑time architecture, Transaction Table 2.0, detailing its unified storage and compute design, automatic data governance, schema evolution, upsert and time‑travel capabilities, and how it simplifies complex big‑data pipelines while delivering minute‑level latency and lower costs.

Big DataData GovernanceMaxCompute
0 likes · 27 min read
MaxCompute’s Integrated Offline & Near‑Real‑Time Architecture: Transaction Table 2.0 Explained
Senior Tony
Senior Tony
Aug 20, 2023 · Fundamentals

Why Elasticsearch Is Called Near‑Real‑Time and How It Works Under the Hood

This article explains Elasticsearch’s near‑real‑time nature, its core mechanisms such as inverted indexes and tokenizers, common interview scenarios, search types, refresh strategies, and the difference between query and filter, helping readers understand when and why to choose ES for full‑text and complex queries.

Query vs FilterSearch Typesinverted index
0 likes · 15 min read
Why Elasticsearch Is Called Near‑Real‑Time and How It Works Under the Hood
ByteDance Data Platform
ByteDance Data Platform
Nov 16, 2022 · Big Data

How ByteDance’s Data Lake Powers Near‑Real‑Time E‑Commerce Analytics

This article explains ByteDance’s data lake technology, its Apache Hudi‑based features, near‑real‑time architecture, and practical e‑commerce use cases such as marketing promotion, traffic diagnosis, logistics monitoring, risk governance, and operational monitoring, while outlining future challenges and plans.

Apache HudiBig Data ArchitectureData Lake
0 likes · 15 min read
How ByteDance’s Data Lake Powers Near‑Real‑Time E‑Commerce Analytics
DataFunTalk
DataFunTalk
Oct 4, 2022 · Big Data

Near‑Real‑Time Data Lake Practices in TikTok E‑commerce Data Warehouse

The presentation by TikTok e‑commerce data‑warehouse engineer Ma Wenyuan explains data‑lake characteristics, near‑real‑time architecture, and practical e‑commerce use cases, highlighting Apache Hudi features, hybrid batch‑stream processing, and future challenges for scaling and integration.

Data LakeHudiStreaming
0 likes · 13 min read
Near‑Real‑Time Data Lake Practices in TikTok E‑commerce Data Warehouse
Intelligent Backend & Architecture
Intelligent Backend & Architecture
Apr 23, 2021 · Big Data

Mastering Elasticsearch: Core Concepts, Architecture, and Performance Tips

This comprehensive guide explains Elasticsearch’s fundamentals, including its distributed architecture, indexing process, shard and replica mechanisms, query execution, near‑real‑time search, segment management, and practical optimization techniques, providing developers and engineers with the knowledge needed to design, operate, and troubleshoot large‑scale search clusters.

Distributed Systemsindexingnear real-time
0 likes · 71 min read
Mastering Elasticsearch: Core Concepts, Architecture, and Performance Tips
Programmer DD
Programmer DD
Jan 28, 2021 · Databases

How Elasticsearch Writes, Reads, and Searches Data: Inside the Engine

This article explains Elasticsearch's internal mechanisms for writing, reading, and searching data, covering the roles of coordinating nodes, primary and replica shards, buffers, translog, segment files, refresh cycles, commit and flush operations, as well as Lucene's inverted index and how deletions and updates are handled.

ElasticsearchSegmentinverted index
0 likes · 10 min read
How Elasticsearch Writes, Reads, and Searches Data: Inside the Engine
Programmer DD
Programmer DD
Aug 8, 2020 · Artificial Intelligence

How Elasticsearch Handles Write, Read, and Search: Inside the Engine

This article explains Elasticsearch's internal mechanisms for indexing, querying, and retrieving data, covering the roles of coordinating nodes, primary and replica shards, the refresh and commit cycles, near‑real‑time search, and the underlying Lucene inverted index.

Elasticsearchdata ingestionindexing
0 likes · 12 min read
How Elasticsearch Handles Write, Read, and Search: Inside the Engine
JavaEdge
JavaEdge
Jun 26, 2019 · Backend Development

How Does Elasticsearch Write and Query Data? A Deep Dive into ES Internals

This article explains the complete workflow of Elasticsearch write, read, search, delete, and update operations, covering coordinating nodes, shard routing, buffer refresh, translog, segment files, commit/flush processes, and the underlying inverted index mechanism.

ElasticsearchSearch Architecturenear real-time
0 likes · 10 min read
How Does Elasticsearch Write and Query Data? A Deep Dive into ES Internals
System Architect Go
System Architect Go
Jul 29, 2018 · Databases

What Is Elasticsearch? Core Concepts and Fundamentals

Elasticsearch is an open‑source, scalable, high‑availability distributed full‑text search engine that operates in near real‑time, using clusters of nodes, indexes, documents, shards and replicas to efficiently store and retrieve large volumes of data.

ClusterDistributed SystemsElasticsearch
0 likes · 4 min read
What Is Elasticsearch? Core Concepts and Fundamentals
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Jul 28, 2017 · Big Data

How Transwarp Transporter Enables Near‑Real‑Time ETL in Big Data Pipelines

The article introduces Transwarp Transporter, a near‑real‑time ETL tool for TDH 5.x, explains its architecture, visual dashboard, drag‑and‑drop data‑flow design, debugging features, parameter management, and highlights how it empowers business users to achieve fast, reliable data migration in big‑data environments.

Data IntegrationETLTranswarp
0 likes · 7 min read
How Transwarp Transporter Enables Near‑Real‑Time ETL in Big Data Pipelines