Tag

big data analytics

0 views collected around this technical thread.

Shopee Tech Team
Shopee Tech Team
Oct 25, 2024 · Big Data

StarRocks at Shopee: Practical Use Cases and Performance Analysis

Shopee’s deployment of StarRocks across DataService, DataGo, and DataStudio demonstrates that its vectorized engine, cost‑based optimizer, and materialized‑view caching can query Hive, Iceberg, Delta Lake and Hudi up to 20,000× faster than Presto, cutting CPU usage and delivering consistently lower latency for complex analytics.

HiveMPPPerformance Benchmark
0 likes · 11 min read
StarRocks at Shopee: Practical Use Cases and Performance Analysis
Baidu Tech Salon
Baidu Tech Salon
Oct 22, 2024 · Big Data

TDE-ClickHouse: Baidu MEG's High-Performance Big Data Analytics Engine

TDE‑ClickHouse, the core engine of Baidu MEG’s Turing 3.0 ecosystem, delivers sub‑second, self‑service analytics on petabyte‑scale data by decoupling compute, adding multi‑level aggregation, high‑cardinality and rule‑based optimizations, a two‑phase bulk‑load pipeline, cloud‑native deployment, and a lightweight meta service, now powering over 350 000 cores, 10 PB storage and more than 150 000 daily BI queries with average response times under three seconds.

ClickHouseDatabase ArchitecturePerformance Tuning
0 likes · 19 min read
TDE-ClickHouse: Baidu MEG's High-Performance Big Data Analytics Engine
vivo Internet Technology
vivo Internet Technology
Apr 17, 2024 · Big Data

Retention Analysis Model Practice Based on ClickHouse

The article explains retention analysis models, their importance for user loyalty, outlines offline Hive architecture, then shows how ClickHouse’s retention() function and columnar storage dramatically speed up multi‑day retention calculations, providing SQL examples and practical guidance for product analytics.

ClickHouseData ModelingHive
0 likes · 17 min read
Retention Analysis Model Practice Based on ClickHouse
AntTech
AntTech
Dec 13, 2023 · Information Security

Graph-Based Intelligent Risk Control: Technologies, Infrastructure, and Real‑World Cases

The article reviews the rise of graph‑based intelligent risk control in the digital economy, outlining its technological evolution, key algorithmic capabilities, underlying infrastructure requirements, and practical case studies that demonstrate its impact on financial security and high‑concurrency scenarios.

Graph Neural Networksbig data analyticsfinancial security
0 likes · 9 min read
Graph-Based Intelligent Risk Control: Technologies, Infrastructure, and Real‑World Cases
Youzan Coder
Youzan Coder
Jul 7, 2022 · Big Data

Optimizing Apache Doris Performance: A Case Study in Query Processing

Youzan replaced ClickHouse and Druid with Apache Doris, refined its vectorized engine by eliminating deserialization overhead in the merge‑aggregation phase, achieving roughly a 30 % query‑time boost, and validated compatibility through SQL rewriting and traffic replay, while planning further SIMD‑based optimizations and broader adoption.

Apache DorisClickHouseDruid
0 likes · 8 min read
Optimizing Apache Doris Performance: A Case Study in Query Processing
DataFunTalk
DataFunTalk
Sep 22, 2021 · Big Data

Distributed Storage and Application Solutions for Massive Spatiotemporal Data

This article explains the rapid growth of global spatiotemporal data, the limitations of traditional GIS, and presents SuperMap's distributed storage architecture, unified data access APIs, dynamic rendering techniques, and geographic processing modeling with real‑world case studies to address performance and scalability challenges.

Distributed StorageGISbig data analytics
0 likes · 16 min read
Distributed Storage and Application Solutions for Massive Spatiotemporal Data
vivo Internet Technology
vivo Internet Technology
Aug 20, 2018 · Big Data

Circos: The Beauty of Circle - Data Visualization with Circos

Yang Zhentao’s 2018 conference talk surveys data‑visualization fundamentals, highlights the multidisciplinary skills required, introduces the open‑source Circos tool and its polar‑coordinate workflow, showcases genomic and business use cases, and compares alternative platforms, emphasizing data quality, query capability, and proper view selection.

CircosData VisualizationSVG
0 likes · 21 min read
Circos: The Beauty of Circle - Data Visualization with Circos
Efficient Ops
Efficient Ops
Feb 20, 2017 · Information Security

Inside YY's Security Ops: Real-World Incident Stories and Architecture

This article shares YY's security operations journey, detailing real incident response scenarios, the evolution of their security infrastructure from 2012 onward, and the key factors considered when building a robust security ops system, including DDoS protection, WAF, vulnerability scanning, intrusion detection, and data‑driven automation.

DDoS protectionbig data analyticsincident response
0 likes · 24 min read
Inside YY's Security Ops: Real-World Incident Stories and Architecture