Tagged articles
558 articles
Page 6 of 6
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 5, 2019 · Big Data

Apache Spark Latest Technological Developments and Outlook for Spark 3.0+

The article provides a comprehensive overview of recent Apache Spark advancements—including Delta Lake, Data Source V2, runtime optimizations, relational cache, cloud‑native challenges, AI integration via Project Hydrogen, and the anticipated features of Spark 3.0—highlighting how these innovations address modern data‑warehouse, cloud, and machine‑learning workloads.

Apache SparkBig DataData Warehouse
0 likes · 17 min read
Apache Spark Latest Technological Developments and Outlook for Spark 3.0+
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 29, 2019 · Databases

Comprehensive Comparison of Apache Kylin and Apache Doris: Architecture, Data Models, Storage, Query, and Operations

This article provides an in‑depth technical comparison of Apache Kylin and Apache Doris, covering their system architectures, aggregation and detail data models, storage engines, data import processes, query execution, deduplication, metadata handling, performance, high availability, maintainability, usability, schema‑change capabilities, features, and community ecosystems.

Apache DorisApache KylinBig Data
0 likes · 21 min read
Comprehensive Comparison of Apache Kylin and Apache Doris: Architecture, Data Models, Storage, Query, and Operations
21CTO
21CTO
Jul 18, 2019 · Databases

Why Snowflake Is Overtaking Oracle: A Deep Dive into Modern Data Warehousing

This article examines Snowflake’s rapid rise as a cloud‑native data warehouse, contrasts its architecture and operational advantages with Oracle’s legacy system, and explains how shifting market dynamics, open‑source alternatives, and cloud adoption are reshaping the database landscape.

Data WarehouseOraclecloud computing
0 likes · 11 min read
Why Snowflake Is Overtaking Oracle: A Deep Dive into Modern Data Warehousing
360 Tech Engineering
360 Tech Engineering
Jul 18, 2019 · Databases

Principles and Practices of Apache Doris: Architecture, Key Technologies, and Real‑World Use Cases

This article presents a comprehensive overview of Apache Doris, covering its positioning as a distributed MPP analytical database, core architecture with FE and BE nodes, key technologies such as vectorized execution and materialized views, integration with Kafka and Elasticsearch, additional features, roadmap, and detailed case studies from Baidu Statistics and Meituan, illustrating its practical deployment and performance characteristics.

Apache DorisColumnar StorageData Warehouse
0 likes · 25 min read
Principles and Practices of Apache Doris: Architecture, Key Technologies, and Real‑World Use Cases
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 9, 2019 · Big Data

How Youku Cut Costs and Boost Performance by Migrating to MaxCompute

This article explains how Youku processed billions of daily logs, migrated from Hadoop to Alibaba Cloud MaxCompute in 2017, and achieved lower compute and storage costs, faster data delivery, and greater operational flexibility through a robust big‑data platform tailored to its complex business needs.

Cost OptimizationData MigrationData Warehouse
0 likes · 12 min read
How Youku Cut Costs and Boost Performance by Migrating to MaxCompute
Ctrip Technology
Ctrip Technology
Jun 26, 2019 · Databases

Applying ClickHouse for a High‑Performance Hotel Data Intelligence Platform

This article describes how Ctrip Hotel's data intelligence platform leverages ClickHouse to achieve real‑time analytics on billions of daily updates and millions of queries, detailing the system architecture, data ingestion pipelines, monitoring, and operational lessons learned for large‑scale, high‑availability data services.

Data WarehouseReal-time analyticsdata pipeline
0 likes · 12 min read
Applying ClickHouse for a High‑Performance Hotel Data Intelligence Platform
Dada Group Technology
Dada Group Technology
Jun 11, 2019 · Big Data

Building and Evolving the Dada‑JD Daojia Big Data Platform: Architecture, Strategies, and Lessons Learned

This article presents a comprehensive case study of the Dada‑JD Daojia big data platform, detailing its evolution from a MySQL‑based warehouse to a multi‑layered One Data, One Platform, One Service, Many Apps architecture, the technical challenges faced, and the strategic approaches adopted to ensure coverage, accuracy, stability, and scalability.

Big DataCase StudyData Governance
0 likes · 14 min read
Building and Evolving the Dada‑JD Daojia Big Data Platform: Architecture, Strategies, and Lessons Learned
21CTO
21CTO
Jan 26, 2019 · Big Data

Data Lake vs Data Warehouse: Which One Powers Your Business?

This article explains the core differences between data lakes and data warehouses, their respective strengths, and how they complement each other to support both exploratory analytics and routine business reporting.

AnalyticsBig DataData Lake
0 likes · 5 min read
Data Lake vs Data Warehouse: Which One Powers Your Business?
58 Tech
58 Tech
Dec 28, 2018 · Big Data

Kylin OLAP Platform Architecture, Optimizations, and 58.com Case Study

This article introduces Kylin, a HBase‑based multidimensional analysis platform, explains its architecture and various performance optimizations—including multi‑tenant support, dimension dictionary handling, and cube size estimation—while showcasing a real‑world deployment and case study at 58.com.

Cube OptimizationData WarehouseHBase
0 likes · 14 min read
Kylin OLAP Platform Architecture, Optimizations, and 58.com Case Study
Tencent Cloud Developer
Tencent Cloud Developer
Dec 25, 2018 · Big Data

Open Source Practices and Trends in Big Data and AI – Insights from the Tencent Cloud Developer Conference

At the inaugural Tencent Cloud+ Community Developer Conference, experts highlighted open‑source big‑data and AI trends—from the Cloudera‑Hortonworks merger and evolving licenses to Tencent Cloud’s contributions such as Sparkling, Spark‑Hydrogen, and Angel—emphasizing the need to nurture both visible features and hidden ecosystem foundations through active community stewardship.

AIData WarehouseTencent Cloud
0 likes · 14 min read
Open Source Practices and Trends in Big Data and AI – Insights from the Tencent Cloud Developer Conference
Sohu Tech Products
Sohu Tech Products
Dec 12, 2018 · Databases

Optimizing MySQL Performance with Read/Write Splitting, Columnar Storage, and Dynamic Scheduling

The article details a real‑world MySQL performance case where a sudden 100‑fold load increase was mitigated through read/write splitting, replica‑based statistics, limited index tuning, middleware‑driven sharding, and finally a columnar storage layer (Infobright) with scripted dynamic data synchronization, achieving dramatic latency reductions and scalable architecture.

Columnar StorageData WarehouseInfobright
0 likes · 12 min read
Optimizing MySQL Performance with Read/Write Splitting, Columnar Storage, and Dynamic Scheduling
JD Retail Technology
JD Retail Technology
Dec 12, 2018 · Big Data

Construction and Architecture of JD Overseas Data Analysis Platform (Columbus Platform)

JD.com’s overseas data analysis platform, dubbed the Columbus platform, combines a lightweight data warehouse deployment with standardized, customizable BI tools to provide real‑time and offline analytics, visualization, KPI management, and future self‑service reporting and predictive capabilities for its global e‑commerce operations.

AnalyticsBIBig Data
0 likes · 9 min read
Construction and Architecture of JD Overseas Data Analysis Platform (Columbus Platform)
NetEase Game Operations Platform
NetEase Game Operations Platform
Dec 5, 2018 · Big Data

Presto + Alluxio Architecture for Interactive Ad‑hoc Queries in NetEase Game Data Warehouse

This article describes how NetEase Games built a Presto‑based interactive ad‑hoc query platform backed by Alluxio caching to achieve sub‑10‑second query latency, outlines the architectural design, performance comparisons with other Hadoop‑based solutions, encountered issues, and future improvement plans.

AlluxioBig DataData Warehouse
0 likes · 10 min read
Presto + Alluxio Architecture for Interactive Ad‑hoc Queries in NetEase Game Data Warehouse
58 Tech
58 Tech
Nov 26, 2018 · Big Data

Big Data OLAP Applications and Practices: Insights from Xiaomi and 58.com

The article reviews the 2018 58 Group technology salon on big‑data OLAP, summarizing Xiaomi’s one‑stop OLAP architecture, 58.com’s challenges and solutions using Kylin, Druid, and UnionSQL, and the practical implementations and optimizations that illustrate modern OLAP practices.

Data WarehouseDruidKylin
0 likes · 12 min read
Big Data OLAP Applications and Practices: Insights from Xiaomi and 58.com
dbaplus Community
dbaplus Community
Nov 4, 2018 · Databases

How Spark Turns Traditional Databases into Powerful OLAP Engines

This article examines why traditional relational databases like MySQL struggle with analytical workloads, compares ROLAP and MOLAP approaches, explains Spark’s architecture and its advantages for OLAP, and details how Alibaba Cloud’s DRDS HTAP leverages a Spark‑based engine to deliver real‑time distributed query processing.

Data WarehouseDistributed SystemsHTAP
0 likes · 11 min read
How Spark Turns Traditional Databases into Powerful OLAP Engines
Tencent Cloud Developer
Tencent Cloud Developer
Oct 30, 2018 · Big Data

Big Data Technology Trends and Cloud Data Warehouse Architecture Practices

The article reviews recent big-data trends—from Hadoop’s evolution and Spark’s in-memory advances to emerging storage like Ozone—while detailing data-warehouse models, query-optimizer techniques, and cloud-native architectures that integrate diverse data sources, enabling scalable, AI-ready analytics and modern data-lake capabilities.

Big DataData LakeData Warehouse
0 likes · 30 min read
Big Data Technology Trends and Cloud Data Warehouse Architecture Practices
Youzan Coder
Youzan Coder
Sep 15, 2018 · Big Data

How Data Empowers Operations: Insights from Youzan & NetEase’s Big Data Summit

On September 15, Youzan’s big-data team and NetEase YouShu hosted a technical sharing titled “The Road to Data-Driven Operations,” where speakers explored the evolution of Youzan’s data warehouse metadata system, the architecture of its big-data development platform, and the application of functional programming in visual data analysis, highlighting current trends and future directions.

Data WarehouseData visualizationOperations
0 likes · 4 min read
How Data Empowers Operations: Insights from Youzan & NetEase’s Big Data Summit
Qizhuo Club
Qizhuo Club
Sep 11, 2018 · Artificial Intelligence

How 360 Mobile Assistant Built a Scalable AI‑Powered App Recommendation System

This article details the design, architecture, and key components of 360 Mobile Assistant's recommendation system, covering business scenarios, data warehouse and computing layers, feature engineering, model selection, and online deployment strategies to improve app discovery and user engagement.

CTR predictionData Warehousefeature engineering
0 likes · 19 min read
How 360 Mobile Assistant Built a Scalable AI‑Powered App Recommendation System
Youzan Coder
Youzan Coder
Aug 3, 2018 · Big Data

Youzan Data Warehouse Metadata System: From Manual Tables to Metadata‑Driven Architecture

Youzan’s data‑warehouse metadata system evolved from manually maintained tables to an automated data dictionary and finally to a metadata‑driven architecture that automatically captures technical, business, and process metadata, visualizes lineage, tracks resource usage, manages synchronization rules and permissions, and now aims to improve novice usability with visual models and impact‑analysis tools.

Big DataData WarehouseHive
0 likes · 11 min read
Youzan Data Warehouse Metadata System: From Manual Tables to Metadata‑Driven Architecture
Architects Research Society
Architects Research Society
Jul 27, 2018 · Big Data

Overview of Apache Hive Features, Usage, and Management

Apache Hive is an open‑source data‑warehouse system built on Hadoop that enables users to read, write, and manage large distributed datasets using SQL‑like queries, offering features such as ETL support, various file‑format connectors, extensible UDFs, and integration with tools like Tez, Spark, and MapReduce.

Apache HiveBig DataData Warehouse
0 likes · 5 min read
Overview of Apache Hive Features, Usage, and Management
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 23, 2018 · Big Data

How Alibaba’s MaxCompute Became the Backbone of 99% Data Processing

This article reviews Alibaba's MaxCompute evolution from ODPS to a unified, multi‑cluster big‑data platform, detailing its architecture, development tools, large‑scale deployments, performance optimizations, typical workload scenarios, and why it is the preferred choice for enterprise data processing.

Alibaba CloudBig DataData Platform
0 likes · 22 min read
How Alibaba’s MaxCompute Became the Backbone of 99% Data Processing
360 Quality & Efficiency
360 Quality & Efficiency
Jun 28, 2018 · Big Data

An Introduction to Apache Hive: Architecture, Workflow, Storage, Advantages, and Comparison with Traditional Databases

This article provides a concise overview of Apache Hive, covering its definition, Hadoop background, architecture, query workflow, storage model, advantages, disadvantages, and a comparison with traditional relational databases, helping readers understand how Hive enables SQL-like queries on data stored in HDFS.

Data WarehouseHadoopHive
0 likes · 5 min read
An Introduction to Apache Hive: Architecture, Workflow, Storage, Advantages, and Comparison with Traditional Databases
Beike Product & Technology
Beike Product & Technology
Apr 26, 2018 · Big Data

Chain Home's OLAP Platform and Kylin Usage

This article details Chain Home's OLAP platform architecture and Kylin usage, covering the evolution from early ROLAP to MOLAP multi-dimensional engine, Kylin's basic principles, platform structure, application scenarios, usage specifications, capability extensions, and middleware development.

Apache KylinBig DataChain Home
0 likes · 11 min read
Chain Home's OLAP Platform and Kylin Usage
Qunar Tech Salon
Qunar Tech Salon
Apr 10, 2018 · Big Data

Design and Implementation of Meituan's Traffic Compass Data Warehouse for Hotel‑Travel Business

The article presents Meituan's Traffic Compass—a data‑warehouse‑driven traffic analysis platform for the hotel‑travel business—detailing its background, challenges, architectural layers, dimensional modeling, Kylin‑based query engine, configuration mechanisms, performance metrics, and future optimization plans.

AnalyticsBig DataData Warehouse
0 likes · 14 min read
Design and Implementation of Meituan's Traffic Compass Data Warehouse for Hotel‑Travel Business
21CTO
21CTO
Apr 2, 2018 · Big Data

How to Build a Scalable Friend Recommendation System with MaxCompute

This article explains how to leverage Alibaba Cloud's MaxCompute and MapReduce to design, model, and deploy a large‑scale social friend recommendation system, covering data requirements, analysis models, cloud architecture, and practical development steps.

Data WarehouseFriend RecommendationMaxCompute
0 likes · 12 min read
How to Build a Scalable Friend Recommendation System with MaxCompute
Meituan Technology Team
Meituan Technology Team
Mar 22, 2018 · Big Data

DataMan: A Data Quality Governance Platform for Meituan's Big Data Ecosystem

Meituan’s DataMan platform provides a unified, closed‑loop data‑quality governance solution that collects demand, refines rules, executes monitoring across offline and real‑time jobs, tracks issues, and builds a knowledge base, improving completeness, accuracy, consistency, and timeliness while optimizing storage, reducing fault resolution time, and supporting data‑driven decisions.

Data GovernanceData QualityData Warehouse
0 likes · 17 min read
DataMan: A Data Quality Governance Platform for Meituan's Big Data Ecosystem
DevOps
DevOps
Dec 14, 2017 · Databases

Introduction to TFS Data Warehouse and TFS Analysis Services with Excel Reporting

This guide explains how TFS stores its data in SQL Server, outlines the four core databases and the Analysis Services multidimensional warehouse, and provides step‑by‑step instructions for creating Excel‑based reports using the Tfs_Analysis cube and code churn metrics.

Analysis ServicesData WarehouseExcel Reporting
0 likes · 5 min read
Introduction to TFS Data Warehouse and TFS Analysis Services with Excel Reporting
Meituan Technology Team
Meituan Technology Team
Nov 2, 2017 · Big Data

Dashiang Cube: A Multi‑Source BI Reporting Tool with Custom Join Algorithms

Meituan‑Dianping’s Dashiang Cube is a multi‑source BI reporting platform that unifies MySQL, Kylin, Elasticsearch and plain‑text data via a common SQL layer, generates dialect‑specific queries, performs custom back‑tracking inner and left outer joins across heterogeneous sources, supports scripted metric calculations, permission controls, and a reusable UI component library for self‑service reporting.

BIData PermissionsData Warehouse
0 likes · 14 min read
Dashiang Cube: A Multi‑Source BI Reporting Tool with Custom Join Algorithms
ITPUB
ITPUB
Sep 30, 2017 · Big Data

Designing Scalable Open‑Source ETL Systems: Lessons from Baidu Waimai

This talk details Baidu Waimai's end‑to‑end ETL design, covering demand sources, data flow patterns, multi‑stage system evolution, storage choices, scheduling architecture, configuration‑driven processing, quality monitoring, and how data lineage enables transparent, self‑service data delivery.

Big DataData QualityData Warehouse
0 likes · 25 min read
Designing Scalable Open‑Source ETL Systems: Lessons from Baidu Waimai
Qunar Tech Salon
Qunar Tech Salon
Sep 25, 2017 · Big Data

Comprehensive Guide to Spark Ecosystem: Data Warehouse, Machine Learning, Streaming, and Enterprise Use Cases

This article provides an extensive overview of Apache Spark’s ecosystem—including its data‑warehouse capabilities, ML/MLlib libraries, streaming with Spark Streaming, external frameworks, and real‑world enterprise case studies—while also noting a promotional announcement for a React Native conference.

Big DataData WarehouseHive
0 likes · 21 min read
Comprehensive Guide to Spark Ecosystem: Data Warehouse, Machine Learning, Streaming, and Enterprise Use Cases
dbaplus Community
dbaplus Community
Sep 24, 2017 · Databases

Why Splitting a Giant SQL Query Cut Report Time by 6 Seconds

In a high‑pressure performance‑optimization project, a team dissected a massive, multi‑with‑clause SQL report into smaller temporary‑table queries, applied dimensional modeling, and achieved a 6‑second runtime reduction while handling complex reporting requirements across regions, contractors, and milestones.

Data WarehouseQuery RefactoringReporting
0 likes · 10 min read
Why Splitting a Giant SQL Query Cut Report Time by 6 Seconds
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Aug 25, 2017 · Big Data

How to Optimize OLAP Cubes with Rubik: Dimensional Reduction Strategies Explained

This article walks through Rubik's OLAP cube reduction techniques—including aggregation groups, required, combined, derived, hierarchical, and partial cubes—by designing and implementing buyers and suppliers cubes with six tables, demonstrating performance gains through pre‑computed queries and SQL examples.

CubeData WarehouseDimensional Reduction
0 likes · 10 min read
How to Optimize OLAP Cubes with Rubik: Dimensional Reduction Strategies Explained
dbaplus Community
dbaplus Community
Jul 16, 2017 · Big Data

How Vipshop Scaled Real‑Time OLAP: From GreenPlum to Presto, Kylin, and Redis

Vipshop faced massive data growth that broke traditional RDBMS, causing slow OLAP queries, inefficient ETL, and long development cycles, so it iteratively rebuilt its analytics stack—adding Hadoop/Hive, a self‑service UI, Presto, Kylin, and Redis—to achieve sub‑second query responses, higher concurrency, and a flexible, low‑latency BI solution.

Data WarehouseKylinOLAP
0 likes · 23 min read
How Vipshop Scaled Real‑Time OLAP: From GreenPlum to Presto, Kylin, and Redis
MaGe Linux Operations
MaGe Linux Operations
May 31, 2017 · Big Data

Essential Skills for a Successful Data Career: From Big Data Platforms to AI

This article outlines the critical competencies needed across the data field—from building and maintaining big data platforms and data warehouses to mastering visualization, analysis, mining, and deep learning—offering practical guidance for aspiring data professionals seeking long‑term career growth.

Data ScienceData Warehousecareer guide
0 likes · 15 min read
Essential Skills for a Successful Data Career: From Big Data Platforms to AI
Architecture Digest
Architecture Digest
May 25, 2017 · Big Data

Designing Data Warehouse Layers: Principles, Models, and Practical Practices

This article explains why data warehouses should be layered, describes the classic ODS‑DW‑APP model, details each layer’s purpose and implementation techniques, presents an improved layering scheme with dimension and temporary tables, and answers common questions about parallel DWS and DWD processing.

Big DataData ArchitectureData Warehouse
0 likes · 17 min read
Designing Data Warehouse Layers: Principles, Models, and Practical Practices
Qunar Tech Salon
Qunar Tech Salon
Mar 12, 2017 · Big Data

Essential Skills and Career Paths for Data Professionals: From Big Data Platforms to AI

The article outlines the key competencies, responsibilities, and career development advice for data professionals across the entire data stack—from building big‑data platforms and data warehouses to visualization, analysis, algorithm engineering, and deep‑learning applications—emphasizing the importance of creating business value with data.

Big DataData AnalystData Warehouse
0 likes · 15 min read
Essential Skills and Career Paths for Data Professionals: From Big Data Platforms to AI
ITPUB
ITPUB
Mar 9, 2017 · Databases

How to Optimize Data Warehouse Indexes for Faster Queries

This article explains practical strategies for indexing dimension and fact tables in a data warehouse, covering when to use clustered versus non‑clustered indexes, how to handle surrogate and business keys, partition considerations, and tips for evolving index designs as data grows.

Data WarehouseSQL Serverdimensional modeling
0 likes · 7 min read
How to Optimize Data Warehouse Indexes for Faster Queries
Ctrip Technology
Ctrip Technology
Mar 8, 2017 · Big Data

Essential Skills and Career Path for Data Professionals: From Big Data Platforms to AI Applications

This article outlines the key competencies and career roadmap for data professionals, covering big‑data infrastructure, data‑warehouse engineering, visualization, analysis, algorithmic mining, and deep‑learning, while emphasizing the importance of business sense, cloud adoption, and continuous learning.

Data WarehouseData visualizationcareer advice
0 likes · 15 min read
Essential Skills and Career Path for Data Professionals: From Big Data Platforms to AI Applications
dbaplus Community
dbaplus Community
Jan 8, 2017 · Big Data

How to Build a Cost‑Effective Data Platform for Small‑to‑Medium Enterprises

This article explains why data platforms are essential for modern SMEs, defines what a data platform is, outlines a four‑step methodology (source definition, analysis theme, ETL processing, and reporting), and shares architectural choices, team structures, common pitfalls, and practical advice for rapid, iterative implementation.

Data ArchitectureData PlatformData Warehouse
0 likes · 15 min read
How to Build a Cost‑Effective Data Platform for Small‑to‑Medium Enterprises
Architecture Digest
Architecture Digest
Nov 6, 2016 · Big Data

Evolution of Taobao’s Big Data Platform: From RAC to MaxCompute

The article chronicles Taobao’s 13‑year evolution of its big data platform, detailing three phases—from a single‑node Oracle setup and the Tianwang scheduler, through a Hadoop‑based “Cloud Ladder 1” architecture with real‑time analytics, to the current MaxCompute/ODPS era with cross‑region projects and advanced data services.

Big DataData PlatformData Warehouse
0 likes · 11 min read
Evolution of Taobao’s Big Data Platform: From RAC to MaxCompute
Java High-Performance Architecture
Java High-Performance Architecture
Oct 21, 2016 · Big Data

What Is Hive and How Does It Turn SQL into MapReduce?

This article explains Hive as a SQL‑based interface for Hadoop, shows why it simplifies large‑scale data analysis, provides practical command‑line examples for table creation, data loading, and queries, and details how HiveQL is internally converted into MapReduce jobs.

Data WarehouseHiveMapReduce
0 likes · 6 min read
What Is Hive and How Does It Turn SQL into MapReduce?
Ctrip Technology
Ctrip Technology
Aug 26, 2016 · Big Data

Exploring OLAP Engine with Apache Kylin: Architecture, Theory, and Practical Applications in Flight Ticket Big Data

This article presents a comprehensive overview of the Qdata session on OLAP engine exploration, detailing the limitations of traditional MySQL‑based solutions, the requirements for large‑scale analytics, the architecture and theoretical foundations of Apache Kylin, its cube construction process, storage in HBase, query rewriting, real‑world flight‑ticket data applications, and the encountered challenges with corresponding optimization practices.

Apache KylinCubeData Warehouse
0 likes · 7 min read
Exploring OLAP Engine with Apache Kylin: Architecture, Theory, and Practical Applications in Flight Ticket Big Data
ITPUB
ITPUB
Jul 19, 2016 · Big Data

From Traditional Data Warehouses to Big Data: Practical Techniques and Migration Insights

The talk shares hands‑on experiences and best‑practice methods for traditional data‑warehouse processing, public and behavioral data handling in big‑data environments, and practical guidance for migrating legacy warehouses to modern Hadoop‑based platforms, emphasizing data governance, security, and performance optimization.

Big DataData GovernanceData Warehouse
0 likes · 13 min read
From Traditional Data Warehouses to Big Data: Practical Techniques and Migration Insights
ITPUB
ITPUB
Jun 29, 2016 · Big Data

Why OLTP Falls Short for Big Data: OLAP, Hadoop & MPP Explained

The article explains how traditional OLTP systems cannot satisfy modern big‑data analytics needs and compares OLAP, Hadoop, and MPP architectures, highlighting their data processing models, scalability, cloud‑based managed services, and practical recommendations for building effective data warehouses.

Big DataCloud ServicesData Warehouse
0 likes · 21 min read
Why OLTP Falls Short for Big Data: OLAP, Hadoop & MPP Explained
ITPUB
ITPUB
Jan 20, 2016 · Big Data

How Meizu Built an Agile Big Data Platform for Millions of Users

The Meizu Tech Open Day showcased the company's rapid evolution to a data‑driven mobile internet firm, detailing its DW1.0 and DW2.0 data‑warehouse architectures, recommendation pipelines, Spark adoption, and ELK‑based log analytics, while sharing practical lessons and future challenges.

Big DataData ArchitectureData Warehouse
0 likes · 11 min read
How Meizu Built an Agile Big Data Platform for Millions of Users
Baidu Maps Tech Team
Baidu Maps Tech Team
Jan 6, 2016 · Big Data

How Baidu Maps Scales Billion‑Row OLAP Queries with Apache Kylin

Baidu Maps’ Data Intelligence team built a large‑scale OLAP platform using Apache Kylin, detailing the challenges of multi‑dimensional analysis on billions of rows, the architecture, custom extensions for task, resource, and monitoring management, and performance optimizations that achieve millisecond‑level SQL responses.

Apache KylinBig DataData Warehouse
0 likes · 21 min read
How Baidu Maps Scales Billion‑Row OLAP Queries with Apache Kylin
21CTO
21CTO
Dec 3, 2015 · Big Data

How Netflix Scales Its Hadoop Data Warehouse on AWS with Genie PaaS

This article explains how Netflix leverages Amazon S3 and Elastic MapReduce to build a virtually unlimited, dynamically scalable Hadoop data warehouse in the cloud, and introduces Genie—a Hadoop platform‑as‑a‑service that abstracts job submission, resource management, and cluster orchestration.

AWSData WarehouseElastic MapReduce
0 likes · 15 min read
How Netflix Scales Its Hadoop Data Warehouse on AWS with Genie PaaS
Architect
Architect
Dec 2, 2015 · Big Data

Designing an Agile Data Warehouse Architecture for Internet Companies

The article outlines a practical, end‑to‑end data platform architecture for internet businesses, covering data collection, storage and analysis, sharing, real‑time processing, task scheduling, and the importance of simplicity and agility in building an agile data warehouse.

Big DataData ArchitectureData Warehouse
0 likes · 10 min read
Designing an Agile Data Warehouse Architecture for Internet Companies
21CTO
21CTO
Nov 4, 2015 · Big Data

Evolution of Dazhong Dianping’s Data Platform (2012‑2014): Key Lessons for Growing Big Data Teams

This article chronicles the step‑by‑step evolution of Dazhong Dianping’s data platform from 2012 to 2014, detailing changes in data models, storage and compute architecture, scheduling, monitoring, and data‑driven applications, offering practical insights for teams building early‑stage big‑data infrastructures.

Big Data ArchitectureData PlatformData Warehouse
0 likes · 7 min read
Evolution of Dazhong Dianping’s Data Platform (2012‑2014): Key Lessons for Growing Big Data Teams
Architect
Architect
Oct 17, 2015 · Big Data

Designing an Agile Data Warehouse and Data Platform for Internet Companies

The article outlines the purposes, architecture, data ingestion, storage, analysis, sharing, application, real‑time processing, scheduling, monitoring, and best‑practice recommendations for building a fast, flexible, and reliable big‑data platform in the fast‑changing internet industry.

Big DataData WarehouseHadoop
0 likes · 12 min read
Designing an Agile Data Warehouse and Data Platform for Internet Companies
ITPUB
ITPUB
Aug 28, 2014 · Databases

How Oracle DBAs Tackle Performance: Real‑World Stats Tuning and Career Insights

In this interview, veteran Oracle DBA Shi Yuedong shares his career journey, personal philosophy on opportunity, a hands‑on performance‑diagnosis case at Lenovo, experiments revealing Oracle 12.1's default statistics sampling rate, and thoughtful perspectives on big data versus data‑warehouse evolution.

DBAData WarehouseDatabase Performance
0 likes · 18 min read
How Oracle DBAs Tackle Performance: Real‑World Stats Tuning and Career Insights