Tagged articles
234 articles
Page 3 of 3
Ctrip Technology
Ctrip Technology
Jul 24, 2018 · Backend Development

Design and Implementation of CTran V3: A Multilingual Translation Platform for Ctrip International Business

This article presents a comprehensive case study of CTran V3, a redesigned multilingual translation platform for Ctrip's international business, detailing its architecture, data flow, job scheduling, translation engine, real‑time services, and lessons learned to guide similar large‑scale content localization projects.

BackendJob Schedulingcontent management
0 likes · 21 min read
Design and Implementation of CTran V3: A Multilingual Translation Platform for Ctrip International Business
Dada Group Technology
Dada Group Technology
Jul 24, 2018 · Operations

Building a Scalable Growth Operations Platform: User Grouping, Dynamic Queries, and Automation

The article describes how a growth operations team can improve efficiency by designing a flexible user‑grouping system, dynamic query generation, and automated rule execution, while addressing data latency, real‑time processing, and scalability challenges through a Lambda‑style architecture.

AutomationDynamic QueryLambda architecture
0 likes · 14 min read
Building a Scalable Growth Operations Platform: User Grouping, Dynamic Queries, and Automation
Efficient Ops
Efficient Ops
Jun 6, 2018 · Big Data

How Tencent’s Multi‑Dimensional Monitoring Turns Big Data Into Real‑Time Business Insights

This article explains how Tencent’s ZhiYun multi‑dimensional monitoring system evolves from the Mobile Monitor platform, outlines its design principles, data‑factory capabilities, storage choices, and intelligent features, and demonstrates how it enables real‑time, multi‑dimensional analysis and alerting for large‑scale business operations.

Big DataDruidStorm
0 likes · 11 min read
How Tencent’s Multi‑Dimensional Monitoring Turns Big Data Into Real‑Time Business Insights
Java Captain
Java Captain
May 24, 2018 · Big Data

Debugging a Kafka Data Drop: A Step‑by‑Step Troubleshooting Case Study

After a recent feature release caused a sharp decline in a key data metric, the team followed a systematic, fourteen‑step troubleshooting process—including verification, code review, DBA involvement, local debugging, environment comparison, logging, packet capture, service restarts, request mode changes, load testing, and partition resizing—to identify and resolve a Kafka‑related throughput bottleneck.

KafkaLoad TestingPerformance debugging
0 likes · 8 min read
Debugging a Kafka Data Drop: A Step‑by‑Step Troubleshooting Case Study
21CTO
21CTO
Apr 9, 2018 · Artificial Intelligence

How E‑Commerce Platforms Build Effective Product Recommendation Systems

This article explains the fundamentals and advanced techniques of e‑commerce product recommendation systems, covering conventional and personalized approaches, user profiling, data collection, storage, modeling, the three‑stage pipeline of preprocessing, recall and ranking, as well as system architecture, challenges, and key algorithms such as LR and GBDT.

data pipelinee‑commercemachine learning
0 likes · 17 min read
How E‑Commerce Platforms Build Effective Product Recommendation Systems
Snowball Engineer Team
Snowball Engineer Team
Mar 23, 2018 · Big Data

Redesigning Snowball's Log Collection Architecture During Hadoop Cluster Expansion

The article details Snowball's challenges with a saturated CDH Hadoop cluster, outlines the limitations of the original Kafka‑based log pipeline, and explains how a comprehensive redesign using FlumeNG, Spillable Memory Channels, and custom HDFS sinks resolves latency, data loss, and high‑load issues while supporting future growth.

Cluster MigrationFlumeNGHadoop
0 likes · 6 min read
Redesigning Snowball's Log Collection Architecture During Hadoop Cluster Expansion
Meitu Technology
Meitu Technology
Dec 19, 2017 · Industry Insights

Inside Meitu’s In‑House Log Collection System Arachnia: Design, Challenges, and Core Mechanisms

This article introduces Meitu’s self‑developed log collection system Arachnia, explaining why a custom solution was needed for massive server‑side user‑behavior logs, the key requirements such as reliability and real‑time throughput, and the core architectural mechanisms that address those challenges.

ArachniaBig DataMeitu
0 likes · 2 min read
Inside Meitu’s In‑House Log Collection System Arachnia: Design, Challenges, and Core Mechanisms
Efficient Ops
Efficient Ops
Dec 18, 2017 · Operations

How WiFi Key Built a Million‑User Monitoring Platform: Architecture and Best Practices

This article describes how WiFi 万能钥匙 designed and implemented the Roma monitoring platform to handle billions of daily requests, covering background challenges, architectural principles, component design, data collection, transmission, storage, alerting, and future directions for large‑scale observability.

MicroservicesObservabilityOperations
0 likes · 16 min read
How WiFi Key Built a Million‑User Monitoring Platform: Architecture and Best Practices
Meituan Technology Team
Meituan Technology Team
Dec 1, 2017 · Big Data

Metric Logic Tree: Automated Anomaly Analysis for Business Metrics

The Metric Logic Tree automates business metric anomaly analysis by integrating heterogeneous data sources (Kylin, MySQL, Elasticsearch, Druid) with a three‑layer architecture—metric calculation, algorithmic analysis (waterfall and Gini‑coefficient methods), and a master‑worker computation service—that parallelizes queries, delivers immediate conclusions, and shortens decision cycles, as demonstrated in Meituan‑Dianping’s hotel‑travel operations.

Big Dataalgorithmanomaly detection
0 likes · 7 min read
Metric Logic Tree: Automated Anomaly Analysis for Business Metrics
Efficient Ops
Efficient Ops
Nov 15, 2017 · Big Data

How Tencent Built a 10 TB‑Per‑Day Full‑Link Log Monitoring Platform

This article explains how Tencent's ZhiYun full‑link log monitoring platform handles massive daily logs, overcomes challenges of diverse log formats, high throughput, fault‑tolerant design, and provides scalable storage, query, and alerting capabilities for distributed micro‑service environments.

Big DataDistributed SystemsLog Monitoring
0 likes · 10 min read
How Tencent Built a 10 TB‑Per‑Day Full‑Link Log Monitoring Platform
ITPUB
ITPUB
Sep 30, 2017 · Big Data

Designing Scalable Open‑Source ETL Systems: Lessons from Baidu Waimai

This talk details Baidu Waimai's end‑to‑end ETL design, covering demand sources, data flow patterns, multi‑stage system evolution, storage choices, scheduling architecture, configuration‑driven processing, quality monitoring, and how data lineage enables transparent, self‑service data delivery.

Big DataData QualityData Warehouse
0 likes · 25 min read
Designing Scalable Open‑Source ETL Systems: Lessons from Baidu Waimai
ITPUB
ITPUB
Sep 29, 2017 · Big Data

Designing an Open ETL System: Baidu Waimai’s Scalable Data Pipeline Practices

In this talk, a Baidu Waimai engineer explains the motivations, requirements, and architectural choices behind their open‑source ETL platform, covering data flow patterns, logical mappings, storage options, scheduling, metadata management, and quality monitoring to achieve scalable, transparent, and explainable data delivery.

Big DataETLScheduling
0 likes · 26 min read
Designing an Open ETL System: Baidu Waimai’s Scalable Data Pipeline Practices
Meituan Technology Team
Meituan Technology Team
Sep 21, 2017 · Big Data

Feature Production Scheduling: Architecture Evolution and Core Technologies

Using Meituan‑Dianping’s hospitality online feature system as a case study, the article describes how feature production scheduling evolved from offline batch ETL to automated, metadata‑driven pipelines and sub‑second streaming, detailing the underlying architecture, incremental updates, storage abstraction, write‑shaving, atomicity, and recovery mechanisms.

Big DataReal-time ProcessingSystem Architecture
0 likes · 23 min read
Feature Production Scheduling: Architecture Evolution and Core Technologies
Architecture Digest
Architecture Digest
Sep 15, 2017 · Artificial Intelligence

Overview of Recommendation Systems: Goals, Methods, Architecture, and Practical Considerations

This article explains the objectives of recommendation systems, compares popular recommendation approaches, details the components and algorithms of personalized recommendation pipelines, and discusses practical challenges such as real‑time processing, freshness, cold‑start, diversity, content quality, and surprise handling.

Real-Timecold startdata pipeline
0 likes · 15 min read
Overview of Recommendation Systems: Goals, Methods, Architecture, and Practical Considerations
Architecture Digest
Architecture Digest
Sep 2, 2017 · Big Data

Designing a High‑Availability, High‑Efficiency Distributed Scheduling Platform for Big Data

This article examines the principles, features, and implementation details of distributed scheduling for big‑data ETL pipelines, covering decentralised schedulers, host selection strategies, fault‑tolerance, operator abstraction, elasticity, trigger mechanisms, visual monitoring, alarm handling, data fan‑in/fan‑out, parameter consistency, real‑time quality checks, lineage tracking, and field‑level traceability.

Big DataData LineageDistributed Scheduling
0 likes · 23 min read
Designing a High‑Availability, High‑Efficiency Distributed Scheduling Platform for Big Data
21CTO
21CTO
Aug 18, 2017 · Big Data

How Ctrip Builds a Scalable User Profile Platform for Personalized Travel

This article explains why Ctrip creates user profiles, describes the product and technical architectures, and details the data collection, computation, storage, high‑availability querying, and monitoring components that power its personalized travel recommendations and services.

CtripReal-time ProcessingSystem Architecture
0 likes · 8 min read
How Ctrip Builds a Scalable User Profile Platform for Personalized Travel
High Availability Architecture
High Availability Architecture
Aug 8, 2017 · Big Data

Practical Big Data Architecture Evolution and Lessons Learned

The article reviews the evolution of big‑data architectures from a simple RDB‑centric pipeline to a SaaS‑based solution, highlighting common bottlenecks such as scaling, integration, cost, and operational complexity, and shares practical experiences and best‑practice recommendations for building efficient, maintainable data platforms.

Big DataSaaSarchitecture
0 likes · 12 min read
Practical Big Data Architecture Evolution and Lessons Learned
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Jul 28, 2017 · Big Data

How Transwarp Transporter Enables Near‑Real‑Time ETL in Big Data Pipelines

The article introduces Transwarp Transporter, a near‑real‑time ETL tool for TDH 5.x, explains its architecture, visual dashboard, drag‑and‑drop data‑flow design, debugging features, parameter management, and highlights how it empowers business users to achieve fast, reliable data migration in big‑data environments.

Data IntegrationETLTranswarp
0 likes · 7 min read
How Transwarp Transporter Enables Near‑Real‑Time ETL in Big Data Pipelines
Architecture Digest
Architecture Digest
Jul 18, 2017 · Backend Development

Design and Implementation of Ctrip Real‑Time User Data Collection System

This article describes the design, technology selection, and performance evaluation of Ctrip's real‑time user behavior data collection platform, covering Netty‑based network handling, Kafka/Hermes messaging, encryption, compression, Avro backup, and related analytics products, with detailed feasibility analysis and benchmark results.

Backend ArchitectureDistributed SystemsKafka
0 likes · 17 min read
Design and Implementation of Ctrip Real‑Time User Data Collection System
Qunar Tech Salon
Qunar Tech Salon
Jul 4, 2017 · Big Data

Design and Evolution of Airbnb's Log Data Storage and Query Platform

The article describes how Airbnb's data infrastructure team built a next‑generation log storage and query platform to improve data quality, timeliness, flexibility, and anomaly detection, outlining the system architecture, key requirements, five improvement areas, and the resulting benefits.

Airbnbdata pipelinelog platform
0 likes · 7 min read
Design and Evolution of Airbnb's Log Data Storage and Query Platform
Meituan Technology Team
Meituan Technology Team
Mar 2, 2017 · Big Data

Meituan Waimai Feature Archive Platform: Architecture, Tag System, and Data Processing

Meituan Waimai’s Feature Archive platform processes billions of daily orders by managing ~200 user and 400 merchant tags through a three‑layer architecture—Hive, Elasticsearch, HBase, and MySQL—offering visual tag selection, instant self‑service queries, full data extraction, and a predicate‑logic query language, while supporting future extensibility.

Big DataElasticsearchHBase
0 likes · 14 min read
Meituan Waimai Feature Archive Platform: Architecture, Tag System, and Data Processing
21CTO
21CTO
Nov 6, 2016 · Artificial Intelligence

How to Build a Scalable AI-Powered Recommendation System with SOA

This article outlines a service‑oriented architecture for a high‑availability personalized recommendation platform, detailing the front‑end, back‑end, crawler, user‑profile modeling, data collection from logs and client events, and processing pipelines using technologies such as Node.js, Python, RabbitMQ/Kafka, MongoDB and TensorFlow.

SOATensorFlowdata pipeline
0 likes · 5 min read
How to Build a Scalable AI-Powered Recommendation System with SOA
Meituan Technology Team
Meituan Technology Team
Aug 5, 2016 · Big Data

Design and Implementation of a Large-Scale User Behavior Analytics Platform

The article outlines Meituan‑Dianping’s “Sensors Analytics” platform, a privately‑deployed, open‑PaaS solution that collects full‑stack user events from iOS, Android, Web and WeChat, maps IDs in near real‑time, stores detailed records in Kudu (real‑time) and Parquet (offline), and serves low‑latency queries via Impala, addressing the architectural and operational challenges of high‑throughput ingestion and data‑security requirements.

ImpalaKafkaKudu
0 likes · 8 min read
Design and Implementation of a Large-Scale User Behavior Analytics Platform
Baidu Maps Tech Team
Baidu Maps Tech Team
Feb 3, 2016 · Big Data

How Baidu Maps Powers Its Open Platform with Big Data Architecture

This article explains how Baidu Maps’ open platform handles massive daily location data through real‑time and offline pipelines, Hadoop‑based offline computing, stream processing, and query engines built on MySQL, Redis, and Apache Kylin, while outlining future big‑data enhancements.

Apache KylinBaidu MapsHadoop
0 likes · 7 min read
How Baidu Maps Powers Its Open Platform with Big Data Architecture
21CTO
21CTO
Nov 21, 2015 · Big Data

Why Build a Kafka System? Core Use Cases and Design Principles

This article explains why Kafka is essential for activity and operational data pipelines, outlines key use cases such as news feeds, relevance ranking, security, monitoring, and reporting, and details its deployment topology, design decisions, and message persistence strategies.

Distributed MessagingKafkaReal-time Processing
0 likes · 14 min read
Why Build a Kafka System? Core Use Cases and Design Principles