Tagged articles
3 articles
Page 1 of 1
Big Data Technology & Architecture
Big Data Technology & Architecture
Sep 1, 2021 · Big Data

Understanding Hadoop Data Splitting and InputFormat Mechanisms

This article explains Hadoop's data splitting concepts, the distinction between HDFS blocks and logical InputSplits, details the source code of various InputFormats such as TextInputFormat, CombineTextInputFormat, KeyValueTextInputFormat, NLineInputFormat, and custom InputFormats, and provides complete Java examples for Mapper, Reducer, and driver classes.

Data SplittingHadoopInputFormat
0 likes · 24 min read
Understanding Hadoop Data Splitting and InputFormat Mechanisms
Architecture Digest
Architecture Digest
Aug 6, 2021 · Backend Development

GitHub’s Journey from Monolith to Microservices: Practices and Lessons

This article details GitHub’s transition from a 12‑year‑old Ruby on Rails monolith to a micro‑service architecture, covering growth challenges, modular design, data splitting, core service extraction, operational changes, and strategies for building resilient, asynchronous systems.

Data SplittingGitHubservice extraction
0 likes · 15 min read
GitHub’s Journey from Monolith to Microservices: Practices and Lessons
Tencent Advertising Technology
Tencent Advertising Technology
Jun 16, 2017 · Artificial Intelligence

Weekly Champion Insights from the Tencent Social Ads Algorithm Competition – The ThreeIdiots Team

The ThreeIdiots team shares their experience winning the weekly champion in Tencent's social ads algorithm contest, detailing their feature engineering strategy, time‑based data splitting, handling of large‑scale data, and model choices such as LightGBM and FM, while emphasizing the importance of thoughtful feature extraction over extensive parameter tuning.

Data SplittingModel SelectionTencent
0 likes · 7 min read
Weekly Champion Insights from the Tencent Social Ads Algorithm Competition – The ThreeIdiots Team