Author

Hulu Beijing

Follow Hulu's official WeChat account for the latest company updates and recruitment information.

105

Articles

Likes

Views

Comments

Latest from Hulu Beijing

100 recent articles max

Hulu Beijing

Feb 6, 2018 · Artificial Intelligence

Modeling Chinese Word Segmentation with Hidden Markov Models

This article explains how Hidden Markov Models can be used to model Chinese word segmentation, covering the underlying Markov process, model parameters, basic HMM problems, and both supervised and unsupervised training methods.

Chinese Word SegmentationHidden Markov Modelmachine learning

0 likes · 8 min read

Modeling Chinese Word Segmentation with Hidden Markov Models

Hulu Beijing

Feb 1, 2018 · Artificial Intelligence

Understanding GANs: Theory, Minimax Game, and Training Challenges

This article introduces Generative Adversarial Networks (GANs), explains their minimax formulation, value function, Jensen‑Shannon divergence, common variants, and practical training issues such as gradient saturation, while also previewing the next topic on Hidden Markov Models.

GANGenerative Adversarial NetworksMinimax Game

0 likes · 11 min read

Understanding GANs: Theory, Minimax Game, and Training Challenges

Hulu Beijing

Jan 30, 2018 · Artificial Intelligence

Understanding Stochastic Gradient Descent and Mini‑Batch Optimization

This article explains why traditional gradient descent struggles with massive datasets, introduces stochastic gradient descent and mini‑batch gradient descent as efficient alternatives, and provides practical guidance on batch size selection, data shuffling, and learning‑rate scheduling for deep learning models.

Mini-BatchOptimizationstochastic gradient descent

0 likes · 8 min read

Understanding Stochastic Gradient Descent and Mini‑Batch Optimization

Hulu Beijing

Jan 25, 2018 · Artificial Intelligence

How Batch Normalization Accelerates Neural Network Training and Improves Generalization

This article explains the motivation, core principles, and implementation details of Batch Normalization, including how it normalizes each mini‑batch, restores learned feature distributions, and is applied in convolutional neural networks to speed up training and boost model generalization.

CNNbatch-normalizationdeep learning

0 likes · 6 min read

How Batch Normalization Accelerates Neural Network Training and Improves Generalization

Hulu Beijing

Jan 23, 2018 · Artificial Intelligence

Feature Engineering for Structured Data: Normalization, Encoding & Interaction

This article explains the fundamentals of feature engineering for structured data, covering why and how to normalize numerical features, various categorical encoding techniques, methods for creating high‑dimensional interaction features, and decision‑tree based strategies for efficiently discovering valuable feature combinations.

Normalizationcategorical encodingfeature engineering

0 likes · 12 min read

Feature Engineering for Structured Data: Normalization, Encoding & Interaction

Hulu Beijing

Jan 18, 2018 · Artificial Intelligence

Why Accuracy Misleads and How to Pick Better ML Evaluation Metrics

This article uses realistic Hulu business scenarios to illustrate the pitfalls of relying solely on accuracy, precision, recall, RMSE, and other single metrics, and explains how combining complementary evaluation measures such as average accuracy, precision‑recall curves, ROC, F1‑score, and MAPE can provide a more comprehensive assessment of classification, ranking, and regression models.

RMSEaccuracyfeature engineering

0 likes · 12 min read

Why Accuracy Misleads and How to Pick Better ML Evaluation Metrics

Hulu Beijing

Jan 16, 2018 · Artificial Intelligence

Why PCA Can Be Seen as Linear Regression: The Minimum Square Error Perspective

This article revisits Principal Component Analysis by framing it as a minimum‑square‑error regression problem, showing how the optimal projection line aligns with linear regression, deriving the solution in both two‑dimensional and high‑dimensional spaces, and linking it to the classic maximum‑variance approach.

Linear regressionPCAminimum square error

0 likes · 5 min read

Why PCA Can Be Seen as Linear Regression: The Minimum Square Error Perspective

Hulu Beijing

Jan 11, 2018 · Artificial Intelligence

Topic Modeling Explained: pLSA, LDA, and How to Pick the Right Number of Topics

This article introduces the fundamentals of topic modeling, compares the probabilistic latent semantic analysis (pLSA) and latent Dirichlet allocation (LDA) methods, explains their graphical models and inference via EM or Gibbs sampling, and discusses practical strategies for selecting the optimal number of topics using perplexity or hierarchical Dirichlet processes.

LDApLSAperplexity

0 likes · 10 min read

Topic Modeling Explained: pLSA, LDA, and How to Pick the Right Number of Topics

Hulu Beijing

Jan 9, 2018 · Artificial Intelligence

Mastering SVM: How Kernel Functions and Slack Variables Enable Perfect Classification

This article explains how kernel functions and slack variables empower Support Vector Machines to achieve zero training error on linearly inseparable data, presents three theoretical questions about Gaussian kernels, error‑free classification without slack variables, and the impact of the regularization parameter C when using SMO, and provides detailed analytical solutions.

SMOkernel functionsslack variables

0 likes · 6 min read

Mastering SVM: How Kernel Functions and Slack Variables Enable Perfect Classification

Hulu Beijing

Jan 4, 2018 · Artificial Intelligence

Why SGD Fails and How Momentum, AdaGrad, and Adam Fix It

This article explains why vanilla Stochastic Gradient Descent often struggles in deep learning, describes the challenges of valleys and saddle points, and introduces three major SGD variants—Momentum, AdaGrad, and Adam—detailing their motivations, update rules, and advantages.

AdaGradAdamMomentum

0 likes · 13 min read

Why SGD Fails and How Momentum, AdaGrad, and Adam Fix It