Tagged articles

1881 articles

Page 8 of 19

Oct 11, 2022 · Artificial Intelligence

Are Statistics and Machine Learning Really the Same? Uncover the Real Differences

While many claim that machine learning is merely statistics with a flashy veneer, this article explores the nuanced distinctions between the two fields—examining their goals, methodologies, and examples such as linear regression—to clarify why they are related yet fundamentally different.

Model EvaluationPredictionlinear regression

0 likes · 17 min read

Are Statistics and Machine Learning Really the Same? Uncover the Real Differences

Programmer DD

Oct 11, 2022 · Artificial Intelligence

Why Is Google Translate Suddenly Failing in Mainland China?

Recent reports reveal that Google Translate’s mainland China domain translate.google.cn now redirects to a search page and ultimately to the Hong Kong site, rendering the service inaccessible for Chinese users, while underlying issues stem from censored training data, outdated infrastructure, and past attempts to revive Google’s search in China.

AICensorshipChina

0 likes · 5 min read

Why Is Google Translate Suddenly Failing in Mainland China?

Model Perspective

Oct 10, 2022 · Fundamentals

Matrix-to-Matrix Derivatives: Definitions, Differential Method & Examples

This article explains the definition of matrix‑to‑matrix derivatives, introduces the vectorization‑based differential approach using Kronecker products, presents key matrix‑vectorization properties, and walks through detailed examples illustrating how to compute such derivatives, highlighting their role and limitations in machine‑learning optimization.

Kronecker productderivativemachine learning

0 likes · 5 min read

Matrix-to-Matrix Derivatives: Definitions, Differential Method & Examples

Model Perspective

Oct 9, 2022 · Artificial Intelligence

Why Model Interpretability Matters: Tackling the Black‑Box Problem in AI

This article explains the challenges of black‑box machine‑learning models, illustrates real‑world banking examples, and introduces explainable AI techniques such as intrinsic vs. post‑hoc and local vs. global explanations to improve trust, safety, and fairness.

AI ethicsblack-box modelsmachine learning

0 likes · 13 min read

Why Model Interpretability Matters: Tackling the Black‑Box Problem in AI

Model Perspective

Oct 8, 2022 · Artificial Intelligence

How Ensemble Learning Boosts Model Performance: A Comprehensive Overview

Ensemble learning combines multiple individual models—either homogeneous or heterogeneous—using strategies such as boosting, bagging, averaging, voting, or stacking to create a stronger learner, and this article explains its principles, key algorithms, and combination methods in detail.

Stackingbaggingmachine learning

0 likes · 8 min read

How Ensemble Learning Boosts Model Performance: A Comprehensive Overview

Rare Earth Juejin Tech Community

Oct 8, 2022 · Artificial Intelligence

Wasserstein GAN (WGAN): Theory and Hands‑On Implementation

This article explains why traditional GANs suffer from training instability, introduces the Wasserstein (Earth‑Mover) distance as a smoother alternative, derives the WGAN objective, discusses Lipschitz constraints, and provides practical PyTorch code modifications to convert a vanilla GAN into a stable WGAN.

Deep LearningGANPyTorch

0 likes · 21 min read

Wasserstein GAN (WGAN): Theory and Hands‑On Implementation

Model Perspective

Oct 7, 2022 · Artificial Intelligence

Master Gradient Descent: From Intuition to Advanced Variants

This comprehensive guide explains the mathematical foundation, intuitive intuition, algorithmic steps, tuning strategies, and variants of gradient descent, comparing it with other optimization methods and illustrating its use in machine‑learning models such as linear regression.

gradient descentlearning ratelinear regression

0 likes · 14 min read

Master Gradient Descent: From Intuition to Advanced Variants

Model Perspective

Oct 6, 2022 · Artificial Intelligence

Mastering the Chain Rule for Vector‑to‑Vector and Scalar‑to‑Matrix Derivatives

This article explains the chain rule for vector‑to‑vector derivatives, scalar‑to‑multiple‑vector and scalar‑to‑matrix cases, illustrates how to handle dimensional compatibility, provides concrete examples such as least‑squares optimization, and summarizes four key matrix‑vector derivative conclusions for efficient machine‑learning calculations.

Derivativeschain rulemachine learning

0 likes · 5 min read

Mastering the Chain Rule for Vector‑to‑Vector and Scalar‑to‑Matrix Derivatives

MaGe Linux Operations

Oct 1, 2022 · Artificial Intelligence

11 Powerful Feature Selection Techniques Every Data Scientist Should Master

This guide walks through a comprehensive set of feature‑selection strategies—from removing unused or missing columns to handling multicollinearity, low‑variance features, and using PCA—complete with Python code examples and visualizations to help you build leaner, more interpretable machine‑learning models.

Pythondata preprocessingdimensionality reduction

0 likes · 18 min read

11 Powerful Feature Selection Techniques Every Data Scientist Should Master

Model Perspective

Sep 23, 2022 · Artificial Intelligence

How to Fill Missing Data with sklearn’s SimpleImputer and KNNImputer

This guide explains how to use scikit-learn’s SimpleImputer and KNNImputer to fill missing values, covering available strategies such as mean, median, most frequent, and constant, and provides complete Python code examples with expected output.

KNNImputerPythonSimpleImputer

0 likes · 3 min read

How to Fill Missing Data with sklearn’s SimpleImputer and KNNImputer

Tencent Advertising Technology

Sep 23, 2022 · Industry Insights

How Tencent Ads’ CONFLUX and MVKE Algorithms Boost Conversion – Insights from KDD2022

Tencent Ads hosted two KDD2022‑focused live sessions showcasing the CONFLUX and MVKE algorithms, explaining their technical foundations, real‑world impact on billions of ad impressions, and answering audience questions about brand versus performance ads, validation methods, and future research directions.

Industry InsightsKDD2022advertising technology

0 likes · 6 min read

How Tencent Ads’ CONFLUX and MVKE Algorithms Boost Conversion – Insights from KDD2022

Model Perspective

Sep 21, 2022 · Fundamentals

Understanding Matrix and Vector Derivatives: Layouts and Jacobians Explained

This article introduces matrix and vector differentiation, explains the nine possible derivative cases, clarifies numerator and denominator layouts, and shows how Jacobian and gradient matrices arise, providing a concise foundation for machine‑learning calculus.

Jacobiandenominator layoutgradient

0 likes · 8 min read

Understanding Matrix and Vector Derivatives: Layouts and Jacobians Explained

Alimama Tech

Sep 21, 2022 · Artificial Intelligence

EXTR: Click-Through Rate Prediction with Externalities in E-Commerce Sponsored Search

The paper introduces EXTR, a Transformer‑based CTR prediction model that jointly encodes diverse externalities from surrounding organic results and ads and infers missing ad placements via a Potential Allocation Generator, achieving superior AUC, COPC and LogLoss on Taobao data and deployment in Alibaba’s advertising system.

AdvertisingExternalitiesTransformer

0 likes · 11 min read

EXTR: Click-Through Rate Prediction with Externalities in E-Commerce Sponsored Search

NetEase Game Operations Platform

Sep 19, 2022 · Artificial Intelligence

Applying AIOps to Game Operations: Roadmap, Anomaly Detection, and Fault Localization

This article describes NetEase's AIOps journey for game operations, explaining the Gartner definition of intelligent operations, the implementation roadmap, detailed anomaly‑detection techniques for business, performance, and log data, and a comprehensive fault‑localization workflow that combines resource, code, and historical analysis.

Fault Localizationaiopsanomaly detection

0 likes · 12 min read

Applying AIOps to Game Operations: Roadmap, Anomaly Detection, and Fault Localization

Model Perspective

Sep 18, 2022 · Artificial Intelligence

How Bayesian Linear Regression Reveals Uncertainty in Model Parameters

This article explains Bayesian linear regression, describing its probabilistic treatment of weights, prior and posterior computation, MAP and numerical solutions, and how it enables uncertainty quantification, online learning, and model comparison through Bayes factors.

Bayesian inferenceMAP estimationMCMC

0 likes · 9 min read

How Bayesian Linear Regression Reveals Uncertainty in Model Parameters

DataFunSummit

Sep 17, 2022 · Artificial Intelligence

Advertising Targeting: From Glory to Sunset – Technical Reflections and Future Directions

This article reviews the evolution of advertising targeting technology, recounts its historical impact, analyzes the underlying machine‑learning models—from early Python classifiers to XGBoost‑Spark, DNN, and attention‑based wide‑deep systems—and discusses why the technique is now waning while outlining possible future integrations with large‑scale recall and cost‑aware optimization.

CTRmachine learningtargeting

0 likes · 26 min read

Advertising Targeting: From Glory to Sunset – Technical Reflections and Future Directions

ITPUB

Sep 15, 2022 · Artificial Intelligence

Why Precise Feature Engineering Still Matters in Recommendation Systems

In the era of deep learning, feature engineering remains crucial for recommendation and search advertising because it bridges raw relational data and models, improves performance, reduces complexity, and handles high‑cardinality, large‑scale, and time‑sensitive scenarios with robust transformations and statistical encoding.

AIdata preprocessingfeature engineering

0 likes · 20 min read

Why Precise Feature Engineering Still Matters in Recommendation Systems

DataFunSummit

Sep 13, 2022 · Artificial Intelligence

Elegant Integration of Ads in Search: An Analysis of Baidu's Mobius Approach

This article examines how search advertising can be seamlessly blended with user queries by balancing relevance and revenue, reviewing the evolution from portal indexing to recommendation systems, and detailing Baidu's Mobius framework that jointly optimizes relevance, CTR, and eCPM in a unified pipeline.

CTRMobiusad ranking

0 likes · 24 min read

Elegant Integration of Ads in Search: An Analysis of Baidu's Mobius Approach

Python Crawling & Data Mining

Sep 11, 2022 · Big Data

How Tencent Built Its Massive Big Data Platform Over a Decade

Over more than ten years, Tencent evolved its big data infrastructure through three generations—from early Hadoop-based offline processing, to a hybrid real‑time Spark/Storm system, and finally to a self‑developed, open‑source machine‑learning platform—highlighting the shift from “borrowed” solutions to fully proprietary, AI‑ready architectures.

ArchitectureOpen-sourcedata-warehouse

0 likes · 10 min read

How Tencent Built Its Massive Big Data Platform Over a Decade

DataFunTalk

Sep 9, 2022 · Artificial Intelligence

AI-Powered Music Comment Moderation and Ranking: Models, Challenges, and Business Impact

This article presents a comprehensive overview of AI-driven music comment moderation and ranking systems, detailing business scenarios, model architectures, data processing techniques, performance improvements, and future directions for both QQ Music and K‑Song platforms.

AIBERTNLP

0 likes · 17 min read

AI-Powered Music Comment Moderation and Ranking: Models, Challenges, and Business Impact

MaGe Linux Operations

Sep 8, 2022 · Artificial Intelligence

Master 10 Popular Clustering Algorithms in Python with Scikit‑Learn

This tutorial introduces unsupervised clustering, explains its purpose, and walks through installing scikit‑learn and implementing ten popular clustering algorithms—including AffinityPropagation, Agglomerative, BIRCH, DBSCAN, K‑Means, Mini‑Batch K‑Means, MeanShift, OPTICS, Spectral Clustering, and Gaussian Mixture—complete with code examples and visualizations.

Unsupervised Learningclusteringdata mining

0 likes · 27 min read

Master 10 Popular Clustering Algorithms in Python with Scikit‑Learn

Alimama Tech

Sep 7, 2022 · Artificial Intelligence

How AI is Revolutionizing Chinese Font Design: Inside Alibaba’s Smart Font Lab

This article examines Alibaba’s AliMama “Smart Font” project, detailing how AI‑driven pipelines, few‑shot generation, and open‑source toolkits are used to create commercial Chinese typefaces, enable font subsetting, and explore dynamic and variable‑font technologies for modern digital media.

AIAlibabaTypography

0 likes · 16 min read

How AI is Revolutionizing Chinese Font Design: Inside Alibaba’s Smart Font Lab

Model Perspective

Sep 5, 2022 · Fundamentals

Why Understanding Causal Relationships Is Crucial for Machine Learning

This article explains why causal inference matters beyond prediction, introduces potential outcomes notation, demonstrates how bias separates correlation from causation, and outlines the conditions under which observed differences can be interpreted as true causal effects.

BiasPredictioncausal inference

0 likes · 16 min read

Why Understanding Causal Relationships Is Crucial for Machine Learning

Python Programming Learning Circle

Sep 5, 2022 · Artificial Intelligence

Ten Essential Python Libraries for AI, Data Processing, and Model Deployment

This article introduces ten powerful Python libraries—including Awkward Array, Jupytext, Gradio, Hub, AugLy, Evidently, YOLOX, LightSeq, Greykite, and Jina/Finetuner—highlighting their key features, performance benefits, and where to find them, offering developers essential tools for data handling, model deployment, and AI research.

AIData ScienceOpen-source

0 likes · 8 min read

Ten Essential Python Libraries for AI, Data Processing, and Model Deployment

HelloTech

Sep 2, 2022 · Artificial Intelligence

Search and Recommendation Algorithms: Evolution, Common Pipelines, and Integrated Engine Design

The article outlines how search and recommendation systems have evolved from simple hot‑list displays to sophisticated, data‑driven pipelines comprising recall, fine‑ranking and re‑ranking stages, describes an integrated low‑code engine with standardized features, configurable components and intelligent modules that enable rapid deployment across many scenarios, delivering notable CTR, GMV and engagement gains at 哈啰.

Data StandardizationEmbeddingalgorithm architecture

0 likes · 10 min read

Search and Recommendation Algorithms: Evolution, Common Pipelines, and Integrated Engine Design

Hulu Beijing

Sep 2, 2022 · Artificial Intelligence

How Hulu Eliminated Feature Drift with Server‑Side Feature Logging

This article explains Hulu's server‑side feature logging system that aligns online and offline recommendation features, measures and mitigates feature drift caused by data source, timing, and code differences, and improves model performance while reducing resource consumption.

Hulufeature driftfeature logging

0 likes · 17 min read

How Hulu Eliminated Feature Drift with Server‑Side Feature Logging

Open Source Linux

Sep 1, 2022 · Operations

What’s New in Zabbix 6.0? Enhanced Monitoring, HA, AI & Cloud Features Explained

Zabbix 6.0 introduces a suite of enhancements—including high‑availability clustering, advanced business‑service monitoring with SLA calculations, root‑cause analysis, machine‑learning‑based anomaly detection, Kubernetes templates, a redesigned audit log, TLS certificate checks, UI improvements, customizable branding, and new integrations—aimed at boosting operational visibility and efficiency across cloud and on‑premise environments.

KubernetesOperationsZabbix

0 likes · 12 min read

What’s New in Zabbix 6.0? Enhanced Monitoring, HA, AI & Cloud Features Explained

Model Perspective

Aug 31, 2022 · Fundamentals

How to Build a Watermelon Sweetness Dataset: From Field to Features

This article describes how the author collected a watermelon dataset, defined measurable features such as size, color, sugar content, seed count, and texture, and documented the process with photos, tables, and a brief discussion of data characteristics for future machine‑learning analysis.

data analysisdata collectionfeature engineering

0 likes · 12 min read

How to Build a Watermelon Sweetness Dataset: From Field to Features

Aikesheng Open Source Community

Aug 31, 2022 · Big Data

Tencent's Big Data Construction: Philosophy, Architecture Evolution, and Open‑Source Strategy

The article introduces Tencent's big‑data platform philosophy and overall architecture, detailing three generations of evolution from offline Hadoop‑based processing to real‑time Spark/Storm integration and finally AI‑driven machine‑learning platforms, while also highlighting the team, book publication, and a related giveaway event.

ArchitectureBig DataCloud Native

0 likes · 12 min read

Tencent's Big Data Construction: Philosophy, Architecture Evolution, and Open‑Source Strategy

DataFunTalk

Aug 30, 2022 · Artificial Intelligence

Feature Engineering for Recommendation and Search Advertising

This article explains why meticulous feature engineering remains crucial in recommendation and search advertising, outlines what constitutes good features, describes common transformation techniques such as scaling, binning, and encoding, and provides practical examples and Q&A for practitioners.

AIdata preprocessingfeature engineering

0 likes · 18 min read

Feature Engineering for Recommendation and Search Advertising

DaTaobao Tech

Aug 29, 2022 · Frontend Development

Subtoken‑TranX: Front‑end JavaScript Code Generation for Industrial Use

Subtoken‑TranX, a joint effort by Alibaba’s DaTaobao team and Peking University, converts natural‑language requirements into JavaScript by training on a curated 2,489‑pair dataset, using subtoken‑level AST generation and task‑augmented variable semantics, achieving superior accuracy over standard TranX and Transformer models and now powering Alibaba’s BizCook front‑end production platform.

ASTJavaScriptmachine learning

0 likes · 12 min read

Subtoken‑TranX: Front‑end JavaScript Code Generation for Industrial Use

DataFunTalk

Aug 27, 2022 · Artificial Intelligence

User Growth Algorithms and Engineering Practices at Huya Live Streaming

This article details Huya's comprehensive user growth framework, covering the full acquisition‑activation‑retention‑revenue funnel, advertising workflow, crowd targeting stages, uplift modeling, virtual callbacks, intelligent bidding, and engineering implementations such as material automation, low‑latency RTA filtering, and dynamic strategy operators.

HuyaUplift Modelingadvertising algorithms

0 likes · 14 min read

User Growth Algorithms and Engineering Practices at Huya Live Streaming

Airbnb Technology Team

Aug 26, 2022 · Information Security

Airbnb Data Privacy and Security Engineering: Inspekt Data Classification Service and Angmar Secret Detection

Airbnb’s second privacy‑security article describes how the Inspekt service automatically classifies personal and sensitive data across diverse stores using regexes, Aho‑Corasick tries, machine‑learning models and custom validators, measures validator quality, and how the Angmar system scans code repositories for secrets via CI checks and pre‑commit hooks, with plans to broaden coverage to more APIs and data stores.

Secret Detectioncloud securitydata classification

0 likes · 16 min read

Airbnb Data Privacy and Security Engineering: Inspekt Data Classification Service and Angmar Secret Detection

Model Perspective

Aug 25, 2022 · Artificial Intelligence

Mastering Regression: Key Assumptions, Metrics, and Model Evaluation

This article explains the fundamental assumptions of linear regression, compares linear and nonlinear models, discusses multicollinearity, outliers, regularization, heteroscedasticity, VIF, stepwise regression, and reviews essential evaluation metrics such as MAE, MSE, RMSE, R² and Adjusted R².

MetricsModel Evaluationlinear regression

0 likes · 12 min read

Mastering Regression: Key Assumptions, Metrics, and Model Evaluation

ELab Team

Aug 24, 2022 · Artificial Intelligence

Demystifying AI: From Linear Regression to Neural Networks with TensorFlow.js

This article walks through the fundamentals of artificial intelligence, explaining linear and logistic regression, loss functions, gradient descent, and neural network basics, illustrated with TensorFlow.js code examples, visual analogies, and practical demos, helping readers grasp core concepts and their real‑world applications.

Artificial IntelligenceNeural NetworksTensorFlow.js

0 likes · 18 min read

Demystifying AI: From Linear Regression to Neural Networks with TensorFlow.js

Python Programming Learning Circle

Aug 23, 2022 · Big Data

10 Python Packages for Automated Exploratory Data Analysis (EDA)

This article introduces ten Python packages that automate exploratory data analysis, explaining each library's capabilities, providing concise usage examples, and showing how they can generate comprehensive data summaries and visualizations with just a few lines of code.

Automated EDAData visualizationEDA

0 likes · 8 min read

10 Python Packages for Automated Exploratory Data Analysis (EDA)

DevOps

Aug 23, 2022 · Artificial Intelligence

Intelligent Automation Testing: Self‑Healing and Machine‑Learning Techniques

This article reviews the evolution of automated testing toward intelligent solutions, explaining self‑healing mechanisms, machine‑learning‑driven object recognition, computer‑vision and OCR approaches, industry tools such as Healenium and Airtest, and future prospects for zero‑code AI‑powered test automation.

AIAutomation testingComputer Vision

0 likes · 13 min read

Intelligent Automation Testing: Self‑Healing and Machine‑Learning Techniques

Model Perspective

Aug 18, 2022 · Artificial Intelligence

Master SciPy Clustering: K‑Means and Hierarchical Methods with Python

This guide introduces SciPy's clustering modules, explaining the vector quantization and k‑means algorithm in scipy.cluster.vq, and demonstrates hierarchical clustering with scipy.cluster.hierarchy, accompanied by complete Python code examples and visualizations to help you apply these techniques to real data.

Hierarchical ClusteringK-Meansclustering

0 likes · 4 min read

Master SciPy Clustering: K‑Means and Hierarchical Methods with Python

DataFunSummit

Aug 18, 2022 · Artificial Intelligence

Evolution and Technical Practices of Du Xiaoman Risk Control Decision Engine

This article presents a comprehensive overview of Du Xiaoman's risk control system evolution—from early rule‑based engines to AI‑enhanced intelligent decision engines—detailing technical practices such as strategy iteration acceleration, decision latency reduction, parallel workflow design, and future trends in data quality, automated strategy optimization, and real‑time analytics.

Data Qualitydecision enginemachine learning

0 likes · 18 min read

Evolution and Technical Practices of Du Xiaoman Risk Control Decision Engine

Python Programming Learning Circle

Aug 16, 2022 · Fundamentals

30 Useful Python Packages for Data Workflows

This article introduces thirty unique and practical Python packages that simplify various aspects of data workflows, including model training notifications, progress tracking, data validation, statistical calculations, date handling, and more, providing installation commands and code examples for each tool.

Data WorkflowPackagesPython

0 likes · 15 min read

30 Useful Python Packages for Data Workflows

Bilibili Tech

Aug 16, 2022 · Artificial Intelligence

Bilibili's Intelligent Adaptive Bitrate Algorithm: From Theory to Practice

Bilibili enhanced mobile video streaming by replacing standard ABR methods with an intelligent adaptive bitrate system that uses a real‑time QoE model, refined network‑speed preprocessing, long‑term feature analysis, decision‑tree‑based neural models, and personalized user modes to balance resolution, buffering, and data usage.

BilibiliPensieve algorithmQoE optimization

0 likes · 12 min read

Bilibili's Intelligent Adaptive Bitrate Algorithm: From Theory to Practice

Model Perspective

Aug 14, 2022 · Artificial Intelligence

Mastering Feature Binning with sklearn: Uniform, Quantile, and K‑Means Methods

This article explains why discretizing continuous variables improves model stability, introduces three common binning techniques—equal-width, equal-frequency, and clustering—and demonstrates how to implement each using scikit‑learn's KBinsDiscretizer with Python code examples on a synthetic score dataset.

KBinsDiscretizerPythondata preprocessing

0 likes · 5 min read

Mastering Feature Binning with sklearn: Uniform, Quantile, and K‑Means Methods

DataFunSummit

Aug 14, 2022 · Artificial Intelligence

Optimizing Pre‑Ranking in Meituan Search: Knowledge Distillation and Neural Architecture Search

This article describes Meituan Search's pre‑ranking (coarse‑ranking) system evolution and presents two major optimization strategies—leveraging knowledge distillation to align coarse‑ranking with fine‑ranking and employing neural architecture search to jointly improve effectiveness and latency—demonstrating significant offline and online performance gains.

Knowledge DistillationNeural Architecture Searchmachine learning

0 likes · 17 min read

Optimizing Pre‑Ranking in Meituan Search: Knowledge Distillation and Neural Architecture Search

Model Perspective

Aug 13, 2022 · Artificial Intelligence

Mastering Outlier Detection: Techniques, Algorithms, and PyOD Implementation

Outlier detection identifies data points far from the norm, using methods such as the 3‑sigma rule, boxplots, K‑Nearest Neighbors, and numerous probabilistic and proximity‑based algorithms, with practical PyOD code examples for training, evaluating, and visualizing models across various techniques.

anomaly detectionmachine learningoutlier detection

0 likes · 8 min read

Mastering Outlier Detection: Techniques, Algorithms, and PyOD Implementation

政采云技术

Aug 11, 2022 · Artificial Intelligence

Semi‑Automatic Annotation with Label Studio and YOLOv5: Installation, Project Setup, and Model Training

This guide explains how to combine the open‑source labeling platform Label Studio with the YOLOv5 object‑detection model to achieve semi‑automatic annotation, covering installation of both tools, project creation, dataset configuration, and training a custom YOLOv5 model on your own data.

Label StudioPythonSemi-Automatic Annotation

0 likes · 11 min read

Semi‑Automatic Annotation with Label Studio and YOLOv5: Installation, Project Setup, and Model Training

Model Perspective

Aug 8, 2022 · Artificial Intelligence

Mastering sklearn.svm: Parameters, Grid Search, and Real-World Examples

An in‑depth guide to sklearn.svm explains SVM classification and regression, details key parameters such as C and kernel types, demonstrates how to use GridSearchCV for hyperparameter tuning, and provides complete Python code examples for iris classification and California housing price prediction.

GridSearchCVPythonmachine learning

0 likes · 6 min read

Mastering sklearn.svm: Parameters, Grid Search, and Real-World Examples

DataFunSummit

Aug 8, 2022 · Artificial Intelligence

Voice Analysis for Financial Risk Control: Feature Extraction, Single-Channel Speech Separation, and Text Tagging

This talk presents the application of voice analysis in financial risk control, covering voice‑based risk feature extraction, single‑channel speech separation techniques, and speech‑text labeling methods, demonstrating how acoustic and textual cues can be leveraged to improve risk detection and model performance.

Audio Processingmachine learningrisk control

0 likes · 12 min read

Voice Analysis for Financial Risk Control: Feature Extraction, Single-Channel Speech Separation, and Text Tagging

Model Perspective

Aug 7, 2022 · Artificial Intelligence

Understanding Support Vector Regression: Theory and Formulation

Support Vector Regression (SVR) predicts continuous outputs by fitting a hyperplane that minimizes a loss function while employing an ε‑insensitive loss to reduce overfitting, and the article details its mathematical formulation, penalty terms, Lagrangian dual, and optimization process.

Lagrangian dualSVRepsilon-insensitive loss

0 likes · 3 min read

Understanding Support Vector Regression: Theory and Formulation

Model Perspective

Aug 7, 2022 · Artificial Intelligence

Mastering Core ML Evaluation Metrics: From Bias‑Variance to ROC Curves

This article explains essential machine‑learning evaluation concepts—including the bias‑variance trade‑off, Gini impurity versus entropy, precision‑recall curves, ROC and AUC, the elbow method for K‑means, PCA scree plots, linear and logistic regression, SVM geometry, normal‑distribution rules, and Student’s t‑distribution—providing clear visual illustrations for each.

PCAROCbias‑variance

0 likes · 7 min read

Mastering Core ML Evaluation Metrics: From Bias‑Variance to ROC Curves

Model Perspective

Aug 6, 2022 · Artificial Intelligence

How Kernel Functions Enable SVMs to Classify Non‑Linear Data

When training data from two classes overlap heavily, linear SVMs fail, so we map inputs into a high‑dimensional Hilbert (feature) space using kernel functions—such as linear, polynomial, radial basis, and Fourier kernels—to make the data linearly separable, formulate a quadratic programming problem, solve its convex dual, and construct a classifier for unknown samples.

Hilbert spacekernel methodsmachine learning

0 likes · 2 min read

How Kernel Functions Enable SVMs to Classify Non‑Linear Data

Model Perspective

Aug 6, 2022 · Artificial Intelligence

Understanding Activation Functions in Artificial Neural Networks

This article introduces artificial neural networks, explains the role of artificial neurons and their weighted connections, and provides an overview of common activation functions—including linear, nonlinear ramp, threshold/step, and sigmoid forms—highlighting their characteristics and typical saturation values.

Deep Learningactivation functionartificial neural network

0 likes · 2 min read

Understanding Activation Functions in Artificial Neural Networks

Model Perspective

Aug 5, 2022 · Artificial Intelligence

What Are the Essential Steps and Types of Machine Learning?

Machine learning involves five core steps—from data collection and preparation to model training, evaluation, and improvement—while encompassing supervised, unsupervised, and reinforcement learning methods, each with distinct algorithms and real-world applications across finance, healthcare, and retail.

ApplicationsUnsupervised Learningmachine learning

0 likes · 7 min read

What Are the Essential Steps and Types of Machine Learning?

Model Perspective

Aug 5, 2022 · Artificial Intelligence

Understanding Generalized Linear‑Separable Support Vector Machines

This article explains how hard‑margin and soft‑margin support vector machines handle perfectly and approximately linearly separable data, introduces slack variables and penalty parameters, derives the quadratic programming and dual formulations, and shows how the resulting classifier works on unseen samples.

Support Vector Machinemachine learningoptimization

0 likes · 3 min read

Understanding Generalized Linear‑Separable Support Vector Machines

HelloTech

Aug 5, 2022 · Artificial Intelligence

Intelligent Transaction System Construction for Halu Carpool

In a July 2022 keynote, Halu’s senior algorithm expert Wang Fan outlined the construction of an intelligent transaction system for its car‑pool service, detailing business challenges, a decomposition into matching, pricing, marketing and arbitration, a recommendation‑pipeline architecture, and three‑stage algorithm evolution that boosted order volume by over 20 %.

algorithmcarpoolintelligent matching

0 likes · 12 min read

Intelligent Transaction System Construction for Halu Carpool

High Availability Architecture

Aug 5, 2022 · Big Data

Innovative Marketing Practices on the Cloud: How an Intelligent Data Lake Enables Flexible and Efficient Marketing Capabilities

The presentation details how Amazon Web Services’ intelligent data lake architecture integrates big data and machine learning to overcome marketing challenges, improve data governance, and provide scalable, real‑time analytics for personalized, data‑driven marketing across enterprises.

AWSBig DataCloud Computing

0 likes · 13 min read

Innovative Marketing Practices on the Cloud: How an Intelligent Data Lake Enables Flexible and Efficient Marketing Capabilities

Model Perspective

Aug 4, 2022 · Artificial Intelligence

How Supervised Learning Predicts House Prices – A Hands‑On Guide

Using a real‑world housing example, this article explains supervised and unsupervised learning, walks through building a price‑prediction function, introduces gradient descent for optimizing weights, and highlights pitfalls like overfitting, offering a practical introduction to core machine‑learning concepts.

Pythongradient descentlinear regression

0 likes · 13 min read

How Supervised Learning Predicts House Prices – A Hands‑On Guide

Model Perspective

Aug 3, 2022 · Artificial Intelligence

Explore the Most Popular Machine Learning Algorithms: A Comprehensive Guide

This article provides a thorough overview of the most widely used machine learning algorithms, classifying them by learning style and problem type, and highlighting popular methods such as supervised, unsupervised, semi‑supervised, regression, instance‑based, regularization, decision‑tree, Bayesian, clustering, association rule, neural network, deep learning, dimensionality‑reduction, and ensemble techniques.

AlgorithmsDeep LearningUnsupervised Learning

0 likes · 10 min read

Explore the Most Popular Machine Learning Algorithms: A Comprehensive Guide

Model Perspective

Jul 30, 2022 · Artificial Intelligence

How Decision Trees Predict House Locations: From Intuition to Overfitting

This article explains machine learning fundamentals using a house‑location classification example, illustrating how decision trees create split points from features like elevation and price, grow recursively, achieve high training accuracy, and reveal overfitting when evaluated on unseen test data.

Artificial IntelligenceData visualizationclassification

0 likes · 11 min read

How Decision Trees Predict House Locations: From Intuition to Overfitting

MaGe Linux Operations

Jul 29, 2022 · Artificial Intelligence

Master 10 Popular Clustering Algorithms in Python with Scikit‑Learn

This tutorial introduces clustering, explains why no single algorithm fits all data, and provides step‑by‑step Python examples using scikit‑learn for ten popular unsupervised learning methods, complete with code snippets and visualizations to illustrate results.

PythonUnsupervised Learningclustering

0 likes · 24 min read

ByteDance Terminal Technology

Jul 29, 2022 · Artificial Intelligence

Pitaya: ByteDance’s End‑Side AI Engineering Platform Overview

Pitaya, built by ByteDance’s Client AI and MLX teams, is a comprehensive end‑side AI engineering platform that provides a full workflow from model development and data preparation to deployment, monitoring, and federated learning, supporting large‑scale commercial scenarios across multiple apps.

AI PlatformFederated LearningInference Engine

0 likes · 14 min read

Pitaya: ByteDance’s End‑Side AI Engineering Platform Overview

DataFunTalk

Jul 29, 2022 · Artificial Intelligence

Tencent Music Cloud‑Native One‑Stop Machine Learning Platform: Features and Future Roadmap

This article introduces Tencent Music's cloud‑native, one‑stop machine learning platform, detailing its engineering workflow, distributed acceleration, inference closed‑loop, edge computing capabilities, and future plans, while highlighting challenges of traditional ML pipelines and the platform's solutions for resource orchestration, storage, scheduling, and GPU utilization.

AI PlatformDistributed TrainingPipeline

0 likes · 17 min read

Tencent Music Cloud‑Native One‑Stop Machine Learning Platform: Features and Future Roadmap

GuanYuan Data Tech Team

Jul 28, 2022 · Artificial Intelligence

Unlocking Reinforcement Learning: Core Concepts, Algorithms, and Real‑World Applications

This article introduces reinforcement learning by defining agents, environments, rewards, and policies, explains key concepts such as Markov Decision Processes and Bellman equations, and surveys major algorithms—including dynamic programming, Monte‑Carlo, TD learning, policy gradients, Q‑learning, DQN, and evolution strategies—while highlighting practical challenges and notable case studies like AlphaGo Zero.

Deep LearningEvolution StrategiesMDP

0 likes · 27 min read

Unlocking Reinforcement Learning: Core Concepts, Algorithms, and Real‑World Applications

Model Perspective

Jul 25, 2022 · Artificial Intelligence

How to Interpret Logistic Regression Parameters and Odds Ratios with Python

This article explains the concepts of odds and odds ratios, shows how logistic regression estimates them, and demonstrates practical model fitting and prediction using Python's statsmodels and scikit‑learn libraries with real‑world examples.

PythonStatsmodelslogistic regression

0 likes · 8 min read

How to Interpret Logistic Regression Parameters and Odds Ratios with Python

Model Perspective

Jul 23, 2022 · Artificial Intelligence

LASSO Regression Explained: Theory, Case Studies, and Python Code

This article introduces the mathematical foundations of ordinary least squares, ridge, and LASSO regression, explains why LASSO requires coordinate descent, presents two real-world case studies with data, and provides complete Python code for fitting, visualizing, and interpreting LASSO models.

LASSOPythonmachine learning

0 likes · 8 min read

LASSO Regression Explained: Theory, Case Studies, and Python Code

Meituan Technology Team

Jul 21, 2022 · Artificial Intelligence

Overview of Meituan Technical Team Papers Featured at ACM SIGIR 2022 and Related Works

The article highlights ten representative Meituan technical papers accepted at ACM SIGIR 2022, spanning personalized opinion tagging, cross‑domain sentiment classification, dialogue summarization transfer, universal retrieval, CTR prediction, image behavior modeling, and topic segmentation, each summarized with abstracts and download links for researchers.

cross-domain learninginformation retrievalmachine learning

0 likes · 25 min read

Overview of Meituan Technical Team Papers Featured at ACM SIGIR 2022 and Related Works

JD Retail Technology

Jul 18, 2022 · Artificial Intelligence

JD’s Intelligent Product Matching Technology Wins the 11th Wu Wenjun AI Science and Technology Award

On July 16, JD.com’s jointly developed intelligent product‑matching technology received the second‑place Science and Technology Progress Award at the 11th Wu Wenjun Artificial Intelligence Award, highlighting its multi‑dimensional user profiling, cross‑modal product modeling, and precise matching innovations that have driven significant commercial and societal impact.

Artificial IntelligenceJD.comWu Wenjun award

0 likes · 4 min read

JD’s Intelligent Product Matching Technology Wins the 11th Wu Wenjun AI Science and Technology Award

DaTaobao Tech

Jul 18, 2022 · Artificial Intelligence

Walle: An End-to-End, General-Purpose, Large-Scale Device-Cloud Collaborative Machine Learning System

Walle is Alibaba’s first end‑to‑end, general‑purpose, large‑scale device‑cloud collaborative machine‑learning platform that manages billions of mobile devices, provides a full‑stack data and compute pipeline, cuts cloud load by 87 %, reduces latency to ~100 ms, and already powers over a trillion daily ML invocations across dozens of Alibaba apps.

MNNOSDIdevice-cloud collaboration

0 likes · 11 min read

Walle: An End-to-End, General-Purpose, Large-Scale Device-Cloud Collaborative Machine Learning System

DataFunTalk

Jul 17, 2022 · Artificial Intelligence

Evolution of OPPO Commercial Advertising Targeting: From Differentiated to Intelligent to Untargeted Practices

This article details OPPO's commercial advertising targeting evolution, covering the background and logic, the multi‑layer targeting system and data modeling, automated intelligent targeting methods, the shift to untargeted crowd recall, and future considerations for ad‑targeting technology.

AdvertisingOPPOmachine learning

0 likes · 13 min read

Evolution of OPPO Commercial Advertising Targeting: From Differentiated to Intelligent to Untargeted Practices

DataFunSummit

Jul 14, 2022 · Artificial Intelligence

Next‑Generation Song Recognition: From Audio Fingerprints to Cover Detection

This article reviews the limitations of traditional audio‑fingerprint song identification, surveys the evolution of cover‑song detection techniques, and details Tencent Music’s Lyra‑CoverNet system—including embedding extraction, sequence retrieval, automated labeling, deployment results, and future research directions—demonstrating how deep learning advances enable more accurate and scalable music recognition.

EmbeddingTencent Musicaudio fingerprint

0 likes · 10 min read

Next‑Generation Song Recognition: From Audio Fingerprints to Cover Detection

DataFunSummit

Jul 11, 2022 · Artificial Intelligence

Optimizing CVR in Sparse High‑Value Travel Recommendation Scenarios

This article presents a comprehensive overview of conversion‑rate (CVR) optimization for Alitrip’s travel recommendation platform, detailing the challenges of extremely sparse user feedback, the design of item, user, query and context features, and a series of model‑level and loss‑function techniques—including generic‑label modeling, global‑transaction modeling, ESMM, rank‑loss approximations, and multi‑task CTR auxiliary training—to improve both CTR and CVR performance in high‑ticket‑price scenarios.

CVR optimizationSparse Datae‑commerce

0 likes · 19 min read

Optimizing CVR in Sparse High‑Value Travel Recommendation Scenarios

Bitu Technology

Jul 8, 2022 · Artificial Intelligence

Applying NLP and Machine Learning to Classify Tubi User Feedback

This article explains how Tubi leverages natural‑language processing, sentence embeddings (USE and BERT), and LightGBM models to automatically categorize large volumes of Net Promoter Score comments and customer‑support tickets, enabling data‑driven product decisions and workflow automation.

LightGBMNLPTubi

0 likes · 11 min read

Applying NLP and Machine Learning to Classify Tubi User Feedback

DaTaobao Tech

Jul 8, 2022 · Frontend Development

Alibaba Front‑End Intelligent Technology: PipCook, DataCook, imgcook and Future Directions

Alibaba Front‑End Intelligent Technology combines PipCook, DataCook, and imgcook to enable data‑driven UI generation, on‑device AI inference via WASM‑Rust‑SIMD and WebGPU, and applications such as code IntelliSense and design‑to‑code, while outlining a roadmap toward unified AI‑powered interfaces for commerce.

AITensorFlow.jsWasm

0 likes · 33 min read

Alibaba Front‑End Intelligent Technology: PipCook, DataCook, imgcook and Future Directions

DataFunTalk

Jul 5, 2022 · Artificial Intelligence

Identifying Viral Short‑Video Content on Kuaishou: Models, Features, and Engineering Framework

This article explains how Kuaishou detects and predicts viral short‑video素材 by defining content types, outlining essential viral elements, describing a two‑stage coarse‑recall and fine‑ranking model that combines speed‑based features, Gaussian mixture modeling, and a lightweight DNN, and showcases real‑world case studies and Q&A.

Kuaishoumachine learningrecommendation system

0 likes · 14 min read

Identifying Viral Short‑Video Content on Kuaishou: Models, Features, and Engineering Framework

政采云技术

Jul 5, 2022 · Artificial Intelligence

Overview of Natural Language Processing Techniques and Their Evolution

This article provides a comprehensive overview of natural language processing, covering its definition, historical development from one‑hot encoding to modern models such as word2vec, ELMo, GPT, and BERT, and discusses the advantages, limitations, and key concepts of each technique.

Artificial IntelligenceNLPWord Embedding

0 likes · 23 min read

Overview of Natural Language Processing Techniques and Their Evolution

Python Programming Learning Circle

Jul 4, 2022 · Artificial Intelligence

Building an Advertising Recommendation Model with Python and PyTorch

This article walks through the development of a simple advertising recommendation system using Python, covering data collection, preprocessing with label encoding, text embedding via Torch, constructing an MLP model, and initiating training, while reflecting on the challenges faced by Python developers in the big‑data era.

EmbeddingMLPPyTorch

0 likes · 5 min read

Building an Advertising Recommendation Model with Python and PyTorch

Model Perspective

Jul 3, 2022 · Fundamentals

Explore 20+ Essential Modeling Articles: From Differential Equations to Machine Learning

This curated list groups recent articles on change and predictive models, covering topics such as war dynamics, population, epidemic spread, differential equations, regression, time‑series analysis, machine learning classifiers, and grey‑prediction techniques, providing students with ready references for diverse modeling approaches.

ModelingTime Seriesmachine learning

0 likes · 3 min read

Explore 20+ Essential Modeling Articles: From Differential Equations to Machine Learning

DataFunSummit

Jul 3, 2022 · Artificial Intelligence

Graph Neural Network Approaches for Internet Financial Fraud Detection

The talk examines how the COVID‑19 pandemic accelerated online financial services and fraud, outlines the challenges of traditional and internet‑based fraud detection, and presents graph neural network solutions—including PC‑GNN and AO‑GNN—demonstrating their effectiveness on real‑world and public datasets while discussing future research directions.

AUC optimizationfinancial fraudfraud detection

0 likes · 12 min read

Graph Neural Network Approaches for Internet Financial Fraud Detection

Python Crawling & Data Mining

Jul 3, 2022 · Artificial Intelligence

Logistic Regression vs KNN: Python Stock Trading Experiment

A Python enthusiast reproduces a Tsinghua University quantitative trading strategy, swapping K‑Nearest Neighbors for logistic regression, fetches three years of Moutai stock data, engineers features, trains and evaluates the model, and finds logistic regression slightly underperforms the original KNN benchmark.

logistic regressionmachine learningstock trading

0 likes · 5 min read

Logistic Regression vs KNN: Python Stock Trading Experiment

21CTO

Jun 26, 2022 · Artificial Intelligence

Can Babies Teach Us to Build the Next Generation of AI?

Researchers at Trinity College Dublin propose new AI guidelines inspired by infant learning, arguing that babies' experiential, unsupervised learning can overcome current machine learning limitations, and outlining three principles to help develop more efficient, data‑light AI systems.

AIUnsupervised Learninginfant learning

0 likes · 4 min read

Can Babies Teach Us to Build the Next Generation of AI?

DataFunTalk

Jun 24, 2022 · Artificial Intelligence

Explore‑and‑Exploit (EE) in JD Search: Bias Mitigation, Model Iteration, and Evaluation

The talk presents JD Search's Explore‑and‑Exploit (EE) module, detailing its bias‑mitigation pipeline—including position, popularity, and exposure debiasing—model architecture upgrades with SVGP and causal inference, online AB metrics, offline evaluation methods, and future research directions to improve search diversity and long‑term value.

SVGPbias mitigationexplore‑exploit

0 likes · 17 min read

Explore‑and‑Exploit (EE) in JD Search: Bias Mitigation, Model Iteration, and Evaluation

DataFunSummit

Jun 23, 2022 · Artificial Intelligence

Unlocking Data Potential: Automatic Data Augmentation, Denoising, Active Learning, and Data Splitting

The talk explains how to maximize the value of training data by exploring background on model generalization, automatic data augmentation techniques, denoising strategies, active learning for selecting unlabeled samples, and robust data splitting methods, offering practical guidelines for AI practitioners.

AIData Qualityactive learning

0 likes · 16 min read

Unlocking Data Potential: Automatic Data Augmentation, Denoising, Active Learning, and Data Splitting

Model Perspective

Jun 22, 2022 · Artificial Intelligence

Understanding Model Performance: Precision, Recall, and F1 Score Explained

This article explains how to evaluate classification models by moving beyond simple accuracy to using confusion matrices, precision, recall, and the F1 score, illustrating their trade‑offs and when each metric is most appropriate for different real‑world scenarios.

F1 scoreclassificationconfusion matrix

0 likes · 4 min read

Understanding Model Performance: Precision, Recall, and F1 Score Explained

Model Perspective

Jun 19, 2022 · Artificial Intelligence

How Decision Trees Work: From Entropy to Gini Index Explained

This article introduces decision tree algorithms, explains their role in supervised learning for classification and regression, details the construction process, compares information gain and Gini index for attribute selection, and reviews popular tree methods such as ID3, C4.5, and CART with illustrative examples.

C4.5CARTGini Index

0 likes · 7 min read

How Decision Trees Work: From Entropy to Gini Index Explained

Model Perspective

Jun 18, 2022 · Artificial Intelligence

Understanding Support Vector Machines: Theory, Example, and Python Code

This article explains the fundamentals of Support Vector Machines, describes how they separate data with optimal hyperplanes, provides a 2‑D example with visualizations, and includes Python code using scikit‑learn to generate synthetic data, plot points, and illustrate possible decision boundaries.

Support Vector Machineclassificationmachine learning

0 likes · 4 min read

Understanding Support Vector Machines: Theory, Example, and Python Code

Model Perspective

Jun 17, 2022 · Artificial Intelligence

What Is Classification in Data Mining? Types, Models, and Key Applications

The article explains classification as a data‑analysis task that builds models to assign new observations to predefined categories, outlines its implementation steps, describes various data types (boolean, nominal, ordinal, continuous, discrete), presents common machine‑learning classifiers such as decision trees and neural networks, and highlights practical applications like crime detection, disease risk prediction, and credit assessment.

Model Evaluationclassificationdata mining

0 likes · 5 min read

What Is Classification in Data Mining? Types, Models, and Key Applications

Model Perspective

Jun 17, 2022 · Artificial Intelligence

Understanding Supervised Learning: Regression vs Classification Explained

This article explains the fundamentals of supervised machine learning, distinguishing between regression and classification, describing how algorithms learn mappings from inputs to outputs, and outlining common models such as linear regression, logistic regression, decision trees, SVMs, random forests, and neural networks.

Artificial Intelligenceclassificationmachine learning

0 likes · 4 min read

Understanding Supervised Learning: Regression vs Classification Explained

Model Perspective

Jun 13, 2022 · Artificial Intelligence

Understanding Decision Trees: From Basic Process to Watermelon Example

This article explains the fundamentals of decision tree learning, describing its recursive construction, the criteria for splitting nodes using information gain based on entropy, and walks through a classic watermelon dataset example to illustrate how attributes are selected and the final tree is built.

ID3 algorithmInformation Gainclassification

0 likes · 8 min read

Understanding Decision Trees: From Basic Process to Watermelon Example

Efficient Ops

Jun 12, 2022 · Artificial Intelligence

Unlocking AI Success: A Deep Dive into the Model/MLOps Capability Maturity Framework

This article explains the globally first AI model development management standard—Model/MLOps Capability Maturity Model (Part 1: Development Management)—detailing its structure, key domains such as requirement management, test case design, and project planning, and how organizations can assess and improve their AI engineering capabilities.

AI GovernanceCapability Maturity ModelMLOps

0 likes · 9 min read

Unlocking AI Success: A Deep Dive into the Model/MLOps Capability Maturity Framework

DataFunTalk

Jun 10, 2022 · Artificial Intelligence

Intelligent Risk Control Algorithms for Logistics and Commercial Vehicle Finance

This article examines the rapid growth of financing demand among small‑and‑micro enterprises in logistics and commercial vehicle sectors, outlines the high financial penetration in the industry, and details how AI‑driven intelligent risk‑control frameworks—covering data pipelines, model selection, feature‑portrait systems, and graph‑based applications—address the challenges and opportunities of modern financial risk management.

Artificial Intelligencegraph analyticslogistics finance

0 likes · 17 min read

Intelligent Risk Control Algorithms for Logistics and Commercial Vehicle Finance

DataFunSummit

Jun 8, 2022 · Artificial Intelligence

Search Term Recommendation: Scenarios, Algorithm Design, and Future Directions

This article presents a comprehensive overview of search term recommendation in QQ Browser, covering various recommendation scenarios, challenges, query library architecture, multi‑task ranking models, coarse‑to‑fine ranking pipelines, auto‑completion strategies, and future research directions.

AImachine learningmulti-task learning

0 likes · 14 min read

Search Term Recommendation: Scenarios, Algorithm Design, and Future Directions

Laravel Tech Community

Jun 6, 2022 · Artificial Intelligence

What an Open‑Source Twitter Algorithm Would Look Like: Architecture, Data Model, and Engineering Challenges

This article examines the practical aspects of open‑sourcing Twitter’s recommendation algorithm, covering the platform’s data model, timeline views, ranking features, a TypeScript pseudocode illustration, and the major engineering challenges of scale, real‑time processing, reliability, and security.

Twitteralgorithmlarge scale

0 likes · 14 min read

What an Open‑Source Twitter Algorithm Would Look Like: Architecture, Data Model, and Engineering Challenges

Python Programming Learning Circle

Jun 5, 2022 · Artificial Intelligence

The Rise of Hugging Face: From Emoji Logo to Leading AI Platform

From its quirky start as a teenage iPhone chatbot to becoming the central hub for open‑source transformer models, Hugging Face has grown into a fast‑rising AI platform, securing $100 million Series C funding, serving thousands of organizations, and aiming to democratize machine learning.

AIFundingHugging Face

0 likes · 7 min read

Model Perspective

Jun 4, 2022 · Artificial Intelligence

Master K-means Clustering: How the Algorithm Finds Compact Groups

K-means is a classic distance‑based clustering algorithm that iteratively partitions data into k compact, well‑separated groups by minimizing the sum of squared errors, using random centroid initialization and heuristic updates until convergence, making it a fundamental tool in AI and data analysis.

K-MeansUnsupervised Learningalgorithm

0 likes · 3 min read

Master K-means Clustering: How the Algorithm Finds Compact Groups

Model Perspective

Jun 4, 2022 · Artificial Intelligence

Master Systematic Clustering: From Distance Matrix to Multi-Level Groupings

Systematic clustering, a widely used hierarchical clustering technique, builds a dendrogram by iteratively merging the closest sample points based on a distance matrix, allowing analysts to visualize and select groupings at various distance thresholds, from a single cluster to each point as its own class.

Hierarchical Clusteringclusteringdistance matrix

0 likes · 3 min read

Master Systematic Clustering: From Distance Matrix to Multi-Level Groupings

Model Perspective

Jun 4, 2022 · Fundamentals

Understanding Sample Similarity: Distance Metrics and Cluster Methods

This article explains how to quantify similarity between data samples using distance metrics such as Manhattan, Euclidean, and Chebyshev, outlines the properties these distances must satisfy, and describes common inter‑class measures like single linkage, complete linkage, centroid, group average, and sum‑of‑squares methods.

Minkowskiclusteringdistance metrics

0 likes · 4 min read

Understanding Sample Similarity: Distance Metrics and Cluster Methods

Model Perspective

Jun 2, 2022 · Artificial Intelligence

Master Polynomial Regression: Fit Non‑Linear Data with Simple Polynomials

Polynomial regression extends linear models by fitting data with higher‑order polynomial functions, requiring selection of the polynomial degree and its coefficients, and can be applied alongside other nonlinear fitting techniques to capture complex growth trends in real‑world systems.

machine learningnonlinear fittingpolynomial regression

0 likes · 3 min read

Master Polynomial Regression: Fit Non‑Linear Data with Simple Polynomials

Model Perspective

Jun 2, 2022 · Fundamentals

Understanding Simple and Multivariate Linear Regression Models

This article introduces the basics of simple (univariate) linear regression and extends to multivariate linear regression, explaining their regression equations, the use of the least‑squares method to estimate parameters, and the practical relevance of multiple predictors in modeling real‑world phenomena.

Least Squaresmachine learningmultivariate analysis

0 likes · 3 min read

Understanding Simple and Multivariate Linear Regression Models

Tencent Cloud Developer

May 31, 2022 · Artificial Intelligence

Scalable Graph Neural Architecture Search System (PaSca) – WWW 2022 Best Student Paper

PaSca, a scalable graph neural architecture search system that separates message aggregation from updates, explores over 150,000 GNN designs with multi‑objective optimization, delivers models that outperform traditional GNNs in accuracy, memory and speed, has been open‑sourced and deployed at Tencent for risk control, recommendation and fraud detection, and earned the WWW 2022 Best Student Paper award.

Big DataNeural Architecture SearchScalable Systems

0 likes · 11 min read

Scalable Graph Neural Architecture Search System (PaSca) – WWW 2022 Best Student Paper

HelloTech

May 30, 2022 · Artificial Intelligence

Harbor's Passive Growth Algorithms and Growth Engine: Practices and Insights

Harbor’s growth engine combines a passive, attribution‑driven traffic‑allocation algorithm with componentized ranking, search, and marketing systems—using pairwise/Listwise models, multi‑task CTR/CVR prediction, and automated strategy triggers—to align short‑term efficiency with long‑term LTV goals while moving toward causal inference and domain‑expert‑driven general models.

AIalgorithm engineeringgrowth algorithms

0 likes · 11 min read

Harbor's Passive Growth Algorithms and Growth Engine: Practices and Insights

TAL Education Technology

May 26, 2022 · Artificial Intelligence

GoodFuture International Algorithm Team Wins Champion and Runner‑up in the 5th Educational Data Mining Workshop

The GoodFuture International Algorithm Team, together with Jinan University Guangdong Smart Education Research Institute, distinguished themselves among 95 global teams in the 5th Educational Data Mining in Computer Science Education Workshop, securing a champion title in one task and a runner‑up in another, showcasing advanced AI‑driven predictive and recommendation techniques for intelligent student assessment.

Educational Data MiningPredictive Modelingmachine learning

0 likes · 6 min read

GoodFuture International Algorithm Team Wins Champion and Runner‑up in the 5th Educational Data Mining Workshop