Tagged articles

Text Classification

57 articles · Page 1 of 1

Jul 3, 2026 · Artificial Intelligence

NLP Study Notes: How Deep Learning Powers Natural Language Processing

This article explains how deep learning models such as RNN, LSTM, GRU and Transformer enable NLP tasks like machine translation, text classification, question answering and text generation, outlines their advantages over traditional methods, and provides a Keras code example for text classification.

Deep LearningKerasMachine Translation

0 likes · 8 min read

NLP Study Notes: How Deep Learning Powers Natural Language Processing

Sohu Tech Products

Nov 26, 2025 · Artificial Intelligence

How Cleanlab Cut Data Review by 34×: A Real‑World Text Classification Case Study

This article walks through a real text‑classification project where noisy labels inflated the review workload to over 15,000 samples, and shows how using cleanlab’s confident‑learning framework reduced the manual audit set to 438 items, boosting efficiency by thirty‑four times while improving model performance.

Data QualityData-centric AIText Classification

0 likes · 16 min read

How Cleanlab Cut Data Review by 34×: A Real‑World Text Classification Case Study

Open Source Tech Hub

Aug 22, 2025 · Artificial Intelligence

Automate User Feedback Classification with a Large‑Model API in PHP

This guide shows how to use the Tongyi Qianwen large‑model API with PHP to automatically classify user feedback into predefined categories, eliminating manual analysis and complex NLP development while providing clear steps, code, and result interpretation for rapid business insights.

APIAutomationLarge Language Model

0 likes · 7 min read

Automate User Feedback Classification with a Large‑Model API in PHP

OPPO Amber Lab

Apr 26, 2024 · Artificial Intelligence

Deploy Efficient Text Classification on Android with TensorFlow Lite

This guide walks you through the end‑to‑end process of building, training, converting, and deploying a TensorFlow Lite text‑classification model on Android, covering data preparation, model selection, performance trade‑offs, and integration using the TFLite Task Library.

AndroidTensorFlow LiteText Classification

0 likes · 19 min read

Deploy Efficient Text Classification on Android with TensorFlow Lite

Bilibili Tech

Feb 18, 2024 · Artificial Intelligence

Bilibili Personal Attack Content Governance: Background, Goals, Methods, and Effectiveness

Bilibili combats personal‑attack and trolling comments by combining sector‑specific keyword databases, user‑group analysis, advanced word‑matching (including pinyin and homophone detection) and multiple NLP/graph models, which has cut personal‑attack reports in entertainment, film and gaming by about 32 % and trolling reports by roughly 25 % between June and December 2023.

BilibiliText Classificationabusive language detection

0 likes · 12 min read

Bilibili Personal Attack Content Governance: Background, Goals, Methods, and Effectiveness

Open Source Tech Hub

Jan 28, 2024 · Artificial Intelligence

How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer

This guide explains how to use ModelScope’s trainer components to fine‑tune a pretrained backbone for text classification, covering dataset loading, configuration modification, trainer construction, training, evaluation, prediction, and checkpoint management with concrete code examples.

ModelScopePyTorchText Classification

0 likes · 11 min read

How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer

Sohu Tech Products

Sep 6, 2023 · Mobile Development

Building an iOS SMS Spam Filter App with CoreML

This tutorial walks through creating a custom iOS SMS spam filter app, covering extraction of personal SMS data from an iPhone backup, training a CoreML text‑classification model with CreateML, implementing a Message Filter Extension in Xcode, and exploring advanced update strategies.

App ExtensionCoreMLMobile App

0 likes · 12 min read

Building an iOS SMS Spam Filter App with CoreML

Sohu Tech Products

Jun 7, 2023 · Artificial Intelligence

Multiscale PU Learning for Detecting AI‑Generated Text

Researchers from Peking University and Huawei present a multiscale positive‑unlabeled learning framework that significantly improves detection of AI‑generated short and long texts, addressing the difficulty of distinguishing AI‑written content from human writing and outperforming existing baselines on multiple benchmarks.

AI detectionLarge Language ModelsPu-Learning

0 likes · 8 min read

Multiscale PU Learning for Detecting AI‑Generated Text

Bitu Technology

Jul 8, 2022 · Artificial Intelligence

Applying NLP and Machine Learning to Classify Tubi User Feedback

This article explains how Tubi leverages natural‑language processing, sentence embeddings (USE and BERT), and LightGBM models to automatically categorize large volumes of Net Promoter Score comments and customer‑support tickets, enabling data‑driven product decisions and workflow automation.

LightGBMNLPText Classification

0 likes · 11 min read

Applying NLP and Machine Learning to Classify Tubi User Feedback

DataFunSummit

Jun 11, 2022 · Artificial Intelligence

Transforming Regular Expressions into Neural Networks for Text Classification and Slot Filling

This article explains how regular expressions can be converted into equivalent neural network models—FA‑RNN for classification and FST‑RNN for slot filling—by leveraging finite‑state automata, tensor decomposition, and pretrained word embeddings, achieving zero‑shot performance and strong results in low‑resource scenarios.

FA-RNNText Classificationneural networks

0 likes · 17 min read

Transforming Regular Expressions into Neural Networks for Text Classification and Slot Filling

Zuoyebang Tech Team

Apr 15, 2022 · Artificial Intelligence

Zuoyebang’s NLP Platforms: Boosting Online Education with AI

In this interview, Zuoyebang’s NLP lead explains how the company built self‑developed platforms like IQC and FTP to automate text quality inspection and intelligent labeling, outlines their architecture, shares practical deep‑learning applications such as translation and grammar correction, and discusses future research directions in large‑scale multi‑label classification, few‑shot learning, and multimodal models.

AI PlatformsNLPText Classification

0 likes · 11 min read

Zuoyebang’s NLP Platforms: Boosting Online Education with AI

DataFunTalk

Mar 17, 2022 · Artificial Intelligence

A Survey of Text Classification and Intent Recognition: Industrial and Research Perspectives

This article reviews recent developments in text classification and intent recognition, comparing industrial practices such as business‑coupled feature engineering with research trends like pretrained language models, and provides references and practical insights for building effective NLP solutions.

Intent RecognitionNLPPretrained Models

0 likes · 13 min read

A Survey of Text Classification and Intent Recognition: Industrial and Research Perspectives

IEG Growth Platform Technology Team

Feb 14, 2022 · Artificial Intelligence

Multimodal Evolution and Application in Tencent Game Advertising System

This article describes the end‑to‑end multimodal modeling pipeline—covering text, image, and video understanding, model evolution from shallow to deep networks, key‑frame extraction, fine‑tuning, and multimodal fusion—used in Tencent's game ad exchange platform, along with practical deployment challenges and solutions.

AdvertisingCNNMultimodal Learning

0 likes · 22 min read

Multimodal Evolution and Application in Tencent Game Advertising System

DataFunSummit

Jan 16, 2022 · Artificial Intelligence

Multimodal Text and Speech Emotion Analysis: Overview, MSCNN‑SPU Model, and Domain Adaptation

This talk presents an overview of text‑plus‑speech multimodal emotion analysis, covering background, single‑modal text and audio models, the MSCNN‑SPU multimodal architecture, domain‑adaptation techniques, and future directions, with detailed model explanations, experimental results, and practical deployment insights.

Audio ProcessingDeep LearningText Classification

0 likes · 40 min read

Multimodal Text and Speech Emotion Analysis: Overview, MSCNN‑SPU Model, and Domain Adaptation

Baobao Algorithm Notes

Jan 14, 2022 · Artificial Intelligence

Boosting BERT Text Classification with Label Embedding: How It Works

The paper proposes a simple yet effective method that fuses label embeddings into BERT, enhancing text‑classification performance without increasing computational cost, and validates the approach across six benchmark datasets, also exploring tf‑idf‑based label augmentation and the impact of using [SEP] versus no‑[SEP] inputs.

BERTDeep LearningNLP

0 likes · 8 min read

Boosting BERT Text Classification with Label Embedding: How It Works

ByteDance Terminal Technology

Jan 7, 2022 · Information Security

Graph-Based Detection of Malicious Webpages: Methods, Experiments, and Future Directions

This article presents a comprehensive study on detecting malicious webpages by constructing heterogeneous graphs from URL redirection and textual features, applying Graph Convolutional Networks and Cluster‑Text‑GCN models, detailing optimization techniques for large‑scale deployment, and outlining future research directions.

GCNGraph Neural NetworksText Classification

0 likes · 11 min read

Graph-Based Detection of Malicious Webpages: Methods, Experiments, and Future Directions

Code DAO

Dec 12, 2021 · Artificial Intelligence

How to Boost Text Analysis Accuracy on a 2‑Billion‑Word Corpus

This article explains practical techniques for improving NLP model accuracy on massive corpora, covering challenges of multi‑field text, word‑embedding choices, a fasttext‑based regression demo with book‑review data, feature engineering tricks, and a comparison with tf‑idf + LASSO.

NLPPythonRegression

0 likes · 13 min read

How to Boost Text Analysis Accuracy on a 2‑Billion‑Word Corpus

DataFunTalk

Aug 14, 2021 · Artificial Intelligence

Multimodal Advertisement Detection System for WeChat "KanKan" Articles

This article introduces a multimodal advertisement detection framework for WeChat KanKan that decomposes the problem into text, image, and article‑structure dimensions, presents novel models for ad text and image recognition, and describes how sequence classification and visualisation are used to filter severe ad‑spam articles.

Multimodal AIText ClassificationWeChat

0 likes · 16 min read

Multimodal Advertisement Detection System for WeChat "KanKan" Articles

58 Tech

Aug 10, 2021 · Artificial Intelligence

Active Learning and Model Enhancements for Semantic Tag Mining in 58.com Voice Data

This article presents a comprehensive study on extracting semantic tags from 58.com voice data, detailing the use of active learning to address cold‑start problems, comparing keyword matching, XGBoost, TextCNN, CRNN, and an improved Wide&Deep model, and demonstrating significant reductions in labeling effort and superior F1 scores across multiple experiments.

Active LearningCRNNText Classification

0 likes · 15 min read

Active Learning and Model Enhancements for Semantic Tag Mining in 58.com Voice Data

Ctrip Technology

Jul 29, 2021 · Artificial Intelligence

NLP Techniques for Classifying Ctrip Ticket Customer Service Conversations

This article presents the background, problem analysis, data preprocessing, modeling approaches and optimization results of applying various NLP methods—including statistical models, word embeddings, attention mechanisms and pretrained language models such as BERT—to improve the accuracy of classifying Ctrip ticket customer service dialogues.

BERTDeep LearningNLP

0 likes · 13 min read

NLP Techniques for Classifying Ctrip Ticket Customer Service Conversations

NetEase Media Technology Team

Apr 13, 2021 · Artificial Intelligence

Applying BERT for News Timeliness Classification at NetEase

The article describes how NetEase adapts a pre‑trained BERT model to classify news articles into ultra‑short, short, or long timeliness categories by combining rule‑based strong and weak time cues, key‑sentence extraction, domain‑embedding fusion and multi‑layer semantic aggregation, achieving accurate and interpretable predictions for its platform.

Artificial IntelligenceBERTModel Fusion

0 likes · 12 min read

Applying BERT for News Timeliness Classification at NetEase

58 Tech

Mar 1, 2021 · Artificial Intelligence

Intelligent QABot for 58.com: Classification and Retrieval Model Exploration

This article describes how 58.com’s AI Lab built and continuously improved the QABot intelligent customer‑service system by designing classification and retrieval models, evaluating FastText, LSTM‑DSSM, BERT and a self‑developed SPTM framework, and finally fusing them to boost answer rates and user experience.

AI ChatbotBERTModel Fusion

0 likes · 9 min read

Intelligent QABot for 58.com: Classification and Retrieval Model Exploration

58 Tech

Sep 21, 2020 · Artificial Intelligence

58.com AI Algorithm Competition: Winning Teams and Their Technical Solutions

The 58.com AI Algorithm Competition showcased intelligent customer‑service technology, with 158 teams competing on text classification and matching tasks, and the top five teams presenting detailed BERT, ELECTRA, focal‑loss and multi‑model fusion solutions along with award ceremonies, video recordings and PPT resources.

AIBERTELECTRA

0 likes · 9 min read

58.com AI Algorithm Competition: Winning Teams and Their Technical Solutions

Baobao Algorithm Notes

Aug 28, 2020 · Artificial Intelligence

Avoid Common Pitfalls in Industrial Text Classification: A Practical Guide

This comprehensive guide examines real‑world text classification projects, covering label taxonomy design, data scarcity solutions, efficient annotation, new‑class discovery, algorithm selection, evaluation metrics, OOV handling, model evolution, rule‑model integration, performance‑boosting tricks, and inference under resource constraints.

NLPText Classificationadversarial validation

0 likes · 15 min read

Avoid Common Pitfalls in Industrial Text Classification: A Practical Guide

58 Tech

Aug 14, 2020 · Artificial Intelligence

Using SPTM in qa_match for the 58 City AI Competition: Data Preparation, Model Training, and Prediction

This article provides a step‑by‑step guide on preparing data, pre‑training the SPTM lightweight model, fine‑tuning a text‑classification model with qa_match, and generating competition‑ready predictions for the 58 City AI Algorithm Contest, including all required shell commands and parameter explanations.

AISPTMText Classification

0 likes · 9 min read

Using SPTM in qa_match for the 58 City AI Competition: Data Preparation, Model Training, and Prediction

58 Tech

Aug 12, 2020 · Artificial Intelligence

Guide to Using SPTM (Simple Pre-trained Model) with qa_match for an AI Competition

This article provides a step‑by‑step tutorial on preparing data, pre‑training the SPTM language model, fine‑tuning a text‑classification model, generating predictions, and creating a submission file for the 58.com AI algorithm competition using the open‑source qa_match toolkit.

AIModel TrainingNLP

0 likes · 9 min read

Guide to Using SPTM (Simple Pre-trained Model) with qa_match for an AI Competition

DataFunTalk

Aug 6, 2020 · Artificial Intelligence

Practical Applications of Pretrained Language Models (BERT, GPT, ELMo) in NetEase Yanxuan NLP Tasks

The article reviews the principles of popular pretrained language models, compares their architectures, and details how NetEase Yanxuan applied BERT, GPT and ELMo to classification, matching, sequence labeling and generation tasks, presenting experimental results and deployment insights.

BERTNLPText Classification

0 likes · 20 min read

Practical Applications of Pretrained Language Models (BERT, GPT, ELMo) in NetEase Yanxuan NLP Tasks

Tencent Advertising Technology

Jul 30, 2020 · Artificial Intelligence

Winning Strategies for the Tencent Advertising Algorithm Competition: Text Classification with Word2Vec and BiLSTM

The article details the Tencent Advertising Algorithm competition final, explains the chizhu team's approach of converting ad IDs into word sequences for text classification using large‑scale word2vec embeddings and a dual BiLSTM architecture, presents custom loss functions, training tricks, and shares full Python model code, achieving an overall rank of 11.

AdvertisingBiLSTMDeep Learning

0 likes · 9 min read

Winning Strategies for the Tencent Advertising Algorithm Competition: Text Classification with Word2Vec and BiLSTM

DataFunTalk

Jun 28, 2020 · Artificial Intelligence

Applying UDA Semi‑Supervised Learning to Financial Text Classification: Experiments and Insights

This article investigates the practical performance of Google’s 2019 Unsupervised Data Augmentation (UDA) framework on real‑world financial text classification tasks, detailing experiments with limited labeled data, domain‑out‑of‑distribution samples, noisy labels, and comparisons between BERT and lightweight TextCNN models.

BERTFinancial NLPSemi-supervised Learning

0 likes · 21 min read

Applying UDA Semi‑Supervised Learning to Financial Text Classification: Experiments and Insights

Tencent Advertising Technology

Jun 2, 2020 · Artificial Intelligence

Turning Ad Click Sequences into Age & Gender Predictions with Transformers

This article shares a competition winner's step‑by‑step solution for predicting user age and gender from ad click sequences, treating IDs as words, using word2vec embeddings, a custom transformer‑LSTM model, dual‑task loss, and weight‑search post‑processing.

AdvertisingNLPText Classification

0 likes · 7 min read

Turning Ad Click Sequences into Age & Gender Predictions with Transformers

DataFunTalk

Apr 5, 2020 · Artificial Intelligence

WeChat Hotspot Mining Platform: Architecture, Detection, and Presentation

This article describes a WeChat hotspot mining platform that integrates multiple data sources, builds quality and prediction models, employs advanced clustering and multi‑granular text matching techniques, and uses generative active learning to efficiently discover, predict, and present news hotspots for users.

Active LearningText ClassificationWeChat

0 likes · 17 min read

WeChat Hotspot Mining Platform: Architecture, Detection, and Presentation

Xueersi Online School Tech Team

Jan 17, 2020 · Artificial Intelligence

Fine‑tuning BERT for Sentence Pair Similarity in an Online Education Platform

This article describes how a BERT‑based model is fine‑tuned to compute sentence‑pair similarity for improving recommendation accuracy in an online school, detailing the architecture, training mechanisms, code implementation, experimental results, and future extensions such as sentiment analysis.

BERTChinese NLPDeep Learning

0 likes · 20 min read

Fine‑tuning BERT for Sentence Pair Similarity in an Online Education Platform

Amap Tech

Jan 3, 2020 · Artificial Intelligence

Machine Learning Solutions for User Feedback Intelligence at Amap (Gaode Maps)

Amap replaced its rule‑based feedback pipeline with a three‑stage, LSTM‑driven system that combines word2vec embeddings and structured fields, achieving over 96% classification accuracy, cutting manual workload by 80%, and slashing per‑task costs while enabling scalable, data‑driven map quality improvements.

Gaode MapsLSTMNLP

0 likes · 14 min read

Machine Learning Solutions for User Feedback Intelligence at Amap (Gaode Maps)

Yanxuan Tech Team

Dec 9, 2019 · Artificial Intelligence

How NetEase Yanxuan Leverages BERT, GPT, and ELMo for Real-World NLP Tasks

This article reviews the evolution of language models from bag‑of‑words to BERT, compares ELMo, GPT, and BERT architectures, and details how NetEase Yanxuan applies pre‑trained models to classification, text matching, sequence labeling, and generative tasks in production.

BERTELMoGPT

0 likes · 19 min read

How NetEase Yanxuan Leverages BERT, GPT, and ELMo for Real-World NLP Tasks

Ctrip Technology

Nov 21, 2019 · Artificial Intelligence

Designing and Deploying an NLP Model for Airline Ticket Customer Service

This article describes the end‑to‑end development of a multi‑class NLP system for Ctrip airline ticket customer service, covering problem analysis, data preprocessing, sample balancing, model architecture (TextCNN and Bi‑GRU), training strategies, performance evaluation, and online customization to achieve high accuracy in intent recognition.

Bi-GRUDeep LearningModel Deployment

0 likes · 16 min read

Designing and Deploying an NLP Model for Airline Ticket Customer Service

360 Tech Engineering

Nov 13, 2019 · Artificial Intelligence

Text Anti‑Spam Techniques and TextCNN Model for Real‑Time Spam Detection on the Huajiao Platform

This article introduces the Huajiao platform's text anti‑spam architecture, analyzes spam categories and challenges, compares rule‑based and machine‑learning approaches, details traditional NLP methods and the TextCNN deep‑learning model, provides its TensorFlow implementation, and describes the online deployment workflow.

CNNNLPTensorFlow

0 likes · 14 min read

Text Anti‑Spam Techniques and TextCNN Model for Real‑Time Spam Detection on the Huajiao Platform

Huajiao Technology

Nov 12, 2019 · Artificial Intelligence

Text Anti‑Spam Detection with TextCNN: From Traditional Methods to Online Deployment

This article introduces the challenges of text‑based spam on the Huajiao platform, reviews traditional rule‑based and machine‑learning classification methods, explains the TextCNN architecture for robust character‑level detection, and details its TensorFlow Serving deployment for real‑time anti‑spam services.

CNNTensorFlowText Classification

0 likes · 16 min read

Text Anti‑Spam Detection with TextCNN: From Traditional Methods to Online Deployment

JD Tech Talk

Jul 24, 2019 · Artificial Intelligence

Absolute Semantic Recognition Competition: Feature Design, Modeling Strategy, and Core Algorithm Insights

This article presents a comprehensive solution to the absolute semantic recognition competition, detailing the problem background, dataset, evaluation metrics, feature engineering, model architecture—including Attention, Capsule, Bi‑GRU, and BERT—and analysis of results and lessons learned.

BERTCapsule NetworksNLP

0 likes · 11 min read

Absolute Semantic Recognition Competition: Feature Design, Modeling Strategy, and Core Algorithm Insights

Tencent Cloud Developer

Jul 19, 2019 · Artificial Intelligence

Multi-turn Dialogue Intent Classification: Data Processing, Model Construction, and Operational Optimization

The article details a multi‑turn dialogue intent classification pipeline that extracts and expands labeled utterances, preprocesses text with custom tokenization, trains a two‑layer CNN‑Highway and a multi‑head self‑attention model, analyzes errors, and achieves up to 98.7% accuracy on a large, balanced dataset.

BERTCNNText Classification

0 likes · 15 min read

Multi-turn Dialogue Intent Classification: Data Processing, Model Construction, and Operational Optimization

iQIYI Technical Product Team

May 17, 2019 · Artificial Intelligence

Kui: AI-Powered Anti-Spam System Architecture and Strategies

Kui is iQiyi’s AI‑driven anti‑spam platform that protects online communities through a three‑layer architecture—service, algorithm strategy, and auxiliary modules—and employs keyword, rule‑based, machine‑learning, and risk‑control strategies to detect advertising, pornographic, abusive and other malicious content while continuously adapting to evolving threats.

AI systemText Classificationanti-spam

0 likes · 10 min read

Kui: AI-Powered Anti-Spam System Architecture and Strategies

58 Tech

Feb 22, 2019 · Artificial Intelligence

Algorithm Evolution and Implementation of 58.com Intelligent QABot for Business Consultation

The article details the design and iterative improvement of 58.com’s intelligent QABot, covering knowledge‑base construction, feature engineering, three generations of classification models—including FastText, Bi‑LSTM, and deep semantic matching—and evaluation metrics that achieve high accuracy and automation rates.

AIDeep LearningIntelligent Customer Service

0 likes · 12 min read

Algorithm Evolution and Implementation of 58.com Intelligent QABot for Business Consultation

iQIYI Technical Product Team

Jan 25, 2019 · Artificial Intelligence

Multimodal Video Quality Assessment Models for Short Video Platforms

The paper presents an integrated multimodal quality assessment system for short‑video platforms that evaluates cover images, video content, and accompanying text using deep‑learning and handcrafted features—combining ResNet‑50, NetVLAD, TSN, VGGish, and XGBoost—to improve user experience, recommendation accuracy, and operational efficiency, with plans for optimization and modular deployment.

Image AnalysisMultimodal LearningText Classification

0 likes · 11 min read

Multimodal Video Quality Assessment Models for Short Video Platforms

Tencent TDS Service

Jan 24, 2019 · Artificial Intelligence

Unlocking BERT: How Its Transformer Engine Powers State-of-the-Art Text Classification

This article explains BERT’s architecture—from its bidirectional Transformer encoder and attention mechanisms to its pre‑training tasks—and presents extensive experiments showing its superior performance on various Chinese and English text‑classification benchmarks across multiple datasets.

BERTNLPText Classification

0 likes · 22 min read

Unlocking BERT: How Its Transformer Engine Powers State-of-the-Art Text Classification

DataFunTalk

Jan 9, 2019 · Artificial Intelligence

Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications

This article introduces reinforcement learning fundamentals, contrasts it with supervised learning, and explores its challenges and advantages in natural language processing, including applications such as text classification, relation extraction from noisy data, and weakly supervised topic segmentation, while summarizing key insights and experimental results.

Text ClassificationWeak Supervisionnatural language processing

0 likes · 11 min read

Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications

AntTech

Aug 16, 2018 · Artificial Intelligence

Deep Learning Approaches for Text Classification in Alipay Complaint Fraud Detection

This article reviews deep‑learning‑based text classification techniques—including TextCNN, BiGRU, Capsule Networks, Attention mechanisms, and the novel cw2vec embedding—applied to Alipay complaint fraud data, presents experimental comparisons, and discusses their advantages, challenges, and future directions.

AlipayDeep LearningText Classification

0 likes · 18 min read

Deep Learning Approaches for Text Classification in Alipay Complaint Fraud Detection

Practical DevOps Architecture

Aug 2, 2018 · Artificial Intelligence

Naive Bayes Text Classification: Theory, Implementation, and Evaluation

This article explains the principles of Naive Bayes text classification, detailing feature representation, model selection, training and testing procedures, probability calculations, code implementation in Python, and evaluation metrics such as accuracy, precision, recall, PR and ROC curves.

Evaluation MetricsNaive BayesPrecision

0 likes · 22 min read

Naive Bayes Text Classification: Theory, Implementation, and Evaluation

Tencent Cloud Developer

Mar 21, 2018 · Artificial Intelligence

Abusive Comment Detection Using TextCNN: A Strategy + Algorithm Approach

The article proposes a hybrid approach that first filters blacklist words and then classifies suspicious comments with a character-level TextCNN, achieving around 89% precision and 87% recall, demonstrating that simple convolutional networks outperform keyword filters and RNNs for short, noisy abusive Chinese text.

Abusive Comment DetectionDeep LearningNLP

0 likes · 10 min read

Abusive Comment Detection Using TextCNN: A Strategy + Algorithm Approach

360 Zhihui Cloud Developer

Mar 6, 2018 · Artificial Intelligence

Master Naive Bayes: From Theory to Python Text Classification

This article introduces the Naive Bayes classifier, explains its underlying probability formulas—including conditional probability, total probability, and the Bayes theorem—covers the feature independence assumption, Laplace smoothing, and demonstrates both manual and scikit‑learn implementations for email and text classification with Python code.

Naive BayesScikit-learnText Classification

0 likes · 11 min read

Master Naive Bayes: From Theory to Python Text Classification

Baobao Algorithm Notes

Feb 28, 2018 · Artificial Intelligence

Mastering Text Classification: From TF‑IDF to Word Embeddings and Deep Learning

This article provides a comprehensive guide to text classification, covering traditional pipelines, bag‑of‑words and TF‑IDF features, dimensionality‑reduction techniques, word‑embedding models such as GloVe, word2vec and fastText, and modern deep‑learning architectures like CNN, RCNN and HAN.

CNNDeep LearningNLP

0 likes · 9 min read

Mastering Text Classification: From TF‑IDF to Word Embeddings and Deep Learning

Huawei Cloud Developer Alliance

Jan 16, 2018 · Artificial Intelligence

How to Build a Scalable Spark-Based Text Sentiment Analysis System

This article walks through constructing a Spark-powered text sentiment analysis pipeline—from crawling movie reviews, preprocessing and feature extraction with jieba and TF‑IDF, to training Naive Bayes and SVM classifiers—while discussing Spark's advantages and ways to improve model accuracy.

Big DataNLPPython

0 likes · 19 min read

How to Build a Scalable Spark-Based Text Sentiment Analysis System

iQIYI Technical Product Team

Dec 15, 2017 · Artificial Intelligence

Sentiment Classification of iQIYI User Comments: Model Selection, Feature Engineering, and Online Deployment

The team built a lightweight three‑class sentiment classifier for iQIYI user comments using a linear‑kernel SVM with high‑dimensional bag‑of‑words features and an expanded ~100k word lexicon, achieving over 96% accuracy across domains, and deployed it as a Spring Boot PMML service with zero‑downtime refresh, while planning GBDT‑enhanced features and word‑embedding optimizations.

NLPSentiment AnalysisText Classification

0 likes · 13 min read

Sentiment Classification of iQIYI User Comments: Model Selection, Feature Engineering, and Online Deployment

Meituan Technology Team

Oct 12, 2017 · Artificial Intelligence

Machine Learning Q&A: Data Imputation, Feature Selection, Recommendation Systems and More

The article answers ten machine‑learning questions, explaining how to impute missing behavior data, extract and select features, describe Meituan‑Dianping’s recommendation pipeline, suggest a beginner learning path, clarify L1 sparsity, recommend TextCNN for text, discuss search‑ranking sample bias, label generation for wide‑deep models, the shift to deep‑learning video detection, and the use of factorization machines for CTR with open‑source examples.

Deep LearningL1 RegularizationRecommendation Systems

0 likes · 7 min read

Machine Learning Q&A: Data Imputation, Feature Selection, Recommendation Systems and More

Alibaba Cloud Developer

Jul 6, 2017 · Artificial Intelligence

How Alibaba’s Conv‑RNN Boosts Voice Assistant QA and Text Classification

Alibaba’s Tmall Genie X1 showcases a new semantic encoding model called Conv‑RNN that improves intelligent question answering and text classification, achieving state‑of‑the‑art results on benchmark datasets while illustrating the broader impact of semantic encoding on NLP applications.

NLPText Classificationconv-RNN

0 likes · 8 min read

How Alibaba’s Conv‑RNN Boosts Voice Assistant QA and Text Classification

Qunar Tech Salon

Aug 18, 2016 · Artificial Intelligence

Automatic Ticket Classification Using SVM and word2vec at Qunar

At Qunar, the data center algorithm team developed an automatic ticket classification system that combines Support Vector Machine with word2vec embeddings to handle high‑dimensional, low‑sample text data, achieving 89% accuracy and 80% recall while outlining the full machine‑learning pipeline from feature extraction to deployment.

QunarText ClassificationWord2Vec

0 likes · 7 min read

Automatic Ticket Classification Using SVM and word2vec at Qunar

21CTO

Feb 12, 2016 · Artificial Intelligence

Can Machine Learning Reveal the True Author of Red Mansions' Final 40 Chapters?

This article uses machine learning to compare lexical patterns between the first 80 and last 40 chapters of 'Dream of the Red Chamber', demonstrating distinct stylistic differences that support the scholarly view that the final chapters were not authored by Cao Xueqin.

Red MansionsSupport Vector MachineText Classification

0 likes · 6 min read

Can Machine Learning Reveal the True Author of Red Mansions' Final 40 Chapters?

Suning Technology

Jun 18, 2015 · Artificial Intelligence

How Suning Uses Naive Bayes for High‑Accuracy Product Classification

This article explains Suning's implementation of a Naive Bayes‑based product classification system, detailing its basic theory, formal definition, step‑by‑step training process, three implementation phases, evaluation results, and error analysis to improve classification accuracy.

Naive BayesSuningText Classification

0 likes · 6 min read

How Suning Uses Naive Bayes for High‑Accuracy Product Classification

Meituan Technology Team

Dec 18, 2014 · Artificial Intelligence

Auto-Label Missing POI Categories Using Naive Bayes and Feature Selection

This article details a step‑by‑step machine‑learning pipeline that transforms over one million calibrated POI records into feature vectors, selects discriminative terms via information‑gain and domain rules, trains a Naive Bayes classifier, and achieves 91% accuracy with 84% coverage on unseen POI data.

Chinese NLPNaive BayesPOI classification

0 likes · 12 min read

Auto-Label Missing POI Categories Using Naive Bayes and Feature Selection