Tagged articles

BERT

150 articles · Page 1 of 2

Jul 2, 2026 · Artificial Intelligence

NLP Study Notes: How Pre‑trained Models Transform Language Processing

This article reviews the evolution of pre‑trained models in natural language processing, from early word embeddings to Transformer‑based architectures like BERT and its variants, outlines their wide‑range applications such as QA, translation, and dialogue, and discusses remaining challenges and future research directions.

AIBERTNLP

0 likes · 6 min read

NLP Study Notes: How Pre‑trained Models Transform Language Processing

AI Engineer Programming

Apr 26, 2026 · Artificial Intelligence

From Bag‑of‑Words to Semantics: How Embeddings Turn Meaning into Numbers (Part 2)

The article explains how embedding techniques encode semantic information into numeric vectors, covering Word2Vec and GloVe fundamentals, BERT anisotropy, SimCSE contrastive learning, alignment and uniformity metrics, ANN index structures such as HNSW, IVF and PQ, Matryoshka representation learning, practical deployment challenges, and evaluation best practices.

ANNBERTEmbedding

0 likes · 23 min read

From Bag‑of‑Words to Semantics: How Embeddings Turn Meaning into Numbers (Part 2)

Bighead's Algorithm Notes

Mar 14, 2026 · Artificial Intelligence

Quantitative Finance Paper Digest: AI‑Driven Market Prediction Studies (Mar 7‑13 2026)

This digest summarizes four recent research papers that apply advanced AI techniques—node‑transformer graphs with BERT sentiment analysis, a quantum‑classical LSTM‑Born machine hybrid, large‑language‑model benchmarking for portfolio optimization, and a conditional diffusion model—to improve stock market prediction, volatility forecasting, and investment decision making, providing detailed experimental results and statistical validation.

BERTLarge Language ModelTransformer

0 likes · 10 min read

Quantitative Finance Paper Digest: AI‑Driven Market Prediction Studies (Mar 7‑13 2026)

HyperAI Super Neural

Jan 19, 2026 · Artificial Intelligence

AI Boosts Scientists’ Careers by 1.37 Years While Shrinking Research Scope by 4.63%

A Nature study by Tsinghua and the University of Chicago analyzes 41.3 million papers and 5.37 million scientists, showing that AI tools dramatically increase individual researchers’ output and citation impact but simultaneously contract the overall knowledge breadth and collaborative density of science.

AIBERTKnowledge breadth

0 likes · 13 min read

AI Boosts Scientists’ Careers by 1.37 Years While Shrinking Research Scope by 4.63%

JD Tech

Dec 30, 2025 · Artificial Intelligence

How a Semi‑Supervised Unified Framework Boosts E‑commerce Query Intent Classification

The paper introduces a semi‑supervised, extensible unified framework (SSUF) that integrates knowledge, label, and structural enhancements to overcome data sparsity, label bias, and fragmented sub‑tasks in e‑commerce query intent prediction, achieving superior offline and online performance.

BERTGCNSemi-supervised Learning

0 likes · 14 min read

How a Semi‑Supervised Unified Framework Boosts E‑commerce Query Intent Classification

Bighead's Algorithm Notes

Dec 2, 2025 · Artificial Intelligence

Dual-Relation Fusion Network (DRFN) for Accurate Stock Prediction

The paper introduces DRFN, a dual‑relation fusion network that jointly models static and dynamic stock relationships using multimodal BERT and GRU encodings, achieving significantly lower RMSE and MAE than baseline models on both Chinese and US market datasets.

BERTGRUGraph Neural Network

0 likes · 11 min read

Dual-Relation Fusion Network (DRFN) for Accurate Stock Prediction

Instant Consumer Technology Team

Aug 15, 2025 · Artificial Intelligence

Master the iFLYTEK Prohibited Words Classification Challenge: Baselines & BERT

This article introduces the iFLYTEK AI Developer Competition on prohibited‑word classification, outlines the task, dataset, evaluation metric, and provides three baseline solutions—including a logistic‑regression model, a BERT fine‑tuning approach, and a large‑model prompt method—along with code snippets and performance notes.

BERTLarge Language ModelNLP

0 likes · 15 min read

Master the iFLYTEK Prohibited Words Classification Challenge: Baselines & BERT

Architects' Tech Alliance

Jun 11, 2025 · Artificial Intelligence

From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models

This article chronicles the rapid development of large language models from the 2017 Transformer breakthrough through the rise of BERT, GPT‑3, ChatGPT, multimodal systems like GPT‑4V/o, and the recent cost‑efficient DeepSeek‑R1, highlighting key architectural innovations, scaling trends, alignment techniques, and their transformative impact on AI research and industry.

AI alignmentBERTCost‑Efficient Inference

0 likes · 26 min read

From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models

Alibaba Cloud Big Data AI Platform

Mar 21, 2025 · Artificial Intelligence

How to Build Multimodal Image Tagging with RAM and BERT in DataWorks Notebook

This tutorial walks through using DataWorks Notebook with GPU support to combine the open‑vocabulary visual model RAM and the language model BERT for zero‑shot multimodal image captioning, covering environment setup, model installation, dataset preparation, tagging code, and result visualization.

BERTDataWorksMultimodal

0 likes · 13 min read

How to Build Multimodal Image Tagging with RAM and BERT in DataWorks Notebook

ZhongAn Tech Team

Dec 28, 2024 · Artificial Intelligence

Weekly AI Digest Issue 8: OpenAI Robotics, ModernBERT Upgrade, Spatial Cognition, LLM Agent Evolution, and GNN‑LLM Fusion

This issue surveys recent AI developments, covering OpenAI's renewed robot program, the ModernBERT encoder upgrade, spatial reasoning advances in multimodal models, automated environment generation for LLM agents, and a novel GNN‑LLM approach for label‑free node classification.

BERTGraph Neural NetworksLLM

0 likes · 10 min read

Weekly AI Digest Issue 8: OpenAI Robotics, ModernBERT Upgrade, Spatial Cognition, LLM Agent Evolution, and GNN‑LLM Fusion

DataFunSummit

Jul 22, 2024 · Artificial Intelligence

From BERT to LLM: Language Model Applications in 360 Advertising Recommendation

This talk explores how 360's advertising recommendation system leverages language models—from BERT to large‑scale LLMs—to improve user interest modeling, feature extraction, and conversion‑rate prediction, detailing practical challenges, engineering solutions, experimental results, and future research directions.

AdvertisingBERTLLM

0 likes · 18 min read

From BERT to LLM: Language Model Applications in 360 Advertising Recommendation

Airbnb Technology Team

Jan 31, 2024 · Artificial Intelligence

Airbnb’s Listing Attribute Extraction Platform (LAEP): End-to-End Structured Information Extraction Using Machine Learning and NLP

Airbnb’s Listing Attribute Extraction Platform (LAEP) uses a custom NER model, word‑embedding mapping, and a BERT‑based scorer to automatically pull, normalize, and validate structured attributes from hosts’ unstructured text, boosting coverage for downstream tools and enhancing guest‑host matching at scale.

AirbnbBERTNER

0 likes · 11 min read

Airbnb’s Listing Attribute Extraction Platform (LAEP): End-to-End Structured Information Extraction Using Machine Learning and NLP

Rare Earth Juejin Tech Community

Dec 20, 2023 · Artificial Intelligence

BERT Model Overview: Inputs, Encoder, Fine‑tuning, and Variants

This article explains BERT's WordPiece tokenization, input embeddings (token, segment, and position embeddings), encoder architecture for Base and Large models, fine‑tuning strategies for various NLP tasks, and introduces popular variants such as RoBERTa and ALBERT.

BERTNLPTransformer

0 likes · 12 min read

BERT Model Overview: Inputs, Encoder, Fine‑tuning, and Variants

Rare Earth Juejin Tech Community

Dec 13, 2023 · Artificial Intelligence

Comprehensive Overview of BERT: Architecture, Pre‑training Tasks, and Applications

This article provides a detailed introduction to BERT, covering its bidirectional transformer encoder design, pre‑training objectives such as Masked Language Modeling and Next Sentence Prediction, model configurations, differences from GPT/ELMo, and a wide range of downstream NLP applications.

BERTMasked Language ModelNLP

0 likes · 17 min read

Comprehensive Overview of BERT: Architecture, Pre‑training Tasks, and Applications

Rare Earth Juejin Tech Community

Dec 4, 2023 · Artificial Intelligence

An Overview of BERT: Architecture, Pre‑training Tasks, Comparisons, and Applications

This article provides a comprehensive English overview of BERT, covering its original paper, model architecture, pre‑training objectives (Masked Language Model and Next Sentence Prediction), differences from ELMo, GPT and vanilla Transformers, parameter counts, main contributions, and a range of NLP application scenarios such as text classification, sentiment analysis, NER, and machine translation.

BERTNLPNext Sentence Prediction

0 likes · 16 min read

An Overview of BERT: Architecture, Pre‑training Tasks, Comparisons, and Applications

Baidu Geek Talk

Nov 2, 2023 · Artificial Intelligence

AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models

The paper presents an AI‑driven static analysis framework that builds code knowledge graphs to extract relevant slices and leverages large language models for multilingual defect prediction, achieving up to 80% F1, detecting 662 defects across 1,100 C++ modules with a 26.9% recall gain over traditional rule‑based scanners.

BERTcode defect detectioncode knowledge graph

0 likes · 9 min read

AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models

Zhuanzhuan Tech

Oct 11, 2023 · Artificial Intelligence

Building a ChatGPT‑Based Intelligent Customer Service System with BERT Classification and Knowledge Filtering

This article describes how to construct an intelligent customer‑service assistant using ChatGPT for natural‑language understanding, BERT for user‑question classification, and Sentence‑BERT for knowledge‑selection, detailing system architecture, prompt design, model training, performance results, and practical cost reductions.

BERTChatGPTIntelligent Customer Service

0 likes · 16 min read

Building a ChatGPT‑Based Intelligent Customer Service System with BERT Classification and Knowledge Filtering

Baidu Tech Salon

Sep 20, 2023 · Artificial Intelligence

Live Session: Introduction to NVIDIA Nsight Systems and Compute for AI Performance Analysis

In a live session, NVIDIA senior deep‑learning solutions architect Zhai Jian demonstrates how to use Nsight Systems and Nsight Compute to analyze a simple neural‑network training workload, accelerate BERT with mixed precision, and examine matrix‑transpose kernels, with registration via QR code and a detailed event schedule.

AI toolsBERTGPU performance

0 likes · 2 min read

Live Session: Introduction to NVIDIA Nsight Systems and Compute for AI Performance Analysis

HelloTech

Sep 13, 2023 · Artificial Intelligence

AI Platform‑Powered Automated Ticket Routing: Modeling Workflow, Feature Engineering, and Intent Recognition

The Haro AI platform automates customer‑service ticket routing by applying a four‑step pipeline—feature processing, model training, evaluation, and deployment—using BERT/ALBERT‑based intent recognition, configurable feature storage, AutoML or expert modes, and Faas‑style deployment, as demonstrated in the Universal Ticket System case study, dramatically improving accuracy and efficiency.

AI platformALBERTBERT

0 likes · 11 min read

AI Platform‑Powered Automated Ticket Routing: Modeling Workflow, Feature Engineering, and Intent Recognition

Sohu Tech Products

Jul 26, 2023 · Artificial Intelligence

Attention Mechanism, Transformer Architecture, and BERT: An In-Depth Overview

This article provides a comprehensive overview of the attention mechanism, its mathematical foundations, the transformer model architecture—including encoder and decoder components—and the BERT pre‑training model, detailing their principles, implementations, and applications in natural language processing.

Attention MechanismBERTEncoder-Decoder

0 likes · 13 min read

Attention Mechanism, Transformer Architecture, and BERT: An In-Depth Overview

HomeTech

Jul 7, 2023 · Artificial Intelligence

Multi-Modal Video Understanding and AIGC Video Generation at Autohome

This article presents a comprehensive multi-modal video understanding system for AIGC video generation, detailing technical architecture, GCN-based semi-supervised learning, and practical applications across automotive content scenarios.

AIGCBERTNeXtVLAD

0 likes · 8 min read

Multi-Modal Video Understanding and AIGC Video Generation at Autohome

Xianyu Technology

Feb 22, 2023 · Artificial Intelligence

Integrating Retrieval and Generation Tasks for Deep Semantic Matching in Xianyu Search

The paper introduces SimBert, a later‑fusion model that jointly trains a dual‑tower retrieval component and an auxiliary generation task on the item tower, using a two‑stage pre‑training and fine‑tuning pipeline, which yields a 3.6% relevance boost and reduces bad‑case rates in Xianyu search.

BERTmulti-task trainingretrieval-generation

0 likes · 8 min read

Integrating Retrieval and Generation Tasks for Deep Semantic Matching in Xianyu Search

DataFunSummit

Feb 3, 2023 · Artificial Intelligence

Interactive BERT for Relevance in Health E‑commerce Search

This article presents an in‑depth exploration of an interactive BERT‑based relevance model for health e‑commerce search, detailing the business context, query and product feature extraction, domain‑specific sample generation, model architecture enhancements, offline and online performance gains, and practical deployment through knowledge distillation.

AIBERTSemantic Modeling

0 likes · 14 min read

Interactive BERT for Relevance in Health E‑commerce Search

DataFunTalk

Jan 11, 2023 · Artificial Intelligence

Exploring Interactive BERT for Relevance in Health E‑commerce Search

This article presents a comprehensive overview of Alibaba Health's interactive BERT approach for improving relevance in health e‑commerce search, covering business background, model design, domain‑specific data construction, knowledge‑distilled twin‑tower deployment, experimental results, and a detailed Q&A session.

AIBERTSemantic Modeling

0 likes · 14 min read

Exploring Interactive BERT for Relevance in Health E‑commerce Search

21CTO

Dec 28, 2022 · Artificial Intelligence

Why Google Is Rerouting Its Teams to Compete with ChatGPT

Google’s CEO Sundar Pichai has ordered a rapid shift of resources toward AI, pulling staff from various projects to counter OpenAI’s ChatGPT, while senior engineers discuss the company’s own language models like LaMDA, BERT, and MUM and the future of search.

AIBERTChatGPT

0 likes · 5 min read

Why Google Is Rerouting Its Teams to Compete with ChatGPT

Ctrip Technology

Nov 10, 2022 · Artificial Intelligence

Improving Search Intent Recognition and Term Weighting with Deep Learning and Model Distillation at Ctrip

This article describes how Ctrip's R&D team applied deep‑learning models, BERT‑based embeddings, knowledge distillation, and term‑weighting techniques to enhance e‑commerce search intent recognition and term importance estimation, achieving high accuracy while meeting sub‑10 ms latency requirements.

BERTIntent RecognitionSearch

0 likes · 12 min read

Improving Search Intent Recognition and Term Weighting with Deep Learning and Model Distillation at Ctrip

ELab Team

Sep 23, 2022 · Artificial Intelligence

Fine‑Tune a Chinese BERT Model for Cloze Tasks in 30 Minutes

This tutorial walks you through NLP fundamentals, the evolution of BERT, the concept of pre‑trained models, and a step‑by‑step guide to fine‑tune a Chinese BERT on a cloze‑style task, complete with code snippets and verification results.

BERTChineseCloze Task

0 likes · 13 min read

Fine‑Tune a Chinese BERT Model for Cloze Tasks in 30 Minutes

Zuoyebang Tech Team

Sep 15, 2022 · Artificial Intelligence

How We Replaced BERT with a Lightweight TextCNN to Slash GPU Costs

This article describes the production challenges of using BERT for large‑scale text classification at Zuoyebang, explores lightweight alternatives such as knowledge distillation, pruning and quantization, and details a teacher‑student‑active‑learning pipeline that trains a TextCNN model to match BERT performance while dramatically reducing GPU consumption and improving throughput.

Active LearningBERTModel Deployment

0 likes · 13 min read

How We Replaced BERT with a Lightweight TextCNN to Slash GPU Costs

DataFunTalk

Sep 9, 2022 · Artificial Intelligence

AI-Powered Music Comment Moderation and Ranking: Models, Challenges, and Business Impact

This article presents a comprehensive overview of AI-driven music comment moderation and ranking systems, detailing business scenarios, model architectures, data processing techniques, performance improvements, and future directions for both QQ Music and K‑Song platforms.

AIBERTNLP

0 likes · 17 min read

AI-Powered Music Comment Moderation and Ranking: Models, Challenges, and Business Impact

JD Cloud Developers

Aug 15, 2022 · Artificial Intelligence

How FCA Doubles BERT’s Inference Speed with Less Than 1% Accuracy Loss

This article explains how the Fine‑ and Coarse‑Granularity Hybrid Self‑Attention (FCA) mechanism reduces BERT’s computational cost by over 50% while keeping accuracy loss under 1%, detailing the method, experimental results, and its significance for efficient large‑scale language models.

BERTFCAModel Efficiency

0 likes · 8 min read

How FCA Doubles BERT’s Inference Speed with Less Than 1% Accuracy Loss

DataFunSummit

Jul 7, 2022 · Artificial Intelligence

Discovering and Enhancing Robustness in Low‑Resource Information Extraction

This article examines the robustness challenges of information extraction tasks such as NER and relation extraction, introduces the Entity Coverage Ratio metric, analyzes why pretrained models like BERT may “take shortcuts,” and proposes evaluation tools and training strategies—including mutual‑information‑based methods, negative‑training, and flooding—to improve model robustness across diverse scenarios.

BERTEvaluation MetricsNamed Entity Recognition

0 likes · 12 min read

Discovering and Enhancing Robustness in Low‑Resource Information Extraction

Meituan Technology Team

Jul 6, 2022 · Artificial Intelligence

Improving Search Relevance in PointCheck

The article details Meituan‑Dianping's search relevance pipeline, describing how multi‑similarity matrix structures, multi‑stage domain‑adaptive training, POI field summarization, and online inference optimizations together improve a BERT‑based relevance model's offline metrics and reduce the BadCase rate in production.

BERTMeituanmulti-similarity matrix

0 likes · 31 min read

Improving Search Relevance in PointCheck

Ctrip Technology

Jun 16, 2022 · Artificial Intelligence

Entity Linking System for Travel Knowledge Graph at Ctrip AI R&D

The article presents Ctrip's travel AI team's end‑to‑end entity linking solution built on a large‑scale tourism knowledge graph, detailing its background, technical architecture, core modules—including mention detection, candidate generation, and disambiguation using BERT and prefix‑tree techniques—and real‑world applications such as search, intelligent客服, and POI data maintenance.

BERTKnowledge GraphNLP

0 likes · 18 min read

Entity Linking System for Travel Knowledge Graph at Ctrip AI R&D

Meituan Technology Team

May 26, 2022 · Artificial Intelligence

Span-Level Dialogue Summarization via Distant Supervision and Machine Reading Comprehension (DSMRC‑S)

The paper reviews classic summarization models, then proposes DSMRC‑S, a span-level extractive dialogue summarization method using distant supervision and a machine‑reading‑comprehension framework, with token‑level labeling and density‑based span selection, achieving state‑of‑the‑art BLEU and ROUGE improvements on a large Meituan dialogue dataset.

BERTDialogue Summarizationmachine reading comprehension

0 likes · 33 min read

Span-Level Dialogue Summarization via Distant Supervision and Machine Reading Comprehension (DSMRC‑S)

DataFunTalk

May 22, 2022 · Artificial Intelligence

Advances in Information‑Flow Recommendation: Pre‑trained Models and Multimodal User‑Interface Modeling

This article reviews Huawei Noah's Ark Lab's work on modern information‑flow recommendation, covering the evolution from collaborative filtering to deep learning, the application of BERT‑based pre‑training for news ranking, multimodal user‑interface modeling, practical deployment challenges, and future research directions.

AIBERTHuawei

0 likes · 19 min read

Advances in Information‑Flow Recommendation: Pre‑trained Models and Multimodal User‑Interface Modeling

Liulishuo Tech Team

May 20, 2022 · Artificial Intelligence

Multi‑Scale BERT‑Based Automated Essay Scoring: Architecture, Loss Functions, and Experimental Evaluation

This article surveys automated essay scoring (AES), compares handcrafted, deep‑learning, and pre‑trained language‑model approaches, proposes a multi‑scale BERT architecture with document, token, and segment features, introduces three combined loss functions, and demonstrates superior performance on the ASAP dataset and internal tasks.

ASAP datasetBERTLoss Functions

0 likes · 13 min read

Multi‑Scale BERT‑Based Automated Essay Scoring: Architecture, Loss Functions, and Experimental Evaluation

Code DAO

May 19, 2022 · Artificial Intelligence

Semi‑Supervised Training Methods for Transformers

This article explains an end‑to‑end semi‑supervised training pipeline for Transformer‑based NLP models, detailing the unsupervised language‑model pre‑training, supervised fine‑tuning, and the internal architecture of embeddings, encoder layers, and downstream tasks such as text classification and NER.

BERTMasked Language ModelNLP

0 likes · 9 min read

Semi‑Supervised Training Methods for Transformers

DataFunTalk

May 7, 2022 · Artificial Intelligence

Intelligent Recommendation Selling Point Generation: Architecture, Core AI Techniques, Model Development, and Product Impact

This article explains how JD's intelligent recommendation selling point system leverages NLP, BERT, Transformer and pointer‑generator models to automatically create short, personalized product highlights, describing the technical background, system architecture, model training pipeline, online/offline monitoring, and the resulting business benefits.

BERTNLPRecommendation Systems

0 likes · 13 min read

Intelligent Recommendation Selling Point Generation: Architecture, Core AI Techniques, Model Development, and Product Impact

TAL Education Technology

Apr 14, 2022 · Artificial Intelligence

Intelligent Call Recording Quality Inspection Using Dual‑Mode Detection

This article proposes a dual‑mode detection solution for call‑recording quality inspection that combines rule‑based semantic similarity matching with BERT‑based sentence segmentation and RoBERTa multi‑label classification to achieve high accuracy, fast task adaptation, and strong generalization for customer‑service scenarios.

BERTNLPRoBERTa

0 likes · 7 min read

Intelligent Call Recording Quality Inspection Using Dual‑Mode Detection

ELab Team

Mar 16, 2022 · Artificial Intelligence

Reverse Dictionary Made Easy: Harness WantWords and Hugging Face for Quick NLP Model Integration

This article introduces the open‑source WantWords reverse‑dictionary project, explains its token‑based processing pipeline, walks through Python implementation and model invocation with Hugging Face’s Transformers, reviews NLP’s historical evolution, and shows how front‑end developers can quickly integrate NLP models into products.

BERTHugging FaceModel Deployment

0 likes · 13 min read

Reverse Dictionary Made Easy: Harness WantWords and Hugging Face for Quick NLP Model Integration

Baobao Algorithm Notes

Mar 16, 2022 · Artificial Intelligence

How to Boost Kaggle NLP Scores with BERT, Tree Models, and Smart Post‑Processing

The article analyzes a recent Kaggle essay‑segmentation competition, explains why standard BERT‑based models plateau, and shows how a two‑stage pipeline that combines coarse BERT filtering with a feature‑rich tree model and post‑processing scaling can push scores well beyond the 70‑point barrier.

BERTKaggleNLP

0 likes · 5 min read

How to Boost Kaggle NLP Scores with BERT, Tree Models, and Smart Post‑Processing

Tencent Cloud Developer

Mar 3, 2022 · Artificial Intelligence

Model Distillation for Query-Document Matching: Techniques and Optimizations

We applied knowledge distillation to a video query‑document BERT matcher, compressing the 12‑layer teacher into production‑ready 1‑layer ALBERT and tiny TextCNN students using combined soft, hard, and relevance losses plus AutoML‑tuned hyper‑parameters, achieving sub‑5 ms latency and up to 2.4% AUC improvement over the original model.

ALBERTAutoMLBERT

0 likes · 12 min read

Model Distillation for Query-Document Matching: Techniques and Optimizations

Baobao Algorithm Notes

Jan 20, 2022 · Interview Experience

How to Ace Algorithm Interviews: Insider Tips, Sample Questions, and Evaluation Criteria

The article shares an interviewer's perspective on algorithm hiring, outlining five assessment dimensions—fundamentals, knowledge depth, breadth, business understanding, and communication—providing concrete question examples, a coding challenge, and practical communication tips to help candidates succeed.

BERTalgorithm interviewcommunication skills

0 likes · 9 min read

How to Ace Algorithm Interviews: Insider Tips, Sample Questions, and Evaluation Criteria

Baobao Algorithm Notes

Jan 14, 2022 · Artificial Intelligence

Boosting BERT Text Classification with Label Embedding: How It Works

The paper proposes a simple yet effective method that fuses label embeddings into BERT, enhancing text‑classification performance without increasing computational cost, and validates the approach across six benchmark datasets, also exploring tf‑idf‑based label augmentation and the impact of using [SEP] versus no‑[SEP] inputs.

BERTNLPTF-IDF

0 likes · 8 min read

Boosting BERT Text Classification with Label Embedding: How It Works

Baobao Algorithm Notes

Jan 14, 2022 · Artificial Intelligence

BERT Interview Q&A: Decoding CLS, Masks, Complexity, and More

An in‑depth Q&A breaks down core BERT concepts—from the purpose of the [CLS] token and masking strategies to self‑attention complexity, sparse attention tricks, subword handling of OOV words, warm‑up learning rates, GPT’s unidirectional nature, and ALBERT’s parameter sharing—providing concise explanations for each.

BERTMaskingSelf-Attention

0 likes · 7 min read

BERT Interview Q&A: Decoding CLS, Masks, Complexity, and More

Beike Product & Technology

Jan 7, 2022 · Artificial Intelligence

Beike Real Estate NLP Team Wins First Place in CCIR Cup 2021 Intelligent Human‑Computer Interaction Track

The Beike Real Estate NLP team secured first place in the CCIR Cup 2021 Intelligent Human‑Computer Interaction track by applying semi‑supervised and transfer learning techniques to small‑sample intent recognition and slot filling, and also presented the large‑scale Mandarin dialect speech benchmark KeSpeech at NeurIPS 2021.

AI competitionBERTNLP

0 likes · 5 min read

Beike Real Estate NLP Team Wins First Place in CCIR Cup 2021 Intelligent Human‑Computer Interaction Track

Baobao Algorithm Notes

Jan 7, 2022 · Interview Experience

Essential Transformer Interview Cheat Sheet: 11 Must‑Know Q&A

This concise guide presents eleven frequently asked Transformer interview questions with clear, English explanations covering self‑attention formulas, scaling, alternative designs, LayerNorm vs. BatchNorm, positional embeddings, multi‑head mechanisms, and BPE tokenization, helping candidates deliver solid, theory‑backed answers.

BERTLayerNormNLP interview

0 likes · 6 min read

Essential Transformer Interview Cheat Sheet: 11 Must‑Know Q&A

Ctrip Technology

Dec 30, 2021 · Artificial Intelligence

Semantic Matching Techniques for Intelligent Customer Service at Ctrip

This article presents Ctrip's intelligent customer service system, detailing the evolution of semantic matching methods from traditional lexical models to deep learning approaches such as BERT and ESIM, and describing multi‑stage retrieval, multilingual transfer learning, and KBQA techniques for improving query understanding and response accuracy.

BERTNLPcustomer service

0 likes · 16 min read

Semantic Matching Techniques for Intelligent Customer Service at Ctrip

Baobao Algorithm Notes

Dec 23, 2021 · Artificial Intelligence

How Pre‑Training Evolved: From word2vec to MAE Across NLP & Vision

This article traces the evolution of deep‑learning pre‑training techniques, starting with word2vec in NLP, moving through ELMo and BERT, then shifting to computer‑vision models such as iGPT, ViT, BEiT, and MAE, highlighting key innovations, challenges, and the convergence of NLP and CV paradigms.

BERTMAENLP

0 likes · 21 min read

How Pre‑Training Evolved: From word2vec to MAE Across NLP & Vision

Baobao Algorithm Notes

Dec 15, 2021 · Artificial Intelligence

Why Can BERT’s Token, Segment, and Position Embeddings Be Added? A Deep Dive into Positional Encoding

This article revisits the long‑standing question of why BERT’s token, segment, and position embeddings are summed, critiques earlier explanations, and presents findings from the ICLR‑2021 paper “Rethinking Positional Encoding in Language Pre‑training” that show removing the token‑position cross term speeds convergence and improves downstream GLUE scores.

BERTEmbeddingLanguage Pretraining

0 likes · 6 min read

Why Can BERT’s Token, Segment, and Position Embeddings Be Added? A Deep Dive into Positional Encoding

Meituan Technology Team

Dec 9, 2021 · Artificial Intelligence

Fine-Grained Aspect-Based Sentiment Analysis for Meituan's To‑Restaurant Business

To enhance decision‑making for users and quality monitoring for merchants, Meituan’s to‑restaurant platform implements fine‑grained aspect‑based sentiment analysis that extracts dish, attribute, opinion and polarity tuples from reviews, employing both a BERT‑CRF pipeline and a joint Dual‑MRC model which raise F1 scores from 0.61 to 0.68, and are deployed in dashboards and review‑management tools, with future work targeting efficiency and broader four‑tuple extraction.

ABSABERTNLP

0 likes · 28 min read

Fine-Grained Aspect-Based Sentiment Analysis for Meituan's To‑Restaurant Business

Meituan Technology Team

Dec 2, 2021 · Artificial Intelligence

Pretraining Techniques for Search Advertising Relevance at Meituan

Meituan improves search‑ad relevance by applying pre‑trained BERT models enhanced with data‑augmented samples, multi‑task learning, keyword extraction and two‑stage knowledge distillation, producing a lightweight distilled model that, when fused with traditional relevance signals, boosts CTR, lowers Badcase@5 and raises NDCG while preserving revenue.

BERTSearchadvertising relevance

0 likes · 30 min read

Pretraining Techniques for Search Advertising Relevance at Meituan

58 Tech

Nov 25, 2021 · Artificial Intelligence

Technical Evolution of the “Guess You Want” Recommendation Module in 58 Local Services

This article describes the design, multi‑stage recall strategies, and successive ranking model upgrades—including BERT‑based intent prediction, vector‑based DSSM recall, tag expansion, and multi‑task DeepFM/MMoE/ESMM architectures—that together reduce no‑result rates and significantly improve user conversion for 58's local service platform.

BERTDSSMMulti-Task Learning

0 likes · 16 min read

Technical Evolution of the “Guess You Want” Recommendation Module in 58 Local Services

JD Retail Technology

Nov 16, 2021 · Artificial Intelligence

Intelligent Online Selling Point Extraction for E‑Commerce Recommendation (IOSPE) Wins AAAI 2022 Innovation Award

The IOSPE system, which uses BERT‑based scoring, transformer‑pointer generation, and personalized distribution to automatically extract and generate selling points for millions of e‑commerce products, earned the AAAI 2022 Artificial Intelligence Innovation Application Award and has boosted click‑through rates and user dwell time across JD.com platforms.

AIBERTInnovation Award

0 likes · 6 min read

Intelligent Online Selling Point Extraction for E‑Commerce Recommendation (IOSPE) Wins AAAI 2022 Innovation Award

DataFunTalk

Nov 16, 2021 · Artificial Intelligence

Hotel Search Relevance Modeling and Architecture at Fliggy (Alibaba)

This article presents a comprehensive overview of Fliggy's hotel search relevance system, covering the business background, multi‑scenario architecture, core factor estimation, entity recognition, text and spatial relevance modeling, multi‑scenario fusion, and future optimization directions.

AIBERTRanking

0 likes · 17 min read

Hotel Search Relevance Modeling and Architecture at Fliggy (Alibaba)

Meituan Technology Team

Oct 28, 2021 · Artificial Intelligence

Supply Standardization for Script‑Murder Business Using a Knowledge Graph

Meituan’s To‑Store Integrated data team built an end‑to‑end supply‑standardization pipeline for the rapidly growing script‑murder market by extending the GENE knowledge graph to mine merchant supply, construct a unified script library through rule‑based, semantic, and multimodal clustering, and link products and user‑generated content to standardized scripts, enabling a dedicated category, personalized recommendations, filter tags, and improved ranking.

BERTKnowledge GraphMultimodal Learning

0 likes · 23 min read

Supply Standardization for Script‑Murder Business Using a Knowledge Graph

Meituan Technology Team

Sep 30, 2021 · Artificial Intelligence

Meituan's Intelligent Customer Service Technology and Practice

Meituan’s intelligent customer service platform, serving over 630 million users and 7.7 million merchants, integrates six core AI capabilities—including problem recommendation, understanding, dialogue management, answer supply, response recommendation, and session summarization—across pre‑sale, in‑sale, after‑sale and internal scenarios, leveraging multi‑turn dialogue, intent recognition, knowledge‑graph Q&A, and the Moses platform, while targeting future end‑to‑end and emotionally intelligent interactions.

BERTDialogue SystemsIntelligent Customer Service

0 likes · 23 min read

Meituan's Intelligent Customer Service Technology and Practice

58 Tech

Aug 19, 2021 · Artificial Intelligence

Practical NER Techniques for Business Chatbots on the 58.com Service Platform

This article presents a comprehensive case study of applying named‑entity‑recognition (NER) techniques to the smart chat assistant of 58.com’s yellow‑page service, covering business background, model selection (BiLSTM‑CRF, IDCNN‑CRF, BERT), data‑augmentation, focal loss, fusion of rule‑based and neural methods, context modeling, online performance, and future research directions.

BERTCRFData Augmentation

0 likes · 16 min read

Practical NER Techniques for Business Chatbots on the 58.com Service Platform

58 Tech

Aug 5, 2021 · Artificial Intelligence

Exploration and Practice of Text Representation Algorithms in the 58 Security Scenario

This article presents a comprehensive study of text representation techniques—from weighted word‑vector methods to supervised SimBert and unsupervised contrastive learning models—applied to large‑scale unstructured data in 58's information‑security workflows, evaluating their effectiveness for classification and content‑recall tasks.

BERTPretrained ModelsSimCSE

0 likes · 11 min read

Exploration and Practice of Text Representation Algorithms in the 58 Security Scenario

Ctrip Technology

Jul 29, 2021 · Artificial Intelligence

NLP Techniques for Classifying Ctrip Ticket Customer Service Conversations

This article presents the background, problem analysis, data preprocessing, modeling approaches and optimization results of applying various NLP methods—including statistical models, word embeddings, attention mechanisms and pretrained language models such as BERT—to improve the accuracy of classifying Ctrip ticket customer service dialogues.

BERTNLPText Classification

0 likes · 13 min read

NLP Techniques for Classifying Ctrip Ticket Customer Service Conversations

DataFunSummit

Jul 25, 2021 · Artificial Intelligence

Advances in Query Understanding and Semantic Retrieval at Zhihu Search

This article details Zhihu Search's engineering solutions for long‑tail query challenges, covering historical development, term weighting, synonym expansion, query rewriting with reinforcement learning, and semantic recall using BERT‑based models, while also outlining future research directions such as GAN‑based rewriting and lightweight pre‑training.

BERTEmbedding RetrievalQuery Rewriting

0 likes · 14 min read

Advances in Query Understanding and Semantic Retrieval at Zhihu Search

58 Tech

Jul 5, 2021 · Artificial Intelligence

Construction of a Virtual Category‑Tag System for 58 Local Services Using Machine Learning

This article describes the end‑to‑end design and implementation of a virtual category‑tag framework for 58 local services, detailing data preparation, tag selection via semantic similarity models, tag mounting, synonym normalization, experimental comparisons of CDSSM, MatchPyramid, BERT, RoBERTa and other techniques, and outlines future improvements.

BERTTaggingsynonym normalization

0 likes · 16 min read

Construction of a Virtual Category‑Tag System for 58 Local Services Using Machine Learning

DataFunTalk

Jun 20, 2021 · Artificial Intelligence

Iterative Development and Applications of Meituan Takeaway Food Knowledge Graph

This article systematically introduces the architecture, iterative improvements, modeling techniques, and practical applications of Meituan's food knowledge graph, covering category taxonomy, standard dish names, basic and thematic attributes, health‑meal detection, dish entity alignment, and downstream recommendation and search use cases.

AIBERTFood Recommendation

0 likes · 18 min read

Iterative Development and Applications of Meituan Takeaway Food Knowledge Graph

58 Tech

Jun 18, 2021 · Artificial Intelligence

Bidding Document Classification and Entity Extraction Using BERT-based Models

This article describes how 58.com built an end‑to‑end bidding service that crawls tender documents, classifies them into multiple categories with BERT‑based models (including softmax, sigmoid and ensemble approaches) and extracts key entities using BERT‑CRF and reading‑comprehension techniques, achieving over 90% overall accuracy and dramatically improving recall and precision.

BERTNLPdocument classification

0 likes · 15 min read

Bidding Document Classification and Entity Extraction Using BERT-based Models

58 Tech

Jun 16, 2021 · Artificial Intelligence

Improving Text Matching Accuracy in Voice Assistants: Experiments with Siamese Networks, BERT Models, and Advanced Tricks

This article evaluates classic Siamese networks, various BERT‑based pretrained models, and several training tricks such as adversarial training, k‑fold cross‑validation, and model ensembling on both a public similarity‑sentence competition dataset and an internal voice‑assistant standard question matching dataset, ultimately raising accuracy from 97.23 % to 99.5 %.

BERTSiamese NetworkVoice Assistant

0 likes · 15 min read

Improving Text Matching Accuracy in Voice Assistants: Experiments with Siamese Networks, BERT Models, and Advanced Tricks

DataFunTalk

Jun 6, 2021 · Artificial Intelligence

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT introduces a contrastive self‑supervised framework that enhances BERT‑derived sentence embeddings by applying efficient embedding‑level data augmentations, achieving significant improvements on semantic textual similarity tasks, especially in low‑resource settings, and outperforming previous state‑of‑the‑art methods.

BERTcontrastive learningself-supervised

0 likes · 20 min read

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

DataFunSummit

Jun 5, 2021 · Artificial Intelligence

Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure‑Preserving Methods

This article reviews BERT’s architecture, analyzes the storage and compute costs of each layer, and systematically presents compression methods—including quantization, pruning, knowledge distillation (Distilled BiLSTM and MobileBERT), and structure‑preserving techniques—aimed at enabling efficient deployment on resource‑constrained mobile devices.

BERTMobile DeploymentQuantization

0 likes · 15 min read

Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure‑Preserving Methods

Meituan Technology Team

Jun 3, 2021 · Artificial Intelligence

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT is a contrastive self‑supervised framework that fine‑tunes BERT with augmented sentence views and NT‑Xent loss to overcome embedding collapse, delivering roughly 8% higher STS performance than prior methods, remaining robust in few‑shot and supervised scenarios, and now deployed in Meituan’s NLP pipelines.

BERTNLPcontrastive learning

0 likes · 20 min read

DataFunTalk

Jun 3, 2021 · Artificial Intelligence

Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure-Preserving Methods

This article examines the internal structure of BERT and systematically presents various model‑compression strategies—including quantization, pruning, knowledge distillation, and structure‑preserving techniques—highlighting their impact on storage, computational cost, and inference speed for deployment on resource‑constrained mobile devices.

BERTQuantizationknowledge distillation

0 likes · 16 min read

Meituan Technology Team

May 27, 2021 · Artificial Intelligence

Iterative Development and Applications of Meituan Takeaway Food Knowledge Graph

The Meituan Takeaway Food Knowledge Graph iteratively builds a hierarchical tag taxonomy, standardizes dish names, extracts basic and theme attributes, aligns online‑offline entities using CNN‑CRF, BERT and hybrid models, and powers combo, interactive and search recommendations while planning scene‑specific tags and graph‑based personalization.

BERTFood RecommendationKnowledge Graph

0 likes · 19 min read

58 Tech

May 24, 2021 · Artificial Intelligence

Tag Extraction for 58 Yellow Pages Posts Using Sequence Labeling and Model Optimization

This article describes a complete solution for extracting and normalizing tags from 58 Yellow Pages service posts, covering candidate word acquisition, sequence‑labeling models such as CRF and BERT‑CRF, hierarchical softmax optimization for massive label spaces, and experimental results on both post content and user reviews.

BERTCRFhierarchical softmax

0 likes · 20 min read

Tag Extraction for 58 Yellow Pages Posts Using Sequence Labeling and Model Optimization

TiPaiPai Technical Team

May 14, 2021 · Artificial Intelligence

Mastering Text Matching: From SentenceBERT to Contrastive Learning

This article explores the landscape of text matching in NLP, covering problem types, three model interaction levels, sentence embedding techniques, supervised and unsupervised approaches, and the role of contrastive learning with alignment and uniformity metrics.

BERTNLPcontrastive learning

0 likes · 12 min read

Mastering Text Matching: From SentenceBERT to Contrastive Learning

Cyber Elephant Tech Team

Apr 28, 2021 · Artificial Intelligence

Understanding BERT: From Encoder-Decoder to Transformer and Attention

This article explains the BERT model by first reviewing the Encoder-Decoder framework, then detailing the attention mechanism—including self-attention and multi-head attention—before describing the full Transformer architecture and finally outlining BERT’s encoder-only design, training stages, and fine-tuning applications.

BERTEncoder-DecoderNLP

0 likes · 15 min read

Understanding BERT: From Encoder-Decoder to Transformer and Attention

NetEase Media Technology Team

Apr 13, 2021 · Artificial Intelligence

Applying BERT for News Timeliness Classification at NetEase

The article describes how NetEase adapts a pre‑trained BERT model to classify news articles into ultra‑short, short, or long timeliness categories by combining rule‑based strong and weak time cues, key‑sentence extraction, domain‑embedding fusion and multi‑layer semantic aggregation, achieving accurate and interpretable predictions for its platform.

BERTModel FusionNLP

0 likes · 12 min read

Applying BERT for News Timeliness Classification at NetEase

58 Tech

Mar 29, 2021 · Artificial Intelligence

Deep Semantic Model Exploration and Application in 58 Search

This article presents a comprehensive overview of 58 Search's multi‑stage retrieval system, compares term‑match and semantic matching, details the design, training, and optimization of interactive, dual‑tower, and semi‑interactive BERT‑based semantic models, and discusses their practical deployment in ranking and recall stages.

AIBERTInformation Retrieval

0 likes · 18 min read

Deep Semantic Model Exploration and Application in 58 Search

Youku Technology

Mar 23, 2021 · Artificial Intelligence

Text-Video Alignment Algorithm for Automated Short Video Production at Youku

Youku’s new text‑video alignment system automatically generates short video summaries by extracting multimodal video and linguistic features, matching sentences to clips through embedding and tag‑level models, and enabling AI‑driven auto‑editing that cuts production time from days to minutes.

BERTNLPcross-modal matching

0 likes · 10 min read

Text-Video Alignment Algorithm for Automated Short Video Production at Youku

360 Smart Cloud

Mar 4, 2021 · Artificial Intelligence

Optimizing BERT Online Service Deployment at 360 Search

This article describes the challenges of deploying a large BERT model as an online service for 360 Search and details engineering optimizations—including framework selection, model quantization, knowledge distillation, stream scheduling, caching, and dynamic sequence handling—that dramatically improve latency, throughput, and resource utilization.

BERTFP16 quantizationGPU Optimization

0 likes · 12 min read

Optimizing BERT Online Service Deployment at 360 Search

360 Tech Engineering

Mar 1, 2021 · Artificial Intelligence

Deploying BERT as an Online Service: Challenges and Optimizations at 360 Search

This article details the engineering challenges of serving a large BERT model in real‑time for 360 Search and describes a series of optimizations—including TensorRT‑based kernel fusion, model quantization, knowledge distillation, multi‑stream execution, caching, and dynamic sequence handling—that together achieve low latency, high throughput, and stable deployment on GPU clusters.

BERTGPUModel Optimization

0 likes · 10 min read

Deploying BERT as an Online Service: Challenges and Optimizations at 360 Search

58 Tech

Mar 1, 2021 · Artificial Intelligence

Intelligent QABot for 58.com: Classification and Retrieval Model Exploration

This article describes how 58.com’s AI Lab built and continuously improved the QABot intelligent customer‑service system by designing classification and retrieval models, evaluating FastText, LSTM‑DSSM, BERT and a self‑developed SPTM framework, and finally fusing them to boost answer rates and user experience.

AI ChatbotBERTModel Fusion

0 likes · 9 min read

Intelligent QABot for 58.com: Classification and Retrieval Model Exploration

DataFunTalk

Feb 20, 2021 · Artificial Intelligence

Industrial-Scale Machine Translation at Bytedance: Applications, Demos, and Research Advances

This article presents Bytedance's industrial machine‑translation platform, describing its global deployment, diverse product demos, underlying sequence‑to‑sequence models, BERT‑enhanced training strategies, prune‑tune sparsity techniques, multilingual pre‑training, document translation, and a high‑performance inference engine.

BERTMachine Translationmultilingual NLP

0 likes · 19 min read

Industrial-Scale Machine Translation at Bytedance: Applications, Demos, and Research Advances

Sohu Tech Products

Feb 17, 2021 · Artificial Intelligence

Improving BERT Pre‑training with RealFormer: Principles, Implementation, and Empirical Evaluation

This article analyzes the RealFormer modification to the Transformer architecture, details its implementation in BERT, and presents extensive experiments showing that while RealFormer can boost performance on low‑label‑count classification tasks, its benefits diminish or disappear as the number of classes grows.

BERTRealFormerResidual

0 likes · 12 min read

Improving BERT Pre‑training with RealFormer: Principles, Implementation, and Empirical Evaluation

DataFunTalk

Feb 14, 2021 · Artificial Intelligence

TurboTransformers: An Efficient GPU Serving System for Transformer Models

TurboTransformers introduces a suite of GPU‑centric optimizations—including a high‑throughput batch reduction algorithm, a variable‑length‑aware memory allocator, and a dynamic‑programming‑based batch scheduling strategy—that together deliver significantly lower latency and higher throughput for Transformer‑based NLP services compared with existing frameworks such as PyTorch, TensorFlow, ONNX Runtime and TensorRT.

BERTDynamic BatchingGPU inference

0 likes · 13 min read

TurboTransformers: An Efficient GPU Serving System for Transformer Models

58 Tech

Jan 29, 2021 · Artificial Intelligence

Optimization Practices for Business Opportunity Slot Recognition in 58.com Intelligent Customer Service

This article details the background, challenges, architecture, model selection, and future directions of the business‑opportunity slot recognition module used in 58.com’s intelligent customer service, highlighting how regex‑model fusion and IDCNN‑CRF improve entity extraction for phone, WeChat, address, and time slots.

BERTCRFIDCNN

0 likes · 11 min read

Optimization Practices for Business Opportunity Slot Recognition in 58.com Intelligent Customer Service

DataFunTalk

Jan 15, 2021 · Artificial Intelligence

Zhihu Search Text Relevance Evolution and BERT Knowledge Distillation Practices

This talk by Zhihu search algorithm engineer Shen Zhan details the evolution of text relevance models from TF‑IDF/BM25 to deep semantic matching and BERT, explains the challenges of deploying BERT at scale, and describes practical knowledge‑distillation techniques that improve both online latency and offline storage while maintaining search quality.

BERTSemantic Retrievalknowledge distillation

0 likes · 14 min read

Zhihu Search Text Relevance Evolution and BERT Knowledge Distillation Practices

Amap Tech

Dec 30, 2020 · Artificial Intelligence

LRC-BERT: Contrastive Learning based Knowledge Distillation with COS‑NCE Loss for Efficient NLP Models

The Amap team introduced LRC‑BERT, a contrastive‑learning‑based knowledge‑distillation framework that employs a novel COS‑NCE loss, gradient‑perturbation, and a two‑stage training schedule, enabling a 4‑layer student model to retain about 97 % of BERT‑Base accuracy while being 7.5× smaller and 9.6× faster, and it has already improved real‑world traffic‑event extraction performance.

BERTCOS-NCE lossNLP

0 likes · 16 min read

LRC-BERT: Contrastive Learning based Knowledge Distillation with COS‑NCE Loss for Efficient NLP Models

DataFunTalk

Dec 28, 2020 · Artificial Intelligence

Intelligent Question Answering: Scenarios, Architecture, and Technical Implementations (QA, Knowledge‑Graph QA, NL2SQL)

This article introduces the typical applications of intelligent question answering, compares chat‑type, knowledge‑type and task‑type bots, and then details the end‑to‑end architecture, knowledge‑base construction, semantic‑equivalence modeling with BERT‑BIMPM, knowledge‑graph QA pipelines, and NL2SQL techniques, concluding with practical deployment insights.

AIBERTDialogue Systems

0 likes · 15 min read

Intelligent Question Answering: Scenarios, Architecture, and Technical Implementations (QA, Knowledge‑Graph QA, NL2SQL)

DataFunSummit

Dec 27, 2020 · Artificial Intelligence

Sequence Labeling in Natural Language Processing: Definitions, Tag Schemes, Model Choices, and Practical Implementation

This article provides a comprehensive overview of sequence labeling tasks in NLP, covering their definition, common tag schemes (BIO, BIEO, BIESO), comparisons with other NLP tasks, major modeling approaches such as HMM, CRF, RNN and BERT, real‑world applications like POS tagging, NER, event extraction and gene analysis, and a step‑by‑step PyTorch implementation with dataset preparation, training pipeline, and evaluation metrics.

BERTCRFHMM

0 likes · 27 min read

Sequence Labeling in Natural Language Processing: Definitions, Tag Schemes, Model Choices, and Practical Implementation

DataFunTalk

Dec 25, 2020 · Artificial Intelligence

Exploring Pretraining Model Optimization and Deployment Challenges in NLP

This article reviews the evolution of pretraining models in NLP, discusses the practical challenges of deploying large models such as inference latency, knowledge integration, and task adaptation, and presents Xiaomi’s optimization techniques including knowledge distillation, low‑precision inference, operator fusion, and multi‑granularity segmentation for dialogue systems.

BERTDialogue SystemsInference Optimization

0 likes · 15 min read

Exploring Pretraining Model Optimization and Deployment Challenges in NLP

Meituan Technology Team

Dec 3, 2020 · Artificial Intelligence

Meituan Knowledge Graph Group's Six Papers Accepted at CIKM 2020

Meituan’s search and NLP team announced that six knowledge‑graph papers—covering query‑aware tip generation, BERT‑based ranking, multi‑modal and sequential recommendation, conversational recommendation, and graph‑embedding for personalized product search—were accepted at CIKM 2020, resulting from university collaborations and already deployed to boost Meituan’s search, recommendation and product‑search services.

BERTCIKM 2020Knowledge Graph

0 likes · 13 min read

Meituan Knowledge Graph Group's Six Papers Accepted at CIKM 2020

JD Tech Talk

Dec 3, 2020 · Artificial Intelligence

Consumer Behavior Cause Extraction with BERT Fine‑tuning and a Novel Sequence‑Labeling Framework (ICDM 2020 Winning Solution)

At ICDM 2020, the JD Digits Silicon Valley team achieved top results in the Knowledge Graph Contest by fine‑tuning BERT and introducing a novel sequence‑labeling framework that jointly extracts consumer behavior types and their underlying reasons, leveraging CRF decoding and model ensemble for superior performance.

BERTCRFICDM 2020

0 likes · 11 min read

Consumer Behavior Cause Extraction with BERT Fine‑tuning and a Novel Sequence‑Labeling Framework (ICDM 2020 Winning Solution)

Sohu Tech Products

Nov 4, 2020 · Artificial Intelligence

Understanding BERT: Architecture, Pre‑training, Fine‑tuning and Applications in Modern NLP

This article provides a comprehensive overview of BERT and related NLP advances, covering its historical context, model architecture, input‑output mechanisms, comparisons with CNNs, word‑embedding evolution, pre‑training strategies like MLM and next‑sentence prediction, and practical guidance for fine‑tuning and feature extraction.

BERTNLPTransformer

0 likes · 17 min read

Understanding BERT: Architecture, Pre‑training, Fine‑tuning and Applications in Modern NLP

DataFunTalk

Oct 24, 2020 · Artificial Intelligence

FinBERT 1.0: An Open‑Source Chinese Financial Domain Pre‑trained BERT Model and Its Evaluation

FinBERT 1.0 is an open‑source Chinese BERT model pre‑trained on large‑scale financial corpora that achieves 2‑5 % F1 improvements across multiple downstream fintech tasks without additional tuning, demonstrating the value of domain‑specific pre‑training for natural language processing.

BERTChineseDomain Adaptation

0 likes · 14 min read

FinBERT 1.0: An Open‑Source Chinese Financial Domain Pre‑trained BERT Model and Its Evaluation

Tencent Cloud Developer

Sep 23, 2020 · Artificial Intelligence

NLP Model Interpretability: White-box and Black-box Methods and Business Applications

The article reviews NLP interpretability techniques, contrasting white‑box approaches that probe model internals such as neuron analysis, diagnostic classifiers, and attention with black‑box strategies like rationales, adversarial testing, and local surrogates, and argues that black‑box methods are generally more practical for business deployment despite offering shallower insights.

Attention MechanismBERTLIME

0 likes · 12 min read

NLP Model Interpretability: White-box and Black-box Methods and Business Applications

DataFunTalk

Sep 23, 2020 · Artificial Intelligence

From Word Embedding to BERT: A Comprehensive Overview of Pre‑training Model Development in NLP

This article surveys the evolution of pre‑training models for natural language processing, detailing model architectures such as Encoder‑AE, Decoder‑AR, Encoder‑Decoder, Prefix LM, and PLM, analyzing why models like RoBERTa, T5, and GPT‑3 excel, and offering practical guidance for building strong pre‑training systems.

BERTLanguage ModelsNLP

0 likes · 47 min read

From Word Embedding to BERT: A Comprehensive Overview of Pre‑training Model Development in NLP

58 Tech

Sep 21, 2020 · Artificial Intelligence

58.com AI Algorithm Competition: Winning Teams and Their Technical Solutions

The 58.com AI Algorithm Competition showcased intelligent customer‑service technology, with 158 teams competing on text classification and matching tasks, and the top five teams presenting detailed BERT, ELECTRA, focal‑loss and multi‑model fusion solutions along with award ceremonies, video recordings and PPT resources.

AIBERTELECTRA

0 likes · 9 min read

58.com AI Algorithm Competition: Winning Teams and Their Technical Solutions

Tencent Cloud Developer

Sep 3, 2020 · Artificial Intelligence

CTR Prediction Optimization for App Store Recommendation: Integrating DeepWalk, BERT, and Attention Mechanisms

The paper presents an optimized CTR prediction model for Tencent’s App Store that merges multi‑behavior shared embeddings, long‑term DeepWalk graph embeddings, BERT‑derived app description vectors, and attention‑based fusion, reducing parameters while improving bias, AUC, and recommendation performance for sparse, long‑tail data.

BERTCTR PredictionDeepWalk

0 likes · 9 min read

CTR Prediction Optimization for App Store Recommendation: Integrating DeepWalk, BERT, and Attention Mechanisms

Meituan Technology Team

Aug 20, 2020 · Artificial Intelligence

DR-BERT: Enhancing BERT-based Document Ranking with Task-adaptive Training and OOV Matching

DR‑BERT boosts BERT‑based document ranking on the MS MARCO benchmark by applying domain‑adaptive pretraining, a two‑stage fine‑tuning pipeline (pointwise then listwise), and OOV‑aware mechanisms—including exact‑match features and word‑recovery of sub‑tokens—achieving the first MRR@10 above 0.4 and leading the leaderboard.

BERTDR-BERTMS MARCO

0 likes · 16 min read

DR-BERT: Enhancing BERT-based Document Ranking with Task-adaptive Training and OOV Matching

Sohu Tech Products

Aug 19, 2020 · Artificial Intelligence

ASR Error Correction with BERT, ELECTRA and a Fuzzy‑Phoneme Generator: Techniques from Xiaomi AI

This article describes how Xiaomi's AI team tackles Automatic Speech Recognition (ASR) query errors by analyzing error patterns, employing BERT, ELECTRA and a soft‑masked BERT model, generating synthetic noisy data with a fuzzy‑phoneme generator, and presenting experimental results and future research directions.

ASRBERTELECTRA

0 likes · 18 min read

ASR Error Correction with BERT, ELECTRA and a Fuzzy‑Phoneme Generator: Techniques from Xiaomi AI

DataFunTalk

Aug 6, 2020 · Artificial Intelligence

Practical Applications of Pretrained Language Models (BERT, GPT, ELMo) in NetEase Yanxuan NLP Tasks

The article reviews the principles of popular pretrained language models, compares their architectures, and details how NetEase Yanxuan applied BERT, GPT and ELMo to classification, matching, sequence labeling and generation tasks, presenting experimental results and deployment insights.

BERTNLPText Classification

0 likes · 20 min read

Practical Applications of Pretrained Language Models (BERT, GPT, ELMo) in NetEase Yanxuan NLP Tasks

iQIYI Technical Product Team

Jul 24, 2020 · Artificial Intelligence

Fine‑grained Character Sentiment Analysis in Scripts: Models, Challenges, and Future Directions

The article surveys fine‑grained character sentiment analysis for script evaluation, detailing traditional, target‑dependent and aspect‑level methods, describing iQIYI’s BERT‑TD‑LSTM and CNN architectures, addressing challenges such as character name recognition and long‑range context, and outlining future improvements after a Parasite case study.

BERTLSTMNLP

0 likes · 19 min read

Fine‑grained Character Sentiment Analysis in Scripts: Models, Challenges, and Future Directions