Tagged articles
149 articles
Page 1 of 2
AI Engineer Programming
AI Engineer Programming
Apr 26, 2026 · Artificial Intelligence

From Bag‑of‑Words to Semantics: How Embeddings Turn Meaning into Numbers (Part 2)

The article explains how embedding techniques encode semantic information into numeric vectors, covering Word2Vec and GloVe fundamentals, BERT anisotropy, SimCSE contrastive learning, alignment and uniformity metrics, ANN index structures such as HNSW, IVF and PQ, Matryoshka representation learning, practical deployment challenges, and evaluation best practices.

ANNBERTEmbedding
0 likes · 23 min read
From Bag‑of‑Words to Semantics: How Embeddings Turn Meaning into Numbers (Part 2)
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Mar 14, 2026 · Artificial Intelligence

Quantitative Finance Paper Digest: AI‑Driven Market Prediction Studies (Mar 7‑13 2026)

This digest summarizes four recent research papers that apply advanced AI techniques—node‑transformer graphs with BERT sentiment analysis, a quantum‑classical LSTM‑Born machine hybrid, large‑language‑model benchmarking for portfolio optimization, and a conditional diffusion model—to improve stock market prediction, volatility forecasting, and investment decision making, providing detailed experimental results and statistical validation.

BERTQuantum ComputingTransformer
0 likes · 10 min read
Quantitative Finance Paper Digest: AI‑Driven Market Prediction Studies (Mar 7‑13 2026)
JD Tech
JD Tech
Dec 30, 2025 · Artificial Intelligence

How a Semi‑Supervised Unified Framework Boosts E‑commerce Query Intent Classification

The paper introduces a semi‑supervised, extensible unified framework (SSUF) that integrates knowledge, label, and structural enhancements to overcome data sparsity, label bias, and fragmented sub‑tasks in e‑commerce query intent prediction, achieving superior offline and online performance.

BERTGCNSemi-supervised Learning
0 likes · 14 min read
How a Semi‑Supervised Unified Framework Boosts E‑commerce Query Intent Classification
Instant Consumer Technology Team
Instant Consumer Technology Team
Aug 15, 2025 · Artificial Intelligence

Master the iFLYTEK Prohibited Words Classification Challenge: Baselines & BERT

This article introduces the iFLYTEK AI Developer Competition on prohibited‑word classification, outlines the task, dataset, evaluation metric, and provides three baseline solutions—including a logistic‑regression model, a BERT fine‑tuning approach, and a large‑model prompt method—along with code snippets and performance notes.

BERTNLPcompetition
0 likes · 15 min read
Master the iFLYTEK Prohibited Words Classification Challenge: Baselines & BERT
Architects' Tech Alliance
Architects' Tech Alliance
Jun 11, 2025 · Artificial Intelligence

From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models

This article chronicles the rapid development of large language models from the 2017 Transformer breakthrough through the rise of BERT, GPT‑3, ChatGPT, multimodal systems like GPT‑4V/o, and the recent cost‑efficient DeepSeek‑R1, highlighting key architectural innovations, scaling trends, alignment techniques, and their transformative impact on AI research and industry.

AI AlignmentBERTCost‑Efficient Inference
0 likes · 26 min read
From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models
DataFunSummit
DataFunSummit
Jul 22, 2024 · Artificial Intelligence

From BERT to LLM: Language Model Applications in 360 Advertising Recommendation

This talk explores how 360's advertising recommendation system leverages language models—from BERT to large‑scale LLMs—to improve user interest modeling, feature extraction, and conversion‑rate prediction, detailing practical challenges, engineering solutions, experimental results, and future research directions.

AdvertisingBERTLLM
0 likes · 18 min read
From BERT to LLM: Language Model Applications in 360 Advertising Recommendation
Airbnb Technology Team
Airbnb Technology Team
Jan 31, 2024 · Artificial Intelligence

Airbnb’s Listing Attribute Extraction Platform (LAEP): End-to-End Structured Information Extraction Using Machine Learning and NLP

Airbnb’s Listing Attribute Extraction Platform (LAEP) uses a custom NER model, word‑embedding mapping, and a BERT‑based scorer to automatically pull, normalize, and validate structured attributes from hosts’ unstructured text, boosting coverage for downstream tools and enhancing guest‑host matching at scale.

AirbnbBERTNER
0 likes · 11 min read
Airbnb’s Listing Attribute Extraction Platform (LAEP): End-to-End Structured Information Extraction Using Machine Learning and NLP
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 13, 2023 · Artificial Intelligence

Comprehensive Overview of BERT: Architecture, Pre‑training Tasks, and Applications

This article provides a detailed introduction to BERT, covering its bidirectional transformer encoder design, pre‑training objectives such as Masked Language Modeling and Next Sentence Prediction, model configurations, differences from GPT/ELMo, and a wide range of downstream NLP applications.

BERTMasked Language ModelNLP
0 likes · 17 min read
Comprehensive Overview of BERT: Architecture, Pre‑training Tasks, and Applications
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 4, 2023 · Artificial Intelligence

An Overview of BERT: Architecture, Pre‑training Tasks, Comparisons, and Applications

This article provides a comprehensive English overview of BERT, covering its original paper, model architecture, pre‑training objectives (Masked Language Model and Next Sentence Prediction), differences from ELMo, GPT and vanilla Transformers, parameter counts, main contributions, and a range of NLP application scenarios such as text classification, sentiment analysis, NER, and machine translation.

BERTNLPNext Sentence Prediction
0 likes · 16 min read
An Overview of BERT: Architecture, Pre‑training Tasks, Comparisons, and Applications
Baidu Geek Talk
Baidu Geek Talk
Nov 2, 2023 · Artificial Intelligence

AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models

The paper presents an AI‑driven static analysis framework that builds code knowledge graphs to extract relevant slices and leverages large language models for multilingual defect prediction, achieving up to 80% F1, detecting 662 defects across 1,100 C++ modules with a 26.9% recall gain over traditional rule‑based scanners.

BERTSoftware qualitycode defect detection
0 likes · 9 min read
AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models
Zhuanzhuan Tech
Zhuanzhuan Tech
Oct 11, 2023 · Artificial Intelligence

Building a ChatGPT‑Based Intelligent Customer Service System with BERT Classification and Knowledge Filtering

This article describes how to construct an intelligent customer‑service assistant using ChatGPT for natural‑language understanding, BERT for user‑question classification, and Sentence‑BERT for knowledge‑selection, detailing system architecture, prompt design, model training, performance results, and practical cost reductions.

BERTChatGPTIntelligent Customer Service
0 likes · 16 min read
Building a ChatGPT‑Based Intelligent Customer Service System with BERT Classification and Knowledge Filtering
Baidu Tech Salon
Baidu Tech Salon
Sep 20, 2023 · Artificial Intelligence

Live Session: Introduction to NVIDIA Nsight Systems and Compute for AI Performance Analysis

In a live session, NVIDIA senior deep‑learning solutions architect Zhai Jian demonstrates how to use Nsight Systems and Nsight Compute to analyze a simple neural‑network training workload, accelerate BERT with mixed precision, and examine matrix‑transpose kernels, with registration via QR code and a detailed event schedule.

AI toolsBERTGPU performance
0 likes · 2 min read
Live Session: Introduction to NVIDIA Nsight Systems and Compute for AI Performance Analysis
HelloTech
HelloTech
Sep 13, 2023 · Artificial Intelligence

AI Platform‑Powered Automated Ticket Routing: Modeling Workflow, Feature Engineering, and Intent Recognition

The Haro AI platform automates customer‑service ticket routing by applying a four‑step pipeline—feature processing, model training, evaluation, and deployment—using BERT/ALBERT‑based intent recognition, configurable feature storage, AutoML or expert modes, and Faas‑style deployment, as demonstrated in the Universal Ticket System case study, dramatically improving accuracy and efficiency.

AI PlatformALBERTBERT
0 likes · 11 min read
AI Platform‑Powered Automated Ticket Routing: Modeling Workflow, Feature Engineering, and Intent Recognition
Sohu Tech Products
Sohu Tech Products
Jul 26, 2023 · Artificial Intelligence

Attention Mechanism, Transformer Architecture, and BERT: An In-Depth Overview

This article provides a comprehensive overview of the attention mechanism, its mathematical foundations, the transformer model architecture—including encoder and decoder components—and the BERT pre‑training model, detailing their principles, implementations, and applications in natural language processing.

Attention MechanismBERTEncoder-Decoder
0 likes · 13 min read
Attention Mechanism, Transformer Architecture, and BERT: An In-Depth Overview
HomeTech
HomeTech
Jul 7, 2023 · Artificial Intelligence

Multi-Modal Video Understanding and AIGC Video Generation at Autohome

This article presents a comprehensive multi-modal video understanding system for AIGC video generation, detailing technical architecture, GCN-based semi-supervised learning, and practical applications across automotive content scenarios.

AIGCBERTNeXtVLAD
0 likes · 8 min read
Multi-Modal Video Understanding and AIGC Video Generation at Autohome
Xianyu Technology
Xianyu Technology
Feb 22, 2023 · Artificial Intelligence

Integrating Retrieval and Generation Tasks for Deep Semantic Matching in Xianyu Search

The paper introduces SimBert, a later‑fusion model that jointly trains a dual‑tower retrieval component and an auxiliary generation task on the item tower, using a two‑stage pre‑training and fine‑tuning pipeline, which yields a 3.6% relevance boost and reduces bad‑case rates in Xianyu search.

BERTmulti-task trainingretrieval-generation
0 likes · 8 min read
Integrating Retrieval and Generation Tasks for Deep Semantic Matching in Xianyu Search
DataFunSummit
DataFunSummit
Feb 3, 2023 · Artificial Intelligence

Interactive BERT for Relevance in Health E‑commerce Search

This article presents an in‑depth exploration of an interactive BERT‑based relevance model for health e‑commerce search, detailing the business context, query and product feature extraction, domain‑specific sample generation, model architecture enhancements, offline and online performance gains, and practical deployment through knowledge distillation.

AIBERTSemantic Modeling
0 likes · 14 min read
Interactive BERT for Relevance in Health E‑commerce Search
DataFunTalk
DataFunTalk
Jan 11, 2023 · Artificial Intelligence

Exploring Interactive BERT for Relevance in Health E‑commerce Search

This article presents a comprehensive overview of Alibaba Health's interactive BERT approach for improving relevance in health e‑commerce search, covering business background, model design, domain‑specific data construction, knowledge‑distilled twin‑tower deployment, experimental results, and a detailed Q&A session.

AIBERTSemantic Modeling
0 likes · 14 min read
Exploring Interactive BERT for Relevance in Health E‑commerce Search
21CTO
21CTO
Dec 28, 2022 · Artificial Intelligence

Why Google Is Rerouting Its Teams to Compete with ChatGPT

Google’s CEO Sundar Pichai has ordered a rapid shift of resources toward AI, pulling staff from various projects to counter OpenAI’s ChatGPT, while senior engineers discuss the company’s own language models like LaMDA, BERT, and MUM and the future of search.

AIBERTChatGPT
0 likes · 5 min read
Why Google Is Rerouting Its Teams to Compete with ChatGPT
Ctrip Technology
Ctrip Technology
Nov 10, 2022 · Artificial Intelligence

Improving Search Intent Recognition and Term Weighting with Deep Learning and Model Distillation at Ctrip

This article describes how Ctrip's R&D team applied deep‑learning models, BERT‑based embeddings, knowledge distillation, and term‑weighting techniques to enhance e‑commerce search intent recognition and term importance estimation, achieving high accuracy while meeting sub‑10 ms latency requirements.

BERTDeep LearningSearch
0 likes · 12 min read
Improving Search Intent Recognition and Term Weighting with Deep Learning and Model Distillation at Ctrip
ELab Team
ELab Team
Sep 23, 2022 · Artificial Intelligence

Fine‑Tune a Chinese BERT Model for Cloze Tasks in 30 Minutes

This tutorial walks you through NLP fundamentals, the evolution of BERT, the concept of pre‑trained models, and a step‑by‑step guide to fine‑tune a Chinese BERT on a cloze‑style task, complete with code snippets and verification results.

BERTChineseCloze Task
0 likes · 13 min read
Fine‑Tune a Chinese BERT Model for Cloze Tasks in 30 Minutes
Zuoyebang Tech Team
Zuoyebang Tech Team
Sep 15, 2022 · Artificial Intelligence

How We Replaced BERT with a Lightweight TextCNN to Slash GPU Costs

This article describes the production challenges of using BERT for large‑scale text classification at Zuoyebang, explores lightweight alternatives such as knowledge distillation, pruning and quantization, and details a teacher‑student‑active‑learning pipeline that trains a TextCNN model to match BERT performance while dramatically reducing GPU consumption and improving throughput.

BERTModel DeploymentNLP
0 likes · 13 min read
How We Replaced BERT with a Lightweight TextCNN to Slash GPU Costs
JD Cloud Developers
JD Cloud Developers
Aug 15, 2022 · Artificial Intelligence

How FCA Doubles BERT’s Inference Speed with Less Than 1% Accuracy Loss

This article explains how the Fine‑ and Coarse‑Granularity Hybrid Self‑Attention (FCA) mechanism reduces BERT’s computational cost by over 50% while keeping accuracy loss under 1%, detailing the method, experimental results, and its significance for efficient large‑scale language models.

BERTDeep LearningFCA
0 likes · 8 min read
How FCA Doubles BERT’s Inference Speed with Less Than 1% Accuracy Loss
DataFunSummit
DataFunSummit
Jul 7, 2022 · Artificial Intelligence

Discovering and Enhancing Robustness in Low‑Resource Information Extraction

This article examines the robustness challenges of information extraction tasks such as NER and relation extraction, introduces the Entity Coverage Ratio metric, analyzes why pretrained models like BERT may “take shortcuts,” and proposes evaluation tools and training strategies—including mutual‑information‑based methods, negative‑training, and flooding—to improve model robustness across diverse scenarios.

BERTEvaluation MetricsRobustness
0 likes · 12 min read
Discovering and Enhancing Robustness in Low‑Resource Information Extraction
Meituan Technology Team
Meituan Technology Team
Jul 6, 2022 · Artificial Intelligence

Improving Search Relevance in PointCheck

The article details Meituan‑Dianping's search relevance pipeline, describing how multi‑similarity matrix structures, multi‑stage domain‑adaptive training, POI field summarization, and online inference optimizations together improve a BERT‑based relevance model's offline metrics and reduce the BadCase rate in production.

BERTMeituanmulti-similarity matrix
0 likes · 31 min read
Improving Search Relevance in PointCheck
Ctrip Technology
Ctrip Technology
Jun 16, 2022 · Artificial Intelligence

Entity Linking System for Travel Knowledge Graph at Ctrip AI R&D

The article presents Ctrip's travel AI team's end‑to‑end entity linking solution built on a large‑scale tourism knowledge graph, detailing its background, technical architecture, core modules—including mention detection, candidate generation, and disambiguation using BERT and prefix‑tree techniques—and real‑world applications such as search, intelligent客服, and POI data maintenance.

BERTKnowledge GraphNLP
0 likes · 18 min read
Entity Linking System for Travel Knowledge Graph at Ctrip AI R&D
Meituan Technology Team
Meituan Technology Team
May 26, 2022 · Artificial Intelligence

Span-Level Dialogue Summarization via Distant Supervision and Machine Reading Comprehension (DSMRC‑S)

The paper reviews classic summarization models, then proposes DSMRC‑S, a span-level extractive dialogue summarization method using distant supervision and a machine‑reading‑comprehension framework, with token‑level labeling and density‑based span selection, achieving state‑of‑the‑art BLEU and ROUGE improvements on a large Meituan dialogue dataset.

BERTDialogue Summarizationmachine reading comprehension
0 likes · 33 min read
Span-Level Dialogue Summarization via Distant Supervision and Machine Reading Comprehension (DSMRC‑S)
DataFunTalk
DataFunTalk
May 22, 2022 · Artificial Intelligence

Advances in Information‑Flow Recommendation: Pre‑trained Models and Multimodal User‑Interface Modeling

This article reviews Huawei Noah's Ark Lab's work on modern information‑flow recommendation, covering the evolution from collaborative filtering to deep learning, the application of BERT‑based pre‑training for news ranking, multimodal user‑interface modeling, practical deployment challenges, and future research directions.

AIBERTHuawei
0 likes · 19 min read
Advances in Information‑Flow Recommendation: Pre‑trained Models and Multimodal User‑Interface Modeling
Liulishuo Tech Team
Liulishuo Tech Team
May 20, 2022 · Artificial Intelligence

Multi‑Scale BERT‑Based Automated Essay Scoring: Architecture, Loss Functions, and Experimental Evaluation

This article surveys automated essay scoring (AES), compares handcrafted, deep‑learning, and pre‑trained language‑model approaches, proposes a multi‑scale BERT architecture with document, token, and segment features, introduces three combined loss functions, and demonstrates superior performance on the ASAP dataset and internal tasks.

ASAP datasetBERTLoss Functions
0 likes · 13 min read
Multi‑Scale BERT‑Based Automated Essay Scoring: Architecture, Loss Functions, and Experimental Evaluation
Code DAO
Code DAO
May 19, 2022 · Artificial Intelligence

Semi‑Supervised Training Methods for Transformers

This article explains an end‑to‑end semi‑supervised training pipeline for Transformer‑based NLP models, detailing the unsupervised language‑model pre‑training, supervised fine‑tuning, and the internal architecture of embeddings, encoder layers, and downstream tasks such as text classification and NER.

BERTFine-tuningMasked Language Model
0 likes · 9 min read
Semi‑Supervised Training Methods for Transformers
DataFunTalk
DataFunTalk
May 7, 2022 · Artificial Intelligence

Intelligent Recommendation Selling Point Generation: Architecture, Core AI Techniques, Model Development, and Product Impact

This article explains how JD's intelligent recommendation selling point system leverages NLP, BERT, Transformer and pointer‑generator models to automatically create short, personalized product highlights, describing the technical background, system architecture, model training pipeline, online/offline monitoring, and the resulting business benefits.

BERTNLPRecommendation Systems
0 likes · 13 min read
Intelligent Recommendation Selling Point Generation: Architecture, Core AI Techniques, Model Development, and Product Impact
TAL Education Technology
TAL Education Technology
Apr 14, 2022 · Artificial Intelligence

Intelligent Call Recording Quality Inspection Using Dual‑Mode Detection

This article proposes a dual‑mode detection solution for call‑recording quality inspection that combines rule‑based semantic similarity matching with BERT‑based sentence segmentation and RoBERTa multi‑label classification to achieve high accuracy, fast task adaptation, and strong generalization for customer‑service scenarios.

BERTNLPRoBERTa
0 likes · 7 min read
Intelligent Call Recording Quality Inspection Using Dual‑Mode Detection
ELab Team
ELab Team
Mar 16, 2022 · Artificial Intelligence

Reverse Dictionary Made Easy: Harness WantWords and Hugging Face for Quick NLP Model Integration

This article introduces the open‑source WantWords reverse‑dictionary project, explains its token‑based processing pipeline, walks through Python implementation and model invocation with Hugging Face’s Transformers, reviews NLP’s historical evolution, and shows how front‑end developers can quickly integrate NLP models into products.

BERTHugging FaceModel Deployment
0 likes · 13 min read
Reverse Dictionary Made Easy: Harness WantWords and Hugging Face for Quick NLP Model Integration
Tencent Cloud Developer
Tencent Cloud Developer
Mar 3, 2022 · Artificial Intelligence

Model Distillation for Query-Document Matching: Techniques and Optimizations

We applied knowledge distillation to a video query‑document BERT matcher, compressing the 12‑layer teacher into production‑ready 1‑layer ALBERT and tiny TextCNN students using combined soft, hard, and relevance losses plus AutoML‑tuned hyper‑parameters, achieving sub‑5 ms latency and up to 2.4% AUC improvement over the original model.

ALBERTAutoMLBERT
0 likes · 12 min read
Model Distillation for Query-Document Matching: Techniques and Optimizations
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 20, 2022 · Interview Experience

How to Ace Algorithm Interviews: Insider Tips, Sample Questions, and Evaluation Criteria

The article shares an interviewer's perspective on algorithm hiring, outlining five assessment dimensions—fundamentals, knowledge depth, breadth, business understanding, and communication—providing concrete question examples, a coding challenge, and practical communication tips to help candidates succeed.

BERTalgorithm interviewcommunication skills
0 likes · 9 min read
How to Ace Algorithm Interviews: Insider Tips, Sample Questions, and Evaluation Criteria
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 14, 2022 · Artificial Intelligence

Boosting BERT Text Classification with Label Embedding: How It Works

The paper proposes a simple yet effective method that fuses label embeddings into BERT, enhancing text‑classification performance without increasing computational cost, and validates the approach across six benchmark datasets, also exploring tf‑idf‑based label augmentation and the impact of using [SEP] versus no‑[SEP] inputs.

BERTDeep LearningNLP
0 likes · 8 min read
Boosting BERT Text Classification with Label Embedding: How It Works
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 14, 2022 · Artificial Intelligence

BERT Interview Q&A: Decoding CLS, Masks, Complexity, and More

An in‑depth Q&A breaks down core BERT concepts—from the purpose of the [CLS] token and masking strategies to self‑attention complexity, sparse attention tricks, subword handling of OOV words, warm‑up learning rates, GPT’s unidirectional nature, and ALBERT’s parameter sharing—providing concise explanations for each.

BERTMaskingSelf-Attention
0 likes · 7 min read
BERT Interview Q&A: Decoding CLS, Masks, Complexity, and More
Beike Product & Technology
Beike Product & Technology
Jan 7, 2022 · Artificial Intelligence

Beike Real Estate NLP Team Wins First Place in CCIR Cup 2021 Intelligent Human‑Computer Interaction Track

The Beike Real Estate NLP team secured first place in the CCIR Cup 2021 Intelligent Human‑Computer Interaction track by applying semi‑supervised and transfer learning techniques to small‑sample intent recognition and slot filling, and also presented the large‑scale Mandarin dialect speech benchmark KeSpeech at NeurIPS 2021.

AI competitionBERTNLP
0 likes · 5 min read
Beike Real Estate NLP Team Wins First Place in CCIR Cup 2021 Intelligent Human‑Computer Interaction Track
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 7, 2022 · Interview Experience

Essential Transformer Interview Cheat Sheet: 11 Must‑Know Q&A

This concise guide presents eleven frequently asked Transformer interview questions with clear, English explanations covering self‑attention formulas, scaling, alternative designs, LayerNorm vs. BatchNorm, positional embeddings, multi‑head mechanisms, and BPE tokenization, helping candidates deliver solid, theory‑backed answers.

BERTDeep LearningLayerNorm
0 likes · 6 min read
Essential Transformer Interview Cheat Sheet: 11 Must‑Know Q&A
Ctrip Technology
Ctrip Technology
Dec 30, 2021 · Artificial Intelligence

Semantic Matching Techniques for Intelligent Customer Service at Ctrip

This article presents Ctrip's intelligent customer service system, detailing the evolution of semantic matching methods from traditional lexical models to deep learning approaches such as BERT and ESIM, and describing multi‑stage retrieval, multilingual transfer learning, and KBQA techniques for improving query understanding and response accuracy.

BERTNLPcustomer-service
0 likes · 16 min read
Semantic Matching Techniques for Intelligent Customer Service at Ctrip
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 23, 2021 · Artificial Intelligence

How Pre‑Training Evolved: From word2vec to MAE Across NLP & Vision

This article traces the evolution of deep‑learning pre‑training techniques, starting with word2vec in NLP, moving through ELMo and BERT, then shifting to computer‑vision models such as iGPT, ViT, BEiT, and MAE, highlighting key innovations, challenges, and the convergence of NLP and CV paradigms.

BERTMAENLP
0 likes · 21 min read
How Pre‑Training Evolved: From word2vec to MAE Across NLP & Vision
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 15, 2021 · Artificial Intelligence

Why Can BERT’s Token, Segment, and Position Embeddings Be Added? A Deep Dive into Positional Encoding

This article revisits the long‑standing question of why BERT’s token, segment, and position embeddings are summed, critiques earlier explanations, and presents findings from the ICLR‑2021 paper “Rethinking Positional Encoding in Language Pre‑training” that show removing the token‑position cross term speeds convergence and improves downstream GLUE scores.

BERTEmbeddingLanguage Pretraining
0 likes · 6 min read
Why Can BERT’s Token, Segment, and Position Embeddings Be Added? A Deep Dive into Positional Encoding
Meituan Technology Team
Meituan Technology Team
Dec 9, 2021 · Artificial Intelligence

Fine-Grained Aspect-Based Sentiment Analysis for Meituan's To‑Restaurant Business

To enhance decision‑making for users and quality monitoring for merchants, Meituan’s to‑restaurant platform implements fine‑grained aspect‑based sentiment analysis that extracts dish, attribute, opinion and polarity tuples from reviews, employing both a BERT‑CRF pipeline and a joint Dual‑MRC model which raise F1 scores from 0.61 to 0.68, and are deployed in dashboards and review‑management tools, with future work targeting efficiency and broader four‑tuple extraction.

ABSABERTNLP
0 likes · 28 min read
Fine-Grained Aspect-Based Sentiment Analysis for Meituan's To‑Restaurant Business
Meituan Technology Team
Meituan Technology Team
Dec 2, 2021 · Artificial Intelligence

Pretraining Techniques for Search Advertising Relevance at Meituan

Meituan improves search‑ad relevance by applying pre‑trained BERT models enhanced with data‑augmented samples, multi‑task learning, keyword extraction and two‑stage knowledge distillation, producing a lightweight distilled model that, when fused with traditional relevance signals, boosts CTR, lowers Badcase@5 and raises NDCG while preserving revenue.

BERTSearchadvertising relevance
0 likes · 30 min read
Pretraining Techniques for Search Advertising Relevance at Meituan
58 Tech
58 Tech
Nov 25, 2021 · Artificial Intelligence

Technical Evolution of the “Guess You Want” Recommendation Module in 58 Local Services

This article describes the design, multi‑stage recall strategies, and successive ranking model upgrades—including BERT‑based intent prediction, vector‑based DSSM recall, tag expansion, and multi‑task DeepFM/MMoE/ESMM architectures—that together reduce no‑result rates and significantly improve user conversion for 58's local service platform.

BERTDSSMmulti-task learning
0 likes · 16 min read
Technical Evolution of the “Guess You Want” Recommendation Module in 58 Local Services
JD Retail Technology
JD Retail Technology
Nov 16, 2021 · Artificial Intelligence

Intelligent Online Selling Point Extraction for E‑Commerce Recommendation (IOSPE) Wins AAAI 2022 Innovation Award

The IOSPE system, which uses BERT‑based scoring, transformer‑pointer generation, and personalized distribution to automatically extract and generate selling points for millions of e‑commerce products, earned the AAAI 2022 Artificial Intelligence Innovation Application Award and has boosted click‑through rates and user dwell time across JD.com platforms.

AIBERTInnovation Award
0 likes · 6 min read
Intelligent Online Selling Point Extraction for E‑Commerce Recommendation (IOSPE) Wins AAAI 2022 Innovation Award
DataFunTalk
DataFunTalk
Nov 16, 2021 · Artificial Intelligence

Hotel Search Relevance Modeling and Architecture at Fliggy (Alibaba)

This article presents a comprehensive overview of Fliggy's hotel search relevance system, covering the business background, multi‑scenario architecture, core factor estimation, entity recognition, text and spatial relevance modeling, multi‑scenario fusion, and future optimization directions.

AIBERThotel search
0 likes · 17 min read
Hotel Search Relevance Modeling and Architecture at Fliggy (Alibaba)
Meituan Technology Team
Meituan Technology Team
Oct 28, 2021 · Artificial Intelligence

Supply Standardization for Script‑Murder Business Using a Knowledge Graph

Meituan’s To‑Store Integrated data team built an end‑to‑end supply‑standardization pipeline for the rapidly growing script‑murder market by extending the GENE knowledge graph to mine merchant supply, construct a unified script library through rule‑based, semantic, and multimodal clustering, and link products and user‑generated content to standardized scripts, enabling a dedicated category, personalized recommendations, filter tags, and improved ranking.

BERTKnowledge GraphMultimodal Learning
0 likes · 23 min read
Supply Standardization for Script‑Murder Business Using a Knowledge Graph
Meituan Technology Team
Meituan Technology Team
Sep 30, 2021 · Artificial Intelligence

Meituan's Intelligent Customer Service Technology and Practice

Meituan’s intelligent customer service platform, serving over 630 million users and 7.7 million merchants, integrates six core AI capabilities—including problem recommendation, understanding, dialogue management, answer supply, response recommendation, and session summarization—across pre‑sale, in‑sale, after‑sale and internal scenarios, leveraging multi‑turn dialogue, intent recognition, knowledge‑graph Q&A, and the Moses platform, while targeting future end‑to‑end and emotionally intelligent interactions.

BERTDialogue SystemsIntelligent Customer Service
0 likes · 23 min read
Meituan's Intelligent Customer Service Technology and Practice
58 Tech
58 Tech
Aug 19, 2021 · Artificial Intelligence

Practical NER Techniques for Business Chatbots on the 58.com Service Platform

This article presents a comprehensive case study of applying named‑entity‑recognition (NER) techniques to the smart chat assistant of 58.com’s yellow‑page service, covering business background, model selection (BiLSTM‑CRF, IDCNN‑CRF, BERT), data‑augmentation, focal loss, fusion of rule‑based and neural methods, context modeling, online performance, and future research directions.

BERTCRFDialogue Systems
0 likes · 16 min read
Practical NER Techniques for Business Chatbots on the 58.com Service Platform
58 Tech
58 Tech
Aug 5, 2021 · Artificial Intelligence

Exploration and Practice of Text Representation Algorithms in the 58 Security Scenario

This article presents a comprehensive study of text representation techniques—from weighted word‑vector methods to supervised SimBert and unsupervised contrastive learning models—applied to large‑scale unstructured data in 58's information‑security workflows, evaluating their effectiveness for classification and content‑recall tasks.

BERTSimCSEcontrastive learning
0 likes · 11 min read
Exploration and Practice of Text Representation Algorithms in the 58 Security Scenario
Ctrip Technology
Ctrip Technology
Jul 29, 2021 · Artificial Intelligence

NLP Techniques for Classifying Ctrip Ticket Customer Service Conversations

This article presents the background, problem analysis, data preprocessing, modeling approaches and optimization results of applying various NLP methods—including statistical models, word embeddings, attention mechanisms and pretrained language models such as BERT—to improve the accuracy of classifying Ctrip ticket customer service dialogues.

BERTDeep LearningNLP
0 likes · 13 min read
NLP Techniques for Classifying Ctrip Ticket Customer Service Conversations
DataFunSummit
DataFunSummit
Jul 25, 2021 · Artificial Intelligence

Advances in Query Understanding and Semantic Retrieval at Zhihu Search

This article details Zhihu Search's engineering solutions for long‑tail query challenges, covering historical development, term weighting, synonym expansion, query rewriting with reinforcement learning, and semantic recall using BERT‑based models, while also outlining future research directions such as GAN‑based rewriting and lightweight pre‑training.

BERTEmbedding RetrievalQuery Rewriting
0 likes · 14 min read
Advances in Query Understanding and Semantic Retrieval at Zhihu Search
58 Tech
58 Tech
Jul 5, 2021 · Artificial Intelligence

Construction of a Virtual Category‑Tag System for 58 Local Services Using Machine Learning

This article describes the end‑to‑end design and implementation of a virtual category‑tag framework for 58 local services, detailing data preparation, tag selection via semantic similarity models, tag mounting, synonym normalization, experimental comparisons of CDSSM, MatchPyramid, BERT, RoBERTa and other techniques, and outlines future improvements.

BERTTaggingsynonym normalization
0 likes · 16 min read
Construction of a Virtual Category‑Tag System for 58 Local Services Using Machine Learning
DataFunTalk
DataFunTalk
Jun 20, 2021 · Artificial Intelligence

Iterative Development and Applications of Meituan Takeaway Food Knowledge Graph

This article systematically introduces the architecture, iterative improvements, modeling techniques, and practical applications of Meituan's food knowledge graph, covering category taxonomy, standard dish names, basic and thematic attributes, health‑meal detection, dish entity alignment, and downstream recommendation and search use cases.

AIBERTFood Recommendation
0 likes · 18 min read
Iterative Development and Applications of Meituan Takeaway Food Knowledge Graph
58 Tech
58 Tech
Jun 18, 2021 · Artificial Intelligence

Bidding Document Classification and Entity Extraction Using BERT-based Models

This article describes how 58.com built an end‑to‑end bidding service that crawls tender documents, classifies them into multiple categories with BERT‑based models (including softmax, sigmoid and ensemble approaches) and extracts key entities using BERT‑CRF and reading‑comprehension techniques, achieving over 90% overall accuracy and dramatically improving recall and precision.

BERTNLPdocument classification
0 likes · 15 min read
Bidding Document Classification and Entity Extraction Using BERT-based Models
58 Tech
58 Tech
Jun 16, 2021 · Artificial Intelligence

Improving Text Matching Accuracy in Voice Assistants: Experiments with Siamese Networks, BERT Models, and Advanced Tricks

This article evaluates classic Siamese networks, various BERT‑based pretrained models, and several training tricks such as adversarial training, k‑fold cross‑validation, and model ensembling on both a public similarity‑sentence competition dataset and an internal voice‑assistant standard question matching dataset, ultimately raising accuracy from 97.23 % to 99.5 %.

BERTSiamese NetworkVoice Assistant
0 likes · 15 min read
Improving Text Matching Accuracy in Voice Assistants: Experiments with Siamese Networks, BERT Models, and Advanced Tricks
DataFunTalk
DataFunTalk
Jun 6, 2021 · Artificial Intelligence

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT introduces a contrastive self‑supervised framework that enhances BERT‑derived sentence embeddings by applying efficient embedding‑level data augmentations, achieving significant improvements on semantic textual similarity tasks, especially in low‑resource settings, and outperforming previous state‑of‑the‑art methods.

BERTcontrastive learningself-supervised
0 likes · 20 min read
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
DataFunSummit
DataFunSummit
Jun 5, 2021 · Artificial Intelligence

Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure‑Preserving Methods

This article reviews BERT’s architecture, analyzes the storage and compute costs of each layer, and systematically presents compression methods—including quantization, pruning, knowledge distillation (Distilled BiLSTM and MobileBERT), and structure‑preserving techniques—aimed at enabling efficient deployment on resource‑constrained mobile devices.

BERTMobile Deploymentknowledge distillation
0 likes · 15 min read
Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure‑Preserving Methods
Meituan Technology Team
Meituan Technology Team
Jun 3, 2021 · Artificial Intelligence

ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT is a contrastive self‑supervised framework that fine‑tunes BERT with augmented sentence views and NT‑Xent loss to overcome embedding collapse, delivering roughly 8% higher STS performance than prior methods, remaining robust in few‑shot and supervised scenarios, and now deployed in Meituan’s NLP pipelines.

BERTNLPcontrastive learning
0 likes · 20 min read
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
DataFunTalk
DataFunTalk
Jun 3, 2021 · Artificial Intelligence

Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure-Preserving Methods

This article examines the internal structure of BERT and systematically presents various model‑compression strategies—including quantization, pruning, knowledge distillation, and structure‑preserving techniques—highlighting their impact on storage, computational cost, and inference speed for deployment on resource‑constrained mobile devices.

BERTMobile AIknowledge distillation
0 likes · 16 min read
Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure-Preserving Methods
Meituan Technology Team
Meituan Technology Team
May 27, 2021 · Artificial Intelligence

Iterative Development and Applications of Meituan Takeaway Food Knowledge Graph

The Meituan Takeaway Food Knowledge Graph iteratively builds a hierarchical tag taxonomy, standardizes dish names, extracts basic and theme attributes, aligns online‑offline entities using CNN‑CRF, BERT and hybrid models, and powers combo, interactive and search recommendations while planning scene‑specific tags and graph‑based personalization.

BERTFood RecommendationKnowledge Graph
0 likes · 19 min read
Iterative Development and Applications of Meituan Takeaway Food Knowledge Graph
58 Tech
58 Tech
May 24, 2021 · Artificial Intelligence

Tag Extraction for 58 Yellow Pages Posts Using Sequence Labeling and Model Optimization

This article describes a complete solution for extracting and normalizing tags from 58 Yellow Pages service posts, covering candidate word acquisition, sequence‑labeling models such as CRF and BERT‑CRF, hierarchical softmax optimization for massive label spaces, and experimental results on both post content and user reviews.

BERTCRFhierarchical softmax
0 likes · 20 min read
Tag Extraction for 58 Yellow Pages Posts Using Sequence Labeling and Model Optimization
Cyber Elephant Tech Team
Cyber Elephant Tech Team
Apr 28, 2021 · Artificial Intelligence

Understanding BERT: From Encoder-Decoder to Transformer and Attention

This article explains the BERT model by first reviewing the Encoder-Decoder framework, then detailing the attention mechanism—including self-attention and multi-head attention—before describing the full Transformer architecture and finally outlining BERT’s encoder-only design, training stages, and fine-tuning applications.

BERTEncoder-DecoderNLP
0 likes · 15 min read
Understanding BERT: From Encoder-Decoder to Transformer and Attention
NetEase Media Technology Team
NetEase Media Technology Team
Apr 13, 2021 · Artificial Intelligence

Applying BERT for News Timeliness Classification at NetEase

The article describes how NetEase adapts a pre‑trained BERT model to classify news articles into ultra‑short, short, or long timeliness categories by combining rule‑based strong and weak time cues, key‑sentence extraction, domain‑embedding fusion and multi‑layer semantic aggregation, achieving accurate and interpretable predictions for its platform.

BERTModel FusionNLP
0 likes · 12 min read
Applying BERT for News Timeliness Classification at NetEase
58 Tech
58 Tech
Mar 29, 2021 · Artificial Intelligence

Deep Semantic Model Exploration and Application in 58 Search

This article presents a comprehensive overview of 58 Search's multi‑stage retrieval system, compares term‑match and semantic matching, details the design, training, and optimization of interactive, dual‑tower, and semi‑interactive BERT‑based semantic models, and discusses their practical deployment in ranking and recall stages.

AIBERTdual-tower
0 likes · 18 min read
Deep Semantic Model Exploration and Application in 58 Search
Youku Technology
Youku Technology
Mar 23, 2021 · Artificial Intelligence

Text-Video Alignment Algorithm for Automated Short Video Production at Youku

Youku’s new text‑video alignment system automatically generates short video summaries by extracting multimodal video and linguistic features, matching sentences to clips through embedding and tag‑level models, and enabling AI‑driven auto‑editing that cuts production time from days to minutes.

BERTNLPcross-modal matching
0 likes · 10 min read
Text-Video Alignment Algorithm for Automated Short Video Production at Youku
360 Smart Cloud
360 Smart Cloud
Mar 4, 2021 · Artificial Intelligence

Optimizing BERT Online Service Deployment at 360 Search

This article describes the challenges of deploying a large BERT model as an online service for 360 Search and details engineering optimizations—including framework selection, model quantization, knowledge distillation, stream scheduling, caching, and dynamic sequence handling—that dramatically improve latency, throughput, and resource utilization.

BERTFP16 quantizationGPU Optimization
0 likes · 12 min read
Optimizing BERT Online Service Deployment at 360 Search
360 Tech Engineering
360 Tech Engineering
Mar 1, 2021 · Artificial Intelligence

Deploying BERT as an Online Service: Challenges and Optimizations at 360 Search

This article details the engineering challenges of serving a large BERT model in real‑time for 360 Search and describes a series of optimizations—including TensorRT‑based kernel fusion, model quantization, knowledge distillation, multi‑stream execution, caching, and dynamic sequence handling—that together achieve low latency, high throughput, and stable deployment on GPU clusters.

BERTGPUModel Optimization
0 likes · 10 min read
Deploying BERT as an Online Service: Challenges and Optimizations at 360 Search
58 Tech
58 Tech
Mar 1, 2021 · Artificial Intelligence

Intelligent QABot for 58.com: Classification and Retrieval Model Exploration

This article describes how 58.com’s AI Lab built and continuously improved the QABot intelligent customer‑service system by designing classification and retrieval models, evaluating FastText, LSTM‑DSSM, BERT and a self‑developed SPTM framework, and finally fusing them to boost answer rates and user experience.

AI chatbotBERTModel Fusion
0 likes · 9 min read
Intelligent QABot for 58.com: Classification and Retrieval Model Exploration
DataFunTalk
DataFunTalk
Feb 20, 2021 · Artificial Intelligence

Industrial-Scale Machine Translation at Bytedance: Applications, Demos, and Research Advances

This article presents Bytedance's industrial machine‑translation platform, describing its global deployment, diverse product demos, underlying sequence‑to‑sequence models, BERT‑enhanced training strategies, prune‑tune sparsity techniques, multilingual pre‑training, document translation, and a high‑performance inference engine.

BERTmachine translationmultilingual NLP
0 likes · 19 min read
Industrial-Scale Machine Translation at Bytedance: Applications, Demos, and Research Advances
Sohu Tech Products
Sohu Tech Products
Feb 17, 2021 · Artificial Intelligence

Improving BERT Pre‑training with RealFormer: Principles, Implementation, and Empirical Evaluation

This article analyzes the RealFormer modification to the Transformer architecture, details its implementation in BERT, and presents extensive experiments showing that while RealFormer can boost performance on low‑label‑count classification tasks, its benefits diminish or disappear as the number of classes grows.

BERTRealFormerResidual
0 likes · 12 min read
Improving BERT Pre‑training with RealFormer: Principles, Implementation, and Empirical Evaluation
DataFunTalk
DataFunTalk
Feb 14, 2021 · Artificial Intelligence

TurboTransformers: An Efficient GPU Serving System for Transformer Models

TurboTransformers introduces a suite of GPU‑centric optimizations—including a high‑throughput batch reduction algorithm, a variable‑length‑aware memory allocator, and a dynamic‑programming‑based batch scheduling strategy—that together deliver significantly lower latency and higher throughput for Transformer‑based NLP services compared with existing frameworks such as PyTorch, TensorFlow, ONNX Runtime and TensorRT.

BERTDynamic BatchingGPU inference
0 likes · 13 min read
TurboTransformers: An Efficient GPU Serving System for Transformer Models
58 Tech
58 Tech
Jan 29, 2021 · Artificial Intelligence

Optimization Practices for Business Opportunity Slot Recognition in 58.com Intelligent Customer Service

This article details the background, challenges, architecture, model selection, and future directions of the business‑opportunity slot recognition module used in 58.com’s intelligent customer service, highlighting how regex‑model fusion and IDCNN‑CRF improve entity extraction for phone, WeChat, address, and time slots.

BERTCRFIDCNN
0 likes · 11 min read
Optimization Practices for Business Opportunity Slot Recognition in 58.com Intelligent Customer Service
DataFunTalk
DataFunTalk
Jan 15, 2021 · Artificial Intelligence

Zhihu Search Text Relevance Evolution and BERT Knowledge Distillation Practices

This talk by Zhihu search algorithm engineer Shen Zhan details the evolution of text relevance models from TF‑IDF/BM25 to deep semantic matching and BERT, explains the challenges of deploying BERT at scale, and describes practical knowledge‑distillation techniques that improve both online latency and offline storage while maintaining search quality.

BERTknowledge distillationmachine learning
0 likes · 14 min read
Zhihu Search Text Relevance Evolution and BERT Knowledge Distillation Practices
Amap Tech
Amap Tech
Dec 30, 2020 · Artificial Intelligence

LRC-BERT: Contrastive Learning based Knowledge Distillation with COS‑NCE Loss for Efficient NLP Models

The Amap team introduced LRC‑BERT, a contrastive‑learning‑based knowledge‑distillation framework that employs a novel COS‑NCE loss, gradient‑perturbation, and a two‑stage training schedule, enabling a 4‑layer student model to retain about 97 % of BERT‑Base accuracy while being 7.5× smaller and 9.6× faster, and it has already improved real‑world traffic‑event extraction performance.

BERTCOS-NCE lossNLP
0 likes · 16 min read
LRC-BERT: Contrastive Learning based Knowledge Distillation with COS‑NCE Loss for Efficient NLP Models
DataFunTalk
DataFunTalk
Dec 28, 2020 · Artificial Intelligence

Intelligent Question Answering: Scenarios, Architecture, and Technical Implementations (QA, Knowledge‑Graph QA, NL2SQL)

This article introduces the typical applications of intelligent question answering, compares chat‑type, knowledge‑type and task‑type bots, and then details the end‑to‑end architecture, knowledge‑base construction, semantic‑equivalence modeling with BERT‑BIMPM, knowledge‑graph QA pipelines, and NL2SQL techniques, concluding with practical deployment insights.

AIBERTDialogue Systems
0 likes · 15 min read
Intelligent Question Answering: Scenarios, Architecture, and Technical Implementations (QA, Knowledge‑Graph QA, NL2SQL)
DataFunSummit
DataFunSummit
Dec 27, 2020 · Artificial Intelligence

Sequence Labeling in Natural Language Processing: Definitions, Tag Schemes, Model Choices, and Practical Implementation

This article provides a comprehensive overview of sequence labeling tasks in NLP, covering their definition, common tag schemes (BIO, BIEO, BIESO), comparisons with other NLP tasks, major modeling approaches such as HMM, CRF, RNN and BERT, real‑world applications like POS tagging, NER, event extraction and gene analysis, and a step‑by‑step PyTorch implementation with dataset preparation, training pipeline, and evaluation metrics.

BERTCRFHMM
0 likes · 27 min read
Sequence Labeling in Natural Language Processing: Definitions, Tag Schemes, Model Choices, and Practical Implementation
DataFunTalk
DataFunTalk
Dec 25, 2020 · Artificial Intelligence

Exploring Pretraining Model Optimization and Deployment Challenges in NLP

This article reviews the evolution of pretraining models in NLP, discusses the practical challenges of deploying large models such as inference latency, knowledge integration, and task adaptation, and presents Xiaomi’s optimization techniques including knowledge distillation, low‑precision inference, operator fusion, and multi‑granularity segmentation for dialogue systems.

BERTDialogue SystemsInference Optimization
0 likes · 15 min read
Exploring Pretraining Model Optimization and Deployment Challenges in NLP
Meituan Technology Team
Meituan Technology Team
Dec 3, 2020 · Artificial Intelligence

Meituan Knowledge Graph Group's Six Papers Accepted at CIKM 2020

Meituan’s search and NLP team announced that six knowledge‑graph papers—covering query‑aware tip generation, BERT‑based ranking, multi‑modal and sequential recommendation, conversational recommendation, and graph‑embedding for personalized product search—were accepted at CIKM 2020, resulting from university collaborations and already deployed to boost Meituan’s search, recommendation and product‑search services.

BERTCIKM 2020Knowledge Graph
0 likes · 13 min read
Meituan Knowledge Graph Group's Six Papers Accepted at CIKM 2020
JD Tech Talk
JD Tech Talk
Dec 3, 2020 · Artificial Intelligence

Consumer Behavior Cause Extraction with BERT Fine‑tuning and a Novel Sequence‑Labeling Framework (ICDM 2020 Winning Solution)

At ICDM 2020, the JD Digits Silicon Valley team achieved top results in the Knowledge Graph Contest by fine‑tuning BERT and introducing a novel sequence‑labeling framework that jointly extracts consumer behavior types and their underlying reasons, leveraging CRF decoding and model ensemble for superior performance.

BERTCRFICDM 2020
0 likes · 11 min read
Consumer Behavior Cause Extraction with BERT Fine‑tuning and a Novel Sequence‑Labeling Framework (ICDM 2020 Winning Solution)
Sohu Tech Products
Sohu Tech Products
Nov 4, 2020 · Artificial Intelligence

Understanding BERT: Architecture, Pre‑training, Fine‑tuning and Applications in Modern NLP

This article provides a comprehensive overview of BERT and related NLP advances, covering its historical context, model architecture, input‑output mechanisms, comparisons with CNNs, word‑embedding evolution, pre‑training strategies like MLM and next‑sentence prediction, and practical guidance for fine‑tuning and feature extraction.

BERTFine-tuningNLP
0 likes · 17 min read
Understanding BERT: Architecture, Pre‑training, Fine‑tuning and Applications in Modern NLP
Tencent Cloud Developer
Tencent Cloud Developer
Sep 23, 2020 · Artificial Intelligence

NLP Model Interpretability: White-box and Black-box Methods and Business Applications

The article reviews NLP interpretability techniques, contrasting white‑box approaches that probe model internals such as neuron analysis, diagnostic classifiers, and attention with black‑box strategies like rationales, adversarial testing, and local surrogates, and argues that black‑box methods are generally more practical for business deployment despite offering shallower insights.

Attention MechanismBERTDeep Learning
0 likes · 12 min read
NLP Model Interpretability: White-box and Black-box Methods and Business Applications
DataFunTalk
DataFunTalk
Sep 23, 2020 · Artificial Intelligence

From Word Embedding to BERT: A Comprehensive Overview of Pre‑training Model Development in NLP

This article surveys the evolution of pre‑training models for natural language processing, detailing model architectures such as Encoder‑AE, Decoder‑AR, Encoder‑Decoder, Prefix LM, and PLM, analyzing why models like RoBERTa, T5, and GPT‑3 excel, and offering practical guidance for building strong pre‑training systems.

BERTNLPTransformer
0 likes · 47 min read
From Word Embedding to BERT: A Comprehensive Overview of Pre‑training Model Development in NLP
58 Tech
58 Tech
Sep 21, 2020 · Artificial Intelligence

58.com AI Algorithm Competition: Winning Teams and Their Technical Solutions

The 58.com AI Algorithm Competition showcased intelligent customer‑service technology, with 158 teams competing on text classification and matching tasks, and the top five teams presenting detailed BERT, ELECTRA, focal‑loss and multi‑model fusion solutions along with award ceremonies, video recordings and PPT resources.

AIBERTELECTRA
0 likes · 9 min read
58.com AI Algorithm Competition: Winning Teams and Their Technical Solutions
Tencent Cloud Developer
Tencent Cloud Developer
Sep 3, 2020 · Artificial Intelligence

CTR Prediction Optimization for App Store Recommendation: Integrating DeepWalk, BERT, and Attention Mechanisms

The paper presents an optimized CTR prediction model for Tencent’s App Store that merges multi‑behavior shared embeddings, long‑term DeepWalk graph embeddings, BERT‑derived app description vectors, and attention‑based fusion, reducing parameters while improving bias, AUC, and recommendation performance for sparse, long‑tail data.

BERTCTR predictionDeepWalk
0 likes · 9 min read
CTR Prediction Optimization for App Store Recommendation: Integrating DeepWalk, BERT, and Attention Mechanisms
Meituan Technology Team
Meituan Technology Team
Aug 20, 2020 · Artificial Intelligence

DR-BERT: Enhancing BERT-based Document Ranking with Task-adaptive Training and OOV Matching

DR‑BERT boosts BERT‑based document ranking on the MS MARCO benchmark by applying domain‑adaptive pretraining, a two‑stage fine‑tuning pipeline (pointwise then listwise), and OOV‑aware mechanisms—including exact‑match features and word‑recovery of sub‑tokens—achieving the first MRR@10 above 0.4 and leading the leaderboard.

BERTDR-BERTMS MARCO
0 likes · 16 min read
DR-BERT: Enhancing BERT-based Document Ranking with Task-adaptive Training and OOV Matching
Sohu Tech Products
Sohu Tech Products
Aug 19, 2020 · Artificial Intelligence

ASR Error Correction with BERT, ELECTRA and a Fuzzy‑Phoneme Generator: Techniques from Xiaomi AI

This article describes how Xiaomi's AI team tackles Automatic Speech Recognition (ASR) query errors by analyzing error patterns, employing BERT, ELECTRA and a soft‑masked BERT model, generating synthetic noisy data with a fuzzy‑phoneme generator, and presenting experimental results and future research directions.

ASRBERTDeep Learning
0 likes · 18 min read
ASR Error Correction with BERT, ELECTRA and a Fuzzy‑Phoneme Generator: Techniques from Xiaomi AI
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 24, 2020 · Artificial Intelligence

Fine‑grained Character Sentiment Analysis in Scripts: Models, Challenges, and Future Directions

The article surveys fine‑grained character sentiment analysis for script evaluation, detailing traditional, target‑dependent and aspect‑level methods, describing iQIYI’s BERT‑TD‑LSTM and CNN architectures, addressing challenges such as character name recognition and long‑range context, and outlining future improvements after a Parasite case study.

BERTLSTMNLP
0 likes · 19 min read
Fine‑grained Character Sentiment Analysis in Scripts: Models, Challenges, and Future Directions