Tagged articles
149 articles
Page 2 of 2
DataFunTalk
DataFunTalk
Jul 15, 2020 · Artificial Intelligence

ASR Error Correction with BERT, ELECTRA, and a Fuzzy‑Phoneme Generator: Methods, Experiments, and Future Directions

This article presents a comprehensive overview of automatic speech recognition (ASR) error correction techniques employed by Xiaomi's Xiao‑Ai team, detailing problem definition, related work on BERT and ELECTRA, a custom generator‑discriminator architecture with a fuzzy‑phoneme simulator, experimental results, and prospective research directions.

ASRBERTELECTRA
0 likes · 19 min read
ASR Error Correction with BERT, ELECTRA, and a Fuzzy‑Phoneme Generator: Methods, Experiments, and Future Directions
Meituan Technology Team
Meituan Technology Team
Jul 9, 2020 · Artificial Intelligence

Optimizing Meituan Search Ranking with BERT: Methods and Practices

The Meituan Search team boosted ranking relevance by training a domain‑specific BERT, applying data augmentation, brand‑sample optimization, knowledge‑graph fusion, multi‑task and pairwise fine‑tuning, joint end‑to‑end training with LambdaLoss ranking models, and compressing the model for low‑latency inference, delivering up to +925 BP offline accuracy gains and measurable CTR and NDCG improvements in production.

BERTknowledge distillationmachine learning
0 likes · 34 min read
Optimizing Meituan Search Ranking with BERT: Methods and Practices
DataFunTalk
DataFunTalk
Jul 7, 2020 · Artificial Intelligence

Optimizing Pretrained Language Model Inference: Lessons from the NLPCC Small Model Competition and Deployment at Xiaomi

This article shares the Xiaomi AI Lab NLP team's experience in the NLPCC lightweight language model competition, discusses efficiency challenges of large pretrained models like BERT, and details practical inference optimizations—including model distillation, batching, FP16 quantization, and FasterTransformer integration—that dramatically reduce latency and hardware costs in production.

AIBERTInference Optimization
0 likes · 15 min read
Optimizing Pretrained Language Model Inference: Lessons from the NLPCC Small Model Competition and Deployment at Xiaomi
DataFunTalk
DataFunTalk
Jun 28, 2020 · Artificial Intelligence

Applying UDA Semi‑Supervised Learning to Financial Text Classification: Experiments and Insights

This article investigates the practical performance of Google’s 2019 Unsupervised Data Augmentation (UDA) framework on real‑world financial text classification tasks, detailing experiments with limited labeled data, domain‑out‑of‑distribution samples, noisy labels, and comparisons between BERT and lightweight TextCNN models.

BERTSemi-supervised LearningTextCNN
0 likes · 21 min read
Applying UDA Semi‑Supervised Learning to Financial Text Classification: Experiments and Insights
Tencent Advertising Technology
Tencent Advertising Technology
Jun 22, 2020 · Artificial Intelligence

Graph-based Evidence Aggregating and Reasoning (GEAR) Model for Fact Verification in NLP

The article explains how the GEAR model uses graph neural networks and BERT representations to aggregate multiple pieces of evidence for fact verification, improving accuracy on datasets like FEVER and offering applications in misinformation detection, knowledge‑graph completion, and advertising analytics.

BERTGEAR modelNLP
0 likes · 8 min read
Graph-based Evidence Aggregating and Reasoning (GEAR) Model for Fact Verification in NLP
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 2, 2020 · Artificial Intelligence

How FashionBERT Boosts E‑Commerce Image‑Text Matching with Patch Embeddings

This article introduces FashionBERT, a multimodal BERT‑based model that replaces ROI‑based image tokens with uniform image patches to overcome e‑commerce specific challenges, details its architecture, adaptive loss balancing, deployment in Alibaba search, and reports significant performance gains on public and internal datasets.

BERTDeep Learninge‑commerce
0 likes · 13 min read
How FashionBERT Boosts E‑Commerce Image‑Text Matching with Patch Embeddings
Qunar Tech Salon
Qunar Tech Salon
May 13, 2020 · Artificial Intelligence

Intelligent Hotel Post‑Sale QA System: Model Selection, Evaluation, and Engineering Optimization

This article describes the design, model selection, experimental evaluation, and engineering optimization of an AI‑driven post‑sale question‑answering system for hotel services, covering FAQ construction, intent detection, deep‑learning matching models such as DSSM, ESIM, BERT, and their performance and latency trade‑offs.

AIBERTDSSM
0 likes · 14 min read
Intelligent Hotel Post‑Sale QA System: Model Selection, Evaluation, and Engineering Optimization
Ctrip Technology
Ctrip Technology
Apr 30, 2020 · Artificial Intelligence

Intelligent Generation of Search Engine Advertising Keywords: Methods, Frameworks, and Future Directions

This article presents a comprehensive overview of automated techniques for generating high‑quality search engine advertising keywords, covering background, traditional manual methods, intelligent keyword expansion using NLP, segmentation, POS tagging, BILSTM‑CRF, BERT classification, semantic matching with DSSM, and additional approaches such as query suggestion and synonym rewriting.

BERTBILSTM-CRFNLP
0 likes · 15 min read
Intelligent Generation of Search Engine Advertising Keywords: Methods, Frameworks, and Future Directions
Meituan Technology Team
Meituan Technology Team
Mar 24, 2020 · Artificial Intelligence

Citation Intent Recognition: Meituan's Winning Solution in WSDM Cup 2020

Meituan’s Search & NLP team, together with two universities, won the WSDM Cup 2020 Citation Intent Recognition task by building a multimodal retrieval‑ranking pipeline that merges semantic, spatial and axiomatic recall models with pairwise BERT and LightGBM ranking, achieving the highest MAP@3 and now powering Meituan’s QA, FAQ and core search systems.

BERTCitation IntentLightGBM
0 likes · 14 min read
Citation Intent Recognition: Meituan's Winning Solution in WSDM Cup 2020
Qunar Tech Salon
Qunar Tech Salon
Mar 5, 2020 · Artificial Intelligence

Content Tagging Technology for Short Videos at iQIYI: Challenges and Model Evolution

This article describes iQIYI's short‑video content tagging system, outlining the challenges of extracting type and abstract tags from multimodal data, detailing the evolution from text‑only models to image‑fusion, BERT‑enhanced, and video‑frame models, and discussing their applications and future directions.

BERTMultimodal LearningTransformer
0 likes · 11 min read
Content Tagging Technology for Short Videos at iQIYI: Challenges and Model Evolution
58 Tech
58 Tech
Mar 2, 2020 · Artificial Intelligence

Low-Quality Text Detection Using Unsupervised Language Model Perplexity

This article proposes a method to identify low-quality text in business data by training a large-scale unsupervised language model to compute sentence perplexity, converting the detection problem into a threshold decision, and details model design, challenges, optimizations, and online performance results.

BERTLanguage ModelNLP
0 likes · 13 min read
Low-Quality Text Detection Using Unsupervised Language Model Perplexity
iQIYI Technical Product Team
iQIYI Technical Product Team
Feb 14, 2020 · Artificial Intelligence

Content Tagging Technology for Short Videos: Challenges and Multi‑Modal Model Evolution at iQIYI

iQIYI’s short‑video tagging system tackles multimodal fusion, open‑set and abstract tags by evolving from a text‑only model through cover‑image, BERT‑vector, and video‑frame fusion architectures, enabling automated labeling, personalized recommendation, and semantic search while planning to add OCR, audio, and knowledge‑graph enhancements.

BERTMultimodal LearningTransformer
0 likes · 13 min read
Content Tagging Technology for Short Videos: Challenges and Multi‑Modal Model Evolution at iQIYI
Xueersi Online School Tech Team
Xueersi Online School Tech Team
Jan 17, 2020 · Artificial Intelligence

Fine‑tuning BERT for Sentence Pair Similarity in an Online Education Platform

This article describes how a BERT‑based model is fine‑tuned to compute sentence‑pair similarity for improving recommendation accuracy in an online school, detailing the architecture, training mechanisms, code implementation, experimental results, and future extensions such as sentiment analysis.

BERTChinese NLPDeep Learning
0 likes · 20 min read
Fine‑tuning BERT for Sentence Pair Similarity in an Online Education Platform
DataFunTalk
DataFunTalk
Dec 27, 2019 · Artificial Intelligence

NLP Challenges and Tagging Solutions in Sina Weibo Feed

This article reviews the specific NLP difficulties encountered in Sina Weibo's feed—such as short text, informal language, and ambiguous user behavior—and details the multi‑stage tagging system, material library, multimodal modeling, multi‑task learning, and large‑scale pre‑training techniques used to address them.

BERTNLPWeibo
0 likes · 15 min read
NLP Challenges and Tagging Solutions in Sina Weibo Feed
Amap Tech
Amap Tech
Dec 6, 2019 · Artificial Intelligence

Semantic Understanding of Merchant Signboards for Automatic POI Name Generation at Amap

Amap's POI naming automation uses a two-stage cascade model: Stage 1 extracts token and sentence features with POS tags and domain-adapted BERT‑POI; Stage 2 employs a Bi‑LSTM to model line relationships, achieving over 95% semantic accuracy and 3‑6% recall improvements, thereby enhancing automatic signboard‑based POI name generation.

BERTLSTMMultimodal AI
0 likes · 7 min read
Semantic Understanding of Merchant Signboards for Automatic POI Name Generation at Amap
DataFunTalk
DataFunTalk
Nov 15, 2019 · Artificial Intelligence

MT-BERT: Domain‑Adapted BERT Pre‑training and Fine‑tuning for Meituan‑Dianping NLP Tasks

This article describes the development of MT‑BERT, a BERT‑based language model pre‑trained on Meituan‑Dianping business data, its distributed mixed‑precision training pipeline, domain adaptation, knowledge‑graph integration, model compression techniques, and the wide range of downstream NLP applications achieved in the platform.

BERTKnowledge GraphMeituan
0 likes · 31 min read
MT-BERT: Domain‑Adapted BERT Pre‑training and Fine‑tuning for Meituan‑Dianping NLP Tasks
JD Tech Talk
JD Tech Talk
Nov 15, 2019 · Artificial Intelligence

Legal Case Similarity Competition at CCL 2019: Dataset, Task Transformation, and Model Solutions

The article reviews the CCL “Chinese Law Research Cup” similarity competition, describing the legal text dataset, converting the triple‑sample task to a binary similarity problem, outlining challenges such as long documents, and summarizing the BERT‑based Siamese, InferSent, and triplet‑loss models that achieved top‑10 results.

BERTCCL 2019Legal NLP
0 likes · 8 min read
Legal Case Similarity Competition at CCL 2019: Dataset, Task Transformation, and Model Solutions
Meituan Technology Team
Meituan Technology Team
Nov 14, 2019 · Artificial Intelligence

MT-BERT: Pre‑training and Fine‑tuning Practices at Meituan‑Dianping

MT‑BERT at Meituan‑Dianping combines mixed‑precision, domain‑adapted continual pre‑training, knowledge‑graph‑aware masking, and extensive compression techniques to produce fast, accurate BERT models that power fine‑grained sentiment analysis, intent classification, recommendation reasoning, and other NLP tasks across the platform.

BERTKnowledge GraphMT-BERT
0 likes · 33 min read
MT-BERT: Pre‑training and Fine‑tuning Practices at Meituan‑Dianping
HomeTech
HomeTech
Nov 13, 2019 · Artificial Intelligence

Sequence Labeling for Entity Recognition in Automotive Search: Techniques and Applications

This article examines how sequence labeling methods such as pattern matching, CRF, and deep‑learning models like BiLSTM‑CRF and BERT are applied to automotive search tasks—including car‑series, model, and location/entity recognition—detailing their development, implementation challenges, and performance results.

BERTCRFNLP
0 likes · 11 min read
Sequence Labeling for Entity Recognition in Automotive Search: Techniques and Applications
JD Tech Talk
JD Tech Talk
Oct 30, 2019 · Artificial Intelligence

Solution Overview for the Scientific Paper Recommendation Matching Competition

This article presents a comprehensive solution to a competition that requires matching description paragraphs with the three most relevant papers from a 200,000‑paper corpus, detailing background, task definition, evaluation metrics, modeling strategy, and core algorithms such as SIF, InferSent, Bi‑LSTM, and BERT.

BERTNLPcompetition
0 likes · 9 min read
Solution Overview for the Scientific Paper Recommendation Matching Competition
Qunar Tech Salon
Qunar Tech Salon
Oct 10, 2019 · Artificial Intelligence

Intelligent Customer Service System for Airline Ticket Business: Architecture, Data Analysis, and AI Techniques

This article describes the design and implementation of an AI‑powered intelligent customer service system for airline ticket operations, covering data‑driven problem analysis, dialogue architecture, intent recognition using BERT and fastText, knowledge‑base QA, and future development plans.

AIBERTIntelligent Customer Service
0 likes · 11 min read
Intelligent Customer Service System for Airline Ticket Business: Architecture, Data Analysis, and AI Techniques
DataFunTalk
DataFunTalk
Sep 26, 2019 · Artificial Intelligence

NLP Algorithm Practices in Alibaba's Brand Advertising

This article presents a comprehensive overview of Alibaba's brand advertising business model, its technical architecture, and the practical application of NLP algorithms—specifically brand intent recognition and short‑text relevance—detailing model evolution, evaluation results, and future research directions.

BERTBrand AdvertisingNLP
0 likes · 14 min read
NLP Algorithm Practices in Alibaba's Brand Advertising
Tencent Cloud Developer
Tencent Cloud Developer
Sep 1, 2019 · Artificial Intelligence

Fundamentals and Practical Implementation of Knowledge Graphs and Attribute Extraction

The article surveys the evolution and core components of knowledge graphs—from early Linked Data concepts to modern semantic networks—detailing the end‑to‑end pipeline of data acquisition, cleaning, extraction, and fusion, and showcases Tencent Cloud’s Merak framework and encyclopedia KG, highlighting model choices, performance benchmarks, and real‑world applications such as recommendation and intelligent Q&A.

AIBERTKnowledge Graph
0 likes · 13 min read
Fundamentals and Practical Implementation of Knowledge Graphs and Attribute Extraction
DataFunTalk
DataFunTalk
Jul 23, 2019 · Artificial Intelligence

Technical Exploration of Intelligent Dialogue Robots in Didi Ride-Hailing Scenarios

The talk presents Didi AI Labs' research on intelligent dialogue robots for ride‑hailing, covering single‑turn QA, multi‑turn conversation, multi‑task learning architectures, model experiments, active learning pipelines, and the overall system design that integrates intent detection, slot extraction, dialogue management, and response generation.

AIBERTDialogue Systems
0 likes · 10 min read
Technical Exploration of Intelligent Dialogue Robots in Didi Ride-Hailing Scenarios
Tencent Cloud Developer
Tencent Cloud Developer
Jul 19, 2019 · Artificial Intelligence

Multi-turn Dialogue Intent Classification: Data Processing, Model Construction, and Operational Optimization

The article details a multi‑turn dialogue intent classification pipeline that extracts and expands labeled utterances, preprocesses text with custom tokenization, trains a two‑layer CNN‑Highway and a multi‑head self‑attention model, analyzes errors, and achieves up to 98.7% accuracy on a large, balanced dataset.

BERTCNNdialogue system
0 likes · 15 min read
Multi-turn Dialogue Intent Classification: Data Processing, Model Construction, and Operational Optimization
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 17, 2019 · Artificial Intelligence

How Alibaba Halved BERT Latency for Real‑Time Search

This article details Alibaba's technical challenges with BERT's high resource consumption in online search, analyzes memory and compute bottlenecks using TensorFlow profiling, and presents both TensorFlow‑based tweaks and a custom CUDA implementation that together double throughput and cut latency by about 50%.

AlibabaBERTGPU
0 likes · 9 min read
How Alibaba Halved BERT Latency for Real‑Time Search
DataFunTalk
DataFunTalk
Jul 16, 2019 · Artificial Intelligence

Search Advertising and Ad Recall: Business Logic, Semantic Relevance, and Deep Learning Models at 360

This article explains the architecture of 360's search advertising system, detailing its ad recall, ranking, and display modules, illustrates exact‑match and semantic recall methods with a case study, and reviews the evolution from feature‑engineered GBDT models to deep learning approaches such as DSSM, ESIM, and BERT, including data preparation, training, and performance evaluation.

BERTDSSMad recall
0 likes · 10 min read
Search Advertising and Ad Recall: Business Logic, Semantic Relevance, and Deep Learning Models at 360
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 3, 2019 · Artificial Intelligence

Alibaba AI Sets New Record in MS MARCO Reading Comprehension, Surpassing Humans

Alibaba's AI model topped the MS MARCO reading comprehension challenge, achieving the highest scores in document ranking and open‑domain question answering, even surpassing human performance, thanks to its deep‑cascade BERT‑based architecture that mimics human reading and is already deployed in e‑commerce applications.

BERTMS MARCOReading Comprehension
0 likes · 5 min read
Alibaba AI Sets New Record in MS MARCO Reading Comprehension, Surpassing Humans
DataFunTalk
DataFunTalk
Jun 23, 2019 · Artificial Intelligence

Understanding XLNet: Differences from BERT, Innovations, and Experimental Analysis

This article examines XLNet, contrasting it with BERT by detailing its novel permutation language modeling, dual‑stream attention, and larger pre‑training data, and analyzes experimental results that show XLNet’s superior performance on reading‑comprehension, GLUE, and other NLP tasks, especially for long documents.

BERTNLPPermutation Language Model
0 likes · 27 min read
Understanding XLNet: Differences from BERT, Innovations, and Experimental Analysis
DataFunTalk
DataFunTalk
Jun 10, 2019 · Artificial Intelligence

BERT Applications Across NLP Domains: Progress, Challenges, and Future Directions

This article surveys the rapid proliferation of BERT-based research over the past six months, analyzing its impact on various NLP tasks such as question answering, information retrieval, dialog systems, summarization, data augmentation, classification, and sequence labeling, while also discussing the model's strengths, limitations, and future research opportunities.

BERTNLPdata augmentation
0 likes · 52 min read
BERT Applications Across NLP Domains: Progress, Challenges, and Future Directions
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 5, 2019 · Artificial Intelligence

Tracing the Evolution of Language Models: From N‑grams to GPT‑2

This article reviews the historical development of natural language processing language models, covering expert rule‑based systems, statistical n‑grams, smoothing techniques, neural network models such as NNLM, RNN, word2vec, GloVe, ELMo, and the transformer‑based breakthroughs of GPT, BERT and GPT‑2, and summarizes their impact on modern NLP tasks.

BERTDeep LearningGPT
0 likes · 25 min read
Tracing the Evolution of Language Models: From N‑grams to GPT‑2
58 Tech
58 Tech
May 31, 2019 · Artificial Intelligence

Deep Learning Approaches for Chinese Word Segmentation: BiLSTM‑CRF and BERT

This article reviews modern deep‑learning methods for Chinese word segmentation, comparing traditional CRF‑based approaches with BiLSTM‑CRF and BERT models, describing their architectures, training procedures, experimental results, and practical considerations for deployment.

BERTBiLSTMCRF
0 likes · 17 min read
Deep Learning Approaches for Chinese Word Segmentation: BiLSTM‑CRF and BERT
Alibaba Cloud Developer
Alibaba Cloud Developer
May 27, 2019 · Artificial Intelligence

From Neurons to BERT: Tracing the Evolution of Deep Learning in NLP

This article walks through the development of deep learning for natural language processing, starting with basic neural cells and shallow networks, then exploring CNNs, RNNs, LSTMs, TextCNN, ESIM, ELMo, and culminating with the Transformer‑based BERT model, its training objectives, fine‑tuning strategies, and performance comparisons.

BERTCNNDeep Learning
0 likes · 19 min read
From Neurons to BERT: Tracing the Evolution of Deep Learning in NLP
Qunar Tech Salon
Qunar Tech Salon
Apr 29, 2019 · Artificial Intelligence

Multi‑Level Deep Model Fusion for Fake News Detection Using BERT – Winning Solution of WSDM Cup 2019

The article details the Travel team's award‑winning solution for the WSDM Cup 2019 fake‑news detection task, describing data analysis, preprocessing, label‑propagation augmentation, a BERT‑based baseline, a three‑stage multi‑level model‑fusion framework, experimental results, and future directions.

BERTModel FusionNLP
0 likes · 12 min read
Multi‑Level Deep Model Fusion for Fake News Detection Using BERT – Winning Solution of WSDM Cup 2019
Hulu Beijing
Hulu Beijing
Apr 4, 2019 · Artificial Intelligence

How BERT, GPT, and ELMo Revolutionize Language Feature Representation

Natural language processing, a cornerstone of AI, relies on language models to capture linguistic features; this article reviews classic pre‑training models—ELMo, GPT, and BERT—explaining their architectures, training objectives, and how they boost downstream NLP tasks despite data‑scarcity challenges.

BERTDeep LearningELMo
0 likes · 10 min read
How BERT, GPT, and ELMo Revolutionize Language Feature Representation
Hulu Beijing
Hulu Beijing
Apr 2, 2019 · Artificial Intelligence

From Object Detection to Language Models: A Deep Dive into AI Advances

This article surveys the evolution of object detection models—comparing one‑stage and two‑stage approaches, their performance trade‑offs, and recent state‑of‑the‑art methods—while also outlining key concepts and breakthroughs in natural language processing, highlighting the impact of deep‑learning models such as BERT.

AI researchBERTDeep Learning
0 likes · 14 min read
From Object Detection to Language Models: A Deep Dive into AI Advances
DataFunTalk
DataFunTalk
Feb 25, 2019 · Artificial Intelligence

NLP Research and Practice at Hulu: From Historical Milestones to Product Development

This article recounts a Hulu NLP research engineer's experience, reviewing key milestones such as NNLM, Word2vec, Transformer and BERT, and then contrasting academic research with product development while illustrating real-world projects like news personalization and content embedding, and describing the supporting AI platform architecture.

AIBERTHulu
0 likes · 23 min read
NLP Research and Practice at Hulu: From Historical Milestones to Product Development
Meituan Technology Team
Meituan Technology Team
Feb 21, 2019 · Artificial Intelligence

Fake News Detection with Multi‑level BERT Fusion at WSDM Cup 2019

In the WSDM Cup 2019 fake-news detection challenge, the Meituan Travel team secured second place by combining extensive data analysis, Chinese-English BERT fine-tuning, label-propagation augmentation, and a three-level fusion framework—blending, stacking, and linear regression—that lifted weighted accuracy to 0.88156.

BERTModel FusionNLP
0 likes · 16 min read
Fake News Detection with Multi‑level BERT Fusion at WSDM Cup 2019
Meituan Technology Team
Meituan Technology Team
Jan 25, 2019 · Artificial Intelligence

Fine-grained User Review Sentiment Classification: AI Challenger 2018 Champion's Approach

Cheng Huige’s winning AI Challenger 2018 solution treated fine‑grained Chinese review sentiment as a 20‑aspect multi‑class task, combining a high‑capacity LSTM encoder with self‑attention, word‑and‑character embeddings, simplified ELMo pre‑training, diverse tokenizations and a weighted seven‑model ensemble (including BERT), which together delivered the competition’s top F1 performance.

BERTDeep LearningELMo
0 likes · 14 min read
Fine-grained User Review Sentiment Classification: AI Challenger 2018 Champion's Approach
DataFunTalk
DataFunTalk
Nov 24, 2018 · Artificial Intelligence

Comprehensive Guide to Fine‑Tuning BERT on Chinese Datasets

This article provides a step‑by‑step guide for fine‑tuning Google’s open‑source BERT on Chinese datasets, covering model download, processor customization, code examples, training commands, and insights into the underlying TensorFlow estimator architecture and deployment considerations.

BERTChinese NLPFine-tuning
0 likes · 11 min read
Comprehensive Guide to Fine‑Tuning BERT on Chinese Datasets