Artificial Intelligence 15 min read

Active Learning and Model Enhancements for Semantic Tag Mining in 58.com Voice Data

This article presents a comprehensive study on extracting semantic tags from 58.com voice data, detailing the use of active learning to address cold‑start problems, comparing keyword matching, XGBoost, TextCNN, CRNN, and an improved Wide&Deep model, and demonstrating significant reductions in labeling effort and superior F1 scores across multiple experiments.

58 Tech
58 Tech
58 Tech
Active Learning and Model Enhancements for Semantic Tag Mining in 58.com Voice Data

The paper introduces the importance of voice as a communication medium on 58.com, where large volumes of voice data from call‑center, telephone, and micro‑chat scenarios provide valuable mining opportunities for semantic tagging.

Two main stages are described: converting voice to text using an in‑house ASR engine, followed by semantic tag extraction using natural language processing techniques such as keyword mining, text classification, and entity recognition.

To solve the cold‑start issue of limited labeled data, an active learning framework is applied. The iterative process involves training an initial model on a small labeled set, scoring unlabeled samples, selecting high‑value samples for annotation based on uncertainty or diversity queries, and retraining until the unlabeled pool is exhausted or performance stabilizes.

Uncertainty‑based query strategies (minimum confidence, margin sampling, entropy) and diversity‑based strategies (text similarity) are evaluated. Experiments on real estate semantic tags show that uncertainty sampling reduces required annotations by ~40% compared to random labeling, while diversity sampling saves ~60%.

Various models are benchmarked: keyword matching (baseline, 98.75% accuracy, 37.44% recall, F1 54.3%), XGBoost (F1 59.21%), TextCNN (F1 64.97%), a fused model combining the three (F1 70.15%), CRNN (leveraging TextCNN + RNN for long‑range context, F1 76.08%), and an improved Wide&Deep model that replaces the Deep component with TextCNN (F1 80.67%).

Online A/B tests confirm that two rounds of active learning (first uncertainty, then diversity) raise recall from 52% to 80% while maintaining 98% accuracy, achieving a 28% absolute recall improvement.

The study concludes that active learning dramatically cuts annotation costs and that the improved Wide&Deep architecture delivers the best performance; future work will explore pre‑trained language models and adversarial training to further boost results.

machine learningmodel comparisonText Classificationwide & deepactive learningCRNNsemantic tagging
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.