How Amap Uses AI to Automate Millions of User Feedback Reports

This article describes how Gaode Map leverages machine‑learning techniques—such as word2vec embeddings, LSTM networks, fine‑tuning, and confidence‑threshold ensembles—to automatically classify and verify massive user‑feedback intelligence, streamlining the multi‑step workflow from data collection to road‑map updates and dramatically improving efficiency.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Amap Uses AI to Automate Millions of User Feedback Reports

1. Background

Gaode Map, a leading domestic travel‑big‑data company, receives massive amounts of user feedback (texts, images, videos) that are crucial for improving map services. The challenge is to efficiently process hundreds of thousands of daily reports.

Intelligence refers to any information (text, image, video) that helps solve specific navigation or map‑production problems. User feedback includes intelligence, suggestions, and complaints submitted via mobile or PC clients.

Typical user feedback types
Typical user feedback types

2. Problem and Solution

User feedback is reported through the Amap app or PC portal, selecting options (source, major type, sub‑type, road name) and providing a free‑text description. After submission, each report must be classified, located, and verified before the map data can be updated.

Intelligence recognition : tag the problem type by analyzing selected options and the free‑text description, and reviewing any attached images.

Intelligence positioning : determine the exact coordinates by checking the tap point, the vehicle’s location at the time of reporting, and the user’s navigation trajectory logs.

Intelligence verification : confirm the tag and location using imagery, heat‑maps, and road‑network data.

The manual rule‑based pipeline suffers from low accuracy, high skill requirements, and slow throughput.

3. Machine‑Learning Solution

3.1 Business Decomposition and Hierarchical Splitting

The workflow is broken into six layers: business‑level 1, 2, 3, intelligence recognition, intelligence positioning, and intelligence verification. Only the last three layers need partial human intervention; the upper layers can be fully automated.

Business hierarchy diagram
Business hierarchy diagram

3.2 Model Alignment

Feedback descriptions are the most valuable signals. They are categorized as valid (meaningful) or invalid (empty or nonsensical). Valid feedback undergoes multi‑level classification (data / product / forward), with further sub‑classification for data (road vs. topic). Invalid feedback follows a parallel path using separate models and ultimately relies on rules or manual handling.

Model‑business mapping
Model‑business mapping

3.3 Model Choice

Text is first vectorized. Traditional one‑hot tf‑idf suffers from sparsity, so word2vec embeddings are used to capture semantic similarity. For classification, deep learning models outperform handcrafted features. Recurrent Neural Networks (RNN) handle sequence data, while Long Short‑Term Memory (LSTM) mitigates gradient issues.

3.4 Model Architecture

Each feedback’s word‑vector sequence is fed into an LSTM. The final LSTM hidden state is concatenated with selected categorical features, passed through a fully connected layer, and classified with a softmax output.

LSTM classification architecture
LSTM classification architecture

4. Practical Experience

4.1 Fine‑tuning

Because labeled samples are scarce, a pre‑trained model is fine‑tuned on the intelligence‑recognition dataset, yielding ~3 % accuracy gains across various data sizes.

4.2 Hyper‑parameter Tuning

Initialize with SVD.

Apply dropout before LSTM (especially for bidirectional LSTM) to prevent over‑fitting.

Adam optimizer performed best (similar to RMSprop).

Batch size around 128, but 64 sometimes gives better results.

Always shuffle the training data.

4.3 Ensemble

Voting among the top 5 models (different hyper‑parameters) improves overall accuracy by ~1.5 %.

4.4 Confidence Thresholding

High‑confidence predictions are automated; low‑confidence ones are sent for manual review. A simple per‑class threshold strategy outperformed more complex confidence‑model approaches, and a top‑N recommendation list further reduces operator effort.

Confidence model formula
Confidence model formula
Adjusted confidence formula
Adjusted confidence formula

5. Results and Impact

5.1 Intelligence Classification

Product‑class accuracy > 96 %; data‑class recall ≈ 99 %.

Automation reduced manual workload by 80 % and cut per‑task cost to one‑fifth of the original.

5.2 Intelligence Recognition

Valid‑description accuracy > 96 % after applying confidence‑based routing, boosting operator efficiency by > 30 %.

6. Conclusion and Outlook

The project established a repeatable methodology for tackling complex business problems with NLP and deep learning, delivering substantial efficiency gains while maintaining high user satisfaction. Ongoing work focuses on further model refinement and extending the approach to other domains.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AINLPclassificationuser feedbackLSTM
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.