Artificial Intelligence 21 min read

Turning Maps into a Living Map: Amap’s G-Where Generative AI Recommendation

Amap upgrades its homepage recommendation by integrating large‑model capabilities—G‑Where, G‑Action, and G‑Plan—through semantic ID generation, item tokenization, and multi‑stage LLM training, achieving significant offline and online performance gains while illustrating a scalable generative recommendation framework.

Amap Tech

Oct 27, 2025

Turning Maps into a Living Map: Amap’s G-Where Generative AI Recommendation

Special Introduction

Looking back, Amap has been a leading map app serving one billion users, and it now aims to combine two decades of data with modern large‑model capabilities to evolve from a static map + passive planning system to a dynamic cognition + proactive decision system.

With AI development, users demand a "living map" that can think.

Innovation Recommendation Team – Home‑Page Recommendation Group

G‑Where : large‑model based intelligent destination recommendation.

G‑Action : real‑time spatio‑temporal context to predict travel needs and provide diversified suggestions.

G‑Plan : a thinking large model that integrates AI‑agent abilities to assemble fragmented needs into a complete schedule displayed on the home page.

Innovation Recommendation Team – Human‑Space‑Time Model Group

Pre‑training of human‑space‑time large models learns city rhythms from massive anonymized app usage data.

Post‑training quickly turns a general model into a domain expert, making each recommendation precisely match user habits.

Innovation Recommendation Team – Exploration Recommendation Group

Content‑topic recommendation aggregates world knowledge to discover surprising points of interest (e.g., family day trips, intimate cafés).

Next, we dive into how Amap uses generative large models to embed "understanding you" into every recommendation.

Home‑Page Recommendation

Travel recommendation differs from traditional content or product recommendation: it must capture precise user travel intent in real time, shifting the main metric from List AUC to TOP‑1 ACC while respecting user behavior, scenario, and spatio‑temporal cognition.

User travel exhibits clear time, space, and personalization differences:

Geographic dimension : stable behavior in resident areas, short‑term needs in non‑resident areas.

Temporal dimension : regular peaks, instant needs during off‑peak, leisure activities during holidays.

User dimension : coexistence of short‑term, periodic, and long‑term preferences.

Traditional retrieval‑ranking frameworks can recommend most frequent destinations but suffer from information bubbles, SSB, weak spatio‑temporal awareness, and inability to learn long‑term versus short‑term behavior.

To address these issues, the Innovation Recommendation Team integrates large‑model capabilities into three tasks: G‑Where (guess where you go), G‑Action (guess what you do), and G‑Plan (guess your plan), forming a unique G‑series generative recommendation paradigm based on human‑space‑time background.

G‑Where Architecture

The workflow consists of two stages:

Item Quantization : tokenizes all POIs using a multi‑modal alignment method that captures geographic relationships.

LLM Post‑Training (Pos‑Training):

Amap ID: Semantic ID Generation for POIs

Generative recommendation models rely on next‑token prediction, which requires quantizing items to reduce total token count. Amap’s POI count exceeds one billion, so a semantic ID (SID) is needed to compress POIs while preserving multi‑modal information (text, images, collaborative‑filtering signals, spatial graph embeddings).

Traditional item tokenization methods (e.g., RQ‑VAE) struggle in LBS scenarios because they cannot effectively fuse textual, visual, spatial, and collaborative signals. To improve discriminability, a contrastive‑learning based tokenization pipeline is proposed:

Extract multi‑modal representations (image, text, CF signal, spatial graph).

Fuse them via an attention layer.

Apply residual quantization to obtain hierarchical codebooks.

Align codebooks with multi‑modal features using an InfoNCE loss.

After training, the codebooks exhibit clear hierarchical clustering, reflecting item similarity at multiple granularities.

LLM Post‑Training: Amap’s Domain‑Specific Recommendation LLM

Instead of training from scratch, a large base model is further trained to inject world and scenario knowledge into recommendation.

After tokenizing POIs, new tokens are injected into the model’s tokenizer (randomly initialized). A large‑scale post‑training aligns these tokens with the model’s semantics.

Continued Pre‑Training : teaches the model the "guess where" corpus and aligns new token semantics.

Instruction Fine‑Tuning : adapts the model to navigation‑specific instructions, matching online inference prompts.

Preference Alignment : learns spatio‑temporal preferences and ranking ability.

Evaluation Scheme and Gains

Offline Evaluation

Metrics include Acc@1 (top‑1 exposure accuracy), Time Consistency, Space Consistency, and Behavior Consistency.

Scaling experiments from 0.5B to 7B parameters show consistent improvements across all metrics, confirming a scaling law.

Ablation studies reveal that removing the semantic ID (AmapID) causes the largest drop in Acc@1 (‑4.6pp), while removing the CPT module also significantly hurts all consistency metrics.

Online Gains

Key online improvements:

Overall UV‑CTR +4.64% and scenario lift rate +1.2%.

Commute card UV‑CTR +6.48%, UV exposure +14.9%, UV clicks +21.86%.

Generated recommendations show better temporal and spatial awareness, e.g., recommending night‑snack venues at 8 pm instead of generic coffee shops.

Online Latency

Latency tests on Qwen‑2.5 models show ~30 ms for 0.5B, ~50 ms for 1.5B, ~60 ms for 3B, and ~140 ms for 7B parameters (20 threads, 100 QPS, p99).

Conclusion

The "G‑Where" generative recommendation framework demonstrates that scaling large models in recommendation systems can break the limitations of traditional retrieval‑ranking pipelines, improving accuracy, temporal, spatial, and preference consistency. The approach is generalizable and lays a foundation for extending large‑model capabilities to more Amap scenarios, moving toward a truly "living map" that understands users.

AI large language model Map Services Generative Recommendation semantic ID item tokenization

Written by

Amap Tech

Official Amap technology account showcasing all of Amap's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.