How UniMapGen Revolutionizes Large‑Scale Lane‑Level Map Generation with Generative AI

UniMapGen introduces a generative, multimodal framework that models lane lines as token sequences, employs an iterative state‑update mechanism for global consistency, and achieves state‑of‑the‑art performance on large‑scale satellite‑derived map construction, enabling seamless lane‑level navigation worldwide.

Amap Tech
Amap Tech
Amap Tech
How UniMapGen Revolutionizes Large‑Scale Lane‑Level Map Generation with Generative AI

Abstract

UniMapGen is a generative framework for constructing global lane‑level vector maps with high efficiency and precision. It replaces traditional perception pipelines that rely on expensive survey vehicles or coarse satellite‑image‑based segmentation with a token‑based autoregressive generation process.

Introduction

High‑definition (HD) maps are essential for autonomous driving but their production is limited by the cost of specialized mapping fleets and the incompleteness of satellite imagery (occlusion, latency, annotation gaps). Existing pipelines treat map creation as a segmentation or detection problem, which yields fragmented road vectors and poor cross‑patch continuity. UniMapGen reframes map construction as a generative task: lane lines are encoded as discrete token sequences and produced by a large‑scale autoregressive model that can ingest multiple modalities.

Method

UniMapGen consists of three tightly coupled modules.

Multimodal input architecture The model accepts bird’s‑eye‑view (BEV) satellite images, low‑cost perspective‑view (PV) image sequences, optional natural‑language prompts, and contextual map data. Each modality can be omitted at training or inference, enabling operation with only satellite data when necessary.

Map vector tokenization Road polylines are uniformly sampled to a fixed spatial resolution (e.g., 20 cm per point) and then reordered into a canonical sequence. Each sampled point is quantized into a discrete token from a learned vocabulary, turning the vector prediction problem into a sequence‑to‑sequence generation task.

State‑update mechanism Instead of processing independent image patches, UniMapGen maintains a global state that records three semantic anchors for every vector: Start , End , and Cut . During inference the model iteratively expands a generated region (“growth” process), using the Cut anchors of previously generated patches as seeds for the next patch. This guarantees topological continuity and eliminates “broken road” artifacts across large areas.

Figure 2: Detailed UniMapGen architecture
Figure 2: Detailed UniMapGen architecture

Experiment

Evaluation was performed on the OpenSatMap dataset (20 cm resolution) covering extensive urban and suburban road networks. UniMapGen achieved state‑of‑the‑art completeness and continuity scores, outperforming detection‑based MapTR and segmentation‑based SegNeXt. Quantitative results show a significant increase in vector completeness and a reduction in fragmented edges, while qualitative visualizations demonstrate smooth lane curves and seamless stitching across patch boundaries.

Multimodal ablation confirms that adding PV sequences improves temporal freshness: real‑time PV frames correct outdated satellite BEV imagery, leading to more accurate current‑road representations.

Figure 3: Quantitative comparison with SOTA
Figure 3: Quantitative comparison with SOTA
Figure 4: Qualitative comparison with SOTA
Figure 4: Qualitative comparison with SOTA

Conclusion

UniMapGen demonstrates that large‑scale lane‑level map generation can be achieved by converting lane detection into discrete token generation and by enforcing a state‑update framework for global consistency. The flexible multimodal design (BEV, PV, text prompts) allows the system to adapt to varying data availability and to incorporate interactive updates. Future work will extend the token vocabulary to traffic signs, right‑of‑way rules, and other map elements, and will deepen text‑prompt interaction through synthetic data generation.

Paper homepage: https://amap-cvlab.github.io/UniMapGen/

Paper link: https://arxiv.org/abs/2509.22262

multimodalGenerative AIautonomous drivingmap generationlarge-scale mappingstate update
Amap Tech
Written by

Amap Tech

Official Amap technology account showcasing all of Amap's technical innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.