Technical Evolution of Ground Marking Recognition for High‑Precision Maps
AMap’s ground‑marking recognition has progressed from simple threshold methods to advanced deep‑learning pipelines—including two‑stage R‑FCN, cascade detectors with local regression, corner‑point and segmentation hybrids, and LiDAR‑based 3‑D PointRCNN—achieving over 99 % recall and sub‑5 cm positional accuracy for high‑precision map production.
This article introduces the technical evolution of ground‑marking recognition in high‑precision maps at AMap. The methods described have supported the production line requirements of high‑precision map construction and provide a solid technical guarantee.
Ground‑marking recognition refers to detecting various road‑surface elements such as arrows, text, numbers, speed‑bump markings, lane‑keeping lines, crosswalks, stop‑yield lines, etc. The recognized results become production data for the map‑building pipeline and are later used for autonomous driving, in‑vehicle navigation, and mobile navigation.
High‑precision maps require centimeter‑level accuracy for each map element, which makes the recognition task far more demanding than ordinary maps. Two major challenges are the large variety and size of markings and the wear, occlusion, and inconsistent clarity of markings in real‑world scenes.
1. Diversity of ground markings : markings differ in color (yellow, red, white), shape (arrows, characters, bars, patches, hills), and size (standard arrow length 9 m, but many markings are 1–2 m or smaller).
2. Wear and occlusion : long‑term vehicle and pedestrian traffic causes wear and fading; traffic congestion, construction, and environmental conditions (rain, backlight) further degrade visibility.
Traditional extraction methods (threshold segmentation, skeletonization, connected‑component analysis) work well on clean data but struggle with worn, blurred, or low‑contrast markings. Example results of traditional methods show good extraction for clear cases but poor recall and position accuracy for challenging scenarios.
Deep‑learning era : Since AlexNet (2012), convolutional neural networks (CNNs) have dramatically improved detection performance. Two main detection paradigms are two‑stage (e.g., R‑FCN, Faster‑RCNN) and one‑stage (SSD, YOLO). For high‑precision maps, the two‑stage approach is preferred for its higher accuracy.
R‑FCN detection : Position‑sensitive score maps and ROI pooling provide high location precision. R‑FCN improves recall and generalization across diverse scenes.
Cascade detector : By iteratively refining bounding‑box predictions (e.g., using Deformable‑Conv), the cascade approach reduces the gap between predicted and true positions, meeting the strict precision requirements of high‑precision maps.
Cascade + local regression : A local regression stage focuses on the marking region to further refine the position, yielding finer boundaries.
Corner‑point detection : Predicts two heatmaps for the top‑left and bottom‑right corners of each object, along with embedding vectors for grouping. This reduces reliance on dense anchors and improves bounding‑box tightness.
Cascade + segmentation refinement : A ResNet‑based semantic‑segmentation model (with adaptive receptive fields, multi‑scale fusion, coarse‑fine fusion, and ROI attention) provides pixel‑level masks. Detection supplies coarse locations, while segmentation refines the exact marking shape.
PAnet : A detection‑segmentation hybrid that fuses coarse and fine features both top‑down and bottom‑up, adds adaptive feature down‑sampling, and includes a mask classification branch, resulting in higher position accuracy and robustness to wear.
3‑D point‑cloud detection : Leveraging LiDAR point clouds, the PointRCNN framework generates high‑quality 3‑D proposals, performs foreground‑background segmentation, and refines proposals with ROI pooling and feature fusion, achieving precise 3‑D localization of ground markings.
Results and benefits : With large‑scale data, the proposed pipelines achieve >99 % recall and >99 % positional accuracy within a 5 cm ground‑truth tolerance. The solutions are already deployed in production, dramatically improving efficiency and quality of high‑precision map creation.
High‑precision maps serve autonomous‑driving systems as “eyes,” requiring extremely high recall and positional accuracy. The continuous advancement from manual to semi‑automatic and fully automatic recognition, from 2‑D to 3‑D and multi‑sensor fusion, is essential for scaling map production.
Amap Tech
Official Amap technology account showcasing all of Amap's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.