Multi-Code Scanning Framework and Optimization for Mobile Apps
The article details how a mobile app’s scanner was re‑engineered from single‑code to multi‑code detection by overhauling the logic pipeline, adding UI overlays, implementing a rotation‑and‑scale transformation algorithm, integrating iOS Vision alongside the existing SDK, applying confidence filtering, deduplication, edge‑intelligence prediction, and memory‑optimized caching, ultimately boosting recognition rates by over 30 percentage points and reducing miss‑detections.
This article documents the redesign of a scanning feature from single‑code to multi‑code recognition, describing the technical framework, algorithmic improvements, and performance optimizations.
Background & Challenge – Multiple barcodes or QR codes often appear together (e.g., clothing tags, beverage bottles). The original scanner only handled a single code, leaving users unaware of which code was detected. The goal was to enable multi‑code detection, anchor each code’s position, and retain compatibility with single‑code scenarios.
Logic Layer – The processing pipeline was changed from handling a single data object to iterating over an array of codes. The decoder now caches the current frame for later display.
View Layer – After detection, the UI overlays arrows anchored to each code’s center on the cached frame. Two display strategies were evaluated:
Stop the camera session and keep the last frame static.
Cache the current frame as an image overlay.
Both have trade‑offs; the final solution controls memory via resolution scaling and releases cached frames after user interaction.
Code‑Box Transformation Algorithm
码框旋转rect = CGRect(x: 原图宽 - (原码框.y + 原码框.height), y: 原码框.x, width: 原码框.height, height: 原码框.width);
BOOL 是否长屏幕 = 图片宽/图片高 > 屏幕宽/屏幕高 ? YES : NO;
缩放比例 = 是否长屏幕 ? 屏幕高/图片高 : 屏幕宽/相机宽;
码框缩放rect = CGRect(x: 图片旋转rect.x * scale, y: 图片旋转rect.y * scale, width: 图片旋转rect.width * scale, height: 图片旋转rect.height * scale);
if (长屏幕) { CGFloat xOffset = (图片宽*scale - 屏幕宽)/2.0; 码框坐标变换rect = CGRectMake(x:码框缩放rect.x-xOffset, y:码框缩放rect.y, width:码框缩放rect.width, height:码框缩放rect.height); } else { CGFloat yOffset = (图片高*scale - 屏幕高)/2.0; 码框坐标变换rect = CGRectMake(x:码框缩放rect.x, y:码框缩放rect.y-yOffset, width:码框缩放rect.width, height:码框缩放rect.height); }
return 码框坐标变换rect;Reducing Miss‑Detection Rate – Single‑frame decoding success depends on blur, compression, and algorithmic limits. Multi‑code scenarios showed higher miss rates, prompting additional strategies.
Vision Decoding – iOS 13+ Vision API was integrated in parallel with the existing SDK. Both decoders take ~50‑100 ms, allowing serial processing without blocking the UI. Vision provides supplemental detections, especially for codes missed by the SDK.
Confidence Filtering & Deduplication – Results with confidence < 0.9 are discarded. Duplicate detections (same payload and overlapping position) are merged, keeping the entry with the highest confidence.
Model Mapping & Data Fusion – Vision outputs are mapped to the SDK’s data model (type, payload, confidence, bounding box) and then fused with SDK results using the same deduplication logic.
Edge Intelligence – A lightweight on‑device code‑detection model predicts code locations before decoding, enabling targeted cropping and improving SDK success rates.
Results – The combined approach raised multi‑code recognition rates significantly (over 30 percentage points in album mode) and reduced miss‑detections across various scenarios.
Conclusion & Outlook – Continuous improvements in SDK algorithms, iOS Vision, and edge‑intelligence are essential for sustaining high‑quality scanning experiences.
DaTaobao Tech
Official account of DaTaobao Technology
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.