Building a Cloud‑Native AI Glass Traffic Enforcement Prototype with AgentRun and Serverless Functions
This article details a cloud‑native architecture that combines Meta Ray‑Ban AI glasses, a custom iOS app, and Alibaba Cloud Function Compute (FC) with AgentRun to perform OCR‑based traffic rule enforcement, showcasing a three‑layer "client‑brain‑tools" design, prompt‑driven logic, and cost‑effective serverless deployment.
Background: The author purchased Meta Ray‑Ban glasses, which lacked Chinese support and an open SDK. After Meta released the Device Access Toolkit, the open‑source project turbometa-rayban-ai provided a direct Chinese app and Baileian API, enabling features like multimodal dialogue and calorie detection.
Problem: Inspired by traffic police using AI glasses to detect violations, the author wondered if the core logic could be reduced to OCR + database lookup + rule evaluation and implemented using cloud services.
"Client‑Brain‑Tools" Three‑Layer Architecture
Client (端) : AI glasses + iOS app. Handles frame extraction and image transmission, acting as a lightweight relay.
Brain (脑) : Alibaba Cloud Function Compute AgentRun . Performs reasoning (e.g., determines odd/even date) and decides which backend tools to invoke.
Tools (手) : Additional FC functions that interact with RDS (MySQL) and SLS for database queries and logging.
Data Flow:
See : Glass captures license plate → Bluetooth/Wi‑Fi → iOS app.
Upload : iOS app extracts a frame, encodes it, and POSTs to an FC gateway.
Think : FC injects date context, AgentRun evaluates rules, and decides whether to query the database.
Action : AgentRun triggers FC tools to read/write MySQL, log to SLS, and return a JSON response.
Speak : (Planned) AgentRun generates a human‑readable reply, which the iOS app converts to speech and plays on the glasses.
Server‑Side Components
Entry Point : Authenticates requests and adds the current date context (odd/even day).
AgentRun : Holds the system prompt and orchestrates the workflow. Example prompt snippet:
你是一个智能交通管控 Agent。当前日期信息:{{current_date_info}} (由网关注入,例如:今天是1号,单号)
处理流程:
1. 必须执行:先调用 `log_traffic_all` 记录流水。
2. 规则判断:单号日仅允许尾号单数通行;双号日仅允许尾号双数。
3. 违规处理:先调用 `check_whitelist`,若未报备再调用 `query_plate_history`,最后生成简短回复。FC Tools (Python runtime): Implements atomic functions such as check_whitelist, log_illegal_notice, etc., each performing a single database or logging operation.
Storage Layer : Alibaba Cloud Log Service (SLS) for logs, RDS MySQL for whitelist data, and OSS for optional image storage.
Why Use Agent Architecture?
Traditional hard‑coded logic requires code changes and redeployment whenever traffic rules change (e.g., switching from odd/even to specific plate numbers). With AgentRun, business rules reside in the system prompt; updating a rule is as simple as editing the prompt, avoiding code changes.
AgentRun also enables dynamic decision‑making: it can skip unnecessary database queries when OCR output is unreadable, reducing wasted resources.
Why Serverless (FaaS)?
Cost Efficiency : Instances scale to zero when no cars are detected, incurring no charges.
Elasticity : During traffic peaks, FC can instantly spin up hundreds of instances to handle concurrent requests.
Developer Simplicity : No need to manage servers or connection pools; each function is an isolated, atomic unit.
Implementation Highlights
# tools.py (deployed on FC)
def handler(event, context):
tool_name = json.loads(event).get('function')
if tool_name == 'check_whitelist':
return db.query("SELECT count(*) FROM whitelist WHERE plate=%s", plate)
elif tool_name == 'log_illegal_notice':
return sls.put_log(plate, image_base64, "violation")
# ... other tools # main.py (HTTP gateway)
def handler(event, context):
is_odd = (datetime.now().day % 2 != 0)
date_context = f"今天是{'单号' if is_odd else '双号'}"
prompt = f"{date_context},请处理这张图片里的车:{image_url}"
reply = call_agent_run(prompt)
return {"voice_feedback": reply}Future Improvements
Client‑side frame quality assessment to drop blurry images and save bandwidth.
Deploy a lightweight OCR model on FC GPU to extract plate text before invoking the LLM, cutting token usage by ~90%.
Introduce streaming TTS for near‑real‑time voice feedback.
Conclusion: By leveraging a "client‑brain‑tools" pattern, prompt‑driven AgentRun, and Alibaba Cloud serverless services, the author built a prototype that turns consumer‑grade AI glasses into a traffic‑enforcement assistant, demonstrating the flexibility and cost benefits of cloud‑native, agent‑centric design.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
