Embodied AI Security Survey: A Multi‑Layer Framework for Risks, Attacks, and Defenses

This survey systematically reviews Embodied AI security, proposing a five‑layer taxonomy (perception, cognition, planning, action & interaction, agentic system) that organizes over 400 papers on attacks, defenses, and open challenges, and highlights overlooked vulnerabilities such as multimodal perception fusion and planning instability under jailbreak attacks.

Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Embodied AI Security Survey: A Multi‑Layer Framework for Risks, Attacks, and Defenses

1 Introduction

Embodied AI integrates perception, cognition, planning, and interaction into agents that operate in open‑world, safety‑critical environments such as autonomous driving, medical robotics, and assistive robots. Safety failures can cause physical harm, making security a technical and societal challenge. This survey synthesizes over 400 papers and introduces a five‑layer taxonomy (perception, cognition, planning, action & interaction, agentic system) that unifies scattered work and connects embodied‑specific findings with advances in visual, language, and multimodal foundation models.

2 Perception

The perception layer supplies multimodal environmental understanding; attacks at the sensor boundary cascade upward.

2.1 Visual Perception

Tasks include classification, detection, tracking, segmentation, and video understanding, typically using CLIP, SigLIP, or ViT backbones. Adversarial attacks are either digital pixel‑level perturbations or physical manipulations (e.g., stickers, lasers). White‑box digital attacks such as region‑constrained perturbations on iCub (Melis et al.) and adversarial patches for detectors (Thys et al.) target classification and detection respectively. Physical attacks include DARTS and RP2 (sticker/printed patterns) and ShapeShifter (transformable patches). Black‑box attacks rely on transferability (e.g., AnyAttack generates self‑supervised perturbations on LAION‑400M) or query‑based optimization (e.g., AdvTraj for multi‑object tracking). Defenses comprise robust training with adversarial examples or feature‑recovery (e.g., Kalin et al. using visible/infrared data), contrastive fine‑tuning (TeCoA) to improve zero‑shot robustness, and output auditing (SentiNet). Backdoor attacks embed hidden triggers during training: TrojViT uses RowHammer bit flips; physical objects serve as triggers for detectors (Han et al.). Backdoor defenses are limited; Doan et al. detect patch‑response anomalies, and CleanCLIP realigns multimodal representations via contrastive fine‑tuning.

2.2 Auditory Perception

Speech recognition and speaker verification enable voice‑controlled interaction. White‑box physical attacks include Carlini’s inaudible commands and Metamorph background noises that remain imperceptible to humans. Black‑box digital attacks such as TSMAE modify playback speed to fool recognizers; Occam crafts adversarial examples for cloud APIs. Defenses audit MFCC features with CNN classifiers (Samizade et al.) and apply diffusion‑based audio purification. Backdoor research is nascent; TrojanModel demonstrates training‑time trigger insertion without a dedicated defense.

2.3 Spatial Perception

3D perception (point‑cloud classification, 3D detection, SLAM, neural scene representations) underpins navigation. Adversarial attacks manipulate point clouds, depth maps, or LiDAR signals. White‑box digital attacks such as FLAT perturb LiDAR points to affect motion compensation; Poison‑Splat poisons training data for 3DGS. Physical attacks like SpotAttack use genetic algorithms to place non‑reflective adversarial points. Defenses include robust training, certified robustness (PointGuard) for point‑cloud classifiers, and purification pipelines (LiDARPure) that employ diffusion models. Backdoor attacks inject triggers via data poisoning for 3D detectors (Zhang et al.; BadLiDet), with no specific defenses reported.

2.4 Motion Perception

IMU, visual/LiDAR odometry, and GNSS provide pose and velocity estimates. Sensor‑level attacks exploit hardware vulnerabilities (e.g., acoustic injection on IMU, RF spoofing on radar, replay attacks on GNSS) to feed falsified motion data. Defenses use anomaly detection, cross‑sensor verification, robust state estimation, and anti‑spoofing/authentication mechanisms.

2.5 Cross‑Modal Perception

Fusion of multiple sensors improves robustness but creates new attack surfaces. Adversarial attacks craft perturbations that transfer across modalities by exploiting fusion logic. Defenses apply multimodal contrastive fine‑tuning for feature‑level alignment and independently purify each modality before fusion.

3 Cognition

The cognition layer interprets sensory data, builds world models, and performs reasoning. Instruction‑understanding attacks use jailbreak prompts to bypass alignment constraints. World‑model attacks induce hallucinations, causing agents to act on nonexistent objects. Reasoning attacks hijack chain‑of‑thought prompts or introduce basic failures that corrupt logical inference. The surveyed literature provides few concrete defenses for these cognition‑level threats.

4 Planning

Planning converts perception and cognition outputs into safe action sequences.

4.1 Task Planning

Adversarial, jailbreak, and backdoor attacks target the decomposition of high‑level goals into sub‑tasks. Jailbreak prompts inject malicious instructions; backdoors embed hidden triggers via data poisoning. Robust training and secure inference are the primary defenses.

4.2 Trajectory Planning

Adversarial perturbations can introduce spurious obstacles or manipulate trajectory generators, leading to collisions. Robust training and robust inference mitigate these risks.

4.3 Multi‑Agent Planning

Byzantine faults (malicious or faulty agents) and goal conflicts can destabilize coordination. Reputation‑based mechanisms and robust consensus protocols are suggested mitigations.

4.4 Benchmarks

Existing benchmarks evaluate adversarial planning and jailbreak scenarios.

5 Action and Interaction

This outermost layer executes physical actions and interacts with humans or other agents.

5.1 Robot Control

Adversarial attacks (white‑box and black‑box) target low‑level command execution; defenses include robust training and robust inference. Backdoor attacks embed triggers in control policies, but dedicated defenses remain scarce.

5.2 Human‑Agent Interaction

Risks include trust manipulation and social engineering that can cause harmful behavior during hand‑over or collaborative tasks.

5.3 Multi‑Agent Collaboration

Compromised agents may collude or behave maliciously, threatening task safety; reputation‑based robust consensus is proposed.

6 Agentic System

Agentic systems add tool use, persistent memory, and self‑evolution.

6.1 Tool Use

Agents can invoke external APIs or executors; attacks inject malicious tools or manipulate tool outputs. Defenses involve security checks, whitelisting, and anomaly detection.

6.2 Memory

Memory poisoning inserts malicious experiences; memory leakage exposes private data. Encryption and access control are proposed defenses.

6.3 Self‑Evolving

Self‑evolution enables agents to improve via experience, but can cause alignment drift.

6.4 Cascading Risks

Vulnerabilities propagate across layers; supply‑chain attacks embed backdoors during training or pipeline construction.

7 Open Challenges

Key open problems include fragility of multimodal fusion, planning instability under jailbreak attacks, trustworthy human‑agent interaction in open settings, physical‑world verification of defenses, transferability and generalization of attacks/defenses, and defining the boundary between embodied and digital security.

8 Future Trends

Anticipated directions are safety‑aligned embodied foundation models, real‑time adaptive defenses, standardized embodied security benchmarks, and regulatory/compliance frameworks.

9 Conclusion

The survey provides a systematic review of Embodied AI security, a five‑layer taxonomy, and an aggregation of attack and defense categories across more than 400 papers, highlighting overlooked challenges and offering a roadmap for building capable, autonomous, yet safe and robust embodied agents.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Embodied AIAI securityadversarial attacksmultimodal perceptionrobotics safety
Machine Learning Algorithms & Natural Language Processing
Written by

Machine Learning Algorithms & Natural Language Processing

Focused on frontier AI technologies, empowering AI researchers' progress.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.