ICML 2026: Certifying VLM Robustness with Text‑Prompted Semantic Intervals
This paper introduces a semantic robustness certification framework for vision‑language models that leverages paired text prompts as semantic proxies to define a continuous transformation in the shared embedding space, derives closed‑form interval bounds where predictions remain unchanged, and validates the method on CLIP ViT‑B/32 with both synthetic and real‑world datasets.
