Multimodal Large‑Model Cover Generation AI Agent for Taobao Video and Live Streams
Taobao’s new multimodal AI Agent automatically creates high‑quality static and dynamic video covers by planning tasks, consulting a memory of quality criteria, executing frame selection with ReKV streaming and dual‑stage evaluation, generating marketing copy via fine‑tuned Qwen2.5‑7B, and refining layout, resulting in significantly higher click‑through rates, lower latency, and reduced manual effort.