AniSora: An Integrated System for Anime Video Generation with Data Flywheel, Controllable Diffusion Models, and Evaluation Benchmark
AniSora combines a 10‑million‑pair anime text‑video dataset, a controllable diffusion‑transformer with temporal‑mask conditioning for text‑to‑video, interpolation and region‑guided animation, and a 948‑video benchmark, delivering industry‑leading character and motion consistency and already powering low‑cost dynamic‑comic production for multiple IPs.