Midjourney Video Generator: First Look at the Upcoming V1 Model
The article reviews Midjourney's upcoming V1 video model, explains how users can access early samples through a subscription‑based rating party, evaluates its visual quality, aspect‑ratio support, and limitations, and compares it with established AI video tools such as Veo 3, Kling 2.1 and Runway Gen‑4.
Model Overview
Midjourney’s first video model, referred to as “V1 video model,” will be released after further optimizations. The model is not yet accessible through the Midjourney website or Discord; sample videos can only be viewed by joining a subscription‑based video rating party.
Rating party address: https://www.midjourney.com/account
Rating Party Process
The rating activity presents two videos per round. Participants select the video they prefer and that appears “reasonably accurate,” or they may skip the round if uncertain.
Two videos are shown for each round.
Select the preferred, more accurate video.
Option to skip when unsure.
David Holz, Midjourney’s founder, states that these videos do not represent the final model. Collecting user labels for low‑quality (“garbage”) content is a crucial part of the development pipeline, and participants are asked to provide high‑quality data.
As of June 13 2025, the second phase of the video rating activity has officially launched.
Sample Observations
Early examples show that the model retains Midjourney’s signature artistic style and produces physically accurate rendering, such as realistic reflections on ceilings and floors. The model supports multiple aspect ratios, including square videos and vertical formats suitable for TikTok, Reels, and Shorts.
Favorite samples demonstrate:
Clear text rendering on signs, book covers, and T‑shirts.
Dynamic shadow adjustments that follow moving subjects and adapt to lighting.
Natural lighting and detailed hair strands, with individual hair fibers visible.
Common failure modes include:
Absurd lighting (e.g., night scenes lit like noon).
Distorted facial proportions or unsettling expressions.
Comparison with Other Tools
The core question is whether Midjourney’s V1 video model can compete with Veo 3, Kling 2.1, or Runway Gen‑4.
Veo 3 is noted for native audio generation. Kling 2.1 and Runway Gen‑4 provide extensive customization and camera‑control features, making them preferred platforms for professional AI‑driven filmmaking. Midjourney’s offering remains experimental and lacks these capabilities.
Google’s video models are cited as having higher quality, attributed to massive TPU compute resources.
Related ranking of AI video tools (2025) can be found at: https://mp.weixin.qq.com/s?__biz=MzkzODI1NzQyNA==∣=2247494410&idx=1&sn=6bc66f7500ade48586c9a47e2262d5e7&scene=21#wechat_redirect
Conclusion
Early clips indicate that Midjourney’s video quality lags behind Google’s models and the leading commercial tools. The move into video aligns with long‑standing user demand, and the model preserves Midjourney’s dreamy artistic aesthetic rather than adopting hyper‑realistic styles. Future updates may add audio, camera motion, or scene transitions, but final performance remains uncertain.
AI Algorithm Path
A public account focused on deep learning, computer vision, and autonomous driving perception algorithms, covering visual CV, neural networks, pattern recognition, related hardware and software configurations, and open-source projects.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
