How AI Powered a 40‑Second Brand Video for China’s “Good Employer” Awards

This case study details how the 58.com UX team used AI tools to create a 40‑second pre‑video for the China Good Employer awards, outlining the visual framework, prompt engineering, segmented generation workflow, and the balance between automation and designer insight.

58UXD
58UXD
58UXD
How AI Powered a 40‑Second Brand Video for China’s “Good Employer” Awards

Project Goal

Produce a 40‑second pre‑video for the annual “China Good Employer” awards using AI‑driven video generation. The video must meet brand visual standards, be suitable for online and on‑site playback, and serve as a technical case study of AI video creation limits.

Visual Concept

The visual system is anchored on the theme “AI‑enabled industry prosperity” with the slogan “Data‑intelligence drives, hundreds of industries coexist.” The narrative progresses from a societal shift, defines a “good employer,” and ends with a call‑to‑action.

AI‑Centric Production Workflow

Because most commercial AI video generators output clips of only 5–10 seconds, the production was divided into short segments that were generated independently and later stitched together. Each segment required a detailed prompt covering:

Scene description (objects, colors, background)

Camera motion (pan, tilt, zoom, dolly)

Temporal rhythm (beat, pacing)

Perspective (eye‑level, aerial, close‑up)

Accurate prompts minimized ambiguity and ensured consistency with the pre‑defined visual system.

Generation Modes

Three AI video generation pathways were employed:

Text‑to‑video (文生视频)

Image‑to‑video (图生视频)

Start‑and‑end‑frame video (首尾帧生视频)

For highly customized shots, image‑to‑video and start‑and‑end‑frame methods produced the most faithful results.

Keyframe Generation Pipelines

The following four complex shots illustrate the end‑to‑end process.

Shot 5 – Text‑to‑image → Image‑to‑video

Resulting animated clip:

Shot 6 – Text‑to‑image → Image‑to‑video

Shot 7 – Text‑to‑image → Text‑to‑image → Start‑and‑end‑frame video

Shot 8 – Text‑to‑image → Start‑and‑end‑frame video

Assembly and Post‑Processing

All generated clips were exported in a common codec (e.g., H.264 MP4) and concatenated using a video editing tool (FFmpeg command example:

ffmpeg -f concat -safe 0 -i filelist.txt -c copy output.mp4

where filelist.txt contains the ordered list of segment files. Color grading and audio‑track alignment were applied uniformly to preserve the brand’s visual language.

Outcomes and Technical Insights

A seamless 40‑second video was delivered, meeting the brand’s visual fidelity and timing requirements.

Segment‑by‑segment generation effectively bypassed the 5‑10 second clip limitation of current AI video models.

Image‑to‑video and start‑and‑end‑frame pipelines offered finer control over composition and motion than pure text‑to‑video.

Prompt engineering—explicitly specifying camera moves, pacing, and perspective—proved critical for reducing iteration cycles.

Despite high‑quality AI output, core creative decisions (concept, style, emotional tone, shot sequencing) remained human‑driven.

Conclusion

The project demonstrates a reproducible workflow for turning a brand narrative into a fully AI‑generated video within existing technical constraints. By combining detailed prompt design, multiple generation modes, and conventional stitching tools, designers can achieve high‑quality, highly customized video content while retaining creative control.

Case Studytext-to-imageimage-to-videoAI video generationcreative workflowbrand video
58UXD
Written by

58UXD

58.com User Experience Design Center

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.