One-Click Ad Video from Assets + Brief, plus Baidu’s 8B Text-to-Image – An AI Toolbox
The article introduces three open‑source AI tools—a video editor that turns raw footage and a brief into a finished ad, Baidu's 8‑billion‑parameter text‑to‑image model that runs on 24 GB GPUs, and a weekly AI‑developer digest that auto‑generates Chinese reports—detailing their workflows, benchmarks, usage commands, and target users.
01 | agentic-video-editor
Manual editing of raw footage into a 30‑second advertisement typically requires half a day of selecting shots, arranging rhythm, exporting, and reviewing.
The tool replaces this workflow with an AI Agent pipeline consisting of:
Original footage + creative brief
↓
[Pre‑processing] – scene detection, speech‑to‑text, shot indexing
↓
[Director Agent] – AI searches footage, selects shots, creates an edit plan
↓
[Refinement Agent] – fine‑tunes start/end of each shot
↓
[Edit Agent] – FFmpeg renders MP4
↓
[Review Agent] – scores relevance, rhythm, visual quality, viewing experience, overall (0‑1 each)
↓
If overall score < threshold → feedback to Director Agent (max 3 retries)Running the editor requires a single command:
ave edit \
--footage-dir /path/to/your/footage \
--brief '{"product": "My Product", "audience": "Women 25-45", "tone": "authentic", "duration_seconds": 30}' \
--pipeline pipelines/ugc-ad.yaml \
--style styles/dtc-testimonial.yamlThe built‑in DTC template follows the hook → problem → solution → social proof → CTA structure; custom YAML pipelines can be authored to combine agents differently.
02 | ERNIE‑Image
ERNIE‑Image is Baidu’s open‑source diffusion‑transformer (DiT) model with 8 B parameters, achieving state‑of‑the‑art results among open‑weight text‑to‑image models.
GenEval benchmark scores:
Overall 0.8856 (higher than Qwen‑Image 0.8683 and FLUX.2‑klein‑9B 0.8481)
LongTextBench (Chinese long‑text) 0.9733, comparable to Seedream 4.5 0.9882
Key strengths identified in the source:
Text rendering – long paragraphs, dense typography, layout‑rich images (posters, infographics, UI mockups)
Complex instruction compliance – accurate handling of multi‑object, relational, knowledge‑intensive prompts
Structured generation – posters, comics, storyboards, multi‑panel graphics
Consumer‑grade deployment – runs on a single GPU with 24 GB VRAM
Two released variants:
ERNIE‑Image (SFT version) – 50 inference steps, guidance scale 4.0
ERNIE‑Image‑Turbo (DMD+RL accelerated) – 8 inference steps, guidance scale 1.0
Example usage via HuggingFace:
import torch
from diffusers import ErnieImagePipeline
pipe = ErnieImagePipeline.from_pretrained(
"baidu/ERNIE-Image",
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(
prompt="a black‑and‑white Chinese countryside dog",
height=1024, width=1024,
num_inference_steps=50,
guidance_scale=4.0,
use_pe=True,
).images[0]03 | ai-influence-digest
The tool monitors public activity of more than 65 AI developers, filters posts that are immediately useful for content creators, and generates a structured Chinese weekly briefing without relying on the X (Twitter) API.
Core features:
No X API dependency – fully compliant and avoids account bans
Coverage of tools, workflows, tutorials, prompts across 65+ developers
Automatic rendering of Xiaohongshu‑style long‑image screenshots for easy sharing
Markdown‑formatted Chinese summary output
Three‑step workflow:
# Step 1: Scan candidate posts
python3 scripts/scan_x_weekly.py \
--accounts references/accounts_65.txt \
--days 7 \
--outdir ./output/ai-influence-digest
# Step 2: Human review and assemble Markdown weekly report
# (filter criteria in references/filters.md)
# Step 3: Render Xiaohongshu‑style report screenshot
bash scripts/render_weekly_screenshots.sh \
./output/ai-influence-digest/weekly_report.md \
./output/ai-influence-digest/weekly_report.png \
"2026-04-18"Summary
agentic-video-editor – automates raw footage editing into ads via an AI Agent pipeline with automatic review and up to three retry cycles.
ERNIE‑Image – 8 B diffusion‑transformer delivering state‑of‑the‑art text‑to‑image generation on a single 24 GB GPU; excels at Chinese text rendering and structured graphics.
ai-influence-digest – continuously tracks 65+ AI developers, filters high‑value updates, and produces a ready‑to‑share Chinese weekly briefing.
All projects are open source. Repository URLs: https://github.com/poseljacob/agentic-video-editor, https://github.com/baidu/ERNIE-Image, https://github.com/koffuxu/ai-influence-digest.
Geek Labs
Daily shares of interesting GitHub open-source projects. AI tools, automation gems, technical tutorials, open-source inspiration.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
