Artificial Intelligence 7 min read

AutoClip: One‑Click AI Video Highlight Extraction and Editing

AutoClip is an open‑source, locally‑run tool that uses Alibaba's Qwen large language model and OpenAI Whisper to automatically download, transcribe, analyze, and cut high‑light segments from YouTube or Bilibili videos, offering real‑time task monitoring, smart collections, preview, Docker deployment, and a roadmap of future AI‑driven features.

AI Explorer

Mar 8, 2026

AutoClip: One‑Click AI Video Highlight Extraction and Editing

What it does

Video creators often need to extract the most valuable segments from long recordings. AutoClip lets an AI watch the video, understand its content, score each segment, and automatically cut high‑light clips. The user supplies only a video link or file.

Input methods

YouTube link – paste the URL, the tool downloads the video and extracts subtitles.

Bilibili link – supports BV numbers or full URLs.

Local upload – upload an existing file directly.

AI video‑understanding pipeline

The core pipeline consists of seven steps powered by Alibaba Qwen large language model and OpenAI Whisper:

Download + subtitle extraction – yt-dlp downloads the video; Whisper transcribes the audio.

AI‑generated outline – the LLM reads the transcript and produces a structured summary.

Topic timeline segmentation – identifies where key topics appear on the timeline.

Highlight scoring – evaluates each segment for information density and viewing value.

Automatic title generation – creates a catchy title for each clip.

Smart collection recommendation – clusters clips by thematic similarity and suggests groupings.

Video export – FFmpeg trims and assembles the final clips.

The entire process runs locally, so video content never leaves the user’s machine.

Real‑time task management

AutoClip uses a Celery asynchronous task queue together with WebSocket push notifications. Users can monitor processing progress, view status, and see output for each task without manual page refresh. Multiple projects are handled in parallel.

Smart collections

Beyond single‑video highlights, the system can automatically group clips from different videos that share the same theme. For example, after processing ten interview videos, all segments discussing “entrepreneurial experience” can be aggregated into one collection, while users may manually reorder or filter the clips.

Preview and export

Before exporting, every cut can be previewed directly in the browser. When satisfied, a single click renders the final video, eliminating the need for external editing software.

Technology stack

Frontend : React 18, TypeScript, Ant Design, Vite, Zustand.

Backend : FastAPI, Celery, Redis, SQLite, yt-dlp, FFmpeg.

AI engine : Alibaba Qwen LLM for video understanding + OpenAI Whisper for speech‑to‑text.

One‑click deployment (≈3 minutes)

git clone https://github.com/zhouxiaoka/autoclip.git
cd autoclip
./docker-start.sh

After launch, the services are reachable at:

Frontend UI: localhost:3000 API docs: localhost:8000/docs Task monitor: localhost:5555 System requirements: 4 GB + RAM, 10 GB + disk space, and macOS, Linux, or Windows (WSL).

Planned features

Bilibili one‑click upload – cut and publish directly.

AI‑generated cover images – automatically select the best frame.

Multilingual subtitle translation – generate Chinese subtitles for English videos.

Visual subtitle editor – edit subtitles on the timeline.

Desktop client (beta recruitment).

Core value

Extract multiple highlights from a single video using AI analysis and automatic editing, run locally for lightweight, efficient, privacy‑preserving processing, and let creators focus on creativity while repetitive work is handled by AI.

Project repository

GitHub: https://github.com/zhouxiaoka/autoclip (MIT license)

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker LLM Video Processing FastAPI AI video editing Open-source

Written by

AI Explorer

Stay on track with the blogger and advance together in the AI era.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

What it does

Input methods

AI video‑understanding pipeline

Real‑time task management

Smart collections

Preview and export

Technology stack

One‑click deployment (≈3 minutes)

Planned features

Core value

Project repository

AI Explorer

How this landed with the community

Was this worth your time?

0 Comments

One‑click deployment (≈3 minutes)