Three Must‑Try Open‑Source AI Tools for Data Mining, PPT Creation, and Video Generation

In the era of abundant AI utilities, this article highlights three recently popular open‑source projects—Spider_XHS for comprehensive Xiaohongshu data collection and automated posting, PPTAgent for one‑click, multi‑scene PowerPoint generation, and Code2Video for code‑driven, high‑quality video creation—detailing their core features, deployment steps, and GitHub links.

Old Meng AI Explorer
Old Meng AI Explorer
Old Meng AI Explorer
Three Must‑Try Open‑Source AI Tools for Data Mining, PPT Creation, and Video Generation

Amid the surge of AI utilities, three open‑source projects stand out for their practical impact on content creation, data operations, and multimedia production.

Spider_XHS – Xiaohongshu Full‑Scope Operations Tool

Core Feature Highlights

Multidimensional Data Collection : extracts note details (title, description, tags, likes, saves, comments, shares), user profiles (nickname, avatar, follower/following counts), and media resources (watermark‑free images, videos); results can be exported to Excel or saved locally.

Automated Content Publishing : integrates Xiaohongshu creator platform APIs, supports QR‑code or SMS login, enables batch upload of image sets or videos, and provides access to published posts and unread messages.

Targeted Crawling : can fetch all notes of a specific user, scrape homepage recommendations, or collect data from designated channels to suit diverse operational scenarios.

Quick Start Guide

Environment Preparation : install Python 3.7+ and Node.js 18+.

Project Deployment :

# Clone the repository
git clone https://github.com/cv-cat/Spider_XHS.git
cd Spider_XHS
# Install dependencies
pip install -r requirements.txt
npm install

Configure Cookie : log in to the Xiaohongshu web version, open developer tools (F12), copy the Cookie from the Network panel, and paste it into a new .env file at the project root.

Run the Tool : modify the call logic in main.py as needed, then execute python main.py. Collected data will be saved automatically in the designated folders.

Open‑source address: https://github.com/cv-cat/Spider_XHS

PPTAgent – Intelligent PPT Generation

Core Feature Highlights

Multi‑Scenario Generation : generates slides from short text commands (e.g., "Introduce Xiaomi SU7 design and price") or from long documents such as papers, technical reports, and financial statements using Retrieval‑Augmented Generation (RAG) to extract key points.

Smart Creation Workflow : employs a multi‑agent architecture that mimics human PPT design, handling topic decomposition, content organization, and layout automation; output styles adapt to business, academic, or other contexts.

High Compatibility Output : exports native .pptx files that can be freely edited, dragged, and beautified in PowerPoint without format issues.

Application Scenarios

Business reporting: upload annual financial reports to auto‑generate data‑visualization slides.

Academic presentations: import research papers to produce concise defense slides covering background, methods, and conclusions.

Classroom teaching: input topics like "Decode the impact of legislative processes on international relations" to quickly create teaching decks.

Open‑source address: https://github.com/icip-cas/PPTAgent

Code2Video – Code‑Driven Video Generation

Core Feature Highlights

Unique Generation Logic : built on the Manim animation engine (the same engine used by 3Blue1Brown), it renders each frame via code, producing logically rigorous and detail‑precise videos that can be fine‑tuned frame by frame.

AI‑Powered Assistance : an AI agent automatically writes the required Manim code from a natural‑language prompt, lowering the barrier for users without deep programming expertise.

HD Watermark‑Free Output : renders high‑definition videos without watermarks, suitable for educational courses, scientific outreach, and technical demos.

Open‑source address: https://github.com/showlab/Code2Video

These three projects collectively address high‑frequency scenarios such as content operation, office productivity, and knowledge dissemination, offering easy deployment and strong practicality for both professional developers and newcomers.

AI toolsVideo Generationopen-sourcedata-scrapingPPT automation
Old Meng AI Explorer
Written by

Old Meng AI Explorer

Tracking global AI developments 24/7, focusing on large model iterations, commercial applications, and tech ethics. We break down hardcore technology into plain language, providing fresh news, in-depth analysis, and practical insights for professionals and enthusiasts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.