PPTAgent: An Open‑Source AI System for Automated Presentation Generation Using a Two‑Stage Editing Approach
PPTAgent, an open‑source AI tool jointly developed by the Chinese Academy of Sciences and Shanghai Jiexin Technology, automatically creates high‑quality PowerPoint slides by analyzing reference decks, extracting layout patterns, and iteratively editing content with a self‑correction mechanism, achieving superior content, design, and coherence scores compared to existing methods.
PPTAgent is an open‑source AI system created by researchers from the Chinese Academy of Sciences, the University of Chinese Academy of Sciences, and Shanghai Jiexin Technology that can generate PowerPoint presentations by analyzing high‑quality reference slides, extracting content patterns and layout structures, and then editing slides step‑by‑step to meet user requirements, dramatically saving time and effort.
The core innovation lies in a unique two‑stage generation method inspired by how humans create presentations. In the first stage, PPTAgent clusters reference slides into structural and content slides, uses multimodal large models to recognize layout roles, and groups similar slide images via hierarchical clustering, providing a clear structural reference for later generation.
During the second stage, PPTAgent adopts an edit‑based generation approach: it selects appropriate reference slides and incrementally edits them using a set of editing APIs (edit, delete, copy) that operate on HTML‑rendered slide elements, enabling intuitive manipulation and preserving the original design aesthetics.
A self‑correction mechanism runs in a REPL environment; when an edit cannot be applied, the REPL returns feedback, allowing the model to adjust its operations and avoid inconsistent or erroneous slides, thus ensuring high‑quality, coherent output.
Evaluation on 50 reference decks from the Zenodo10K dataset and 50 input documents (500 generation tasks across 5 domains, 10 document types, and 10 reference decks) shows PPTAgent outperforms rule‑based DocPres and template‑based KCTV, improving content quality by 12.1%–28.6%, design by 13.2%–40.9%, and coherence by 25.5%–36.6%.
The source code is publicly available at https://github.com/icip-cas/PPTAgent .
DevOps
Share premium content and events on trends, applications, and practices in development efficiency, AI and related technologies. The IDCF International DevOps Coach Federation trains end‑to‑end development‑efficiency talent, linking high‑performance organizations and individuals to achieve excellence.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.