Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow

This guide shows how to use the open‑source BISHENG low‑code platform, ByteDance’s Seed‑1.6 and Seedream‑4.5 models, and a custom MCP server to build a workflow that uploads documents, performs RAG, generates structured PPT outlines with LLMs, creates page images via text‑to‑image models, and assembles a downloadable PDF, all while incorporating human‑in‑the‑loop controls.

AI Large Model Application Practice
AI Large Model Application Practice
AI Large Model Application Practice
Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow

In a previous article the author recreated Google NotebookLM’s video‑summary feature with pure code, which proved complex and hard to maintain. This follow‑up demonstrates a low‑code alternative using the open‑source BISHENG visual workflow platform, ByteDance’s Seed‑1.6 (LLM) and gpt‑4o‑mini models, the Seedream‑4.5 text‑to‑image model, and a self‑written MCP server.

Technology Stack

Low‑code platform: BISHENG (open‑source visual workflow engine)

LLM: ByteDance Seed‑1.6 inference model + gpt‑4o‑mini

Text‑to‑image: Seedream‑4.5

External tool: Custom MCP Server for image download, PDF assembly and OSS upload

Core Capabilities Replicated

Document upload → temporary vector knowledge base

Agentic RAG for factual Q&A

Automatic PPT generation from uploaded documents

The workflow is built entirely in BISHENG’s backend; a front‑end UI would be needed for a production‑ready product.

Getting Started with BISHENG

Install and launch BISHENG via Docker:

git clone https://github.com/dataelement/bisheng.git
cd bisheng/docker
docker compose -f docker-compose.yml -p bisheng up -d

After the containers start, open http://localhost:3001/, register an admin account, and configure the required LLM/Embedding models in the “Model” menu.

Workflow Overview (Part 01 – Preparation)

1. Upload documents – the “Input” node accepts multiple files, parses them, slices them, and builds a temporary vector store.

2. Intent recognition – an LLM node determines whether the user wants factual Q&A or PPT generation.

3. Conditional branching – routes to either a RAG answer node or the PPT pipeline.

PPT Generation Pipeline (Part 02 – Document → PPT)

1. LLM creates a structured outline for each slide (title, bullet points, description).

2. Visual prompt generation – the LLM produces page‑specific prompts for the image model.

3. Image creation – Seedream‑4.5 is called via a Python code node (or MCP tool node) to generate a PNG for each slide.

4. Looped page generation – BISHENG’s workflow supports a natural loop, outputting each image sequentially.

5. PDF assembly – a custom MCP tool downloads all images, merges them into a PDF, uploads the file to Alibaba Cloud OSS, and returns a download link.

Human‑in‑the‑Loop (HITL) Controls (Part 03‑04)

Because image generation can be unstable, a HITL step is added after each page:

Show the generated image and ask the user to continue or regenerate .

If regeneration is chosen, the previous visual prompt is presented for editing before re‑invoking the image model.

This interaction is implemented with BISHENG’s input and output nodes, which can be placed anywhere in the flow, allowing mid‑process user feedback.

Observability and Debugging

BISHENG provides a visual “Agent Debug IDE”: each node’s inputs/outputs, tool call parameters, branch decisions and front‑end mockups are viewable, making multi‑step LLM reasoning far easier to debug than pure code.

Limitations and Future Work

Generated PPTs are image‑only; text is not editable without post‑processing (e.g., converting PDF back to PPT).

Seedream‑4.5 struggles with dense textual slides compared to Nano‑Banana‑Pro.

BISHENG currently lacks global variables, workflow nesting, and a rich built‑in tool ecosystem, requiring custom code nodes or MCP extensions.

Conclusion

The prototype validates that a low‑code visual workflow can orchestrate LLM‑driven document processing, RAG, and image‑based PPT creation with interactive HITL steps. While not a production‑ready product, the demo showcases the rapid prototyping power of BISHENG, the extensibility of code nodes/MCP, and the evolving capabilities of low‑code platforms for AI‑centric applications.

LLMworkflowlow-codePPT generationBISHENGHITL
AI Large Model Application Practice
Written by

AI Large Model Application Practice

Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.