Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow
This guide shows how to use the open‑source BISHENG low‑code platform, ByteDance’s Seed‑1.6 and Seedream‑4.5 models, and a custom MCP server to build a workflow that uploads documents, performs RAG, generates structured PPT outlines with LLMs, creates page images via text‑to‑image models, and assembles a downloadable PDF, all while incorporating human‑in‑the‑loop controls.
In a previous article the author recreated Google NotebookLM’s video‑summary feature with pure code, which proved complex and hard to maintain. This follow‑up demonstrates a low‑code alternative using the open‑source BISHENG visual workflow platform, ByteDance’s Seed‑1.6 (LLM) and gpt‑4o‑mini models, the Seedream‑4.5 text‑to‑image model, and a self‑written MCP server.
Technology Stack
Low‑code platform: BISHENG (open‑source visual workflow engine)
LLM: ByteDance Seed‑1.6 inference model + gpt‑4o‑mini
Text‑to‑image: Seedream‑4.5
External tool: Custom MCP Server for image download, PDF assembly and OSS upload
Core Capabilities Replicated
Document upload → temporary vector knowledge base
Agentic RAG for factual Q&A
Automatic PPT generation from uploaded documents
The workflow is built entirely in BISHENG’s backend; a front‑end UI would be needed for a production‑ready product.
Getting Started with BISHENG
Install and launch BISHENG via Docker:
git clone https://github.com/dataelement/bisheng.git
cd bisheng/docker
docker compose -f docker-compose.yml -p bisheng up -dAfter the containers start, open http://localhost:3001/, register an admin account, and configure the required LLM/Embedding models in the “Model” menu.
Workflow Overview (Part 01 – Preparation)
1. Upload documents – the “Input” node accepts multiple files, parses them, slices them, and builds a temporary vector store.
2. Intent recognition – an LLM node determines whether the user wants factual Q&A or PPT generation.
3. Conditional branching – routes to either a RAG answer node or the PPT pipeline.
PPT Generation Pipeline (Part 02 – Document → PPT)
1. LLM creates a structured outline for each slide (title, bullet points, description).
2. Visual prompt generation – the LLM produces page‑specific prompts for the image model.
3. Image creation – Seedream‑4.5 is called via a Python code node (or MCP tool node) to generate a PNG for each slide.
4. Looped page generation – BISHENG’s workflow supports a natural loop, outputting each image sequentially.
5. PDF assembly – a custom MCP tool downloads all images, merges them into a PDF, uploads the file to Alibaba Cloud OSS, and returns a download link.
Human‑in‑the‑Loop (HITL) Controls (Part 03‑04)
Because image generation can be unstable, a HITL step is added after each page:
Show the generated image and ask the user to continue or regenerate .
If regeneration is chosen, the previous visual prompt is presented for editing before re‑invoking the image model.
This interaction is implemented with BISHENG’s input and output nodes, which can be placed anywhere in the flow, allowing mid‑process user feedback.
Observability and Debugging
BISHENG provides a visual “Agent Debug IDE”: each node’s inputs/outputs, tool call parameters, branch decisions and front‑end mockups are viewable, making multi‑step LLM reasoning far easier to debug than pure code.
Limitations and Future Work
Generated PPTs are image‑only; text is not editable without post‑processing (e.g., converting PDF back to PPT).
Seedream‑4.5 struggles with dense textual slides compared to Nano‑Banana‑Pro.
BISHENG currently lacks global variables, workflow nesting, and a rich built‑in tool ecosystem, requiring custom code nodes or MCP extensions.
Conclusion
The prototype validates that a low‑code visual workflow can orchestrate LLM‑driven document processing, RAG, and image‑based PPT creation with interactive HITL steps. While not a production‑ready product, the demo showcases the rapid prototyping power of BISHENG, the extensibility of code nodes/MCP, and the evolving capabilities of low‑code platforms for AI‑centric applications.
AI Large Model Application Practice
Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
