Building a Multimodal RAG Front‑End with Trae Solo: A Vibe‑Coding Guide
This article walks through a three‑step Vibe‑Coding workflow—structured prompt creation, prompt optimization with DeepSeek, and precise bug‑fix guidance—to automatically generate, refine, and extend a React + TypeScript front‑end for a multimodal RAG system using Trae Solo, covering architecture, streaming chat, and PDF citation features.
Vibe‑Coding Best‑Practice Guide
The author first outlines a three‑step "Vibe Coding" methodology for guiding large language models (LLMs) to generate reliable code:
Structured Prompt Construction – Define role (e.g., senior front‑end engineer), clear requirements (upload PDF, image, audio, chat), and coding conventions (TypeScript, React, Ant Design, Tailwind, Vite).
Prompt Optimization – Pass the raw prompt to a secondary LLM (DeepSeek) to improve logical flow and clarity before feeding it to Trae Solo.
Precise Issue Localization – When bugs appear, manually identify the problematic file or code fragment, provide the error message and relevant snippet, and ask the model for a targeted fix.
Trae Solo Front‑End Development
Using the optimized prompt, Trae Solo performs the following automated steps:
Environment Setup – Analyzes the FastAPI back‑end, creates a Vite project, installs dependencies, and configures the development environment.
Structured Code Generation – Generates a modular directory layout (src/pages, src/components, src/modules), adheres to the prescribed coding standards, and produces component files for each RAG feature (text, image, audio, PDF).
Project Structure Verification – Confirms that the generated file tree matches the design, with semantic naming and clear module separation.
Bug Fixing and Feature Extension
Even with thousands of lines, Trae Solo may produce bugs. The workflow recommends:
Identify the error (e.g., Encountered two children with the same key in pdf.tsx).
Supply the error message and the suspect file to Trae Solo for analysis and correction.
Handle environment‑related runtime errors (e.g., localStorage is not defined) by providing context.
Add small features such as chat history and clickable PDF references by describing them in natural language.
Key Front‑End Code Insights
Modular Architecture & Routing
The project uses react-router-dom@7 with a Home.tsx layout that maps four core routes (smart Q&A, image analysis, audio transcription, PDF parsing) to separate pages, enabling lazy loading via React.lazy() and Suspense.
Component Reuse Strategy
All feature pages share a common ChatPage component. Props such as title, extraUploadComponent, and apiEndpoint customize behavior, avoiding code duplication while maintaining a consistent UI.
Streaming Chat Implementation
In ChatPage/index.tsx the front‑end maintains a messages array via useState and stores history in localStorage (demo only). When a user sends a message, the client calls the back‑end streaming endpoint with fetch, reads the Response.body as a ReadableStream, and processes chunks:
Incremental Update – On receiving a content_delta chunk, the UI updates the current reply character‑by‑character.
Completion Handling – When a message_complete flag arrives, the full message is pushed to the messages state.
PDF Reference Tracing
After the streaming response finishes, the back‑end includes a references array containing source snippets and page numbers. The front‑end calls renderContentWithReferences to:
Match placeholder tags like [1] with entries in references.
Render the placeholders as clickable links or buttons.
Show the original excerpt in a modal or sidebar when the user clicks a reference, providing full traceability of the answer.
Final Outcome
Running npm run dev launches the application, displaying a clean home page with four entry points. Each page supports real‑time streaming dialogue, file uploads (image, audio, PDF), and PDF citation links. The author provides the complete source via the "大模型真好玩" WeChat account.
Conclusion
The guide demonstrates that, with a disciplined Vibe‑Coding workflow and Trae Solo’s code‑generation capabilities, developers can rapidly prototype a full‑stack multimodal RAG system, achieve modular and maintainable front‑end code, and handle complex features such as streaming responses and reference tracing.
Fun with Large Models
Master's graduate from Beijing Institute of Technology, published four top‑journal papers, previously worked as a developer at ByteDance and Alibaba. Currently researching large models at a major state‑owned enterprise. Committed to sharing concise, practical AI large‑model development experience, believing that AI large models will become as essential as PCs in the future. Let's start experimenting now!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
