Turn Static Markdown Docs into AI-Powered Q&A with ChatGPT and Embedbase
This guide walks you through building an intelligent documentation system that stores markdown files in Embedbase, creates contextual prompts, and uses ChatGPT to answer user queries, covering setup, code integration, and deployment steps for a full‑stack solution.
ChatGPT can add AI support to traditional systems, enhancing user experience. This article explains how to add ChatGPT Q&A to an online Markdown documentation system using OpenAI and Embedbase.
Overview
We will:
Store content in a database.
Allow users to input queries.
Search the database for the most similar results.
Create a context from the top 5 matches and query ChatGPT.
Implementation Details
Prerequisites:
Embedbase API key – a database that can return the most similar results.
OpenAI API key – for ChatGPT.
Nextra and Node.js installed.
In the .env file, add your keys:
OPENAI_API_KEY="<YOUR KEY>"
EMBEDBASE_API_KEY="<YOUR KEY>"We use the Nextra documentation framework (Next.js, Tailwind CSS, MDX) and Embedbase as the vector store.
Create Nextra Docs
Clone the official Nextra template from GitHub, then run:
# we won't use "pnpm" here, rather the traditional "npm"
rm pnpm-lock.yaml
npm i
npm run devVisit https://localhost:3000 and edit .mdx files.
Prepare and Store Files
Write a scripts/sync.js script to read all .mdx files, split them into 100‑line chunks, and upload them to Embedbase. Install [email protected] first.
const glob = require("glob");
const fs = require("fs");
const sync = async () => {
// 1. read all files under pages/* with .mdx extension
const documents = glob.sync("pages/**/*.mdx").map((path) => ({
id: path.replace("pages/", "/").replace("index.mdx", "").replace(".mdx", ""),
data: fs.readFileSync(path, "utf-8")
}));
// 2. split documents into chunks of 100 lines
const chunks = [];
documents.forEach((document) => {
const lines = document.data.split("
");
const chunkSize = 100;
for (let i = 0; i < lines.length; i += chunkSize) {
const chunk = lines.slice(i, i + chunkSize).join("
");
chunks.push({ data: chunk });
}
});
};
sync();Upload chunks to Embedbase:
const fetch = require("node-fetch");
const apiKey = process.env.EMBEDBASE_API_KEY;
const response = await fetch("https://embedbase-hosted-usx5gpslaq-uc.a.run.app/v1/documentation", {
method: "POST",
headers: {
"Authorization": "Bearer " + apiKey,
"Content-Type": "application/json"
},
body: JSON.stringify({ documents: chunks })
});
const data = await response.json();
console.log(data);Run the script:
EMBEDBASE_API_KEY="<YOUR API KEY>" node scripts/sync.jsGet User Queries
Replace the built‑in search bar with a ChatGPT‑enabled modal component in theme.config.tsx. Add a Modal component and a Search component that opens the modal, captures the question, builds a contextual prompt via /api/buildPrompt, and streams the answer from /api/qa.
// Modal component (simplified)
const Modal = ({ children, open, onClose }) => {
if (!open) return null;
return (
<div style={{position:'fixed',top:0,left:0,right:0,bottom:0,backgroundColor:'rgba(0,0,0,0.5)',zIndex:100}} onClick={onClose}>
<div style={{position:'absolute',top:'50%',left:'50%',transform:'translate(-50%,-50%)',backgroundColor:'#fff',padding:20,borderRadius:5,width:'80%',maxWidth:700,maxHeight:'80%',overflow:'auto'}} onClick={e=>e.stopPropagation()}>
{children}
</div>
</div>
);
};
// Search component (simplified)
const Search = () => {
const [open, setOpen] = useState(false);
const [question, setQuestion] = useState("");
const [answer, setAnswer] = useState("");
const answerQuestion = async (e) => {
e.preventDefault();
setAnswer("");
const promptRes = await fetch("/api/buildPrompt", {method:"POST",headers:{"Content-Type":"application/json"},body:JSON.stringify({prompt:question})});
const {prompt} = await promptRes.json();
const response = await fetch("/api/qa", {method:"POST",headers:{"Content-Type":"application/json"},body:JSON.stringify({prompt})});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let done = false;
while (!done) {
const {value, done: doneReading} = await reader.read();
done = doneReading;
const chunk = decoder.decode(value);
setAnswer(prev => prev + chunk);
}
};
return (
<>
<input placeholder="Ask a question" onClick={() => setOpen(true)} type="text" />
<Modal open={open} onClose={() => setOpen(false)}>
<form onSubmit={answerQuestion} className="nx-flex nx-gap-3">
<input placeholder="Ask a question" type="text" value={question} onChange={e=>setQuestion(e.target.value)} />
<button type="submit">Ask</button>
</form>
<p>{answer}</p>
</Modal>
</>
);
};Build Prompt
Create pages/api/buildPrompt.ts to fetch similar documents from Embedbase, assemble a context limited by token count using tiktoken, and return a prompt for ChatGPT.
import { get_encoding } from "@dqbd/tiktoken";
const enc = get_encoding('cl100k_base');
const apiKey = process.env.EMBEDBASE_API_KEY;
const search = async (query) => {
return fetch("https://embedbase-hosted-usx5gpslaq-uc.a.run.app/v1/documentation/search", {
method: "POST",
headers: {"Authorization": "Bearer " + apiKey, "Content-Type": "application/json"},
body: JSON.stringify({ query })
}).then(r => r.json());
};
const createContext = async (question, maxLen = 1800) => {
const resp = await search(question);
let curLen = 0;
const returns = [];
for (const sim of resp["similarities"]) {
const sentence = sim["data"];
const nTokens = enc.encode(sentence).length;
curLen += nTokens + 4;
if (curLen > maxLen) break;
returns.push(sentence);
}
return returns.join("
###
");
};
export default async function buildPrompt(req, res) {
const prompt = req.body.prompt;
const context = await createContext(prompt);
const newPrompt = `Answer the question based on the context below, and if the question can't be answered based on the context, say "I don't know"
Context: ${context}
---
Question: ${prompt}
Answer:`;
res.status(200).json({ prompt: newPrompt });
}Call ChatGPT
Implement utils/OpenAIStream.ts to stream responses from the OpenAI chat completion endpoint, then expose it via pages/api/qa.ts as an edge function.
// OpenAIStream.ts (simplified)
export interface OpenAIStreamPayload { model: string; messages: { role: string; content: string }[]; stream: boolean; }
export async function OpenAIStream(payload: OpenAIStreamPayload) {
const encoder = new TextEncoder();
const decoder = new TextDecoder();
const res = await fetch("https://api.openai.com/v1/chat/completions", {
headers: {"Content-Type":"application/json","Authorization":`Bearer ${process.env.OPENAI_API_KEY ?? ""}`},
method: "POST",
body: JSON.stringify(payload)
});
const parser = createParser(onParse);
// streaming logic omitted for brevity
return new ReadableStream({ start(controller) { /* ... */ } });
} // pages/api/qa.ts (simplified)
export const config = { runtime: "edge" };
export default async function handler(req) {
const { prompt } = await req.json();
if (!prompt) return new Response("No prompt in the request", { status: 400 });
const payload = { model: "gpt-3.5-turbo", messages: [{ role: "user", content: prompt }], stream: true };
const stream = await OpenAIStream(payload);
return new Response(stream);
}Conclusion
We created Nextra docs, stored and indexed them in Embedbase, built an API to retrieve relevant context, constructed prompts, streamed answers from ChatGPT, and integrated everything into a searchable UI.
Further Reading
Embedding converts data into semantic vectors, enabling semantic search, recommendation, classification, and generative search. While the technique is mature, recent cheap OpenAI embeddings make it widely accessible. Production considerations include storage infrastructure, cost optimization, user isolation, token limits, and integration with services like Supabase or Firebase.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
