Artificial Intelligence 17 min read

Building a ChatGPT‑Powered Markdown Documentation System with Embedbase and Nextra

This tutorial walks through creating an intelligent documentation site that stores markdown pages in Embedbase, retrieves semantically similar chunks for user queries, builds contextual prompts, and streams answers from ChatGPT using a custom Nextra theme and Node.js backend.

Architect's Guide

Nov 20, 2023

Building a ChatGPT‑Powered Markdown Documentation System with Embedbase and Nextra

In this guide we demonstrate how to turn a static markdown documentation site into an AI‑enhanced knowledge base that can answer user questions using ChatGPT. The solution combines OpenAI's ChatGPT, the Embedbase vector database, and the Nextra documentation framework built on Next.js.

Overview

We need to store document content in a database, accept user queries, search for the most similar passages, construct a context from the top‑5 results, and ask ChatGPT to answer based on that context.

Prerequisites

Embedbase API key

– provides semantic similarity search. OpenAI API key – for ChatGPT. Nextra and Node.js – the documentation framework.

Configure the keys in a .env file:

OPENAI_API_KEY="<YOUR KEY>"
EMBEDBASE_API_KEY="<YOUR KEY>"

Create Nextra Docs

Clone the official Nextra template from GitHub, install dependencies, and run the development server.

# we won't use "pnpm" here, rather the traditional "npm"
rm pnpm-lock.yaml
npm i
npm run dev

Prepare and Store Files

Write a scripts/sync.js script that reads all .mdx files, splits them into 100‑line chunks, and uploads the chunks to Embedbase.

const glob = require("glob");
const fs = require("fs");
const sync = async () => {
  // 1. read all files under pages/* with .mdx extension
  const documents = glob.sync("pages/**/*.mdx").map(path => ({
    id: path.replace("pages/", "/").replace("index.mdx", "").replace(".mdx", ""),
    data: fs.readFileSync(path, "utf-8")
  }));
  // 2. split documents into chunks of 100 lines
  const chunks = [];
  documents.forEach(document => {
    const lines = document.data.split("
");
    const chunkSize = 100;
    for (let i = 0; i < lines.length; i += chunkSize) {
      const chunk = lines.slice(i, i + chunkSize).join("
");
      chunks.push({ data: chunk });
    }
  });
};
sync();

Upload the chunks to Embedbase:

const fetch = require("node-fetch");
const apiKey = process.env.EMBEDBASE_API_KEY;
const response = await fetch("https://embedbase-hosted-usx5gpslaq-uc.a.run.app/v1/documentation", {
  method: "POST",
  headers: {
    "Authorization": "Bearer " + apiKey,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({ documents: chunks })
});
console.log(await response.json());

Build Contextual Prompt

Install tiktoken to count tokens and create a helper that searches Embedbase and assembles a prompt limited to 1800 tokens.

import { get_encoding } from "@dqbd/tiktoken";
const enc = get_encoding('cl100k_base');
const apiKey = process.env.EMBEDBASE_API_KEY;
const search = async (query) => {
  return fetch("https://embedbase-hosted-usx5gpslaq-uc.a.run.app/v1/documentation/search", {
    method: "POST",
    headers: { "Authorization": "Bearer " + apiKey, "Content-Type": "application/json" },
    body: JSON.stringify({ query })
  }).then(r => r.json());
};
export default async function buildPrompt(req, res) {
  const prompt = req.body.prompt;
  const context = await createContext(prompt);
  const newPrompt = `Answer the question based on the context below, and if the question can't be answered based on the context, say "I don't know"

Context: ${context}

---

Question: ${prompt}
Answer:`;
  res.status(200).json({ prompt: newPrompt });
}

Streaming ChatGPT Calls

Implement an OpenAI streaming helper ( utils/OpenAIStream.ts) using eventsource-parser and expose an edge function pages/api/qa.ts that forwards the built prompt to the ChatGPT completion endpoint.

export async function OpenAIStream(payload) { /* … streaming logic … */ }
// pages/api/qa.ts
export const config = { runtime: "edge" };
export default async function handler(req, res) {
  const { prompt } = await req.json();
  const payload = { model: "gpt-3.5-turbo", messages: [{ role: "user", content: prompt }], stream: true };
  const stream = await OpenAIStream(payload);
  return new Response(stream);
}

Connect UI

Replace the default Nextra search bar with a modal that collects a question, calls /api/buildPrompt to get a contextual prompt, then streams the answer from /api/qa back to the UI.

// theme.config.tsx – Search component
const Search = () => {
  const [open, setOpen] = useState(false);
  const [question, setQuestion] = useState("");
  const [answer, setAnswer] = useState("");
  const answerQuestion = async (e) => {
    e.preventDefault();
    const promptRes = await fetch("/api/buildPrompt", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ prompt: question }) });
    const { prompt } = await promptRes.json();
    const resp = await fetch("/api/qa", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ prompt }) });
    const reader = resp.body.getReader();
    const decoder = new TextDecoder();
    let done = false;
    while (!done) {
      const { value, done: doneReading } = await reader.read();
      done = doneReading;
      setAnswer(prev => prev + decoder.decode(value));
    }
  };
  return (<>/* UI omitted for brevity */</>);
};

Conclusion

We created a Nextra documentation site, stored its content in Embedbase, built a semantic search API, generated a context‑aware prompt, streamed ChatGPT responses, and wired everything into a custom search modal.

GitHub Action for Continuous Indexing

A simple workflow runs on every push to main, installs dependencies, and executes node scripts/sync.js to keep the Embedbase index up‑to‑date.

name: Index documentation
on:
  push:
    branches: [main]
jobs:
  index:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v2
        with:
          node-version: 14
      - run: npm install
      - run: node scripts/sync.js
        env:
          EMBEDBASE_API_KEY: ${{ secrets.EMBEDBASE_API_KEY }}

With these pieces in place, the documentation site becomes an interactive knowledge base powered by ChatGPT.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Node.js ChatGPT Embedding API Nextra

Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Overview

Prerequisites

Create Nextra Docs

Prepare and Store Files

Build Contextual Prompt

Streaming ChatGPT Calls

Connect UI

Conclusion

Further Reading

GitHub Action for Continuous Indexing

Architect's Guide

How this landed with the community

Was this worth your time?

0 Comments