Build a Private AI Knowledge Base with Ollama and FastGPT

This guide walks you through setting up a locally deployed AI system using Ollama and FastGPT, covering model selection, Docker deployment, configuration, knowledge‑base creation, and testing so your team can query internal documents securely and efficiently.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Build a Private AI Knowledge Base with Ollama and FastGPT

Why AI Matters Now

Our company has made AI a strategic focus for the second half of the year, and almost every tech team is racing to integrate large models into their workflows, from code review to bug fixing.

Beyond code generation, we want AI to truly understand business logic, answer internal questions, and even act as a knowledge‑base‑driven customer service agent.

Start Small: An Internal AI Q&A System

Goal: a private, locally deployed solution that keeps data in‑house, provides semantic search over documents, runs on a modest server or laptop, and optionally offers a conversational interface.

Local deployment, data never leaves the company.

Semantic retrieval from technical docs, SOPs, and historical solutions.

Low‑cost setup – a single machine is enough.

Optional integration with a large language model for natural‑language chat.

Toolchain – Ollama + FastGPT

Ollama lets you run open‑source large models locally with a single command, handling model download, loading, and exposing a local API. ollama run qwen3:1.7b Key benefits: offline operation, full data control, and a lightweight footprint.

❝ Ollama is the "lazy‑person's" way to spin up a model with one command. ❞

FastGPT is an open‑source enterprise knowledge‑base Q&A system that provides a ready‑made front‑end, document upload, vector search, and model integration.

Installing Ollama (macOS example)

Homebrew (recommended): brew install ollama Or download the installer from the official website, or use Docker: docker run -d -p 11434:11434 ollama/ollama After installation, run the model: ollama run qwen3:1.7b The first run downloads the ~1.4 GB model, then starts an interactive chat session.

Deploying FastGPT with Docker Compose

FastGPT recommends the docker-compose-pgvector stack for quick local deployment. mkdir fastgpt && cd fastgpt Download configuration files:

curl -O https://raw.githubusercontent.com/labring/FastGPT/main/projects/app/data/config.json
curl -o docker-compose.yml https://raw.githubusercontent.com/labring/FastGPT/main/deploy/docker/docker-compose-pgvector.yml

Adjust config.json to point the backend proxy to your host:

{
  "mcpServerProxyEndpoint": "http://localhost:3005"
}

Start the stack: docker compose up -d The services include a web front‑end, MCP API server, PostgreSQL + pgvector, and a document‑vector service.

Configuring Models in FastGPT

Log in with the default admin account (user: root, password: 1234 or the value of DEFAULT_ROOT_PSW in docker-compose.yml).

Add a language model (e.g., qwen3:1.7b) and an embedding model (e.g., nomic-embed-text) under the Ollama provider.

Create a model channel that links the two models and set the proxy address to http://host.docker.internal:11434 (or your host IP on Linux).

❝ Use host.docker.internal because localhost inside the container refers to the container itself. ❞

Test the channel – a successful test changes the status to “passed”.

Creating an Application

In the FastGPT dashboard go to Workbench → Team Apps and create a new “Simple App”. Choose the language model you added, give the app a name, and save.

Publish the app via a “no‑login window” link so teammates can use it without authentication.

Building a Knowledge Base

Create a new knowledge base, select the same language and embedding models, and upload documents (PDF, Word, txt, etc.). FastGPT will split the files into chunks, embed them, and build a vector index.

When the status shows “Ready”, the knowledge base is usable.

Linking Knowledge Base to the App

In the app settings, bind the newly created knowledge base. Adjust parameters such as temperature, top‑p, and citation settings.

Now queries like “Who is the second male lead in the Six‑Demon Diagram?” return detailed answers with cited sources from the uploaded document.

Final Thoughts

With Ollama and FastGPT you can spin up a private, self‑hosted AI assistant that reads your internal documentation, answers questions, and can be integrated into Feishu, DingTalk, WeChat, or exposed via API. This demonstrates a practical, production‑ready AI workflow that goes far beyond simple code‑completion.

DockerAIRAGLocal DeploymentOllamaFastGPT
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.