Building a Local AI Knowledge Base in 2 Months for 75k: My Development Journey
In two months and a budget of 75,000 CNY, I built a secure on‑premise AI knowledge‑base for a research institute using SpringBoot, Python, DeepSeek‑v4, RAGFlow, and a custom GPU‑rich server, and documented every step from hardware selection to Docker deployment.
I am programmer Xiao Meng. Recently I took an AI knowledge‑base project for a research institute, with a two‑month schedule and a budget of 75,000 CNY. The client required data confidentiality, so the large language model had to be deployed locally.
Technology Stack
The backend is built with Spring Boot . AI‑related components are written in Python because of its extensive libraries; Java alternatives such as SpringAI and Python alternatives such as LangChain are mentioned. The knowledge base is assembled with RAGFlow , and the large model used is DeepSeek v4 .
Hardware Configuration
Hardware was chosen to meet the high GPU demand and data‑security requirements. The server specifications are:
4U rack with 16 × 3.5‑inch bays
2 × Intel Xeon Gold 6338 (32 cores / 64 threads, 48 MB cache, 205 W)
4 × 32 GB DDR4 2933 MHz RAM (supports up to 32 × DDR4 3200 MHz LRDIMM/RDIMM)
Storage: 4 × 960 GB SSD 2.5″, 2 × 3.84 TB NVMe SSD, 3 × 4 TB 7200 RPM SATA
Front I/O: 2 × USB 3.0, 1 × VGA, 1 × RJ45 serial; Rear I/O: 1 × serial, 2 × USB 3.0, 1 × RJ45 management, 1 × OCP 3.0 NIC
RAID controller with 2 GB cache, supporting RAID 0/1/10/5/50/6/60 and cache‑supercapacitor protection
Network: dual‑port 1 GbE and dual‑port 10 GbE modules
GPU suite: 4 × RTX 5090 (512‑bit, 1792 GB/s bandwidth)
Power: 4 × 2200 W supplies
The hardware cost is roughly 300,000 CNY, mainly due to the GPUs.
Installing CUDA
Run nvidia-smi in a command window to obtain the driver version, then download the matching CUDA Toolkit and install it. This ensures that Ollama can execute the model on the GPU.
Installing and Configuring Ollama
Download the Ollama binary, install it, and set the following environment variables:
OLLAMA_HOST=0.0.0.0 OLLAMA_MODELS=E:\backup\software\ds\ollama OLLAMA_ORIGINS=*Run the model with:
ollama run deepseek-r1:1.5bInstalling Docker
Install Docker Desktop (Windows or macOS) using the official installer. Verify the installation with: docker --version and test image pulling with:
docker run hello-worldConfiguring Docker Image Acceleration
To speed up image pulls in China, edit the Docker daemon settings and add a domestic mirror (e.g., the deepseek mirror). The UI path is Settings → Docker Engine .
Testing Docker
Open a CMD window and execute the version and hello‑world commands above; the expected output confirms Docker works correctly.
Deploying RAGFlow
https://github.com/infiniflow/ragflow
Clone the repository, then ensure the kernel parameter vm.max_map_count is at least 262144: sysctl vm.max_map_count If it is lower, raise it temporarily: sudo sysctl -w vm.max_map_count=262144 For a permanent change, add vm.max_map_count=262144 to /etc/sysctl.conf.
Start the stack:
docker compose -f docker/docker-compose.yml up -dCheck the server logs to confirm the service is running: docker logs -f ragflow-server Successful logs show the server listening on 0.0.0.0 and URLs such as http://127.0.0.1:9380 and http://x.x.x.x:9380.
Verification
Open a browser and navigate to the machine’s IP address (e.g., http://IP_OF_YOUR_MACHINE). The RAGFlow UI should load; if the server has not fully started, the browser may report a network error.
Conclusion
The AI knowledge‑base is now fully operational on a local, secure server, providing the research institute with a confidential AI assistant. All steps—from hardware selection and CUDA installation to Ollama configuration, Docker setup, and RAGFlow deployment—are documented for reproducibility.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
SpringMeng
Focused on software development, sharing source code and tutorials for various systems.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
