How to Build a Tencent IMA‑Style AI Knowledge Base for Under $3,000

This article details a cost‑effective AI knowledge‑base project that replicates Tencent IMA functionality using Dify’s open‑source platform, Chinese LLMs (Qwen, DeepSeek, GLM), a Java Spring Boot backend, Vue frontend, multi‑agent orchestration, hybrid on‑premise/cloud deployment, and provides concrete cost and performance estimates.

SpringMeng
SpringMeng
SpringMeng
How to Build a Tencent IMA‑Style AI Knowledge Base for Under $3,000

Project Overview

A client needed a personal AI knowledge base to replace heavy usage of Tencent IMA for privacy reasons. The solution was delivered for roughly 30,000 CNY, covering the entire stack from AI orchestration to deployment.

AI Platform and Model Selection

The open‑source Dify platform was chosen for private deployment because it offers visual workflow editing, native multi‑agent support, built‑in RAG capabilities, and easy integration with large language models.

Three domestically‑approved LLMs were selected:

Qwen‑Plus – primary model for standard Q&A, pricing ¥0.8 / M input tokens, ¥4.8 / M output tokens.

DeepSeek‑V3 – high‑accuracy fallback for conflict‑resolution scenarios, hallucination rate 3.9 %.

GLM‑4‑Flash – free tier for intent classification and simple queries.

Model routing follows a five‑step strategy: GLM‑4‑Flash for intent classification, Qwen‑Plus for routine cases, DeepSeek‑V3 for high‑risk transactions, with workflow nodes handling dynamic switching.

Knowledge‑Base Pipeline

Dify’s RAG engine performs document ingestion, automatic and manual segmentation, embedding via Alibaba Cloud’s text‑embedding‑v4 (Qwen), and storage in either Weaviate or Qdrant vector databases. Hybrid retrieval (vector + keyword) and a reranker model improve relevance.

Backend Technology Stack

Java 21 + Spring Boot 3.x (virtual threads, high concurrency)

MySQL 8.0 for business data

Redis 7.x for session and cache

RabbitMQ for asynchronous processing

Nginx / Spring Cloud Gateway for API routing, rate‑limiting, and authentication

XXL‑JOB for scheduled tasks (e.g., knowledge‑base sync)

ELK stack for full‑stack logging

Frontend Technology Stack

Vue 3 + Element Plus for the admin console

Vue 3 + WebSocket for the real‑time客服 workbench

System Architecture

The system follows a layered design: channel adapters → API gateway → business services → AI capability layer (Dify workflow, multi‑agent orchestration, RAG) → data layer (MySQL, Redis, vector DB, Elasticsearch).

Multi‑agent roles include a Router Agent (intent classification with GLM‑4‑Flash), Document Agent (Qwen‑Plus for document parsing), Knowledge‑Base Agent (Qwen‑Plus for answering), Image Agent (DeepSeek‑V3 for image analysis), Note Agent (Qwen‑Plus), and Discovery Agent (Qwen‑Plus).

Deployment Strategy

A hybrid deployment combines an on‑premise server (CPU 16 cores, 64 GB RAM, 500 GB SSD, optional GPU) hosting Dify, vector DB, Redis, and Nginx, with a cloud server (4 vCPU, 8 GB RAM, RDS MySQL) running the Java business services, RabbitMQ, and ELK. Communication between the two zones uses VPN or dedicated lines.

All components are containerized with Docker and orchestrated via Docker‑Compose. The compose file defines services such as dify‑api, dify‑web, dify‑worker, weaviate, redis, postgres, business‑backend, mysql, rabbitmq, and nginx, each exposing the necessary ports.

Cost Estimate

For an average of 100,000 daily interactions, the model mix yields a monthly cost of ¥200‑500, with GLM‑4‑Flash handling ~50 % of simple queries for free, Qwen‑Plus covering ~45 % of standard cases, and DeepSeek‑V3 processing the remaining ~5 % high‑accuracy requests.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DockerLLMRAGVueDifymulti-agentAI knowledge baseJava Spring Boot
SpringMeng
Written by

SpringMeng

Focused on software development, sharing source code and tutorials for various systems.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.