How UltraRAG Turns RAG Deployment into a Zero‑Code, One‑Click Process

UltraRAG, an open‑source RAG framework co‑developed by Tsinghua and NEUIR, offers a zero‑code WebUI that streamlines data construction, model fine‑tuning, and multi‑dimensional evaluation, boosting retrieval accuracy by up to 30% and cutting deployment time by half across legal, medical, and research use cases.

Old Meng AI Explorer
Old Meng AI Explorer
Old Meng AI Explorer
How UltraRAG Turns RAG Deployment into a Zero‑Code, One‑Click Process

Overview of UltraRAG

UltraRAG is an open‑source Retrieval‑Augmented Generation (RAG) framework (Apache‑2.0) jointly developed by Tsinghua THUNLP and NEUIR. It integrates knowledge‑base construction, data processing, model fine‑tuning, and multi‑dimensional evaluation into a single zero‑code Web UI, allowing users to build text‑only or multimodal RAG pipelines without writing code.

Core Technical Features

Zero‑code visual interface – All stages (knowledge‑base management, data slicing, annotation, retrieval, generation, evaluation) are operated through a Streamlit‑based UI reachable at http://localhost:8843.

Automated data construction – Built‑in methods such as KBAlign automatically split domain documents into passages, generate alignment annotations, and produce training data for retrieval and generation models.

One‑click fine‑tuning – The RAG‑DDR workflow combines retrieval‑oriented data augmentation with LoRA/SFT fine‑tuning, reducing manual effort by >90% compared with manual pipeline scripts.

Multimodal support – The VisRAG module parses PDFs containing tables, figures, formulas, and raster images, extracts visual features, and merges them with textual tokens to form multimodal knowledge slices.

Robust multi‑dimensional evaluation – The proprietary RAGEval metric reports ROUGE‑L together with effective‑information recall and key‑knowledge coverage, providing a more reliable assessment than single‑score metrics.

Modular architecture – UltraRAG separates module layer (retriever, encoder, generator), workflow layer (pipeline orchestration), and function layer (utility scripts). Users can replace any component (e.g., swap FAISS for Milvus, or replace the generator with a different LLM) without altering the rest of the system.

Flexible deployment – Supports a single‑container Docker‑Compose deployment and a micro‑service mode where embedding models, LLMs, and vector stores run as independent services, enabling high‑concurrency scaling.

Representative Scenarios

1. Enterprise legal‑domain RAG

Deploy UltraRAG with docker-compose up --build -d and open the UI.

Import a legal textbook; KBAlign automatically slices the text into passages and creates alignment labels.

Select the “RAG‑DDR” strategy; the system runs retrieval indexing, data augmentation, and LoRA fine‑tuning in one step.

Evaluate with RAGEval: ROUGE‑L improves from 40.75 % (baseline) to 53.14 %, and key‑clause recall rises by ~30 %.

Lawyers query the system and receive precise, compliant answers up to five times faster than manual lookup.

2. Multimodal medical knowledge base

Upload a PDF containing clinical notes and CT‑scan screenshots via the “Knowledge Base Management” panel.

VisRAG extracts image regions (e.g., lesion locations, numeric indicators) and aligns them with surrounding text to create structured multimodal slices.

When building the retrieval chain, enable “multimodal retrieval” so that a symptom query returns both relevant text excerpts and matching imaging cases.

The generated answer combines textual explanation with visual evidence, delivering more comprehensive diagnostic suggestions than text‑only RAG.

3. Research prototype validation

Replace the default retriever component with a custom algorithm by editing the module configuration file.

Run built‑in evaluation datasets or import a custom benchmark.

Launch a single‑click multi‑dimensional evaluation; RAGEval reports effective‑information recall and key‑knowledge coverage for both the baseline and the new method.

This workflow eliminates the need to re‑implement data preprocessing or evaluation code, cutting experimental setup time by roughly 50 %.

Quick‑Start Deployment

Method 1: Docker (recommended)

# Clone the repository (optional)
git clone https://github.com/OpenBMB/UltraRAG.git
cd UltraRAG
# Build and start containers
docker-compose up --build -d

After the containers are running, open http://localhost:8843 to access the UI.

Method 2: Conda environment (customizable)

# Create and activate a conda environment
conda create -n ultrarag python=3.10
conda activate ultrarag
# Install Python dependencies
pip install -r requirements.txt
# Download default model weights (resources/models)
python scripts/download_model.py
# Launch the Streamlit UI
streamlit run ultrarag/webui/webui.py --server.fileWatcherType none

Visit http://localhost:8843 to begin using UltraRAG.

System Requirements and Notes

CUDA ≥ 12.2 and Python ≥ 3.10 are required for GPU acceleration.

The first launch downloads model checkpoints; a stable internet connection is recommended.

For enterprise‑scale deployments, enable the micro‑service mode and allocate separate resources for the embedding model, LLM, and vector database (e.g., Milvus or Elasticsearch).

Repository

Source code and releases are available at https://github.com/OpenBMB/UltraRAG

AIRAGOpen-source
Old Meng AI Explorer
Written by

Old Meng AI Explorer

Tracking global AI developments 24/7, focusing on large model iterations, commercial applications, and tech ethics. We break down hardcore technology into plain language, providing fresh news, in-depth analysis, and practical insights for professionals and enthusiasts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.