Artificial Intelligence 12 min read

How Hologres Powers Fast Vector & Full‑Text Search for AI‑Driven Customer Service

The Taobao‑Tmall customer operations team built an integrated vector‑plus‑full‑text retrieval solution on Hologres, achieving millisecond‑level recall for massive unstructured knowledge bases, boosting intelligent客服, rule comparison, and sentiment analysis across multiple business scenarios.

Alibaba Cloud Big Data AI Platform

Feb 25, 2026

How Hologres Powers Fast Vector & Full‑Text Search for AI‑Driven Customer Service

In the era of large language models, Taobao‑Tmall (淘天) needs to retrieve relevant knowledge from hundreds of thousands to millions of unstructured text entries quickly and accurately for intelligent customer service, rule matching, and sentiment analysis. Traditional keyword matching with SQL LIKE or regex is slow (seconds) and imprecise at this scale.

Why Combine Vector and Full‑Text Search?

Full‑text search uses keyword matching and inverted indexes to return results in milliseconds, but it lacks semantic understanding (e.g., a query "What fruits are available?" cannot retrieve "apple" or "banana" without explicit keywords). Vector search embeds texts into high‑dimensional vectors (e.g., 128‑dim) and retrieves by semantic similarity, bridging the gap between concepts like "fruit" and "apple".

The team therefore runs a two‑stage pipeline: first vector search to get semantically similar candidates, then full‑text search to supplement keyword matches, finally feeding the fused results into a Retrieval‑Augmented Generation (RAG) workflow.

Why Hologres?

Hologres offers real‑time data warehousing and OLAP capabilities, supporting both vector and full‑text indexes on the same table, along with scalar filters, multi‑field sorting, and complex JOINs. Since version 4.0 it provides a self‑developed HGraph vector index that reduces average latency from 4 seconds (Proxima) to ~30 ms on tens of millions of vectors, a two‑order‑of‑magnitude improvement. It also includes built‑in Chinese tokenization, AND/OR logic, and write‑‑‑immediate query support.

Performance Comparison: HGraph vs Proxima

On 40 k 128‑dim vectors, both engines perform similarly (~30‑40 ms). On 9.5 M vectors, Proxima’s latency spikes to 4 s, while HGraph stays around 30 ms for pure vector recall; combined with downstream OLAP operations the total latency remains under a few hundred milliseconds.

Using Hologres for Vector Search

Define a vector column and index during table creation:

knowledge_vectors array<float>

hgraph

Specify a similarity metric such as cosine similarity. Verify the index is used with: EXPLAIN ANALYZE and check the Vector Filter appears in the plan.

Using Hologres for Full‑Text Search

Create a column‑store table, then build a full‑text index on the text field. If the index exists before data insertion, writes automatically trigger compaction, enabling "write‑and‑search". Queries use the TEXT function with logical operators ( OR default, or AND) and can be verified with:

EXPLAIN ANALYZE

Fulltext Filter

The built‑in jieba tokenizer provides Chinese segmentation, achieving ~30 ms response on small datasets and ~200 ms on 700 M records with complex AND queries and sorting.

Overall Technical Solution

The workflow is divided into three stages:

Preparation : Clean raw texts (knowledge base, platform rules), optionally generate similar questions, embed them into vectors, and keep the original text for full‑text indexing.

Retrieval : User queries are sent simultaneously to an embedding model and a tokenizer, producing a vector and keywords. Both vector and full‑text searches run in parallel, their results can be weighted or used independently, and the top‑K documents are returned.

Application : Retrieved documents are fed as context to a large language model, combined with prompt engineering and tool calls (e.g., rule comparison, order lookup) to produce final answers or decision suggestions for intelligent agents.

Scenario 1 – Merchant Help Knowledge Recall

In the客服 (customer service) front‑end, user queries are embedded and tokenized, then vector and full‑text searches retrieve 20‑40 relevant knowledge entries. These are passed to a LLM with carefully designed prompts for ranking, yielding accurate answers without human hand‑off. Metrics such as recall, click‑through, and accuracy improved significantly over the previous LIKE/regex approach.

Scenario 2 – Competitor Rule Full‑Text Search

The team built a rule‑analysis system that crawls competitor policies, cleans them, stores them in Hologres, and creates a full‑text index. A user query like "show all platforms' 7‑day no‑reason return policies" returns relevant clauses within 500 ms, with a much higher recall than regex‑based matching.

Future Outlook

Planned extensions include applying the retrieval capability to sentiment analysis and similar‑case clustering, integrating image‑based chat screenshot extraction, and simplifying the stack by adding built‑in embedding functions, variable full‑text query parameters, and improved incremental compaction for serverless workloads.

Overall, Hologres’s unified vector and full‑text search, high performance, and stable operations have become the backbone of Taobao‑Tmall’s AI‑driven knowledge retrieval, and its role is expected to grow as more scenarios adopt RAG techniques.

RAG Hologres vector search Knowledge Base Full-text search AI retrieval

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Why Combine Vector and Full‑Text Search?

Why Hologres?

Performance Comparison: HGraph vs Proxima

Using Hologres for Vector Search

Using Hologres for Full‑Text Search

Overall Technical Solution

Scenario 1 – Merchant Help Knowledge Recall

Scenario 2 – Competitor Rule Full‑Text Search

Future Outlook

Alibaba Cloud Big Data AI Platform

How this landed with the community

Was this worth your time?

0 Comments

Scenario 1 – Merchant Help Knowledge Recall

Scenario 2 – Competitor Rule Full‑Text Search