Artificial Intelligence 12 min read

Boost LLM Evaluation with Semantic Enrichment and Vector Search

This article explains how semantic enrichment, vector and hybrid search, and clustering techniques can be applied to large language model logs to evaluate inputs and outputs, improve compliance auditing, and enhance model iteration across various business scenarios.

Alibaba Cloud Developer

Mar 24, 2025

Boost LLM Evaluation with Semantic Enrichment and Vector Search

Challenges of Large Model Content Evaluation

Unlike traditional code, where fixed inputs produce deterministic outputs that can be fully tested, large language models (LLMs) accept natural‑language inputs that vary wildly. Developers need to debug model I/O, assess production quality, and audit all interactions for compliance.

Semantic Analysis: Multi‑Angle Understanding of LLM I/O

Effective log management for LLMs requires natural‑language search, processing, and analysis capabilities, including:

Semantic enrichment: extracting structured information such as user intent, topic, sentiment, keywords, questions, and entities.

Vector retrieval: one‑stop embedding and vector‑index support, enabling intent‑based search beyond keyword matching.

Hybrid retrieval: combining exact keyword matches with approximate vector matches across multiple fields.

Clustering: grouping natural‑language records to identify hotspots and outliers.

(1) Semantic Enrichment

In Retrieval‑Augmented Generation (RAG) scenarios, documents are converted to structured Markdown, chunked, and indexed as vectors. Traditional pipelines lose information, so multi‑modal feature extraction builds a multidimensional semantic space covering:

User intent (e.g., translation, technical query, legal advice).

Topic (e.g., education, cloud computing, law).

Summary (concise description of the conversation).

Sentiment (positive, negative, neutral).

Keywords.

Questions derived from the conversation.

Entity extraction (countries, names, locations).

LLM evaluation and vector indexing extract these structures from prompts and responses, visualize results, and support compliance audits to mitigate legal risks.

Alibaba Cloud Log Service (SLS) provides semantic processing APIs that can call hosted Qwen models or custom LLM endpoints for enrichment.

LLM Evaluation Architecture

Generic HTTP function: SPL syntax for HTTP calls with URL, body, headers.

Qwen model invocation: wrapper AIGC function passing endpoint, access‑key, system and user prompts.

System/custom Prompt library: built‑in Evaluation System Prompt templates or user‑defined prompts.

| extend "__tag__:__sls_qwen_user_tpl__" = replace(replace(replace(replace(replace(replace(replace(replace("__tag__:__sls_qwen_user_tpl__", '<INPUT_TEMPLATE>', "output.value"), '\\', '\\'), '"', '\"'), chr(8), '\b'), chr(12), '\f'), chr(10), '
'), chr(13), '\r'), chr(9), '\t')
  | extend "__tag__:__sls_qwen_sys_tpl__" = replace(replace(replace(replace(replace(replace(replace("__tag__:__sls_qwen_sys_tpl__", '\\', '\\'), '"', '\"'), chr(8), '\b'), chr(12), '\f'), chr(10), '
'), chr(13), '\r'), chr(9), '\t')
  | extend request_body = replace(replace("__tag__:__sls_qwen_body_tpl__", '<SYSTEM_PROMPT>', "__tag__:__sls_qwen_sys_tpl__"), '<USER_PROMPT>', "__tag__:__sls_qwen_user_tpl__")
  | http-call -method='post' -headers='{"Authorization": "Bearer xxxxxx", "Content-Type": "application/json", "Host": "dashscope.aliyuncs.com", "User-Agent":"sls-etl-test"}' -timeout_millis=60000 -body='request_body' 'http://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation' as status, response_body
  | extend tmp_content = json_extract_scalar(response_body, '$.output.choices.0.message.content')
  | extend output_enrich = regexp_replace(regexp_replace(tmp_content, '^([^{]|\s)+{', '{'), '}([^}]|\s)+$', '}')
  | project-away "__tag__:__sls_qwen_sys_tpl__", "__tag__:__sls_qwen_user_tpl__", "__tag__:__sls_qwen_body_tpl__", trimed_input, tmp_content, request_body, response_body

(2) Vector Retrieval

Implementing vector retrieval requires embedding text, building indexes, and maintaining pipelines, which adds engineering complexity and cost. GPU resources are needed for embedding and indexing, and high‑dimensional vectors consume significant memory.

SLS offers a one‑stop vector retrieval service: after writing prompts/responses to SLS, it automatically embeds, indexes, and queries vectors. Users only need to provide text and a query.

Key syntax points:

Use similarity to express approximate similarity.

Specify the vector index key.

Provide the query string.

Set a distance threshold (0 = most similar, 1 = least similar).

similarity(Key,query) < distance

(3) Hybrid Retrieval

When both exact field matches and approximate text similarity are required, hybrid retrieval combines keyword inverted indexes with vector indexes using and conditions.

uid:123 and similarity(key,query) < distance

(4) Vector Clustering

Clustering transforms high‑dimensional vectors into groups to reveal hot topics and outliers. The SQL function clustering_centroids(samples, num_of_clusters) computes centroids, while t_sne(samples) reduces dimensions for visualization.

clustering_centroids(array(array(double)) samples, integer num_of_clusters)

t_sne(array(array(double))

Engineering Practice of LLM Prompt/Response Semantic Insight

After extracting semantic information, the following business goals are achieved:

Compliance and Auditing : Search for prohibited keywords using similarity with adjustable distance thresholds.

similarity("input_semantic.summary", "恶意关键词") < 0.4

Topic and Sentiment Filtering : Query by extracted topic or sentiment, e.g., input_semantic.topic : database.

Content Clustering : Visualize clustered conversations to see topic relationships; each color represents a distinct cluster.

Conclusion

Semantic enrichment and search enable deeper understanding of LLM inputs and outputs, facilitating more intelligent applications. The same techniques can be extended to vertical scenarios such as user‑portrait construction, model iteration optimization, and compliance risk management.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI LLM Vector Search evaluation Log Analysis semantic enrichment

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.