Artificial Intelligence 16 min read

How Human‑Centric Evaluation Transforms Conversational Recommender Systems

This article reviews Professor Jin Yucheng’s research on conversational recommender systems, detailing a human‑centric evaluation framework, the CRS‑Que assessment model, and ChatGPT‑based experiments that reveal how dialogue quality, user trust, and prompt design jointly shape system performance.

DataFunSummit

Sep 16, 2025

How Human‑Centric Evaluation Transforms Conversational Recommender Systems

Introduction

Conversational Recommender Systems (CRSs) combine natural‑language interaction with intelligent recommendation, and are widely used in e‑commerce, music, video and other domains. The article summarizes recent research on human‑centric evaluation frameworks for CRS.

Outline of the Talk

Conversational Recommender Systems

Human‑Centric Evaluation

CRS‑Que Evaluation Framework

ChatGPT‑Based Recommendation Evaluation

Q&A

1. Conversational Recommender Systems

CRSs integrate a chatbot interface with recommendation engines, allowing multi‑turn dialogue to elicit user preferences. Unlike traditional web‑based recommenders, they provide real‑time interaction, natural‑language feedback, and dynamic content updates.

Key characteristics (from a recent survey): (1) task‑oriented dialogue rather than chit‑chat; (2) multi‑turn interaction to achieve recommendation goals. Applications include music, travel, movies and e‑commerce customer service.

Typical early architecture (2017‑2018) consists of a dialogue manager, a user‑modeling module, and a reasoning/recommendation engine that extracts preferences, updates the user profile, matches items, and returns results.

2. Human‑Centric Evaluation

Traditional recommender evaluation focuses on accuracy, neglecting user experience. Early work (CHI 2006) highlighted the need for subjective measures. A human‑centric framework includes four dimensions: perceived quality, user belief, subjective attitude, and behavioral intention, with additional factors such as context and personal traits.

For CRS, a “dialogue dimension” is added, covering language understanding, response quality, adaptability, focus, and engagement. The framework also assesses trust, confidence, satisfaction, and willingness to adopt.

3. CRS‑Que Framework

Building on the ResQue model, CRS‑Que adds a dialogue module and defines four evaluation aspects to assess conversational recommenders. Two user studies were conducted: a music‑recommendation dialogue and a smartphone‑recommendation dialogue, varying feedback mechanisms, explanation provision, and system humanization.

4. ChatGPT‑Based Recommendation Evaluation

Experiments with large‑language‑model (LLM) powered CRS show that prompting users to construct prompts improves perceived explainability, usability, transparency, and adaptability. Prompt framing influences novelty and user willingness across domains such as books and career advice.

Conclusion

Effective evaluation of conversational recommenders must be human‑centric, considering both recommendation quality and dialogue interaction. Factors such as intimacy, trust, and system transparency critically shape user experience, especially when LLMs are employed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI User Modeling conversational recommender systems human‑centric evaluation

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.