How Human‑Centric Evaluation Transforms Conversational Recommender Systems

This article reviews Professor Jin Yucheng’s research on conversational recommender systems, detailing a human‑centric evaluation framework, the CRS‑Que assessment model, and ChatGPT‑based experiments that reveal how dialogue quality, user trust, and prompt design jointly shape system performance.

DataFunSummit
DataFunSummit
DataFunSummit
How Human‑Centric Evaluation Transforms Conversational Recommender Systems

Introduction

Conversational Recommender Systems (CRSs) combine natural‑language interaction with intelligent recommendation, and are widely used in e‑commerce, music, video and other domains. The article summarizes recent research on human‑centric evaluation frameworks for CRS.

Outline of the Talk

Conversational Recommender Systems

Human‑Centric Evaluation

CRS‑Que Evaluation Framework

ChatGPT‑Based Recommendation Evaluation

Q&A

1. Conversational Recommender Systems

CRSs integrate a chatbot interface with recommendation engines, allowing multi‑turn dialogue to elicit user preferences. Unlike traditional web‑based recommenders, they provide real‑time interaction, natural‑language feedback, and dynamic content updates.

Key characteristics (from a recent survey): (1) task‑oriented dialogue rather than chit‑chat; (2) multi‑turn interaction to achieve recommendation goals. Applications include music, travel, movies and e‑commerce customer service.

Typical early architecture (2017‑2018) consists of a dialogue manager, a user‑modeling module, and a reasoning/recommendation engine that extracts preferences, updates the user profile, matches items, and returns results.

CRS overview diagram
CRS overview diagram

2. Human‑Centric Evaluation

Traditional recommender evaluation focuses on accuracy, neglecting user experience. Early work (CHI 2006) highlighted the need for subjective measures. A human‑centric framework includes four dimensions: perceived quality, user belief, subjective attitude, and behavioral intention, with additional factors such as context and personal traits.

For CRS, a “dialogue dimension” is added, covering language understanding, response quality, adaptability, focus, and engagement. The framework also assesses trust, confidence, satisfaction, and willingness to adopt.

Human‑centric evaluation dimensions
Human‑centric evaluation dimensions

3. CRS‑Que Framework

Building on the ResQue model, CRS‑Que adds a dialogue module and defines four evaluation aspects to assess conversational recommenders. Two user studies were conducted: a music‑recommendation dialogue and a smartphone‑recommendation dialogue, varying feedback mechanisms, explanation provision, and system humanization.

CRS‑Que framework
CRS‑Que framework

4. ChatGPT‑Based Recommendation Evaluation

Experiments with large‑language‑model (LLM) powered CRS show that prompting users to construct prompts improves perceived explainability, usability, transparency, and adaptability. Prompt framing influences novelty and user willingness across domains such as books and career advice.

ChatGPT recommendation experiment
ChatGPT recommendation experiment

Conclusion

Effective evaluation of conversational recommenders must be human‑centric, considering both recommendation quality and dialogue interaction. Factors such as intimacy, trust, and system transparency critically shape user experience, especially when LLMs are employed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIuser modelingconversational recommender systemshuman‑centric evaluation
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.