Artificial Intelligence 12 min read

NLP-based Text Opinion Extraction and Sentiment Analysis for iQIYI Video Comments

iQIYI’s NLP pipeline—combining CRF‑based segmentation, bidirectional LSTM/GRU models with attention and a CNN classifier—automatically extracts opinion targets, sentiment words and polarity from unstructured video comments, aggregates them across users to reveal collective attitudes toward actors, plot, and visual effects, and guides future work on implicit opinions and broader sentiment domains.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
NLP-based Text Opinion Extraction and Sentiment Analysis for iQIYI Video Comments

User-generated textual expressions are a crucial component of public opinion data. Natural Language Processing (NLP) techniques can help extract effective information from texts, understand user viewpoints, emotions, and needs. This document introduces iQIYI's technical exploration and practice in text opinion mining and sentiment analysis using comments from TV series.

Background

As a technology‑driven entertainment company, iQIYI aims to provide rich, high‑quality, intelligent services. Analyzing user opinions expressed after watching videos is essential for understanding user preferences. Comments may cover program content, actors, or product feedback. While opinion data can be multi‑modal (text, image, audio), this work focuses on textual comments and explores NLP‑based opinion mining and sentiment analysis.

Examples are drawn from user comments on the drama "You and My Time in the City" (《你和我的倾城时光》), illustrating the concrete analysis process.

Functionality

iQIYI possesses massive video resources, generating abundant bullet comments, episode remarks, and bubble‑chat comments. Each comment is treated as a basic unit for opinion analysis. Although comments are unstructured and informal, NLP pipelines transform them into structured information, extracting opinion targets, opinion words, and sentiment polarity.

For a single‑sentence comment such as “颖宝的演技一直都有进步!期待你和我的倾城时光”, the system can derive:

Overall sentiment polarity: positive.

Opinion targets: “颖宝的演技” and the drama title.

Opinion words: “有进步”, “期待”.

Sentiment toward each target: positive.

Classification of targets into predefined categories (e.g., actor, overall evaluation).

Beyond single‑sentence analysis, the platform aggregates opinions across the user base to reveal collective attitudes toward specific aspects such as actors, plot, or visual effects. Figures illustrate daily sentiment distribution and overall viewpoint classification.

Algorithm and Process

The workflow relies on lexical analysis, opinion extraction, relation extraction, sentiment analysis, and text classification. Lexical analysis, powered by a CRF‑based word segmentation service, provides the foundation for downstream tasks.

1) Opinion Extraction

Opinion targets (the entities being evaluated) and opinion words (the evaluative expressions) are extracted using sequence labeling. A bidirectional LSTM‑CRF model, trained on manually annotated data, achieves strong performance.

Relation extraction determines the link between each opinion word and its target. A bidirectional GRU with attention‑based classification model handles one‑to‑one and many‑to‑many relationships, improving robustness against noisy annotations.

2) Sentiment Analysis

Sentiment is categorized into positive, neutral, and negative. Both sentence‑level sentiment and fine‑grained sentiment toward specific targets are predicted using bidirectional LSTM models enhanced with attention or gating mechanisms.

3) Opinion Aggregation

Aggregated viewpoints are obtained by feeding sentence‑level results into a CNN‑based classification model that summarizes opinions across predefined dimensions (e.g., actor, plot, visual effects).

Conclusion and Future Work

The case study demonstrates how deep‑learning‑driven NLP pipelines can extract and aggregate user opinions and emotions from large‑scale video comments. While the current system handles explicit expressions effectively, future efforts will focus on capturing implicit opinions, handling diverse linguistic styles, and extending the framework to product and artist sentiment analysis.

Deep Learningsentiment analysisNLPiQIYIopinion miningtext mining
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.