Tag

PMI

0 views collected around this technical thread.

Yuewen Technology
Yuewen Technology
Apr 1, 2022 · Artificial Intelligence

Detecting Emerging Terms in Web Novels: PMI, Entropy, and TF‑IDF Methods

This article explores how to automatically discover new words in Chinese web novels by combining n‑gram statistics, pointwise mutual information, information entropy, and TF‑IDF filtering, presenting a practical, unsupervised pipeline that improves tokenization and search recall without manual labeling.

Chinese text miningNLPPMI
0 likes · 14 min read
Detecting Emerging Terms in Web Novels: PMI, Entropy, and TF‑IDF Methods