Yuewen Technology
Apr 1, 2022 · Artificial Intelligence
Detecting Emerging Terms in Web Novels: PMI, Entropy, and TF‑IDF Methods
This article explores how to automatically discover new words in Chinese web novels by combining n‑gram statistics, pointwise mutual information, information entropy, and TF‑IDF filtering, presenting a practical, unsupervised pipeline that improves tokenization and search recall without manual labeling.
Chinese text miningNLPPMI
0 likes · 14 min read