Clever Classical Ideas in Natural Language Processing Tasks

The article highlights several ingenious pre‑deep‑learning techniques for NLP, including the Distributional Hypothesis, Bag‑of‑Words, Latent Semantic Analysis, Probabilistic Topic Models, BMES/BIO tagging schemes, and TextRank, explaining their principles, advantages, and historical significance in text representation and processing.

DataFunTalk
DataFunTalk
DataFunTalk
Clever Classical Ideas in Natural Language Processing Tasks

In this article, Fudan University associate professor Qiu Xipeng shares a collection of clever ideas that were widely used in natural language processing (NLP) before the deep‑learning era.

Distributional Hypothesis : The principle that linguistic items occurring in similar contexts tend to have similar meanings, implying that a word’s semantics can be represented by its surrounding context.

Bag‑of‑Words (BoW) : Treats a document as an unordered collection of words, discarding syntax and word order, thereby converting a variable‑length text sequence into a fixed‑length vector. Extensions such as n‑grams and TF‑IDF build on this basic representation.

Latent Semantic Analysis (LSA) : Constructs a term‑document matrix from the BoW representation and applies singular value decomposition (SVD) to obtain dense vector embeddings for words and documents, revealing hidden semantic structures like topics.

Probabilistic Topic Models (PTM) : Introduce a latent “topic” variable between documents and words, modeling the generation process as document → topic → word. The approach is mathematically elegant, with inference often performed via Gibbs sampling, and remains a valuable resource for NLP tasks.

BMES/BIO Tagging Schemes : BMES (Begin/Middle/End/Single) and BIO (Begin/Inside/Outside) label characters or tokens to transform segmentation, named‑entity recognition, and chunking into sequence‑labeling problems that can be tackled with models such as HMMs and CRFs.

TextRank : Adapts the PageRank algorithm to rank words or sentences in a text, enabling applications like keyword extraction and automatic summarization.

The article concludes with reference links to the distributional semantics Wikipedia page, the term‑document matrix page, and the original Zhihu answer.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

NLPbag-of-wordsdistributional hypothesislatent semantic analysistext rankingtopic models
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.