Tagged articles
20 articles
Page 1 of 1
AI Engineer Programming
AI Engineer Programming
Apr 26, 2026 · Artificial Intelligence

From Bag‑of‑Words to Semantics: How Embeddings Turn Meaning into Numbers (Part 2)

The article explains how embedding techniques encode semantic information into numeric vectors, covering Word2Vec and GloVe fundamentals, BERT anisotropy, SimCSE contrastive learning, alignment and uniformity metrics, ANN index structures such as HNSW, IVF and PQ, Matryoshka representation learning, practical deployment challenges, and evaluation best practices.

ANNBERTEmbedding
0 likes · 23 min read
From Bag‑of‑Words to Semantics: How Embeddings Turn Meaning into Numbers (Part 2)
AI Cyberspace
AI Cyberspace
Jan 13, 2026 · Artificial Intelligence

From Symbolic AI to LLMs: A Complete NLP History and Model Guide

This article provides a comprehensive overview of natural language processing, tracing its evolution from early symbolic and statistical stages through deep learning breakthroughs, detailing sequence models, key NLP tasks, text representation methods, and the development of modern architectures like RNN, LSTM, GRU, Transformer, and GPT series.

Deep LearningGPTLSTM
0 likes · 60 min read
From Symbolic AI to LLMs: A Complete NLP History and Model Guide
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Oct 19, 2023 · Artificial Intelligence

NLP Basics: Word Embeddings, Word2Vec, and Hand‑crafted RNN Implementation in PyTorch

This article introduces word‑level representations—from one‑hot encoding to dense word embeddings via Word2Vec—explains cosine similarity, then walks through the structure, limitations, and PyTorch implementation of a vanilla RNN, including a custom forward function and verification against the library API.

Cosine SimilarityNLPPyTorch
0 likes · 19 min read
NLP Basics: Word Embeddings, Word2Vec, and Hand‑crafted RNN Implementation in PyTorch
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 28, 2022 · Artificial Intelligence

How Pre‑Training Evolved: From word2vec to MAE Across NLP and CV

This article traces the history of deep‑learning pre‑training techniques, comparing the parallel developments in natural‑language processing and computer vision—from early word2vec and bag‑of‑words models through ELMo and BERT to recent transformer‑based vision models like iGPT, ViT, BEiT and MAE—highlighting key innovations, challenges, and the convergence of the two fields.

Deep LearningMAENLP
0 likes · 20 min read
How Pre‑Training Evolved: From word2vec to MAE Across NLP and CV
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 23, 2021 · Artificial Intelligence

How Pre‑Training Evolved: From word2vec to MAE Across NLP & Vision

This article traces the evolution of deep‑learning pre‑training techniques, starting with word2vec in NLP, moving through ELMo and BERT, then shifting to computer‑vision models such as iGPT, ViT, BEiT, and MAE, highlighting key innovations, challenges, and the convergence of NLP and CV paradigms.

BERTMAENLP
0 likes · 21 min read
How Pre‑Training Evolved: From word2vec to MAE Across NLP & Vision
Code DAO
Code DAO
Dec 12, 2021 · Artificial Intelligence

How to Boost Text Analysis Accuracy on a 2‑Billion‑Word Corpus

This article explains practical techniques for improving NLP model accuracy on massive corpora, covering challenges of multi‑field text, word‑embedding choices, a fasttext‑based regression demo with book‑review data, feature engineering tricks, and a comparison with tf‑idf + LASSO.

NLPPythonWord2Vec
0 likes · 13 min read
How to Boost Text Analysis Accuracy on a 2‑Billion‑Word Corpus
58 Tech
58 Tech
Apr 9, 2021 · Artificial Intelligence

Vectorized Recall and Dual‑Tower Model for Home Page Recommendation at 58.com

This article details how 58.com improved its home‑page recommendation system by introducing vectorized recall with Word2Vec, optimizing negative sampling, deploying FAISS for fast nearest‑neighbor search, and later adopting a dual‑tower deep learning model with user interest features, achieving higher click‑through and conversion rates.

FAISSWord2Vecdual-tower
0 likes · 19 min read
Vectorized Recall and Dual‑Tower Model for Home Page Recommendation at 58.com
FunTester
FunTester
Nov 11, 2020 · Artificial Intelligence

Unlocking NLP: From the Turing Test to Word Embeddings and Beyond

This article provides a comprehensive overview of natural language processing, tracing its origins from Turing's seminal test to modern techniques like regular expressions, word order importance, word embeddings, Word2vec, GloVe, and knowledge‑ and retrieval‑based chatbot methods.

GloVeKnowledge GraphsNLP
0 likes · 15 min read
Unlocking NLP: From the Turing Test to Word Embeddings and Beyond
Tencent Advertising Technology
Tencent Advertising Technology
Jul 30, 2020 · Artificial Intelligence

Winning Strategies for the Tencent Advertising Algorithm Competition: Text Classification with Word2Vec and BiLSTM

The article details the Tencent Advertising Algorithm competition final, explains the chizhu team's approach of converting ad IDs into word sequences for text classification using large‑scale word2vec embeddings and a dual BiLSTM architecture, presents custom loss functions, training tricks, and shares full Python model code, achieving an overall rank of 11.

AdvertisingBiLSTMDeep Learning
0 likes · 9 min read
Winning Strategies for the Tencent Advertising Algorithm Competition: Text Classification with Word2Vec and BiLSTM
58 Tech
58 Tech
Jul 10, 2020 · Artificial Intelligence

Tag Mining for Used‑Car Business: NLP, Word2Vec, and Retrieval Pipeline

This article details the end‑to‑end process of extracting and leveraging tags for used‑car listings, covering data collection, segmentation, NLP‑based tokenization, word‑vector generation, tag‑library construction, and online retrieval flow to improve personalized recall and CTR.

NLPTaggingWord2Vec
0 likes · 19 min read
Tag Mining for Used‑Car Business: NLP, Word2Vec, and Retrieval Pipeline
Sohu Tech Products
Sohu Tech Products
May 27, 2020 · Artificial Intelligence

Overview of Embedding Methods: From Word2Vec to Item2Vec and Dual‑Tower Models in Recommendation Systems

This article provides a comprehensive overview of embedding techniques, explaining their role in deep learning recommendation systems, detailing Word2Vec and its Skip‑gram model with negative sampling and hierarchical softmax, and extending the discussion to Item2Vec and dual‑tower architectures for item representation.

Item2VecWord2Vecnegative sampling
0 likes · 15 min read
Overview of Embedding Methods: From Word2Vec to Item2Vec and Dual‑Tower Models in Recommendation Systems
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 18, 2019 · Artificial Intelligence

From Word2Vec to Quick-Thought: A Complete Guide to Modern Embeddings

This article reviews the evolution of word and sentence embeddings, covering foundational theories like vector semantics and distributional hypothesis, practical models such as Word2Vec, GloVe, fastText, Skip‑Thought, Quick‑Thought, and evaluation techniques, while offering implementation tips and real‑world use cases.

GloVeNLPWord2Vec
0 likes · 21 min read
From Word2Vec to Quick-Thought: A Complete Guide to Modern Embeddings
Sohu Tech Products
Sohu Tech Products
Mar 6, 2019 · Artificial Intelligence

Applying Word2Vec Embeddings to Rental and News Recommendation: Model, Hyper‑parameters, and Optimization

This article explains the fundamentals of the Word2Vec SGNS model, details its hyper‑parameters and training tricks, and demonstrates how customized embeddings are built for rental‑listing and news‑article recommendation, covering data preparation, objective‑function redesign, evaluation, and deployment in both recall and ranking stages.

EmbeddingSGNSWord2Vec
0 likes · 14 min read
Applying Word2Vec Embeddings to Rental and News Recommendation: Model, Hyper‑parameters, and Optimization
MaGe Linux Operations
MaGe Linux Operations
Mar 22, 2018 · Artificial Intelligence

Mapping Character Relationships in 'Heavenly Sword and Dragon Slaying' with Jieba, Word2Vec & NetworkX

This article demonstrates how to combine Jieba segmentation, Word2Vec embeddings, and NetworkX graph visualization to extract and analyze character relationships from the Chinese novel "Heavenly Sword and Dragon Slaying," detailing data preparation, model training, entity matrix construction, and network graph generation.

Graph VisualizationNLPPython
0 likes · 10 min read
Mapping Character Relationships in 'Heavenly Sword and Dragon Slaying' with Jieba, Word2Vec & NetworkX
Architecture Digest
Architecture Digest
Feb 22, 2018 · Artificial Intelligence

Deep Learning Applications in Recommendation Systems

This article explains why deep learning has become essential for modern recommendation systems, describing its advantages such as automatic feature extraction, noise robustness, sequential modeling with RNNs, and improved user‑item representation, and reviews major deep‑learning‑based recommendation models and techniques.

Deep LearningRecommendation SystemsWord2Vec
0 likes · 17 min read
Deep Learning Applications in Recommendation Systems
Qunar Tech Salon
Qunar Tech Salon
Aug 18, 2016 · Artificial Intelligence

Automatic Ticket Classification Using SVM and word2vec at Qunar

At Qunar, the data center algorithm team developed an automatic ticket classification system that combines Support Vector Machine with word2vec embeddings to handle high‑dimensional, low‑sample text data, achieving 89% accuracy and 80% recall while outlining the full machine‑learning pipeline from feature extraction to deployment.

QunarWord2Vecmachine learning
0 likes · 7 min read
Automatic Ticket Classification Using SVM and word2vec at Qunar
21CTO
21CTO
Feb 11, 2016 · Artificial Intelligence

How ICBC Leverages Text Mining to Transform Customer Service

This article details how Industrial and Commercial Bank of China (ICBC) applies text mining and natural language processing to analyze both internal call‑center records and external online discussions, building ontologies and models that turn massive unstructured feedback into actionable insights for improving service quality and reducing costs.

BankingOntologyWord2Vec
0 likes · 21 min read
How ICBC Leverages Text Mining to Transform Customer Service