Tagged articles

word embeddings

13 articles · Page 1 of 1

Aug 10, 2023 · Artificial Intelligence

Understanding Word2Vec: Theory, Architecture, and Python Implementation

This article explains the Word2Vec algorithm, its CBOW and Skip‑Gram architectures, cosine similarity mathematics, training process with negative sampling, and provides a concise Python example using the gensim library.

.aiGensimPython

0 likes · 8 min read

Understanding Word2Vec: Theory, Architecture, and Python Implementation

DevOps

Apr 7, 2023 · Artificial Intelligence

Understanding How ChatGPT Generates Answers: Probabilistic Language Modeling and Word Vectors

The article explains that ChatGPT produces responses by converting words into high‑dimensional vectors, feeding them through neural networks, and selecting tokens based on probability distributions, while also contrasting GPT with BERT and describing a related training event.

ChatGPTGPT-4Language Models

0 likes · 7 min read

Understanding How ChatGPT Generates Answers: Probabilistic Language Modeling and Word Vectors

Code DAO

Jan 15, 2022 · Artificial Intelligence

Compressing Unsupervised fastText Models 300× Smaller with Near‑Identical NLP Performance

This article shows how the compress‑fasttext Python library can shrink a 7 GB fastText word‑embedding model to about 21 MB—a 300‑fold reduction—while preserving almost the same accuracy on downstream NLP tasks, and explains the underlying compression techniques, usage examples, and evaluation results.

NLPcompress-fasttextfastText

0 likes · 9 min read

Compressing Unsupervised fastText Models 300× Smaller with Near‑Identical NLP Performance

Code DAO

Dec 12, 2021 · Artificial Intelligence

How to Boost Text Analysis Accuracy on a 2‑Billion‑Word Corpus

This article explains practical techniques for improving NLP model accuracy on massive corpora, covering challenges of multi‑field text, word‑embedding choices, a fasttext‑based regression demo with book‑review data, feature engineering tricks, and a comparison with tf‑idf + LASSO.

NLPPythonRegression

0 likes · 13 min read

How to Boost Text Analysis Accuracy on a 2‑Billion‑Word Corpus

FunTester

Nov 11, 2020 · Artificial Intelligence

Unlocking NLP: From the Turing Test to Word Embeddings and Beyond

This article provides a comprehensive overview of natural language processing, tracing its origins from Turing's seminal test to modern techniques like regular expressions, word order importance, word embeddings, Word2vec, GloVe, and knowledge‑ and retrieval‑based chatbot methods.

GloVeNLPWord2Vec

0 likes · 15 min read

Unlocking NLP: From the Turing Test to Word Embeddings and Beyond

Tencent Cloud Developer

Jul 8, 2020 · Artificial Intelligence

Graph-Based Chinese Word Embedding (AlphaEmbedding) for Improved Text Matching

AlphaEmbedding builds a weighted graph linking Chinese words, sub‑words, characters and pinyin, then uses random‑walk‑based node2vec training to produce embeddings that capture orthographic and phonetic similarity, markedly improving recall and ranking for homophones, typos and OOV terms in enterprise search.

Chinese NLPgraph computingsemantic similarity

0 likes · 17 min read

Graph-Based Chinese Word Embedding (AlphaEmbedding) for Improved Text Matching

JD Retail Technology

Aug 8, 2019 · Artificial Intelligence

From Word Representations to Sentiment Analysis – Talk by Dr. Feng Ao

On August 6, Dr. Feng Ao presented a comprehensive overview of the evolution of word representations and sentiment analysis, illustrating the shift from traditional linguistic features to modern pretrained models such as BERT and XLNet, and sharing practical convolutional experiments relevant to industry applications.

Artificial IntelligenceNLPPretrained Models

0 likes · 4 min read

From Word Representations to Sentiment Analysis – Talk by Dr. Feng Ao

Alibaba Cloud Developer

Jun 18, 2019 · Artificial Intelligence

From Word2Vec to Quick-Thought: A Complete Guide to Modern Embeddings

This article reviews the evolution of word and sentence embeddings, covering foundational theories like vector semantics and distributional hypothesis, practical models such as Word2Vec, GloVe, fastText, Skip‑Thought, Quick‑Thought, and evaluation techniques, while offering implementation tips and real‑world use cases.

GloVeNLPWord2Vec

0 likes · 21 min read

From Word2Vec to Quick-Thought: A Complete Guide to Modern Embeddings

DataFunTalk

Mar 13, 2019 · Artificial Intelligence

A Comprehensive Overview of NLP Development and Deep Learning Models

This article reviews the history of natural language processing, explains key deep‑learning models such as NNLM, Word2vec, CNN, RNN, attention mechanisms, and Transformers, and discusses their applications, future trends, and practical considerations in NLP tasks.

NLPTransformerattention

0 likes · 38 min read

A Comprehensive Overview of NLP Development and Deep Learning Models

Alibaba Cloud Developer

Apr 25, 2018 · Artificial Intelligence

How cw2vec Beats Word2Vec: Leveraging Chinese Stroke N‑grams for Superior Word Embeddings

This article introduces cw2vec, a novel Chinese word‑embedding algorithm that exploits stroke‑level subword information, outlines its theoretical foundations, compares it with word2vec, GloVe, CWE and other models on multiple benchmarks, and demonstrates its superior performance across word similarity, analogy, text classification and named‑entity recognition tasks.

Chinese NLPDeep Learningcw2vec

0 likes · 14 min read

How cw2vec Beats Word2Vec: Leveraging Chinese Stroke N‑grams for Superior Word Embeddings

Baobao Algorithm Notes

Feb 28, 2018 · Artificial Intelligence

Mastering Text Classification: From TF‑IDF to Word Embeddings and Deep Learning

This article provides a comprehensive guide to text classification, covering traditional pipelines, bag‑of‑words and TF‑IDF features, dimensionality‑reduction techniques, word‑embedding models such as GloVe, word2vec and fastText, and modern deep‑learning architectures like CNN, RCNN and HAN.

CNNDeep LearningNLP

0 likes · 9 min read

Mastering Text Classification: From TF‑IDF to Word Embeddings and Deep Learning

AntTech

Jan 18, 2018 · Artificial Intelligence

cw2vec: Learning Chinese Word Embeddings with Stroke n-grams

The cw2vec paper, presented at AAAI 2018, introduces a Chinese word embedding method that leverages stroke n‑grams to capture character semantics, proposes a novel loss function, demonstrates consistent improvements over existing models across similarity, analogy, classification and NER tasks, and discusses real‑world AI applications.

AAAI 2018AI researchChinese NLP

0 likes · 7 min read

cw2vec: Learning Chinese Word Embeddings with Stroke n-grams

ITPUB

Dec 23, 2015 · Artificial Intelligence

How Computers Turn Words into Numbers: A Beginner’s Guide to Tokenization and Vector Similarity

This article explains how natural language processing stores word meanings as numeric vectors, builds token dictionaries, represents sentences as binary vectors, and uses dot‑product calculations to measure similarity, illustrating concepts with simple examples and highlighting current limitations and future directions.

Artificial IntelligenceNLPTokenization

0 likes · 7 min read

How Computers Turn Words into Numbers: A Beginner’s Guide to Tokenization and Vector Similarity