Semantic Embedding with Large Language Models: A Comprehensive Survey
This survey reviews the evolution of semantic embedding—from Word2vec and GloVe to BERT, Sentence‑BERT, and recent contrastive methods—then examines how large language models improve embeddings via synthetic data generation and backbone architectures, detailing techniques such as contrastive prompting, in‑context learning, knowledge distillation, and discussing resource, privacy, and interpretability challenges.