Infra Learning Club
Oct 31, 2024 · Artificial Intelligence
What Is a Token in Large Language Models?
The article explains that a token is the unit processed by large language models, describes three common tokenizer methods—word‑level, character‑level, and sub‑word level—with English and Chinese examples, discusses their advantages and limitations, and shows how OpenAI’s tokenizer varies across model versions.
NLPcharacter-leveljieba
0 likes · 5 min read
