What’s the Latest Landscape of Large Language Models? Params, Capabilities, and Licenses
This article provides a comprehensive overview of the evolution of large language models, comparing foreign and Chinese offerings, explaining parameter significance, multimodal extensions, and detailing which models are available for commercial use.
Overview of Large Language Model Development
Many developers are familiar with popular foreign LLMs such as ChatGPT, but Chinese models receive less attention. This article compiles a side‑by‑side comparison of major LLMs released up to 2024, covering their parameter counts, key characteristics, release years, and commercial licensing.
Evolution Timeline
The LLM era began around 2017 with Google’s Transformer‑based T5 model. Subsequent milestones include GPT‑2 (1.5 B parameters, 2019), Google T5 (11 B, 2019), GPT‑3.5 (175 B, 2022), Meta’s OPT (175 B, 2022), LLaMA series (70 M‑65 B, 2023), GPT‑4 (1.8 T, 2023), and newer models such as Falcon, Claude 3, Gemini Pro, and GPT‑4o released in 2024.
Foreign vs. Domestic Models
Foreign models (selected): GPT‑2, Google T5, GPT‑3.5, Meta OPT, LLaMA, Vicuna‑13B, Falcon, Claude 1.3/2/3, PaLM 2, Mistral, Gemini, etc., mostly launched in 2023‑2024.
Domestic models (selected): Baichuan Intelligent (70 B, 2023), Wenxin YiYan (2.6 T, 2023), Tongyi Qianwen (70 B‑700 B, 2023), ChatGLM series (6 B‑130 B, 2023‑2024), Tencent HunYuan (over 1 T, 2023), MOSS, Aquila, PolyLM, and others, also released primarily in 2023‑2024.
Parameters and Model Capability
Parameters indicate a model’s size and computational demand: larger parameter counts generally require more storage and inference compute, and tend to deliver stronger language understanding and generation. For example, GPT‑2 has 1.5 B parameters, while GPT‑4 reaches 1.8 T, and Claude 3’s Opus variant exceeds 100 B.
What Is a “Base Model”?
Base models are pretrained on massive corpora (often English) before fine‑tuning for specific tasks. Models such as LLaMA and GPT‑4 use English‑dominant data, whereas Chinese models like Wenxin YiYan are trained on predominantly Chinese corpora, giving them an advantage on Chinese tasks.
Multimodal Extensions
Beyond pure text generation, many recent LLMs support multimodal inputs and outputs, enabling image‑to‑text, text‑to‑image, and other cross‑modal capabilities. Examples include GPT‑4‑v, Gemini Pro, and Tencent HunYuan.
Google T5 -> GPT-3 -> GLM130B -> LLaMa -> GPT-4 -> Falcon -> GPT-4vCommercial Licensing
ChatGLM, ChatGLM2 – commercial use allowed
LLaMA – not commercial; LLaMA 2 – commercial allowed
BLOOM, Baichuan, Falcon, Qwen, Aquila, Mistral, Gemma – commercial allowed
Claude, GPT‑4, PaLM 2, Gemini – not commercial
BERT, RoBERTa, T5 – commercial allowed
Ecosystem and Resources
Hugging Face serves as the “GitHub of AI,” hosting many of the open‑source models listed above. Developers can download, fine‑tune, or deploy these models directly from the platform.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JavaEdge
First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
