Artificial Intelligence 10 min read

What’s the Latest Landscape of Large Language Models? Params, Capabilities, and Licenses

This article provides a comprehensive overview of the evolution of large language models, comparing foreign and Chinese offerings, explaining parameter significance, multimodal extensions, and detailing which models are available for commercial use.

JavaEdge

Jun 22, 2024

What’s the Latest Landscape of Large Language Models? Params, Capabilities, and Licenses

Overview of Large Language Model Development

Many developers are familiar with popular foreign LLMs such as ChatGPT, but Chinese models receive less attention. This article compiles a side‑by‑side comparison of major LLMs released up to 2024, covering their parameter counts, key characteristics, release years, and commercial licensing.

Evolution Timeline

The LLM era began around 2017 with Google’s Transformer‑based T5 model. Subsequent milestones include GPT‑2 (1.5 B parameters, 2019), Google T5 (11 B, 2019), GPT‑3.5 (175 B, 2022), Meta’s OPT (175 B, 2022), LLaMA series (70 M‑65 B, 2023), GPT‑4 (1.8 T, 2023), and newer models such as Falcon, Claude 3, Gemini Pro, and GPT‑4o released in 2024.

Foreign vs. Domestic Models

Foreign models (selected): GPT‑2, Google T5, GPT‑3.5, Meta OPT, LLaMA, Vicuna‑13B, Falcon, Claude 1.3/2/3, PaLM 2, Mistral, Gemini, etc., mostly launched in 2023‑2024.

Domestic models (selected): Baichuan Intelligent (70 B, 2023), Wenxin YiYan (2.6 T, 2023), Tongyi Qianwen (70 B‑700 B, 2023), ChatGLM series (6 B‑130 B, 2023‑2024), Tencent HunYuan (over 1 T, 2023), MOSS, Aquila, PolyLM, and others, also released primarily in 2023‑2024.

Parameters and Model Capability

Parameters indicate a model’s size and computational demand: larger parameter counts generally require more storage and inference compute, and tend to deliver stronger language understanding and generation. For example, GPT‑2 has 1.5 B parameters, while GPT‑4 reaches 1.8 T, and Claude 3’s Opus variant exceeds 100 B.

What Is a “Base Model”?

Base models are pretrained on massive corpora (often English) before fine‑tuning for specific tasks. Models such as LLaMA and GPT‑4 use English‑dominant data, whereas Chinese models like Wenxin YiYan are trained on predominantly Chinese corpora, giving them an advantage on Chinese tasks.

Multimodal Extensions

Beyond pure text generation, many recent LLMs support multimodal inputs and outputs, enabling image‑to‑text, text‑to‑image, and other cross‑modal capabilities. Examples include GPT‑4‑v, Gemini Pro, and Tencent HunYuan.

Google T5 -> GPT-3 -> GLM130B -> LLaMa -> GPT-4 -> Falcon -> GPT-4v

Commercial Licensing

ChatGLM, ChatGLM2 – commercial use allowed

LLaMA – not commercial; LLaMA 2 – commercial allowed

BLOOM, Baichuan, Falcon, Qwen, Aquila, Mistral, Gemma – commercial allowed

Claude, GPT‑4, PaLM 2, Gemini – not commercial

BERT, RoBERTa, T5 – commercial allowed

Ecosystem and Resources

Hugging Face serves as the “GitHub of AI,” hosting many of the open‑source models listed above. Developers can download, fine‑tune, or deploy these models directly from the platform.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LLM commercial licensing Model Parameters

Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.