Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources
This article compiles a comprehensive, up‑to‑date inventory of open‑source large language models from Chinese and international organizations, detailing each model’s architecture, parameter count, multilingual capabilities, deployment requirements, and associated tools, offering a valuable reference for AI researchers and developers.
Open-Source Large Language Models List
Large Language Model (LLM) refers to a massive neural‑network‑based natural‑language‑processing model that learns grammar and semantics from large corpora and can generate human‑readable text. LLMs are trained on billions to trillions of parameters and can handle tasks such as generation, classification, summarization, translation, and speech recognition.
This article provides a comprehensive overview of open‑source LLMs released by companies, research institutions, and communities worldwide.
Open‑Source Chinese LLM
ChatGLM-6B – Bilingual Dialogue Model
ChatGLM-6B is an open‑source bilingual (Chinese‑English) dialogue model based on the General Language Model (GLM) architecture with 6.2 B parameters. Quantization (INT4) allows local deployment on consumer‑grade GPUs with as little as 6 GB VRAM.
The model was trained on ~1 T tokens of bilingual data and fine‑tuned with supervised learning, reinforcement learning from human feedback, and other techniques, achieving human‑like responses despite its smaller size.
ChatGLM2-6B – Second Generation
ChatGLM2-6B upgrades the base model with longer context, more efficient inference, and an open‑source license.
VisualGLM-6B – Multimodal Dialogue Model
VisualGLM-6B supports image, Chinese, and English inputs. It builds on ChatGLM-6B (6.2 B parameters) and integrates a BLIP2‑Qformer visual encoder, resulting in a 7.8 B‑parameter multimodal model.
MOSS – Bilingual Dialogue Model
MOSS is a bilingual (Chinese‑English) dialogue model with 16 B parameters (moss‑moon series). It runs on a single A100/A800 or two 3090 GPUs in FP16, or on a single 3090 GPU in INT4/8.
The base model was pre‑trained on ~7 k B tokens of Chinese‑English and code data, then instruction‑tuned, plugin‑enhanced, and aligned with human preferences for multi‑turn dialogue.
DB‑GPT – Database‑Centric LLM
DB‑GPT is an open‑source GPT‑style model focused on database interaction, offering 100 % privacy and security with local deployment.
It provides a complete private LLM solution for database‑driven scenarios, supporting isolated deployment per business module.
CPM‑Bee – Bilingual Large Model
CPM‑Bee is a fully open‑source, commercially usable 100 B‑parameter Chinese‑English base model built on a Transformer autoregressive architecture and trained on trillion‑scale high‑quality corpora.
Open‑source and commercial : OpenBMB allows commercial use after enterprise verification.
Excellent bilingual performance : Strong results on bilingual benchmarks.
Massive high‑quality data : Trained on a trillion‑scale corpus with rigorous cleaning.
OpenBMB ecosystem support : Provides tools for pre‑training, adaptation, compression, deployment, and more.
Powerful dialogue and tool usage : Fine‑tuned instances exhibit strong conversational and tool‑use abilities.
CPM‑Bee excels at semantic understanding, text generation, translation, QA, scoring, and multiple‑choice tasks.
LaWGPT – Legal Knowledge LLM
LaWGPT series are open‑source LLMs specialized in Chinese legal knowledge, built on base models such as Chinese‑LLaMA or ChatGLM, extended with legal vocabularies and large‑scale legal corpora, then instruction‑tuned on legal QA and judicial exam data.
Linly – Large‑Scale Chinese LLM
Linly offers several models: Linly‑Chinese‑LLaMA (7 B, 13 B, 33 B, with 65 B in training), Linly‑ChatFlow (7 B, 13 B), and a 4‑bit quantized ChatFlow version for CPU inference. The project emphasizes reproducibility, open data, and compatibility with CUDA and CPU.
Chinese‑Vicuna – LLaMA‑Based Chinese Model
Chinese‑Vicuna is a low‑resource LLaMA + LoRA adaptation for Chinese.
Finetune model code
Inference code
CPU‑only C++ inference
Tools for downloading/converting/quantizing Facebook LLaMA checkpoints
Other applications
Chinese‑LLaMA‑Alpaca – Chinese LLaMA & Alpaca
This project provides Chinese‑LLaMA base models and instruction‑tuned Alpaca variants, extending LLaMA’s vocabulary with Chinese tokens and improving instruction following.
ChatYuan – Dialogue Model
ChatYuan is a bilingual (Chinese‑English) functional dialogue model. The large‑v2 version incorporates optimized fine‑tuning data, human‑feedback RL, and chain‑of‑thought reasoning, running on consumer GPUs, PCs, or phones (INT4 requires only 400 MB VRAM).
HuatuoGPT – Open‑Source Chinese Medical Model
HuatuoGPT combines distilled ChatGPT data with real doctor responses to create a medical assistant capable of accurate diagnosis and rich interaction.
BenTsao – Medical LLaMA Fine‑Tuned Model
BenTsao (formerly HuaTuo) is a LLaMA‑based model fine‑tuned on Chinese medical instruction data generated via knowledge graphs and GPT‑3.5, improving medical QA performance.
Pengcheng PanGu‑α – Chinese Pre‑Training Model
PanGu‑α is the industry’s first 200 B‑parameter Chinese‑centric pre‑training model, released in standard and enhanced versions, supporting NPU and GPU, excelling in knowledge QA, retrieval, reasoning, and reading‑comprehension.
Core modules include:
Dataset: ~80 TB raw text, ~1.1 TB high‑quality Chinese corpus, plus 53 multilingual datasets (≈2 TB).
Base module: Provides pre‑trained models such as PanGu‑α and its enhanced variant.
Application layer: Supports multilingual translation, open‑domain dialogue, model compression, framework migration, and continual learning.
PanGu‑Dialog – Dialogue Generation Model
PanGu‑Dialog is a large‑scale open‑domain dialogue model that emphasizes logical reasoning, data calculation, association, and creation abilities, achieving SOTA performance among Chinese pure‑generation models.
Wudao – Bilingual Multimodal Model
Wudao is a 1.75 T‑parameter bilingual multimodal pre‑training model with seven open‑source variants.
Image‑Text Models
CogView : 4 B‑parameter model that generates images from text, surpassing DALL·E on MS COCO.
BriVL : Chinese‑centric vision‑language model excelling in image‑text retrieval.
Text Models
GLM : English‑centric series achieving state‑of‑the‑art results on understanding and generation tasks.
CPM : Chinese and bilingual models ranging from 2.6 B to 198 B parameters.
Transformer‑XL : 2.9 B‑parameter Chinese generation model for article writing, poetry, summarization, etc.
EVA : 2.8 B‑parameter Chinese dialogue model trained on 1.4 B Chinese dialogue data.
Lawformer : First Chinese legal long‑text pre‑training model (100 M parameters).
Protein Models
ProtTrans : Largest Chinese protein pre‑training model (3 B parameters).
BBT‑2 – 120 B‑Parameter General LLM
BBT‑2 serves as a base for specialized models in code, finance, and text‑to‑image generation.
BBT‑2‑12B‑Text – Chinese base model
BBT‑2.5‑13B‑Text – Chinese‑English bilingual base model
BBT‑2‑12B‑TC‑001‑SFT – Code model fine‑tuned for dialogue
BBT‑2‑12B‑TF‑001 – Finance model
BBT‑2‑12B‑Fig – Text‑to‑image model
BBT‑2‑12B‑Science – Scientific paper model
BELLE – Open‑Source Chinese Dialogue Model
BELLE aims to promote the Chinese dialogue LLM community, built on open‑source bases such as BLOOM and fine‑tuned with ChatGPT‑generated instruction data.
TigerBot – Multimodal LLM
TigerBot is a multilingual, multitask LLM achieving 96 % of OpenAI’s performance on benchmark evaluations.
YuLan‑Chat – Bilingual Dialogue Model
Developed by Renmin University’s AI Institute, YuLan‑Chat explores instruction‑tuning for Chinese‑English dialogue.
BayLing – English/Chinese Model with Enhanced Alignment
BayLing achieves ~90 % of ChatGPT’s performance on multiple benchmarks.
Open‑Source LLMs
Qwen‑7B – Transformer‑Based Model
Qwen‑7B (70 B parameters) from Alibaba Cloud is pre‑trained on >2.2 T tokens, supporting 8K context and plugin calls.
Code Llama – Code Generation Model
Based on Llama 2, Code Llama offers three variants (base, Python‑optimized, and Instruct) for multi‑language code generation.
CodeFuse‑13B – Code LLM
Trained on 1 000 B tokens of code, Chinese, and English data, covering 40+ programming languages, achieving 37.1 % Pass@1 on HumanEval.
MiLM‑6B – Xiaomi AI Model
MiLM‑6B (6.4 B parameters) achieves top performance on C‑Eval and CMMLU STEM subjects.
LLaMA – Meta LLM
LLaMA series ranges from 7 B to 65 B parameters, offering competitive performance with smaller models running on consumer hardware.
Stanford Alpaca – Instruction‑Tuned LLaMA
Alpaca fine‑tunes LLaMA 7B using 52 K instruction‑following samples generated by text‑davinci‑003.
Lit‑LLaMA – nanoGPT Implementation
Single‑file implementation supporting quantization, LoRA, flash attention, and more.
GloVe – Word Vector Tool
GloVe provides pre‑trained word embeddings for various corpora (Wikipedia, Common Crawl, Twitter) in multiple dimensions.
Dolly – Low‑Cost LLM
Dolly adapts EleutherAI’s 6 B‑parameter model with instruction‑following capabilities.
OPT‑175B – Meta Open‑Source LLM
OPT‑175B (175 B parameters) matches GPT‑3 in size and is freely available for non‑commercial research.
Cerebras‑GPT – NLP LLM
Cerebras‑GPT offers models from 111 M to 13 B parameters, fully open‑source.
BLOOM – Multilingual LLM
BLOOM (176 B parameters) supports 46 languages and 13 programming languages, freely downloadable from Hugging Face.
BLOOMChat – Multilingual Chat LLM
BLOOMChat is a 176 B open‑source chat model fine‑tuned on OpenChatKit, Dolly 2.0, and OASST1.
GPT‑J – NLP Model
GPT‑J (6 B parameters) trained on an 800 GB dataset, comparable to GPT‑3‑style performance.
GPT‑2 – Transformer Model
GPT‑2 (1.5 B parameters) trained on 8 M webpages, capable of translation, QA, summarization, and text generation.
RWKV‑LM – Linear Transformer
RWKV combines RNN and Transformer ideas for fast, memory‑efficient long‑text modeling.
White‑Ze – LoRA‑Trained LLM
White‑Ze fine‑tunes LLaMA with LoRA for English dialogue, offering 7 B, 13 B, and 30 B variants.
CodeGeeX – Multilingual Code Generation
CodeGeeX (13 B parameters) supports Python, C++, Java, JavaScript, Go, achieving 47‑60 % solve rate on HumanEval‑X.
Falcon LLM – Open‑Source Model
Falcon (up to 40 B parameters) from TII outperforms LLaMA on many benchmarks.
Vicuna – LLaMA‑Fine‑Tuned Model
Vicuna (7 B, 13 B) fine‑tuned by academic teams, reaching >90 % of ChatGPT quality on GPT‑4 evaluation.
RedPajama – 1.2 T Token Dataset
RedPajama replicates LLaMA’s training data (>1.2 T tokens) and provides pre‑training, base, and instruction‑tuning resources.
OpenAssistant – Dialogue LLM
OpenAssistant offers free AI chatbots trained on 600 K multi‑topic conversations, releasing LLaMA‑13B and 30 B instruction‑tuned models.
StableLM – Stability AI Model
StableLM‑alpha series includes 3 B and 7 B models, with larger 15 B and 30 B versions in development.
StarCoder – AI Programming Model
StarCoder (15 B parameters) targets code generation to compete with GitHub Copilot and Amazon CodeWhisperer.
SantaCoder – Lightweight Programming Model
SantaCoder (1.1 B parameters) supports Python, Java, and JavaScript code generation.
MLC LLM – Local LLM Solution
MLC LLM enables deployment of any LLM on various hardware back‑ends and local applications.
Web LLM – Browser‑Based LLM
Web LLM runs large models entirely in the browser using WebGPU, offering privacy‑first AI assistants.
WizardLM – Fine‑Tuned LLaMA
WizardLM (7 B) is fine‑tuned with Evol‑Instruct, a method that generates diverse difficulty instructions using LLMs.
YaLM 100B – 100 B‑Parameter Model
YaLM 100B is a GPT‑style model trained on 1 TB of English and Russian data across 800 A100 GPUs.
OpenLLaMA – LLaMA Reimplementation
OpenLLaMA reproduces Meta’s LLaMA under a permissive license, providing 7 B weights and training scripts.
LLM‑Related Tools
OpenLLM – Open Platform for Operating LLMs
OpenLLM offers production‑grade fine‑tuning, serving, deployment, and monitoring for any open‑source LLM.
LangChain – Building LLM Applications
LangChain provides prompts, LLM wrappers, document loaders, utilities, chains, indexes, agents, memory, and chat interfaces to integrate LLMs with external tools.
JARVIS – Collaborative System for LLMs and AI Models
JARVIS uses an LLM as a controller to orchestrate multiple AI models from Hugging Face across four stages: task planning, model selection, execution, and response generation.
Semantic Kernel – SDK for Integrating LLMs
Semantic Kernel is a lightweight SDK that blends natural‑language semantics, traditional code, and embedding‑based memory for AI‑enhanced applications.
LMFlow – Scalable LLM Toolkit
LMFlow provides an open research platform for efficient LLM training, supporting low‑resource experiments and custom data utilization.
xturing – LLM Personalization Fine‑Tuning Tool
xturing enables fast, efficient LoRA‑based fine‑tuning of models such as LLaMA, GPT‑J, GPT‑2, OPT, and Cerebras‑GPT on single or multiple GPUs.
Dify – LLMOps Platform
Dify offers visual workflow composition, API‑first services, data annotation, and supports GPT‑3, GPT‑3.5‑Turbo, and GPT‑4.
Flowise – Visual LLM App Builder
Flowise is an open‑source UI for constructing custom LLM pipelines using LangChainJS.
Jigsaw Datase – Tool for Improving LLM Performance
Jigsaw (Microsoft) post‑processes LLM outputs with syntax/semantic analysis and user feedback, targeting code synthesis for Pandas APIs.
GPTCache – Semantic Cache for LLM Queries
GPTCache stores LLM responses semantically, reducing API costs by up to 10× and speeding up inference by 100×.
WenDa – LLM Invocation Platform
WenDa supports chatGLM‑6B, chatRWKV, and chatYuan with knowledge‑base search, parameter tuning, streaming output, and multi‑user deployment.
MindFormers – Full‑Process LLM Development Suite
MindFormers provides training, inference, and deployment pipelines for models such as BERT, GPT, OPT, T5, MAE, SimMIM, CLIP, FILIP, ViT, and Swin.
Code as Policies – Natural‑Language Code Generation
Code as Policies extends PaLM‑SayCan to generate full Python programs for robot tasks, outperforming direct language‑only approaches.
Colossal‑AI – Large‑Model Parallel Training System
Colossal‑AI integrates data, pipeline, tensor, and sequence parallelism to simplify distributed training of massive models.
BentoML – Unified Model Deployment Framework
BentoML streamlines the lifecycle of AI services, supporting PyTorch, TensorFlow, JAX, XGBoost, Hugging Face, and open‑source LLMs.
NSQL – Open‑Source SQL Generation Model
NSQL series (350 M, 2 B, 6 B) target SQL generation tasks with a unified serving API.
Architect's Alchemy Furnace
A comprehensive platform that combines Java development and architecture design, guaranteeing 100% original content. We explore the essence and philosophy of architecture and provide professional technical articles for aspiring architects.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
