Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources

This article compiles a comprehensive, up‑to‑date inventory of open‑source large language models from Chinese and international organizations, detailing each model’s architecture, parameter count, multilingual capabilities, deployment requirements, and associated tools, offering a valuable reference for AI researchers and developers.

Architect's Alchemy Furnace
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources

Open-Source Large Language Models List

Large Language Model (LLM) refers to a massive neural‑network‑based natural‑language‑processing model that learns grammar and semantics from large corpora and can generate human‑readable text. LLMs are trained on billions to trillions of parameters and can handle tasks such as generation, classification, summarization, translation, and speech recognition.

This article provides a comprehensive overview of open‑source LLMs released by companies, research institutions, and communities worldwide.

Open‑Source Chinese LLM

ChatGLM-6B – Bilingual Dialogue Model

ChatGLM-6B is an open‑source bilingual (Chinese‑English) dialogue model based on the General Language Model (GLM) architecture with 6.2 B parameters. Quantization (INT4) allows local deployment on consumer‑grade GPUs with as little as 6 GB VRAM.

The model was trained on ~1 T tokens of bilingual data and fine‑tuned with supervised learning, reinforcement learning from human feedback, and other techniques, achieving human‑like responses despite its smaller size.

ChatGLM2-6B – Second Generation

ChatGLM2-6B upgrades the base model with longer context, more efficient inference, and an open‑source license.

VisualGLM-6B – Multimodal Dialogue Model

VisualGLM-6B supports image, Chinese, and English inputs. It builds on ChatGLM-6B (6.2 B parameters) and integrates a BLIP2‑Qformer visual encoder, resulting in a 7.8 B‑parameter multimodal model.

MOSS – Bilingual Dialogue Model

MOSS is a bilingual (Chinese‑English) dialogue model with 16 B parameters (moss‑moon series). It runs on a single A100/A800 or two 3090 GPUs in FP16, or on a single 3090 GPU in INT4/8.

The base model was pre‑trained on ~7 k B tokens of Chinese‑English and code data, then instruction‑tuned, plugin‑enhanced, and aligned with human preferences for multi‑turn dialogue.

DB‑GPT – Database‑Centric LLM

DB‑GPT is an open‑source GPT‑style model focused on database interaction, offering 100 % privacy and security with local deployment.

It provides a complete private LLM solution for database‑driven scenarios, supporting isolated deployment per business module.

CPM‑Bee – Bilingual Large Model

CPM‑Bee is a fully open‑source, commercially usable 100 B‑parameter Chinese‑English base model built on a Transformer autoregressive architecture and trained on trillion‑scale high‑quality corpora.

Open‑source and commercial : OpenBMB allows commercial use after enterprise verification.

Excellent bilingual performance : Strong results on bilingual benchmarks.

Massive high‑quality data : Trained on a trillion‑scale corpus with rigorous cleaning.

OpenBMB ecosystem support : Provides tools for pre‑training, adaptation, compression, deployment, and more.

Powerful dialogue and tool usage : Fine‑tuned instances exhibit strong conversational and tool‑use abilities.

CPM‑Bee excels at semantic understanding, text generation, translation, QA, scoring, and multiple‑choice tasks.

LaWGPT – Legal Knowledge LLM

LaWGPT series are open‑source LLMs specialized in Chinese legal knowledge, built on base models such as Chinese‑LLaMA or ChatGLM, extended with legal vocabularies and large‑scale legal corpora, then instruction‑tuned on legal QA and judicial exam data.

Linly – Large‑Scale Chinese LLM

Linly offers several models: Linly‑Chinese‑LLaMA (7 B, 13 B, 33 B, with 65 B in training), Linly‑ChatFlow (7 B, 13 B), and a 4‑bit quantized ChatFlow version for CPU inference. The project emphasizes reproducibility, open data, and compatibility with CUDA and CPU.

Chinese‑Vicuna – LLaMA‑Based Chinese Model

Chinese‑Vicuna is a low‑resource LLaMA + LoRA adaptation for Chinese.

Finetune model code

Inference code

CPU‑only C++ inference

Tools for downloading/converting/quantizing Facebook LLaMA checkpoints

Other applications

Chinese‑LLaMA‑Alpaca – Chinese LLaMA & Alpaca

This project provides Chinese‑LLaMA base models and instruction‑tuned Alpaca variants, extending LLaMA’s vocabulary with Chinese tokens and improving instruction following.

ChatYuan – Dialogue Model

ChatYuan is a bilingual (Chinese‑English) functional dialogue model. The large‑v2 version incorporates optimized fine‑tuning data, human‑feedback RL, and chain‑of‑thought reasoning, running on consumer GPUs, PCs, or phones (INT4 requires only 400 MB VRAM).

HuatuoGPT – Open‑Source Chinese Medical Model

HuatuoGPT combines distilled ChatGPT data with real doctor responses to create a medical assistant capable of accurate diagnosis and rich interaction.

BenTsao – Medical LLaMA Fine‑Tuned Model

BenTsao (formerly HuaTuo) is a LLaMA‑based model fine‑tuned on Chinese medical instruction data generated via knowledge graphs and GPT‑3.5, improving medical QA performance.

Pengcheng PanGu‑α – Chinese Pre‑Training Model

PanGu‑α is the industry’s first 200 B‑parameter Chinese‑centric pre‑training model, released in standard and enhanced versions, supporting NPU and GPU, excelling in knowledge QA, retrieval, reasoning, and reading‑comprehension.

Core modules include:

Dataset: ~80 TB raw text, ~1.1 TB high‑quality Chinese corpus, plus 53 multilingual datasets (≈2 TB).

Base module: Provides pre‑trained models such as PanGu‑α and its enhanced variant.

Application layer: Supports multilingual translation, open‑domain dialogue, model compression, framework migration, and continual learning.

PanGu‑Dialog – Dialogue Generation Model

PanGu‑Dialog is a large‑scale open‑domain dialogue model that emphasizes logical reasoning, data calculation, association, and creation abilities, achieving SOTA performance among Chinese pure‑generation models.

Wudao – Bilingual Multimodal Model

Wudao is a 1.75 T‑parameter bilingual multimodal pre‑training model with seven open‑source variants.

Image‑Text Models

CogView : 4 B‑parameter model that generates images from text, surpassing DALL·E on MS COCO.

BriVL : Chinese‑centric vision‑language model excelling in image‑text retrieval.

Text Models

GLM : English‑centric series achieving state‑of‑the‑art results on understanding and generation tasks.

CPM : Chinese and bilingual models ranging from 2.6 B to 198 B parameters.

Transformer‑XL : 2.9 B‑parameter Chinese generation model for article writing, poetry, summarization, etc.

EVA : 2.8 B‑parameter Chinese dialogue model trained on 1.4 B Chinese dialogue data.

Lawformer : First Chinese legal long‑text pre‑training model (100 M parameters).

Protein Models

ProtTrans : Largest Chinese protein pre‑training model (3 B parameters).

BBT‑2 – 120 B‑Parameter General LLM

BBT‑2 serves as a base for specialized models in code, finance, and text‑to‑image generation.

BBT‑2‑12B‑Text – Chinese base model

BBT‑2.5‑13B‑Text – Chinese‑English bilingual base model

BBT‑2‑12B‑TC‑001‑SFT – Code model fine‑tuned for dialogue

BBT‑2‑12B‑TF‑001 – Finance model

BBT‑2‑12B‑Fig – Text‑to‑image model

BBT‑2‑12B‑Science – Scientific paper model

BELLE – Open‑Source Chinese Dialogue Model

BELLE aims to promote the Chinese dialogue LLM community, built on open‑source bases such as BLOOM and fine‑tuned with ChatGPT‑generated instruction data.

TigerBot – Multimodal LLM

TigerBot is a multilingual, multitask LLM achieving 96 % of OpenAI’s performance on benchmark evaluations.

YuLan‑Chat – Bilingual Dialogue Model

Developed by Renmin University’s AI Institute, YuLan‑Chat explores instruction‑tuning for Chinese‑English dialogue.

BayLing – English/Chinese Model with Enhanced Alignment

BayLing achieves ~90 % of ChatGPT’s performance on multiple benchmarks.

Open‑Source LLMs

Qwen‑7B – Transformer‑Based Model

Qwen‑7B (70 B parameters) from Alibaba Cloud is pre‑trained on >2.2 T tokens, supporting 8K context and plugin calls.

Code Llama – Code Generation Model

Based on Llama 2, Code Llama offers three variants (base, Python‑optimized, and Instruct) for multi‑language code generation.

CodeFuse‑13B – Code LLM

Trained on 1 000 B tokens of code, Chinese, and English data, covering 40+ programming languages, achieving 37.1 % Pass@1 on HumanEval.

MiLM‑6B – Xiaomi AI Model

MiLM‑6B (6.4 B parameters) achieves top performance on C‑Eval and CMMLU STEM subjects.

LLaMA – Meta LLM

LLaMA series ranges from 7 B to 65 B parameters, offering competitive performance with smaller models running on consumer hardware.

Stanford Alpaca – Instruction‑Tuned LLaMA

Alpaca fine‑tunes LLaMA 7B using 52 K instruction‑following samples generated by text‑davinci‑003.

Lit‑LLaMA – nanoGPT Implementation

Single‑file implementation supporting quantization, LoRA, flash attention, and more.

GloVe – Word Vector Tool

GloVe provides pre‑trained word embeddings for various corpora (Wikipedia, Common Crawl, Twitter) in multiple dimensions.

Dolly – Low‑Cost LLM

Dolly adapts EleutherAI’s 6 B‑parameter model with instruction‑following capabilities.

OPT‑175B – Meta Open‑Source LLM

OPT‑175B (175 B parameters) matches GPT‑3 in size and is freely available for non‑commercial research.

Cerebras‑GPT – NLP LLM

Cerebras‑GPT offers models from 111 M to 13 B parameters, fully open‑source.

BLOOM – Multilingual LLM

BLOOM (176 B parameters) supports 46 languages and 13 programming languages, freely downloadable from Hugging Face.

BLOOMChat – Multilingual Chat LLM

BLOOMChat is a 176 B open‑source chat model fine‑tuned on OpenChatKit, Dolly 2.0, and OASST1.

GPT‑J – NLP Model

GPT‑J (6 B parameters) trained on an 800 GB dataset, comparable to GPT‑3‑style performance.

GPT‑2 – Transformer Model

GPT‑2 (1.5 B parameters) trained on 8 M webpages, capable of translation, QA, summarization, and text generation.

RWKV‑LM – Linear Transformer

RWKV combines RNN and Transformer ideas for fast, memory‑efficient long‑text modeling.

White‑Ze – LoRA‑Trained LLM

White‑Ze fine‑tunes LLaMA with LoRA for English dialogue, offering 7 B, 13 B, and 30 B variants.

CodeGeeX – Multilingual Code Generation

CodeGeeX (13 B parameters) supports Python, C++, Java, JavaScript, Go, achieving 47‑60 % solve rate on HumanEval‑X.

Falcon LLM – Open‑Source Model

Falcon (up to 40 B parameters) from TII outperforms LLaMA on many benchmarks.

Vicuna – LLaMA‑Fine‑Tuned Model

Vicuna (7 B, 13 B) fine‑tuned by academic teams, reaching >90 % of ChatGPT quality on GPT‑4 evaluation.

RedPajama – 1.2 T Token Dataset

RedPajama replicates LLaMA’s training data (>1.2 T tokens) and provides pre‑training, base, and instruction‑tuning resources.

OpenAssistant – Dialogue LLM

OpenAssistant offers free AI chatbots trained on 600 K multi‑topic conversations, releasing LLaMA‑13B and 30 B instruction‑tuned models.

StableLM – Stability AI Model

StableLM‑alpha series includes 3 B and 7 B models, with larger 15 B and 30 B versions in development.

StarCoder – AI Programming Model

StarCoder (15 B parameters) targets code generation to compete with GitHub Copilot and Amazon CodeWhisperer.

SantaCoder – Lightweight Programming Model

SantaCoder (1.1 B parameters) supports Python, Java, and JavaScript code generation.

MLC LLM – Local LLM Solution

MLC LLM enables deployment of any LLM on various hardware back‑ends and local applications.

Web LLM – Browser‑Based LLM

Web LLM runs large models entirely in the browser using WebGPU, offering privacy‑first AI assistants.

WizardLM – Fine‑Tuned LLaMA

WizardLM (7 B) is fine‑tuned with Evol‑Instruct, a method that generates diverse difficulty instructions using LLMs.

YaLM 100B – 100 B‑Parameter Model

YaLM 100B is a GPT‑style model trained on 1 TB of English and Russian data across 800 A100 GPUs.

OpenLLaMA – LLaMA Reimplementation

OpenLLaMA reproduces Meta’s LLaMA under a permissive license, providing 7 B weights and training scripts.

LLM‑Related Tools

OpenLLM – Open Platform for Operating LLMs

OpenLLM offers production‑grade fine‑tuning, serving, deployment, and monitoring for any open‑source LLM.

LangChain – Building LLM Applications

LangChain provides prompts, LLM wrappers, document loaders, utilities, chains, indexes, agents, memory, and chat interfaces to integrate LLMs with external tools.

JARVIS – Collaborative System for LLMs and AI Models

JARVIS uses an LLM as a controller to orchestrate multiple AI models from Hugging Face across four stages: task planning, model selection, execution, and response generation.

Semantic Kernel – SDK for Integrating LLMs

Semantic Kernel is a lightweight SDK that blends natural‑language semantics, traditional code, and embedding‑based memory for AI‑enhanced applications.

LMFlow – Scalable LLM Toolkit

LMFlow provides an open research platform for efficient LLM training, supporting low‑resource experiments and custom data utilization.

xturing – LLM Personalization Fine‑Tuning Tool

xturing enables fast, efficient LoRA‑based fine‑tuning of models such as LLaMA, GPT‑J, GPT‑2, OPT, and Cerebras‑GPT on single or multiple GPUs.

Dify – LLMOps Platform

Dify offers visual workflow composition, API‑first services, data annotation, and supports GPT‑3, GPT‑3.5‑Turbo, and GPT‑4.

Flowise – Visual LLM App Builder

Flowise is an open‑source UI for constructing custom LLM pipelines using LangChainJS.

Jigsaw Datase – Tool for Improving LLM Performance

Jigsaw (Microsoft) post‑processes LLM outputs with syntax/semantic analysis and user feedback, targeting code synthesis for Pandas APIs.

GPTCache – Semantic Cache for LLM Queries

GPTCache stores LLM responses semantically, reducing API costs by up to 10× and speeding up inference by 100×.

WenDa – LLM Invocation Platform

WenDa supports chatGLM‑6B, chatRWKV, and chatYuan with knowledge‑base search, parameter tuning, streaming output, and multi‑user deployment.

MindFormers – Full‑Process LLM Development Suite

MindFormers provides training, inference, and deployment pipelines for models such as BERT, GPT, OPT, T5, MAE, SimMIM, CLIP, FILIP, ViT, and Swin.

Code as Policies – Natural‑Language Code Generation

Code as Policies extends PaLM‑SayCan to generate full Python programs for robot tasks, outperforming direct language‑only approaches.

Colossal‑AI – Large‑Model Parallel Training System

Colossal‑AI integrates data, pipeline, tensor, and sequence parallelism to simplify distributed training of massive models.

BentoML – Unified Model Deployment Framework

BentoML streamlines the lifecycle of AI services, supporting PyTorch, TensorFlow, JAX, XGBoost, Hugging Face, and open‑source LLMs.

NSQL – Open‑Source SQL Generation Model

NSQL series (350 M, 2 B, 6 B) target SQL generation tasks with a unified serving API.

AILLMLarge Language Modelopen-sourceToolsmodel list
Architect's Alchemy Furnace
Written by

Architect's Alchemy Furnace

A comprehensive platform that combines Java development and architecture design, guaranteeing 100% original content. We explore the essence and philosophy of architecture and provide professional technical articles for aspiring architects.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.