Tagged articles

ChatGLM3

4 articles · Page 1 of 1

Jul 3, 2024 · Artificial Intelligence

Deploy ChatGLM3‑6B with FastGPT, One‑API, and M3E on Linux

This guide walks you through deploying the ChatGLM3‑6B large language model locally, adding the M3E vector embedding model, setting up One‑API and FastGPT with Docker, configuring environments, fine‑tuning with LoRA, and testing the integrated knowledge‑base Q&A system.

ChatGLM3DockerFastGPT

0 likes · 15 min read

Deploy ChatGLM3‑6B with FastGPT, One‑API, and M3E on Linux

DaTaobao Tech

Dec 27, 2023 · Artificial Intelligence

Deploying a Private LLM Knowledge Base on a MacBook

The guide walks through installing and quantizing the open‑source ChatGLM3‑6B model and the m3e‑base embedder on a MacBook, wrapping them with a FastAPI OpenAI‑compatible service, routing requests through a One‑API gateway, storing metadata in MongoDB and vectors in PostgreSQL pgvector, deploying FastGPT for RAG, ingesting data, and demonstrating 5‑7 second response times, while outlining future improvements.

ChatGLM3DeploymentFastAPI

0 likes · 23 min read

Deploying a Private LLM Knowledge Base on a MacBook

Rare Earth Juejin Tech Community

Nov 29, 2023 · Artificial Intelligence

Building a Private LLM‑Powered Knowledge Base with LangChain and ChatGLM3

This article explains how to migrate personal notes into a private knowledge base by combining a large language model with an external vector store, detailing the concepts of tokenization, embedding, vector databases, and step‑by‑step deployment using LangChain‑Chatchat and the open‑source ChatGLM3 model.

ChatGLM3EmbeddingKnowledge Base

0 likes · 10 min read

Building a Private LLM‑Powered Knowledge Base with LangChain and ChatGLM3

Huawei Cloud Developer Alliance

Nov 16, 2023 · Artificial Intelligence

ChatGLM2 vs ChatGLM3: MQA, FlashAttention, and New Prompt Features

During the Saturday session, we reviewed ChatGLM2’s upgrades—Multi‑Query Attention and FlashAttention—demonstrated deployment on Ascend + ModelArts + MindSpore, and introduced ChatGLM3’s revamped prompt design, native tool‑calling and code‑interpreter capabilities, while previewing the next lecture on text‑generation decoding.

ChatGLM2ChatGLM3FlashAttention

0 likes · 6 min read

ChatGLM2 vs ChatGLM3: MQA, FlashAttention, and New Prompt Features