Tagged articles
3 articles
Page 1 of 1
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Nov 16, 2023 · Artificial Intelligence

ChatGLM2 vs ChatGLM3: MQA, FlashAttention, and New Prompt Features

During the Saturday session, we reviewed ChatGLM2’s upgrades—Multi‑Query Attention and FlashAttention—demonstrated deployment on Ascend + ModelArts + MindSpore, and introduced ChatGLM3’s revamped prompt design, native tool‑calling and code‑interpreter capabilities, while previewing the next lecture on text‑generation decoding.

ChatGLM2ChatGLM3FlashAttention
0 likes · 6 min read
ChatGLM2 vs ChatGLM3: MQA, FlashAttention, and New Prompt Features
Tencent Cloud Developer
Tencent Cloud Developer
Aug 14, 2023 · Artificial Intelligence

Overview of Open‑Source Large Language Models: Llama 2, ChatGLM 2, Usage, Fine‑Tuning and Comparison

The article reviews the rapid evolution of open‑source large language models, detailing Meta’s Llama 2 series and Tsinghua’s ChatGLM 2, their enhanced capabilities such as RLHF, larger context windows, safety‑usefulness trade‑offs, performance gains, download and fine‑tuning procedures, and how they increasingly rival proprietary models like GPT‑4.

AIChatGLM2Llama2
0 likes · 10 min read
Overview of Open‑Source Large Language Models: Llama 2, ChatGLM 2, Usage, Fine‑Tuning and Comparison
Baobao Algorithm Notes
Baobao Algorithm Notes
Jul 5, 2023 · Artificial Intelligence

Session‑Level Sample Organization for Decoder‑Only LLM Fine‑Tuning

This article explains how to restructure multi‑turn dialogue data into single session‑level training samples for decoder‑only large language models, leveraging causal attention and simple position IDs, and provides a concrete implementation, performance results, and a gradient‑weight analysis.

ChatGLM2LLM fine-tuningdecoder-only
0 likes · 7 min read
Session‑Level Sample Organization for Decoder‑Only LLM Fine‑Tuning