Tagged articles
2 articles
Page 1 of 1
DeWu Technology
DeWu Technology
Mar 13, 2024 · Artificial Intelligence

Extending Context Length in LLaMA Models: Structures, Challenges, and Techniques

The article reviews LLaMA’s Transformer and RoPE architecture, explains why its context windows (4K‑128K tokens) are limited, and evaluates industry‑proven extension techniques—including linear, NTK‑aware, and YaRN interpolation plus LongLoRA sparse attention—while addressing memory and quadratic‑cost challenges and presenting a KubeAI workflow for fine‑tuning and deployment.

LLaMALongLoRARoPE
0 likes · 17 min read
Extending Context Length in LLaMA Models: Structures, Challenges, and Techniques
Baidu Geek Talk
Baidu Geek Talk
Dec 6, 2023 · Industry Insights

From MLOps to LMOps: Challenges and Solutions for Large‑Model Operations

This article reviews the evolution from MLOps to LMOps, outlines the core concepts, challenges, and key technologies such as large‑model inference optimization, prompt engineering, and context‑length extension, and offers a forward‑looking perspective on the future of AI operations.

AI OperationsLMOpsMLOps
0 likes · 23 min read
From MLOps to LMOps: Challenges and Solutions for Large‑Model Operations