Tagged articles
10 articles
Page 1 of 1
Infra Learning Club
Infra Learning Club
Feb 15, 2025 · Cloud Native

Advanced Guide: Real‑Time GPU Process Migration in Kubernetes with CRIU

This article explains how os‑criu provides transparent, OS‑level GPU checkpoint/restore, compares its performance with NVIDIA's cuda‑checkpoint, walks through building and installing the PhOS framework, demonstrates migration of a Llama2‑13b‑chat workload in Docker, and discusses current limitations and future Kubernetes integration plans.

CRIUCheckpointDocker
0 likes · 9 min read
Advanced Guide: Real‑Time GPU Process Migration in Kubernetes with CRIU
DataFunSummit
DataFunSummit
Feb 2, 2025 · Artificial Intelligence

BladeDISC++: A Dynamic‑Shape AI Compiler for Memory‑Peak Optimization in Deep Learning Training

The article introduces BladeDISC++, a dynamic‑shape AI compiler from Alibaba Cloud PAI, explains the memory‑peak challenges of dynamic‑shape deep‑learning workloads, describes its symbolic‑shape graph, joint compile‑time/runtime optimizations such as operation fusion, scheduling and just‑in‑time rematerialization, and presents Llama2 experiments showing significant GPU memory savings and throughput gains.

AI compilerBladeDISCLlama2
0 likes · 15 min read
BladeDISC++: A Dynamic‑Shape AI Compiler for Memory‑Peak Optimization in Deep Learning Training
Continuous Delivery 2.0
Continuous Delivery 2.0
Jul 1, 2024 · Artificial Intelligence

How Meta Uses Llama2 to Accelerate Incident Response and Root‑Cause Analysis in AIOps

This article explains how Meta applies AI, specifically a fine‑tuned Llama2 model, to improve AIOps by automating incident monitoring, providing real‑time summaries, assisting responders with contextual information, and efficiently narrowing down root‑cause changes, ultimately reducing incident resolution time from hours to minutes.

AILlama2Meta
0 likes · 13 min read
How Meta Uses Llama2 to Accelerate Incident Response and Root‑Cause Analysis in AIOps
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 18, 2024 · Artificial Intelligence

Llama 2: Open Foundation and Fine‑Tuned Chat Models – Overview and Technical Details

The article provides a comprehensive overview of Meta’s Llama 2 series, detailing model sizes, pre‑training data, architectural enhancements, supervised fine‑tuning, RLHF procedures, safety evaluations, reward‑model training, and iterative improvements, highlighting its open‑source release and comparative performance.

AI SafetyFine-tuningLlama2
0 likes · 27 min read
Llama 2: Open Foundation and Fine‑Tuned Chat Models – Overview and Technical Details
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Dec 29, 2023 · Artificial Intelligence

Unlocking LLaMA2: Key Architecture Insights and Deployment Tricks

This recap of the MindSpore public course reviews LLaMA2 fundamentals, compares its Transformer structure, details upgrades from LLaMA1, explains core components like RMSNorm, RoPE, KV‑Cache, Grouped Multi‑Query Attention and SwiGLU, outlines industry LLM optimization methods, and previews the upcoming lecture on the Pengcheng Brain 200B model.

Llama2MindSporeTransformer
0 likes · 5 min read
Unlocking LLaMA2: Key Architecture Insights and Deployment Tricks
IT Services Circle
IT Services Circle
Sep 16, 2023 · Artificial Intelligence

Porting Llama2 to Mojo: Massive Performance Boosts and Insights

Former Meta engineer Aydyn Tairov quickly ported the Python implementation of Llama2 to the newly released Mojo language, demonstrating that Mojo’s SIMD primitives can accelerate Python code by up to 250 times and even make the Python version run 20% faster than the original C implementation.

AILlama2Meta
0 likes · 2 min read
Porting Llama2 to Mojo: Massive Performance Boosts and Insights
Tencent Cloud Developer
Tencent Cloud Developer
Aug 14, 2023 · Artificial Intelligence

Overview of Open‑Source Large Language Models: Llama 2, ChatGLM 2, Usage, Fine‑Tuning and Comparison

The article reviews the rapid evolution of open‑source large language models, detailing Meta’s Llama 2 series and Tsinghua’s ChatGLM 2, their enhanced capabilities such as RLHF, larger context windows, safety‑usefulness trade‑offs, performance gains, download and fine‑tuning procedures, and how they increasingly rival proprietary models like GPT‑4.

AIChatGLM2Llama2
0 likes · 10 min read
Overview of Open‑Source Large Language Models: Llama 2, ChatGLM 2, Usage, Fine‑Tuning and Comparison
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 25, 2023 · Artificial Intelligence

Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI in Minutes

This guide walks you through using Meta's open‑source Llama 2 models on Alibaba Cloud's PAI platform, covering low‑code LoRA fine‑tuning, full‑parameter fine‑tuning with PAI‑DSW, and rapid WebUI deployment via PAI‑EAS, complete with step‑by‑step instructions, code snippets, and resource requirements.

AIAlibaba CloudFine-tuning
0 likes · 16 min read
Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI in Minutes