Author

Infra Learning Club

Infra Learning Club shares study notes, cutting-edge technology, and career discussions.

Articles

Likes

145

Views

Comments

Latest from Infra Learning Club

72 recent articles

Infra Learning Club

Feb 12, 2025 · Fundamentals

Why Does Nvidia Report Less GPU Memory Than Specified?

The article investigates why Nvidia L40S and RTX A6000 GPUs show less memory via nvidia‑smi than their advertised 48 GB, revealing that enabled ECC memory reserves a few gigabytes, and demonstrates the effect by toggling ECC on a Tesla‑T4 card.

ECCGPU memoryL40S

0 likes · 4 min read

Why Does Nvidia Report Less GPU Memory Than Specified?

Infra Learning Club

Feb 11, 2025 · Artificial Intelligence

How to Run DeepSeek R1 Locally and Build a RAG System with Ollama and LangChain

This guide walks you through installing Ollama, pulling the open‑source DeepSeek R1 model, and using LangChain and Streamlit to create a locally hosted Retrieval‑Augmented Generation (RAG) system that can answer questions from uploaded PDFs without any cloud API.

DeepSeekLLMLangChain

0 likes · 6 min read

How to Run DeepSeek R1 Locally and Build a RAG System with Ollama and LangChain

Infra Learning Club

Feb 8, 2025 · Artificial Intelligence

Why People Pay for DeepSeek Installation Packages (and How to Install It Yourself)

The article explains that DeepSeek is an open‑source LLM that many sellers monetize by offering paid installation packages, outlines the model lineup and size options, and provides a step‑by‑step guide to install and run DeepSeek locally with Ollama and Open WebUI.

AI modelsDeepSeekLLM

0 likes · 7 min read

Why People Pay for DeepSeek Installation Packages (and How to Install It Yourself)

Infra Learning Club

Feb 8, 2025 · Artificial Intelligence

Multi-Agent LLMs Explained: Benefits, Workflows, and Leading Frameworks

The article surveys the rise of multi‑agent LLM systems, detailing how specialized agents collaborate on tasks such as travel planning, outlining their workflow, comparing them with single‑agent models, listing prominent frameworks, and discussing current challenges and research citations.

AIAgent CollaborationAutoGen

0 likes · 13 min read

Multi-Agent LLMs Explained: Benefits, Workflows, and Leading Frameworks

Infra Learning Club

Feb 7, 2025 · Artificial Intelligence

Understanding LLM Agents: Architecture, Capabilities, and Key Challenges

This article explains what LLM agents are, their core components—brain, memory, planning, and tool use—illustrates how they handle complex queries through task decomposition, surveys notable frameworks, and discusses key challenges such as limited context, long‑term planning difficulties, output inconsistency, and prompt dependence.

AI architectureLLM agentsMemory

0 likes · 15 min read

Understanding LLM Agents: Architecture, Capabilities, and Key Challenges

Infra Learning Club

Feb 6, 2025 · Artificial Intelligence

Getting Started with Huawei Ascend AI Accelerators

This guide walks through the fundamentals of Huawei Ascend NPU hardware, the CANN software stack, driver and firmware installation, Kubernetes integration via Docker runtime and device plugin, and a complete ResNet‑50 inference demo on Ascend 310P.

AI inferenceCANNDocker Runtime

0 likes · 12 min read

Getting Started with Huawei Ascend AI Accelerators

Infra Learning Club

Jan 31, 2025 · Fundamentals

Essential CUDA Learning Guide: Basics, Compilation, and Profiling

This article walks through a practical APOD workflow for CUDA development—assessing bottlenecks, parallelizing with cuBLAS/cuFFT/Thrust, optimizing iteratively, and deploying—while covering nvcc compilation flags, PTX virtual ISA, nvprof profiling, core terminology (SP, SM, warp, grid, block, thread), indexing patterns, and unified memory references.

CUDACUDA terminologyGPU programming

0 likes · 8 min read

Essential CUDA Learning Guide: Basics, Compilation, and Profiling

Infra Learning Club

Jan 24, 2025 · Fundamentals

Inside NVCC: How CUDA Code Is Compiled and Linked

The article dissects NVCC’s compilation pipeline, showing how internal registration functions from host_runtime.h are injected into the host binary, how a simple CUDA demo is processed with --dryrun, and how the generated fatbin, PTX, and cubin files are linked and registered for GPU execution.

CUDACompilationFatBinary

0 likes · 10 min read

Inside NVCC: How CUDA Code Is Compiled and Linked

Infra Learning Club

Jan 23, 2025 · Cloud Native

Getting Started with GPU Remote Invocation Using rCUDA

This article introduces GPU remote invocation, explains rCUDA's architecture, walks through installing the server and client, demonstrates running CUDA samples on a GPU‑less node, and shows how to deploy rCUDA on Kubernetes with example DaemonSet and Job manifests.

CUDADockerGPU remote invocation

0 likes · 7 min read

Getting Started with GPU Remote Invocation Using rCUDA

Infra Learning Club

Jan 22, 2025 · Fundamentals

User‑Mode vs Kernel‑Mode GPU Virtualization: Architecture, Benefits, and Limits

The article compares user‑mode and kernel‑mode GPU virtualization, detailing their layered architectures, how they intercept APIs, the advantages such as openness, isolation, and unified memory, and the drawbacks including API complexity, kernel intrusion, legal risks, and cross‑process limitations.

API interceptionCUDAGPU virtualization

0 likes · 5 min read

User‑Mode vs Kernel‑Mode GPU Virtualization: Architecture, Benefits, and Limits