Infra Learning Club
Author

Infra Learning Club

Infra Learning Club shares study notes, cutting-edge technology, and career discussions.

72
Articles
0
Likes
0
Views
0
Comments
Recent Articles

Latest from Infra Learning Club

72 recent articles
Infra Learning Club
Infra Learning Club
Mar 17, 2025 · Artificial Intelligence

Testing OpenManus with DeepSeek: A Hands‑On Evaluation

The author walks through installing OpenManus, configuring it to use DeepSeek (and an Ollama‑based vision model), runs a sample financial data query, and reports that the system is slow, sometimes inaccurate, and still requires further optimization.

AI agentsCondaDeepSeek
0 likes · 5 min read
Testing OpenManus with DeepSeek: A Hands‑On Evaluation
Infra Learning Club
Infra Learning Club
Mar 9, 2025 · Cloud Native

How to Fix nvidia-smi Missing GPU Process Info Inside Containers

The article explains why nvidia-smi cannot display GPU processes when run inside a container, analyzes the underlying pid‑namespace isolation and kernel‑level restrictions, and provides three practical solutions—including using hostPid, custom kernel interception modules, and the nvitop tool—plus a workaround for gpu‑operator deployments.

GPUKernel ModuleKubernetes
0 likes · 8 min read
How to Fix nvidia-smi Missing GPU Process Info Inside Containers
Infra Learning Club
Infra Learning Club
Mar 7, 2025 · Artificial Intelligence

5 Open-Source AI Code Editors That Can Replace Cursor

This article reviews GitHub Copilot and Cursor, outlines their strengths and shortcomings, and then presents five open‑source AI‑powered code editors—Roo‑Code, Cline, OpenHands, Bolt, and Aider—detailing their key features, model support, and practical usage notes.

AI code editorAiderBolt
0 likes · 7 min read
5 Open-Source AI Code Editors That Can Replace Cursor
Infra Learning Club
Infra Learning Club
Mar 6, 2025 · Fundamentals

How GPU DVFS Boosts Efficiency: Concepts, Modeling, and Future Directions

This article explains how GPU Dynamic Voltage and Frequency Scaling (DVFS) reduces power consumption while preserving performance, describes NVIDIA GPU Boost 4.0 features, outlines a hardware‑counter‑based GPGPU power‑estimation model built with a BP‑ANN, reports sub‑5% error on benchmarks, and discusses intelligent and multi‑GPU extensions.

BP-ANNDVFSGPGPU
0 likes · 5 min read
How GPU DVFS Boosts Efficiency: Concepts, Modeling, and Future Directions
Infra Learning Club
Infra Learning Club
Feb 26, 2025 · Cloud Native

How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide

This guide walks new contributors through the HAMI Kubernetes device‑management middleware, covering its core capabilities, repository structure, development environment setup, build steps, and testing procedures using kwok and fake‑gpu to simulate large‑scale GPU scheduling scenarios.

Device PluginGPU virtualizationHAMI
0 likes · 10 min read
How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide
Infra Learning Club
Infra Learning Club
Feb 23, 2025 · Fundamentals

How to Dynamically Decompress CUDA Fatbin Files Compressed by NVCC

This article explains why enabling NVCC's --fatbin-options -compress-all breaks remote GPU calls, describes the fatbin file layout, shows how to extract and analyze the binary with objcopy, and provides a step‑by‑step implementation of a decompression routine for both ELF and PTX sections.

CUDAGPUbinary format
0 likes · 9 min read
How to Dynamically Decompress CUDA Fatbin Files Compressed by NVCC
Infra Learning Club
Infra Learning Club
Feb 22, 2025 · Fundamentals

Understanding NVCC Compilation: A Step‑by‑Step Technical Guide

This article walks through the NVCC compilation pipeline, explaining how CUDA source files are transformed into host and device binaries, detailing file extensions, compilation stages, command‑line options, intermediate artifacts, and the role of registration functions such as __nv_cudaEntityRegisterCallback and __sti____cudaRegisterAll.

CUDACompilationGPU
0 likes · 12 min read
Understanding NVCC Compilation: A Step‑by‑Step Technical Guide
Infra Learning Club
Infra Learning Club
Feb 21, 2025 · Artificial Intelligence

5 Must‑Try Open‑Source AI Projects You Can Start Using Today

This article introduces five open‑source AI tools—a PPT generator, an LLM app development platform, a cloud‑agnostic AI runner, a curated collection of LLM applications, and a one‑click HD video creator—detailing their key features, usage links, and sample configurations.

AIDifyLLM
0 likes · 8 min read
5 Must‑Try Open‑Source AI Projects You Can Start Using Today
Infra Learning Club
Infra Learning Club
Feb 16, 2025 · Operations

GPUprobe: Using eBPF to Monitor CUDA Memory Leaks

The article introduces GPUprobe, an eBPF‑based tool that provides lightweight, continuous, application‑level monitoring of CUDA memory allocation, leaks, and kernel launches, compares it with NSight Systems and DCGM, and demonstrates near‑zero overhead integration with Prometheus and Grafana through detailed code examples and real‑world output analysis.

GPU monitoringGrafanaPrometheus
0 likes · 13 min read
GPUprobe: Using eBPF to Monitor CUDA Memory Leaks
Infra Learning Club
Infra Learning Club
Feb 15, 2025 · Cloud Native

Advanced Guide: Real‑Time GPU Process Migration in Kubernetes with CRIU

This article explains how os‑criu provides transparent, OS‑level GPU checkpoint/restore, compares its performance with NVIDIA's cuda‑checkpoint, walks through building and installing the PhOS framework, demonstrates migration of a Llama2‑13b‑chat workload in Docker, and discusses current limitations and future Kubernetes integration plans.

CRIUDockerGPU
0 likes · 9 min read
Advanced Guide: Real‑Time GPU Process Migration in Kubernetes with CRIU