Tagged articles

nvcc

4 articles · Page 1 of 1

Feb 23, 2025 · Fundamentals

How to Dynamically Decompress CUDA Fatbin Files Compressed by NVCC

This article explains why enabling NVCC's --fatbin-options -compress-all breaks remote GPU calls, describes the fatbin file layout, shows how to extract and analyze the binary with objcopy, and provides a step‑by‑step implementation of a decompression routine for both ELF and PTX sections.

Binary FormatCUDAGPU

0 likes · 9 min read

How to Dynamically Decompress CUDA Fatbin Files Compressed by NVCC

Infra Learning Club

Feb 22, 2025 · Fundamentals

Understanding NVCC Compilation: A Step‑by‑Step Technical Guide

This article walks through the NVCC compilation pipeline, explaining how CUDA source files are transformed into host and device binaries, detailing file extensions, compilation stages, command‑line options, intermediate artifacts, and the role of registration functions such as __nv_cudaEntityRegisterCallback and __sti____cudaRegisterAll.

CUDACompilationGPU

0 likes · 12 min read

Understanding NVCC Compilation: A Step‑by‑Step Technical Guide

Infra Learning Club

Jan 31, 2025 · Fundamentals

Essential CUDA Learning Guide: Basics, Compilation, and Profiling

This article walks through a practical APOD workflow for CUDA development—assessing bottlenecks, parallelizing with cuBLAS/cuFFT/Thrust, optimizing iteratively, and deploying—while covering nvcc compilation flags, PTX virtual ISA, nvprof profiling, core terminology (SP, SM, warp, grid, block, thread), indexing patterns, and unified memory references.

CUDACUDA terminologyGPU programming

0 likes · 8 min read

Essential CUDA Learning Guide: Basics, Compilation, and Profiling

Infra Learning Club

Jan 24, 2025 · Fundamentals

Inside NVCC: How CUDA Code Is Compiled and Linked

The article dissects NVCC’s compilation pipeline, showing how internal registration functions from host_runtime.h are injected into the host binary, how a simple CUDA demo is processed with --dryrun, and how the generated fatbin, PTX, and cubin files are linked and registered for GPU execution.

CUDACompilationFatBinary

0 likes · 10 min read

Inside NVCC: How CUDA Code Is Compiled and Linked