Tagged articles

PyTorch

212 articles · Page 1 of 3

Jun 29, 2026 · Artificial Intelligence

OpenMythos: Open‑Source Reverse‑Engineering of Claude Mythos Architecture and the Controversy

OpenMythos is an open‑source, PyTorch‑based theoretical reconstruction of Anthropic's Claude Mythos that uses a Recurrent‑Depth Transformer, offering multiple model scales, sparking polarized community reactions, and raising security implications for AI‑driven vulnerability research.

AI securityClaude MythosOpenMythos

0 likes · 8 min read

OpenMythos: Open‑Source Reverse‑Engineering of Claude Mythos Architecture and the Controversy

MaGe Linux Operations

Jun 20, 2026 · Artificial Intelligence

Custom PyTorch Dataset & DataLoader: Multiprocessing Optimization Guide

This article walks through diagnosing a severe GPU under‑utilization bug in an 8‑A100 training job, explains why the default Dataset/DataLoader setup stalls, and presents a step‑by‑step redesign using MapDataset or IterableDataset, WebDataset tar shards, tuned DataLoader parameters, worker‑level seeding, GPU‑side prefetching, and distributed sampling to boost GPU utilization from 5‑12% to over 85% while cutting epoch time from 40 h to 9 h.

DataLoaderDistributedSamplerGPU prefetch

0 likes · 22 min read

Custom PyTorch Dataset & DataLoader: Multiprocessing Optimization Guide

Ubuntu

Jun 15, 2026 · Artificial Intelligence

Running AI/ML Models on WSL with CUDA Acceleration: A PyTorch Hands‑On Guide

This guide shows how to enable NVIDIA GPU passthrough in WSL 2, install the CUDA toolkit, set up a PyTorch GPU environment, verify GPU visibility, and run real‑world AI/ML workloads such as LLM inference, YOLO object detection, and Jupyter monitoring, while providing performance comparisons, optimization tips, and troubleshooting FAQs.

AICUDAGPU

0 likes · 13 min read

Running AI/ML Models on WSL with CUDA Acceleration: A PyTorch Hands‑On Guide

IT Services Circle

Jun 13, 2026 · Artificial Intelligence

What Interviewers Expect: Understanding Transformers Beyond Codex and AI Code Generation

The article explains why modern interviewers ask about Transformer fundamentals, breaks down its core components such as self‑attention, multi‑head attention, feed‑forward networks, residual connections and positional encodings, and demonstrates a complete PyTorch toy model that predicts the sum‑mod‑10 of integer sequences while visualizing loss curves, attention heatmaps, embedding PCA and early‑stage gradient norms.

Gradient AnalysisModel VisualizationMulti-Head Attention

0 likes · 20 min read

What Interviewers Expect: Understanding Transformers Beyond Codex and AI Code Generation

DeepHub IMBA

Jun 7, 2026 · Artificial Intelligence

PyTorch GPU Memory Profiling: Checkpointing, Mixed Precision, Optimizer Choice

The article explains the seven sources of GPU memory usage during PyTorch training, shows how to measure them with built‑in profiling APIs and the memory‑viz tool, and evaluates three effective optimizations—gradient checkpointing, mixed‑precision training, and optimizer selection—detailing their memory savings and performance costs.

GPU memoryPyTorchgradient checkpointing

0 likes · 8 min read

PyTorch GPU Memory Profiling: Checkpointing, Mixed Precision, Optimizer Choice

大转转FE

Jun 4, 2026 · Artificial Intelligence

Captcha Recognition in Practice: Front‑End Engineers Skip UI to Train Models

This article details how front‑end developers used a low‑code DDDD trainer and AI‑generated PyTorch CNN code to build high‑accuracy captcha recognizers, achieving up to 99% sequence accuracy while illustrating a workflow that lets developers shift from UI coding to model training with AI assistance.

AICNNPyTorch

0 likes · 15 min read

Captcha Recognition in Practice: Front‑End Engineers Skip UI to Train Models

DeepHub IMBA

May 21, 2026 · Artificial Intelligence

Add Step‑Level Diagnostics to PyTorch Training in Three Lines with TraceML

TraceML provides a lightweight, step‑level profiler for PyTorch training that requires only a few code changes—initializing the library and wrapping each training step—to generate real‑time diagnostics and a compact JSON summary, helping engineers quickly identify whether data loading, forward, backward, or optimizer phases dominate execution time.

ML infrastructurePyTorchTraceML

0 likes · 6 min read

Add Step‑Level Diagnostics to PyTorch Training in Three Lines with TraceML

SuanNi

May 6, 2026 · Artificial Intelligence

Deploy RecBole on a GPU Cloud to Learn Recommendation Algorithms

This guide explains how to launch the RecBole recommendation system image on the SumW GPU cloud, covering its key features, required setup steps, dependency installation tips, and a one‑line command to run a baseline model on an MLU accelerator.

GPU cloudMLUPyTorch

0 likes · 4 min read

Deploy RecBole on a GPU Cloud to Learn Recommendation Algorithms

PaperAgent

Apr 21, 2026 · Artificial Intelligence

OpenMythos: Rebuilding Claude Mythos with Recursive Transformers and MoE

OpenMythos is an open‑source PyTorch reimplementation of Anthropic's Claude Mythos that uses a mixed‑expert routed recurrent Transformer, introduces Recursive Depth Transformers, Multi‑Latent Attention, and several stability mechanisms, and demonstrates parameter‑efficient scaling backed by empirical studies.

AI ArchitectureClaude MythosMoE

0 likes · 6 min read

OpenMythos: Rebuilding Claude Mythos with Recursive Transformers and MoE

AI Explorer

Apr 4, 2026 · Artificial Intelligence

Google TimesFM: A GPT‑style Foundation Model Redefining Time‑Series Forecasting

Google's open‑source TimesFM model brings pre‑trained, GPT‑like capabilities to time‑series forecasting, offering few‑shot and zero‑shot predictions, extended context length, continuous quantile outputs, and easy integration via a simple PyTorch API for developers across domains.

GooglePyTorchTimesFM

0 likes · 7 min read

Google TimesFM: A GPT‑style Foundation Model Redefining Time‑Series Forecasting

AI Explorer

Apr 1, 2026 · Artificial Intelligence

Google Open‑Sources TimesFM: A Foundation Model for Plug‑and‑Play Time‑Series Forecasting

Google’s open‑source TimesFM is a decoder‑only Transformer foundation model that delivers plug‑and‑play time‑series forecasting with zero‑shot accuracy, larger context windows, quantile predictions, and a simple Hugging Face API, making it suitable for retail, energy, finance, monitoring, and IoT use cases.

Hugging FacePyTorchTimesFM

0 likes · 7 min read

Google Open‑Sources TimesFM: A Foundation Model for Plug‑and‑Play Time‑Series Forecasting

Tech Musings

Mar 6, 2026 · Artificial Intelligence

How to Deploy Qwen3-8B on WSL2 with 4‑Bit Quantization and Resource Limits

This article details a step‑by‑step guide for setting up the Qwen3‑8B large language model on a Windows 11 system using WSL2, covering hardware specs, CUDA configuration, 4‑bit quantization with BitsAndBytes, SDPA attention optimization, CPU offload, and resource‑limiting tricks to achieve smooth inference performance.

4-bit quantizationCUDA optimizationPyTorch

0 likes · 10 min read

How to Deploy Qwen3-8B on WSL2 with 4‑Bit Quantization and Resource Limits

DeepHub IMBA

Mar 1, 2026 · Artificial Intelligence

Demystifying VAE: From Probabilistic Encoding to Latent Space Regularization

This article walks through the fundamentals of variational autoencoders, explaining why they are needed, detailing their three core components, loss formulation, PyTorch implementation, training loop, and multiple inference modes such as anomaly detection, data generation, conditional generation, latent space manipulation, and data imputation.

Anomaly DetectionConditional VAEPyTorch

0 likes · 15 min read

Demystifying VAE: From Probabilistic Encoding to Latent Space Regularization

Data STUDIO

Feb 25, 2026 · Artificial Intelligence

Build a Large Language Model from Scratch with PyTorch—No Libraries, No Shortcuts

This guide walks you through building, training, and fine‑tuning a Transformer‑based large language model entirely from scratch using PyTorch, covering tokenization, self‑attention, multi‑head attention, positional encoding, model architecture, data preparation, training loops, and fine‑tuning on custom lyrics.

GPTLLMPyTorch

0 likes · 43 min read

Build a Large Language Model from Scratch with PyTorch—No Libraries, No Shortcuts

AI Cyberspace

Feb 11, 2026 · Artificial Intelligence

From RNNs to LSTMs and GRUs: A Hands‑On Guide to Sequence Modeling in PyTorch

This tutorial explains the nature of sequential data, why traditional feed‑forward networks struggle with it, and how recurrent architectures such as RNN, LSTM, and GRU capture temporal dependencies, complete with mathematical foundations, training algorithms, and full PyTorch implementations for sentiment analysis, text generation, and encoder‑decoder models.

Encoder-DecoderGRULSTM

0 likes · 57 min read

From RNNs to LSTMs and GRUs: A Hands‑On Guide to Sequence Modeling in PyTorch

xkx's Tech General Store

Feb 8, 2026 · Artificial Intelligence

Mastering U‑Net: The Core Engine of Stable Diffusion – Theory to Practice

This article introduces the U‑Net architecture—originally designed for medical image segmentation—explains why its pixel‑wise processing makes it the core denoising engine in Stable Diffusion, details three key modifications for diffusion models, and walks through a ResNet‑50‑based implementation trained on the VOC2012 dataset, achieving 0.92 pixel accuracy and 0.64 mean IoU.

PyTorchResNet50Semantic Segmentation

0 likes · 11 min read

Mastering U‑Net: The Core Engine of Stable Diffusion – Theory to Practice

HyperAI Super Neural

Feb 4, 2026 · Artificial Intelligence

Practical Experience: Optimizing Elementwise Operators on HyperAI Cloud Compute Platform

The article walks through a step‑by‑step optimization of a simple elementwise addition kernel (C = A + B) on HyperAI's RTX 5090 cloud instance, covering FP32 baseline, vectorized FP32, several FP16 variants, benchmark methodology, performance results, and the reasoning behind thread‑block sizing.

CUDAElementwiseFP16

0 likes · 30 min read

Practical Experience: Optimizing Elementwise Operators on HyperAI Cloud Compute Platform

Data Party THU

Feb 1, 2026 · Artificial Intelligence

How Tiny Perturbations Can Fool 95% Accurate Image Classifiers

Despite achieving over 95% accuracy on ImageNet, popular models like ResNet, VGG, and EfficientNet can be easily misled by carefully crafted adversarial examples using FGSM, revealing deep learning’s inherent vulnerability and prompting the need for robust defense strategies.

FGSMPyTorchadversarial examples

0 likes · 11 min read

How Tiny Perturbations Can Fool 95% Accurate Image Classifiers

JD Cloud Developers

Jan 30, 2026 · Artificial Intelligence

Scaling Generative Recommendation: Inside JD’s 9N-LLM Multi‑Framework Training Engine

This article details JD Retail’s 9N-LLM unified training engine, which integrates TensorFlow and PyTorch across GPU and NPU hardware to tackle the massive data, model size, and reinforcement‑learning complexities of generative recommendation, offering concrete components, performance benchmarks, and future directions.

GPU/NPUPyTorchTensorFlow

0 likes · 26 min read

Scaling Generative Recommendation: Inside JD’s 9N-LLM Multi‑Framework Training Engine

xkx's Tech General Store

Jan 29, 2026 · Artificial Intelligence

Understanding CLIP: The Image‑Text Translator Behind Text‑to‑Image Models

This article explains CLIP’s dual‑encoder architecture, contrastive training, and zero‑shot inference, then demonstrates its use through image‑text matching and CIFAR‑10 classification experiments with code examples, highlighting strengths and limitations such as resolution mismatch.

CLIPImage-Text MatchingPyTorch

0 likes · 11 min read

Understanding CLIP: The Image‑Text Translator Behind Text‑to‑Image Models

21CTO

Jan 26, 2026 · Artificial Intelligence

What’s New in PyTorch 2.10? Deep Dive into GPU and CUDA Enhancements

PyTorch 2.10 introduces extensive upgrades for AMD ROCm, Intel XPU, and NVIDIA CUDA, adds new Torch XPU APIs, expands Python 3.14 support, and brings performance‑focused improvements such as fused kernels and enhanced quantization, all available via the official GitHub release.

CUDAGPUPyTorch

0 likes · 4 min read

What’s New in PyTorch 2.10? Deep Dive into GPU and CUDA Enhancements

Ubuntu

Jan 24, 2026 · Artificial Intelligence

Deploy Alibaba’s Qwen3‑TTS on Ubuntu and Clone Your Voice in 3 Seconds

This guide walks through installing the open‑source Qwen3‑TTS model on Ubuntu, covering environment setup, GPU requirements, package installation, model variants, and hands‑on Python scripts for ultra‑low‑latency voice cloning and text‑driven voice design.

AI speech synthesisPyTorchPython

0 likes · 9 min read

Deploy Alibaba’s Qwen3‑TTS on Ubuntu and Clone Your Voice in 3 Seconds

AI Algorithm Path

Jan 21, 2026 · Artificial Intelligence

Understanding Vector Similarity in Machine Learning: A Plain‑Language Guide

The article explains key vector similarity measures—dot product, cosine similarity, and L1/L2 distances—illustrates their geometric meanings, compares their behavior with concrete examples and PyTorch/Numpy code, and discusses when to prefer each metric in machine‑learning tasks.

Cosine SimilarityL1 distanceL2 distance

0 likes · 8 min read

Understanding Vector Similarity in Machine Learning: A Plain‑Language Guide

xkx's Tech General Store

Jan 19, 2026 · Artificial Intelligence

Beginner’s Guide to VAE: Theory, Training, and Full Implementation

This article walks readers through the fundamentals of Variational Autoencoders, compares five major generative model paradigms, explains VAE architecture, training and inference steps, provides PyTorch code, and analyzes experimental results on MNIST and Flowers datasets.

MNISTPyTorchVAE

0 likes · 16 min read

Beginner’s Guide to VAE: Theory, Training, and Full Implementation

Data Party THU

Jan 18, 2026 · Artificial Intelligence

Unlocking 3D Scene Synthesis: A Deep Dive into Neural Radiance Fields (NeRF)

This article explains the core principles of Neural Radiance Fields, detailing how a fully‑connected network maps 5‑D coordinates to color and density, the role of positional encoding and hierarchical sampling, and provides a complete PyTorch implementation with training and rendering examples.

3D Scene RepresentationHierarchical SamplingNeRF

0 likes · 18 min read

Unlocking 3D Scene Synthesis: A Deep Dive into Neural Radiance Fields (NeRF)

Fun with Large Models

Jan 12, 2026 · Artificial Intelligence

Why You Should Master Large‑Model Training: A Full‑Process Practical Guide

The article explains why mastering large‑model training is crucial for professionals, researchers, and enterprises, outlines the end‑to‑end pipeline—from data preparation and pre‑training to instruction fine‑tuning and RLHF alignment—compares training with RAG, and presents a structured learning roadmap.

AI AgentsData EngineeringPyTorch

0 likes · 14 min read

Why You Should Master Large‑Model Training: A Full‑Process Practical Guide

xkx's Tech General Store

Jan 12, 2026 · Artificial Intelligence

How Traditional Programmers Can Thrive in the AI Era: Understanding YOLOv2 Architecture and Implementation

This article walks through YOLOv2’s eight core upgrades over YOLOv1, explains the design rationale behind each change, provides detailed PyTorch code for the backbone, neck, head and prediction layers, demonstrates training on COCO, and outlines further optimization directions for real‑world object detection.

PyTorchResNetYOLOv2

0 likes · 16 min read

How Traditional Programmers Can Thrive in the AI Era: Understanding YOLOv2 Architecture and Implementation

xkx's Tech General Store

Dec 30, 2025 · Artificial Intelligence

From Theory to Practice: Reproducing YOLOv1 – A Step‑by‑Step Guide for Traditional Programmers

This article provides a comprehensive, hands‑on walkthrough of YOLOv1—from its single‑stage detection principles and core architectural questions to a full PyTorch implementation, training pipeline, common pitfalls, and a live camera demo—targeted at developers transitioning into AI.

PyTorchResNetSPP

0 likes · 10 min read

From Theory to Practice: Reproducing YOLOv1 – A Step‑by‑Step Guide for Traditional Programmers

Data Party THU

Dec 20, 2025 · Artificial Intelligence

Master 20 Essential PyTorch Concepts: From Tensors to Model Deployment

This guide walks you through 20 fundamental PyTorch concepts—including tensor creation, operations, autograd, model building, data loading, GPU acceleration, and best‑practice tricks—providing clear code snippets and step‑by‑step explanations so you can quickly prototype, train, and deploy neural networks.

GPU AccelerationModel TrainingPyTorch

0 likes · 16 min read

Master 20 Essential PyTorch Concepts: From Tensors to Model Deployment

Data STUDIO

Dec 9, 2025 · Artificial Intelligence

20 Core PyTorch Concepts to Accelerate Your AI Projects

This article walks through twenty essential PyTorch concepts—from basic Tensor creation and manipulation, through autograd and neural‑network construction, to data loading, GPU acceleration, model saving, and practical training tricks—providing concrete code examples and clear explanations for developers eager to build and deploy AI models.

AutogradDataLoaderGPU

0 likes · 16 min read

20 Core PyTorch Concepts to Accelerate Your AI Projects

Huawei Cloud Developer Alliance

Nov 24, 2025 · Artificial Intelligence

How to Supercharge Transformer AI Agents with Model Compression and Inference Acceleration

This article explains why Transformer models dominate modern AI agents, outlines the challenges of large parameter counts and latency, and presents a comprehensive guide to model compression (parameter sharing, knowledge distillation, quantization, pruning) and inference acceleration (parallel computing, optimized attention, TensorRT deployment), complete with PyTorch code examples and a real‑world case study showing speed‑up and storage savings.

AI AgentPyTorchTransformer

0 likes · 34 min read

How to Supercharge Transformer AI Agents with Model Compression and Inference Acceleration

Python Programming Learning Circle

Nov 18, 2025 · Artificial Intelligence

Top 10 Python Libraries Every Computer Vision Engineer Should Know

This article compiles the most commonly used Python libraries for computer vision, covering basic image handling with Pillow, high‑performance processing with OpenCV and Mahotas, advanced tools like Scikit‑Image, TensorFlow Image, PyTorch Vision, SimpleCV, Imageio, Albumentations, and the model zoo timm, each with concise descriptions and practical code snippets.

PyTorchTensorFlowcomputer-vision

0 likes · 11 min read

Top 10 Python Libraries Every Computer Vision Engineer Should Know

IT Services Circle

Nov 10, 2025 · Artificial Intelligence

Why PyTorch Co‑Founder Soumith Chintala Is Leaving Meta After 11 Years

Soumith Chintala, one of PyTorch’s original creators, announced his departure from Meta after eleven years, citing a desire to move beyond the framework, reflecting on his pivotal role in building PyTorch, its global impact, and his gratitude to the community while looking ahead to new challenges.

AIMetaPyTorch

0 likes · 12 min read

Why PyTorch Co‑Founder Soumith Chintala Is Leaving Meta After 11 Years

Instant Consumer Technology Team

Oct 21, 2025 · Artificial Intelligence

Boost LLM Originality: Master Temperature Scaling & Top‑K Sampling

This tutorial revisits a simple text‑generation function, explains how temperature scaling and top‑K sampling reshape token probability distributions, demonstrates their effects with PyTorch code and visualizations, and shows how to integrate both techniques into an improved generation routine for more diverse and human‑like outputs.

LLMPyTorchText Generation

0 likes · 13 min read

Boost LLM Originality: Master Temperature Scaling & Top‑K Sampling

AI2ML AI to Machine Learning

Oct 20, 2025 · Artificial Intelligence

nanochat Source Code Deep Dive: Data Prep, Model Design, Training & Evaluation

This article revisits nanochat's core components, detailing the preparation of diverse training datasets, the scaling calculations for tokens and parameters, the model's MQA and KV‑cache design, the full training pipeline with gradient accumulation and mixed‑precision, cost breakdown, inference optimizations, evaluation tasks, and identified limitations with suggested improvements.

EvaluationKV cacheLLM

0 likes · 9 min read

nanochat Source Code Deep Dive: Data Prep, Model Design, Training & Evaluation

AI Algorithm Path

Oct 20, 2025 · Artificial Intelligence

Building a Flow Matching Model from Scratch: Complete Code Walkthrough

This article walks through the full implementation of a flow‑matching generative model in PyTorch, covering dataset creation, a small MLP that learns a time‑dependent velocity field, the flow‑matching loss, training loop, ODE‑based sampling, visualisation of the learned vector field, and a discussion of the method's limitations and possible extensions.

MLPPyTorchflow matching

0 likes · 13 min read

Building a Flow Matching Model from Scratch: Complete Code Walkthrough

BirdNest Tech Talk

Oct 15, 2025 · Artificial Intelligence

How DeepSeek‑V3.2‑Exp Achieves Fast Distributed LLM Inference with FP8 and MoE

This article walks through the DeepSeek‑V3.2‑Exp inference codebase, detailing its MoE architecture, Multi‑Head Latent Attention, FP8 quantization, custom CUDA kernels, and 8‑GPU NCCL‑based distributed execution from initialization through prefill and decode stages.

CUDADistributed InferenceFP8 quantization

0 likes · 9 min read

How DeepSeek‑V3.2‑Exp Achieves Fast Distributed LLM Inference with FP8 and MoE

AI Algorithm Path

Oct 13, 2025 · Artificial Intelligence

Step-by-Step Explanation of Neural ODEs with Code Examples

This article introduces Neural Ordinary Differential Equations, explains their core idea of learning continuous dynamics via a neural derivative function, demonstrates Euler integration, compares naive unfolding with the adjoint method for training, provides a PyTorch implementation, and offers practical tips and extensions such as event handling and physics‑informed models.

Adjoint methodContinuous-time modelingEuler method

0 likes · 11 min read

Step-by-Step Explanation of Neural ODEs with Code Examples

Data Party THU

Oct 4, 2025 · Artificial Intelligence

Unveiling Transformer Internals: From Theory to PyTorch Code

This article deeply explores the Transformer architecture by combining original paper principles with PyTorch source code, covering encoder‑decoder design, positional encoding assumptions, core parameters, residual connections, attention mechanisms, and detailed implementation snippets to help readers understand and reproduce the model.

Positional EncodingPyTorchTransformer

0 likes · 22 min read

Unveiling Transformer Internals: From Theory to PyTorch Code

Alimama Tech

Oct 1, 2025 · Artificial Intelligence

How RecIS Revolutionizes Large‑Scale Sparse‑Dense Recommendation Training

RecIS is an open‑source, PyTorch‑based unified framework designed for ultra‑large‑scale sparse‑dense computation in recommendation systems, offering a full solution for training models with massive samples, multimodal inputs, and large embeddings, and demonstrating significant performance gains over TensorFlow and TorchRec in production deployments.

PyTorchRecommendation Systemsdeep learning framework

0 likes · 24 min read

How RecIS Revolutionizes Large‑Scale Sparse‑Dense Recommendation Training

Data Party THU

Sep 25, 2025 · Artificial Intelligence

Mastering Triplet Loss in Sentence‑Transformers: A Step‑by‑Step Guide

This article explains the concept of triplet loss, its mathematical formulation, the different batch‑wise implementations in the sentence_transformers library, their advantages and drawbacks, and provides a complete Python example for training a text‑embedding model with Triplet Loss.

EmbeddingPyTorchPython

0 likes · 12 min read

Mastering Triplet Loss in Sentence‑Transformers: A Step‑by‑Step Guide

IT Services Circle

Sep 16, 2025 · Artificial Intelligence

Why TensorFlow Is Dying and What the New AI Open‑Source Landscape Looks Like

An in‑depth analysis reveals TensorFlow’s rapid decline, the rise of PyTorch, and how Ant Group’s OpenRank‑driven “Large Model Open‑Source Ecosystem Panorama 2.0” maps shifting trends, from short‑term hype projects to performance‑focused AI infrastructure, highlighting the emerging US‑China dominance in AI open‑source development.

AI EcosystemAI open-sourceOpenRank

0 likes · 15 min read

Why TensorFlow Is Dying and What the New AI Open‑Source Landscape Looks Like

AI Algorithm Path

Aug 23, 2025 · Artificial Intelligence

Understanding QAT: Quantization‑Aware Training with PyTorch

This article explains the principles of model quantization, compares post‑training quantization (PTQ) and quantization‑aware training (QAT), details the QAT workflow in PyTorch—including fake quantization, gradient handling, and code examples—and offers practical tips for achieving high‑accuracy int8/int4 models.

Fake QuantizationPyTorchQAT

0 likes · 15 min read

Understanding QAT: Quantization‑Aware Training with PyTorch

Network Intelligence Research Center (NIRC)

Jul 15, 2025 · Fundamentals

How to Write High‑Performance GPU Code with OpenAI Triton

This article introduces OpenAI's Triton language, compares its block‑wise programming model to traditional CUDA, walks through vector‑addition and fused‑softmax kernel implementations, and presents benchmark results that demonstrate significant speedups over native PyTorch operations.

CUDAGPU programmingPyTorch

0 likes · 10 min read

How to Write High‑Performance GPU Code with OpenAI Triton

AI Algorithm Path

Jul 15, 2025 · Artificial Intelligence

Day 8: Fine‑Tuning CLIP for Image‑Text Tasks – A Beginner’s Guide

This tutorial walks through fine‑tuning OpenAI's CLIP ViT‑B/32 on a small image‑text dataset in a Kaggle notebook, covering environment setup, model loading, data preprocessing with CLIPProcessor, training a linear head, and observing loss convergence to align visual and textual embeddings.

CLIPHuggingFaceKaggle

0 likes · 5 min read

Day 8: Fine‑Tuning CLIP for Image‑Text Tasks – A Beginner’s Guide

Network Intelligence Research Center (NIRC)

Jul 13, 2025 · Artificial Intelligence

Getting Started with Hugging Face Transformers Trainer

This guide walks through the Hugging Face Transformers Trainer library, explaining its core features such as configurable training loops, mixed‑precision and gradient‑accumulation support, seamless distributed training via Accelerate and DeepSpeed, and provides a step‑by‑step example of converting a simple PyTorch CNN model to use Trainer.

AccelerateDeepSpeedHugging Face

0 likes · 7 min read

Getting Started with Hugging Face Transformers Trainer

IT Services Circle

Jul 6, 2025 · Artificial Intelligence

Why Transformers Train Like Any Neural Network: Backpropagation Explained

This article demystifies how Transformers are trained by showing that all their linear layers have learnable weights and biases, and that the attention mechanism—including softmax and dot‑product operations—is fully differentiable and updated via standard back‑propagation.

BackpropagationPyTorchTransformer

0 likes · 7 min read

Why Transformers Train Like Any Neural Network: Backpropagation Explained

AI Algorithm Path

Jul 5, 2025 · Artificial Intelligence

Beginner’s Guide to Vision‑Language Models Day 7: How CLIP Achieves Joint Visual‑Language Understanding

This article explains CLIP’s dual‑encoder architecture—using a Vision Transformer for images and a Transformer for text—how both encoders map inputs into a shared embedding space, the role of cosine similarity, and the InfoNCE contrastive loss that drives joint visual‑language learning.

CLIPInfoNCEMulti-modal Embedding

0 likes · 8 min read

Beginner’s Guide to Vision‑Language Models Day 7: How CLIP Achieves Joint Visual‑Language Understanding

Code Mala Tang

Jun 17, 2025 · Artificial Intelligence

Build a Handwritten Digit Classifier with PyTorch: Step‑by‑Step Guide

This tutorial walks you through building a digit classifier using PyTorch and the MNIST dataset, covering environment setup, data loading, model construction, training, evaluation, and model persistence while explaining core deep‑learning concepts.

MNISTNeural NetworkPyTorch

0 likes · 15 min read

Build a Handwritten Digit Classifier with PyTorch: Step‑by‑Step Guide

MaGe Linux Operations

Jun 15, 2025 · Artificial Intelligence

Mastering Transformers: Key Extensions and Optimization Techniques Explained

This comprehensive guide walks you through the Transformer architecture—from its encoder‑decoder structure and self‑attention mechanism to multi‑head attention, positional embeddings, and practical PyTorch implementations—providing clear visualizations and code examples for deep learning practitioners.

PyTorchSelf-AttentionTransformer

0 likes · 22 min read

Mastering Transformers: Key Extensions and Optimization Techniques Explained

Alibaba Cloud Developer

May 29, 2025 · Artificial Intelligence

Build a Minimal Large Language Model from Scratch with Python and PyTorch

This tutorial walks through creating a simple bigram language model in pure Python, refactoring it into a PyTorch implementation, and explains core concepts such as tokenization, embedding layers, loss functions, gradient descent, training loops, and text generation, preparing you for building a full GPT model.

BigramLLMLanguageModel

0 likes · 31 min read

Build a Minimal Large Language Model from Scratch with Python and PyTorch

php Courses

May 15, 2025 · Artificial Intelligence

Why Python Dominates Data Analysis and Machine Learning: Core Tools, Full‑Stack Solutions, and Learning Path

This article explains why Python has become the leading language for data analysis and machine learning, outlines the essential libraries and frameworks, provides practical code examples, describes typical application scenarios, suggests a staged learning roadmap, and forecasts future trends such as AutoML and federated learning.

AutoMLPyTorchPython

0 likes · 6 min read

Why Python Dominates Data Analysis and Machine Learning: Core Tools, Full‑Stack Solutions, and Learning Path

AI Algorithm Path

May 11, 2025 · Artificial Intelligence

How to Parallelize Ultra‑Large Model Training with PyTorch

The article explains the core concepts and trade‑offs of five parallelism techniques—data, tensor, context, pipeline, and expert parallelism—plus the ZeRO optimizer, showing when each method is appropriate for training ultra‑large PyTorch models and providing concrete code snippets and performance considerations.

Context ParallelismExpert ParallelismLarge‑Scale Training

0 likes · 21 min read

How to Parallelize Ultra‑Large Model Training with PyTorch

Network Intelligence Research Center (NIRC)

Apr 28, 2025 · Artificial Intelligence

Export a PyTorch Model to ONNX and Extract Intermediate Layer Features

This article walks through exporting a PyTorch image‑classification model to ONNX, identifying a hidden layer node, adding it as an output, and using ONNX Runtime to retrieve both the final prediction and the intermediate feature tensor.

ONNXONNX RuntimePyTorch

0 likes · 5 min read

Export a PyTorch Model to ONNX and Extract Intermediate Layer Features

Network Intelligence Research Center (NIRC)

Apr 23, 2025 · Artificial Intelligence

DeepQueueNet in Practice: Quickly Achieve High‑Precision Network Simulation

This article walks through using DeepQueueNet—a deep‑learning‑enhanced network performance estimator—to set up a device model, train the PyTorch version, configure a fattree16 topology, and run multi‑GPU simulations that deliver minute‑level, packet‑accurate results in as little as 1 minute 27 seconds.

DeepQueueNetPyTorchdeep learning

0 likes · 6 min read

DeepQueueNet in Practice: Quickly Achieve High‑Precision Network Simulation

Tencent Technical Engineering

Apr 16, 2025 · Artificial Intelligence

Understanding Transformer Architecture for Chinese‑English Translation: A Practical Guide

This practical guide walks through the full Transformer architecture for Chinese‑to‑English translation, detailing encoder‑decoder structure, tokenization and embeddings, batch handling with padding and masks, positional encodings, parallel teacher‑forcing, self‑ and multi‑head attention, and the complete forward and back‑propagation training steps.

Machine TranslationPositional EncodingPyTorch

0 likes · 26 min read

Understanding Transformer Architecture for Chinese‑English Translation: A Practical Guide

Alibaba Cloud Developer

Apr 7, 2025 · Artificial Intelligence

Why Does GPU Memory Keep Growing in DeepSeek‑R1 Inference? Uncovering PyTorch’s Cache

After deploying the full‑precision DeepSeek‑R1 model on a 2×8‑GPU ACS cluster, repeated stress tests showed GPU memory usage continuously rising without release; this article details the investigation, reproduces the behavior, examines vLLM logs, Prometheus metrics, and reveals PyTorch’s caching allocator as the root cause, offering mitigation tips.

DeepSeekGPU memoryMemory Cache

0 likes · 21 min read

Why Does GPU Memory Keep Growing in DeepSeek‑R1 Inference? Uncovering PyTorch’s Cache

Python Programming Learning Circle

Apr 3, 2025 · Artificial Intelligence

Accelerating PyTorch Model Training: Techniques, Benchmarks, and Code

This article explains how to dramatically speed up PyTorch model training using code optimizations, mixed‑precision, torch.compile, distributed data parallelism, and DeepSpeed, presenting benchmark results that show up to 11.5× acceleration on multiple GPUs while maintaining high accuracy.

DeepSpeedGPUPyTorch

0 likes · 6 min read

Accelerating PyTorch Model Training: Techniques, Benchmarks, and Code

Smart Era Software Development

Mar 31, 2025 · Artificial Intelligence

Common AI Model Formats Developers Use: GGUF, PyTorch, Safetensors, and ONNX

Developers face a variety of AI model formats—GGUF, PyTorch (.pt/.pth), Safetensors, and ONNX—each with distinct structures, advantages, drawbacks, and hardware support, and this article analyzes their metadata organization, quantization options, security considerations, and suitability for different deployment scenarios.

AI model formatsGGUFONNX

0 likes · 15 min read

Common AI Model Formats Developers Use: GGUF, PyTorch, Safetensors, and ONNX

Tencent Technical Engineering

Mar 31, 2025 · Artificial Intelligence

Step-by-Step Guide to Local Training of DeepSeek R1 on Multi‑GPU A100 Systems

This step‑by‑step tutorial shows how to set up CUDA 12.4, install required packages, prepare a JSON dataset and custom reward, troubleshoot out‑of‑memory errors, and launch DeepSeek R1 training on an 8‑GPU A100 cluster using Accelerate, Deepspeed zero‑3 and vLLM configurations.

A100CUDADeepSeek

0 likes · 9 min read

Step-by-Step Guide to Local Training of DeepSeek R1 on Multi‑GPU A100 Systems

Sohu Tech Products

Mar 26, 2025 · Artificial Intelligence

How SpatialLM Turns 3D Point Clouds into Structured Scene Understanding

SpatialLM is a large language model designed for 3D spatial understanding that converts point‑cloud data from videos, RGB‑D images or LiDAR into structured scene descriptions, and this guide explains its architecture, model versions, repository links, and step‑by‑step deployment on Ubuntu with PyTorch.

3D point cloudLarge Language ModelMultimodal AI

0 likes · 7 min read

How SpatialLM Turns 3D Point Clouds into Structured Scene Understanding

AI Algorithm Path

Mar 19, 2025 · Artificial Intelligence

Understanding Multimodal Large Language Models: Part 1

This article explains the fundamentals of multimodal large language models, covering their definition, typical applications, two main architectural approaches—unified embedding decoder and cross‑modal attention—along with detailed component breakdowns, a PyTorch implementation of image‑patch projection, and training considerations, ending with a discussion of trade‑offs between the methods.

Cross-AttentionImage EncoderLinear Projection

0 likes · 14 min read

Understanding Multimodal Large Language Models: Part 1

AI Algorithm Path

Mar 16, 2025 · Artificial Intelligence

Speed Up Your PyTorch Model Training: Practical Tips and Tricks

This article walks through concrete techniques to accelerate PyTorch training, covering mixed‑precision with torch.cuda.amp, profiling with torch.profiler, DataLoader tuning, torch.compile, distributed strategies like DataParallel and DDP, gradient accumulation, and advanced libraries such as Lightning, Apex, and DeepSpeed, plus model‑level optimizations and monitoring tips.

DataLoaderProfilingPyTorch

0 likes · 12 min read

Speed Up Your PyTorch Model Training: Practical Tips and Tricks

AI Algorithm Path

Mar 16, 2025 · Artificial Intelligence

How to Train PyTorch Models Using Far Less GPU Memory

This article walks through a suite of PyTorch techniques—including automatic mixed precision, BF16, gradient checkpointing, gradient accumulation, tensor sharding, efficient data loading, in‑place ops, lightweight optimizers, memory profiling, TorchScript, and kernel fusion—that together can cut peak GPU memory usage by up to twenty‑fold while preserving model accuracy.

GPU memoryPyTorchdata loading

0 likes · 13 min read

How to Train PyTorch Models Using Far Less GPU Memory

DataFunTalk

Mar 2, 2025 · Artificial Intelligence

Implementing GRPO from Scratch with Distributed Reinforcement Learning on Qwen2.5-1.5B-Instruct

This tutorial explains how to build a distributed reinforcement‑learning pipeline using the GRPO algorithm, covering data preparation, evaluation and reward functions, multi‑GPU DataParallel implementation, and full fine‑tuning of the Qwen2.5‑1.5B‑Instruct model with PyTorch, FlashAttention2 and Weights & Biases.

AIGRPOPyTorch

0 likes · 10 min read

Implementing GRPO from Scratch with Distributed Reinforcement Learning on Qwen2.5-1.5B-Instruct

Ubuntu

Mar 1, 2025 · Artificial Intelligence

Step-by-Step Ubuntu AI Setup: Build and Run Your First Model

This guide walks you through the full process of preparing an Ubuntu 24.04 system, installing Python, Git, and a virtual environment, adding TensorFlow, Keras, and PyTorch, and finally coding, training, and evaluating a simple MNIST classifier to demonstrate a working AI model.

MNISTPyTorchTensorFlow

0 likes · 7 min read

Step-by-Step Ubuntu AI Setup: Build and Run Your First Model

Cognitive Technology Team

Feb 24, 2025 · Artificial Intelligence

Fine-Tuning Large Language Models with LoRA: A Step-by-Step Guide and Code Example

This article demonstrates the before-and-after effects of fine‑tuning a large language model, explains the concept with analogies, details hardware setup, dataset preparation, LoRA configuration, training arguments, and provides complete Python code for a pure‑framework fine‑tuning workflow.

HuggingFaceLLM fine-tuningLoRA

0 likes · 24 min read

Fine-Tuning Large Language Models with LoRA: A Step-by-Step Guide and Code Example

JavaEdge

Feb 24, 2025 · Artificial Intelligence

Build a CIFAR‑10 Image Classifier with PyTorch – A Java Developer’s Guide

This tutorial walks Java developers through building, training, evaluating, and deploying a CIFAR‑10 image classifier using PyTorch, covering data loading, preprocessing, network definition, loss and optimizer setup, GPU acceleration, model saving, and per‑class accuracy analysis.

CIFAR-10GPUPyTorch

0 likes · 18 min read

Build a CIFAR‑10 Image Classifier with PyTorch – A Java Developer’s Guide

JavaEdge

Feb 23, 2025 · Artificial Intelligence

How Java Developers Can Build Neural Networks with PyTorch: A Step‑by‑Step Guide

This tutorial walks Java developers through the complete workflow of building, training, and evaluating a neural network in PyTorch, covering network definition, data iteration, forward and backward passes, loss calculation, and parameter updates with detailed code examples and Java‑centric analogies.

BackpropagationJavaNeural Network

0 likes · 12 min read

How Java Developers Can Build Neural Networks with PyTorch: A Step‑by‑Step Guide

AI Code to Success

Feb 19, 2025 · Artificial Intelligence

How to Build Traffic‑Sign Recognition and Sentiment Analysis with Keras – A Step‑by‑Step Guide

This article walks through practical Keras tutorials for image‑based traffic‑sign classification and text‑based sentiment analysis, covering data preparation, preprocessing, model construction, training, evaluation, deployment, and a concise comparison of Keras with TensorFlow and PyTorch.

KerasPyTorchPython

0 likes · 19 min read

How to Build Traffic‑Sign Recognition and Sentiment Analysis with Keras – A Step‑by‑Step Guide

Python Programming Learning Circle

Feb 18, 2025 · Artificial Intelligence

Getting Started with PyTorch: Installation, Core Operations, and Practical Deep Learning Projects

This article introduces PyTorch, covering installation on CPU/GPU, basic tensor operations, automatic differentiation, building and training neural networks, data loading with DataLoader, image classification on MNIST, model deployment, and useful tips for accelerating deep‑learning workflows.

GPUPyTorchdeep learning

0 likes · 9 min read

Getting Started with PyTorch: Installation, Core Operations, and Practical Deep Learning Projects

Ops Development & AI Practice

Feb 16, 2025 · Artificial Intelligence

Why FlashAttention Supercharges Qwen Models: A Technical Deep Dive

This article explains the FlashAttention algorithm, its memory‑efficient tiling and recomputation techniques, and how enabling the flash_attn flag dramatically speeds up Qwen‑series large models while outlining hardware, software requirements and potential trade‑offs.

FlashAttentionGPU OptimizationLarge Language Model

0 likes · 8 min read

Why FlashAttention Supercharges Qwen Models: A Technical Deep Dive

Ops Development & AI Practice

Feb 14, 2025 · Artificial Intelligence

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

This comprehensive guide examines the leading large‑model storage formats—including Hugging Face Transformers, TensorFlow SavedModel, ONNX, TorchScript, and GGUF—detailing their file structures, serialization methods, strengths, weaknesses, and typical use‑cases, helping developers and researchers select the optimal format for their specific AI workloads.

AI DeploymentGGUFModel Formats

0 likes · 21 min read

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

AI Code to Success

Feb 14, 2025 · Artificial Intelligence

TensorFlow vs PyTorch: Which Deep Learning Framework Wins for Your Projects?

An in‑depth comparison of TensorFlow and PyTorch examines their computation graph models, deployment tools, API ergonomics, community ecosystems, and performance characteristics, helping developers decide which framework best fits industrial production or fast‑paced research scenarios.

PyTorchTensorFlowai-development

0 likes · 8 min read

TensorFlow vs PyTorch: Which Deep Learning Framework Wins for Your Projects?

Architect

Feb 13, 2025 · Artificial Intelligence

How to Build a Mini ChatGPT on a Single GPU with MiniMind

This article provides a comprehensive, step‑by‑step guide to training and fine‑tuning a miniature large‑language model called MiniMind, covering lightweight model design, open‑source training pipelines, required datasets, tokenizer options, and deployment via a web UI, all using PyTorch on modest hardware.

AILLMMiniMind

0 likes · 11 min read

How to Build a Mini ChatGPT on a Single GPU with MiniMind

AI Code to Success

Feb 13, 2025 · Artificial Intelligence

Why PyTorch Is the Go-To Framework for Modern AI Development

This article introduces PyTorch, explains its dynamic computation graph, Python‑centric design, and tensor operations, surveys its major applications in computer vision, natural language processing, and reinforcement learning, and provides a step‑by‑step tutorial for building and training a multilayer perceptron on the MNIST dataset.

Dynamic Computation GraphMNISTPyTorch

0 likes · 11 min read

Why PyTorch Is the Go-To Framework for Modern AI Development

Python Programming Learning Circle

Jan 9, 2025 · Artificial Intelligence

Choosing Between Keras and PyTorch: A Guide for Deep Learning Beginners

This article compares Keras and PyTorch for beginners, explaining their differences, showing simple digit‑recognition code examples, and offering practical advice on how to select and transition between the two deep‑learning frameworks.

KerasPyTorchdeep learning

0 likes · 6 min read

Choosing Between Keras and PyTorch: A Guide for Deep Learning Beginners

Python Programming Learning Circle

Jan 3, 2025 · Artificial Intelligence

Visualizing Convolutional Neural Network Features with 40 Lines of Python Code

This article demonstrates how to visualize convolutional features of a VGG‑16 network using only about 40 lines of Python code, explains the underlying concepts, walks through generating patterns by maximizing filter activations, and provides a complete implementation with hooks, loss functions, and multi‑scale optimization.

CNNFeature VisualizationHooks

0 likes · 15 min read

Visualizing Convolutional Neural Network Features with 40 Lines of Python Code

Python Programming Learning Circle

Dec 19, 2024 · Artificial Intelligence

DeepPurpose: An AI Toolkit for Accelerating COVID‑19 Drug Discovery

DeepPurpose, a PyTorch‑based AI toolkit developed by Harvard researchers, provides COVID‑19 bioassay data and 56 cutting‑edge models that enable rapid drug‑target affinity prediction, virtual screening, and drug repurposing with just a few lines of code, dramatically shortening new‑drug development cycles.

AICOVID-19DeepPurpose

0 likes · 7 min read

DeepPurpose: An AI Toolkit for Accelerating COVID‑19 Drug Discovery

Python Programming Learning Circle

Dec 19, 2024 · Artificial Intelligence

Overview of Microsoft’s Open‑Source Computer Vision Recipes Library

The article introduces Microsoft’s open‑source Computer Vision Recipes library, describing its purpose, target audience, repository links, supported vision scenarios such as image classification, similarity, detection, key‑point, segmentation, action recognition, multi‑object tracking and crowd counting, and provides guidance on using PyTorch, Azure and GPU resources.

AzurePyTorchimage classification

0 likes · 7 min read

Overview of Microsoft’s Open‑Source Computer Vision Recipes Library

Cognitive Technology Team

Nov 20, 2024 · Artificial Intelligence

Fundamentals and Implementation of Neural Networks and Transformers with PyTorch Examples

This article provides a comprehensive overview of neural network fundamentals, loss functions, activation functions, embedding techniques, attention mechanisms, multi‑head attention, residual networks, and the full Transformer encoder‑decoder architecture, illustrated with detailed PyTorch code and a practical MiniRBT fine‑tuning case for Chinese text classification.

AIPyTorchTransformer

0 likes · 49 min read

Fundamentals and Implementation of Neural Networks and Transformers with PyTorch Examples

Baobao Algorithm Notes

Nov 19, 2024 · Artificial Intelligence

Demystifying OpenRLHF Loss Functions: From GPTLM to KTO and Beyond

This article walks through the various loss functions used in OpenRLHF—including GPTLMLoss, KDLoss, DPOLoss, KTOLoss, and reward model losses—explaining their mathematical foundations, implementation details, and practical considerations for RLHF training.

DPOKTOLoss Functions

0 likes · 23 min read

Demystifying OpenRLHF Loss Functions: From GPTLM to KTO and Beyond

DaTaobao Tech

Nov 13, 2024 · Artificial Intelligence

Understanding Neural Networks and Transformers: Principles, Implementation, and Applications

The article surveys neural networks from basic neuron operations and loss functions through deep architectures to the Transformer model, detailing embeddings, positional encoding, self‑attention, multi‑head attention, residual links, and encoder‑decoder design, and includes PyTorch code examples for linear regression, translation, and fine‑tuning Hugging Face’s MiniRBT for text classification.

AIAttention MechanismNLP

0 likes · 44 min read

Understanding Neural Networks and Transformers: Principles, Implementation, and Applications

Zhuanzhuan Tech

Oct 16, 2024 · Artificial Intelligence

Optimizing TorchServe Inference Service Architecture for High‑Performance AI Deployment

This article details the engineering practice of optimizing TorchServe‑based AI inference services, covering background challenges, framework selection, GPU‑accelerated Torch‑TRT integration, CPU‑side preprocessing improvements, and deployment on Kubernetes to achieve higher throughput and lower resource consumption.

GPUOptimizationModelServingPyTorch

0 likes · 17 min read

Optimizing TorchServe Inference Service Architecture for High‑Performance AI Deployment

DataFunSummit

Oct 5, 2024 · Artificial Intelligence

Optimizing TorchRec for Large‑Scale Recommendation Systems on PyTorch

This article details the performance‑focused optimizations applied to TorchRec, PyTorch's large‑scale recommendation system library, including CUDA graph capture, multithreaded kernel launches, pinned memory copies, and input‑distribution refinements that together achieve a 2.25× speedup on MLPerf DLRM‑DCNv2 across 16 DGX H100 nodes.

CUDA GraphGPU OptimizationPyTorch

0 likes · 11 min read

Optimizing TorchRec for Large‑Scale Recommendation Systems on PyTorch

Huawei Cloud Developer Alliance

Sep 18, 2024 · Artificial Intelligence

How Distributed Training Powers Massive Language Models: Concepts, Strategies, and Code

This article explains why single‑machine resources are insufficient for training ever‑larger language models, introduces the fundamentals of distributed training systems, details various parallel strategies such as data, model, pipeline, and hybrid parallelism, and provides practical PyTorch code and memory‑optimization techniques to accelerate large‑scale model training.

GPUPyTorchdeep learning

0 likes · 29 min read

How Distributed Training Powers Massive Language Models: Concepts, Strategies, and Code

Baobao Algorithm Notes

Sep 18, 2024 · Artificial Intelligence

Why Training on 1,000 GPUs Is Harder Than You Think—and How to Tame It

Training deep learning models on a thousand GPUs faces steep communication overhead, higher failure probability, and scaling inefficiencies, but by profiling each step, overlapping compute and communication, using gradient bucketing and accumulation, and employing elastic training techniques, practitioners can approach near‑linear performance while mitigating common pitfalls.

GPU scalingPerformance OptimizationPyTorch

0 likes · 13 min read

Why Training on 1,000 GPUs Is Harder Than You Think—and How to Tame It

Open Source Tech Hub

Sep 3, 2024 · Artificial Intelligence

Run Vision Transformer in PHP with phpy: A Complete Step‑by‑Step Guide

This article explains how to implement and run a Vision Transformer (ViT) model in PHP using the phpy extension, covering ViT fundamentals, installation of Python dependencies, full PHP and Python code examples, and practical application scenarios for PHP developers.

AIPHPPyTorch

0 likes · 15 min read

Run Vision Transformer in PHP with phpy: A Complete Step‑by‑Step Guide

Rare Earth Juejin Tech Community

Aug 22, 2024 · Artificial Intelligence

Understanding Faster R-CNN: Architecture, Training, and Experimental Results

This article provides an in‑depth overview of the Faster R‑CNN object detection framework, covering its background, key innovations such as the Region Proposal Network, detailed algorithmic principles, training procedures, experimental results on PASCAL VOC and MS COCO, and a reproducible PyTorch implementation.

Faster R-CNNPyTorchRPN

0 likes · 14 min read

Understanding Faster R-CNN: Architecture, Training, and Experimental Results

Rare Earth Juejin Tech Community

Jun 30, 2024 · Artificial Intelligence

Spatial Attention Mechanism and Its PyTorch Implementation

This article explains the principle of spatial attention in convolutional neural networks, details the underlying algorithmic steps, and provides a complete PyTorch implementation including the attention module, full network architecture, and practical considerations for integrating spatial attention into deep learning models.

CNNNeural NetworkPyTorch

0 likes · 10 min read

Spatial Attention Mechanism and Its PyTorch Implementation

Rare Earth Juejin Tech Community

Jun 16, 2024 · Artificial Intelligence

HRNet Source Code Walkthrough: Keypoint Dataset Construction, Online Data Augmentation, and Training Pipeline

This article provides a detailed, English-language walkthrough of the HRNet source code, covering how the COCO keypoint dataset is built, the online data‑augmentation techniques applied during training, and the end‑to‑end training and inference procedures for human pose estimation.

Data AugmentationHRNetPyTorch

0 likes · 36 min read

HRNet Source Code Walkthrough: Keypoint Dataset Construction, Online Data Augmentation, and Training Pipeline

Practical DevOps Architecture

May 30, 2024 · Artificial Intelligence

Eight‑Week LLM and Large Model Training Course Outline

This article outlines an eight‑week curriculum covering LLM evolution, PyTorch fundamentals, CUDA training, large‑model fine‑tuning, LangChain application development, cloud‑based quantization, industry case studies, and a recruitment session, providing video resources for each topic.

AILLMLangChain

0 likes · 5 min read

Eight‑Week LLM and Large Model Training Course Outline

Python Programming Learning Circle

May 11, 2024 · Artificial Intelligence

A Comprehensive Overview of Popular Python Libraries for Artificial Intelligence and Data Science

This article introduces and demonstrates more than twenty widely used Python libraries for artificial intelligence, computer vision, natural language processing, and data analysis, providing concise explanations and runnable code snippets that illustrate each library's core functionality and typical use cases.

NumPyPyTorchPython

0 likes · 29 min read

A Comprehensive Overview of Popular Python Libraries for Artificial Intelligence and Data Science

OPPO Kernel Craftsman

Mar 29, 2024 · Artificial Intelligence

InternLM Model Research and XTuner Practical Guide (Part 1): DataLoader, Model Conversion, Merging, and Inference

The guide walks through fine‑tuning InternLM‑Chat‑7B with XTuner, showing how to build a DataLoader from a HuggingFace Dataset, convert a LoRA .pth checkpoint to HuggingFace format, merge the adapter into the base model, run inference, and adapt the process for custom datasets and 4‑bit quantization experiments.

DataLoaderFineTuningInternLM

0 likes · 27 min read

InternLM Model Research and XTuner Practical Guide (Part 1): DataLoader, Model Conversion, Merging, and Inference

Test Development Learning Exchange

Mar 27, 2024 · Artificial Intelligence

Introduction to PyTorch and Example CNN Training on CIFAR-10

This article introduces PyTorch as a leading open‑source deep‑learning framework, outlines its key components such as dynamic computation graphs, tensors, autograd, modules, optimizers, data loading, distributed training and TorchScript, and provides a complete Python example that defines a simple CNN and trains it on the CIFAR‑10 dataset.

CNNPyTorchPython

0 likes · 8 min read

Introduction to PyTorch and Example CNN Training on CIFAR-10

Alibaba Cloud Big Data AI Platform

Feb 28, 2024 · Artificial Intelligence

How PAI‑TorchAcc Supercharges OLMo LLM Training with Up to 1.64× Speedup

PAI‑TorchAcc, Alibaba Cloud’s PyTorch accelerator, integrates the open‑source OLMo large language model and delivers up to 1.64× faster training on OLMo‑1B and 1.52× on OLMo‑7B by leveraging graph capture, distributed, compute, communication, and memory optimizations, with detailed usage steps and performance analysis.

LLM trainingOLMoPAI‑TorchAcc

0 likes · 7 min read

How PAI‑TorchAcc Supercharges OLMo LLM Training with Up to 1.64× Speedup

Alibaba Cloud Big Data AI Platform

Feb 23, 2024 · Artificial Intelligence

How PAI‑TorchAcc Supercharges Large‑Model Training on Alibaba Cloud

PAI‑TorchAcc, an Alibaba Cloud AI platform accelerator, offers a seamless PyTorch interface that integrates HuggingFace models and employs LazyTensor‑based static graph conversion, multi‑strategy distributed training, and extensive GPU optimizations to dramatically boost throughput for 1B‑175B parameter models, surpassing PyTorch native and Megatron‑LM performance.

AI accelerationAlibaba CloudGPU Optimization

0 likes · 13 min read

How PAI‑TorchAcc Supercharges Large‑Model Training on Alibaba Cloud

DataFunTalk

Feb 13, 2024 · Artificial Intelligence

An Overview of NVIDIA NeMo: Open‑Source Framework for Speech AI, ASR, TTS, NLP and Large Language Model Training

This article introduces NVIDIA’s open‑source NeMo framework, detailing its PyTorch‑based architecture for Speech AI, ASR and TTS training, NLP and LLM support, GPU‑optimized parallelism, pre‑trained model resources, fine‑tuning techniques, and the accompanying NeMo Aligner and Framework tools.

ASRNVIDIA NeMoPyTorch

0 likes · 18 min read

An Overview of NVIDIA NeMo: Open‑Source Framework for Speech AI, ASR, TTS, NLP and Large Language Model Training

Baidu Geek Talk

Feb 5, 2024 · Artificial Intelligence

Why Static Graphs Outperform Dynamic Graphs in AutoDiff: A Deep Dive

This article explains the fundamental differences between static and dynamic computation graphs, compares their memory and performance characteristics, shows how automatic differentiation works in each paradigm, and provides a step‑by‑step implementation of a toy static‑graph AutoDiff engine with Python code examples.

AutoDiffDynamic GraphPyTorch

0 likes · 18 min read

Why Static Graphs Outperform Dynamic Graphs in AutoDiff: A Deep Dive

Open Source Tech Hub

Jan 28, 2024 · Artificial Intelligence

How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer

This guide explains how to use ModelScope’s trainer components to fine‑tune a pretrained backbone for text classification, covering dataset loading, configuration modification, trainer construction, training, evaluation, prediction, and checkpoint management with concrete code examples.

ModelScopePyTorchText Classification

0 likes · 11 min read

How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer