Author

AI Algorithm Path

A public account focused on deep learning, computer vision, and autonomous driving perception algorithms, covering visual CV, neural networks, pattern recognition, related hardware and software configurations, and open-source projects.

138

Articles

Likes

442

Views

Comments

Latest from AI Algorithm Path

100 recent articles max

AI Algorithm Path

Aug 16, 2025 · Artificial Intelligence

Qwen-Image: The Best Open‑Source AI Image Generation Model Unveiled

Qwen-Image, an open‑source multimodal diffusion model, introduces a three‑component architecture, dual‑stream encoding, and a novel MSRoPE positional scheme to achieve superior text‑aligned image generation, with extensive benchmark results, detailed data engineering, progressive training strategies, and publicly released weights for easy access.

AI image generationMSRoPEOpen Source

0 likes · 9 min read

Qwen-Image: The Best Open‑Source AI Image Generation Model Unveiled

AI Algorithm Path

Aug 9, 2025 · Artificial Intelligence

How LoRA Enables Multimodal Capabilities in Large Language Models

This article compares two ways to add vision to large language models—training a native multimodal model from scratch or attaching a visual module to a pretrained LLM—then details the VoRA approach that uses LoRA adapters to inject visual knowledge without extra inference cost.

ChameleonLLaVALoRA

0 likes · 7 min read

How LoRA Enables Multimodal Capabilities in Large Language Models

AI Algorithm Path

Aug 8, 2025 · Artificial Intelligence

GPT‑5 Is Here: In‑Depth Technical Walkthrough of Architecture, Features, and Benchmarks

OpenAI’s GPT‑5, released on August 7 2025, introduces a unified system with real‑time routing, up to 400 k token context windows, multiple model families, refined safety mechanisms, new API controls, and benchmark results that show it surpasses GPT‑4 across intelligence, coding, instruction following, function calling and multimodal tasks.

AI architectureAPIGPT-5

0 likes · 9 min read

GPT‑5 Is Here: In‑Depth Technical Walkthrough of Architecture, Features, and Benchmarks

AI Algorithm Path

Aug 3, 2025 · Artificial Intelligence

Inside Meta’s PerceptionLM: A Deep Dive into Open‑Source Vision‑Language Models

The article provides a detailed analysis of Meta’s PerceptionLM, an open‑source perception language model built on Llama 3, describing its vision encoder, projector, dynamic tiling, three‑stage training pipeline, model variants, and competitive performance on image and video benchmarks.

Dynamic TilingLlama3Multimodal Training

0 likes · 10 min read

Inside Meta’s PerceptionLM: A Deep Dive into Open‑Source Vision‑Language Models

AI Algorithm Path

Aug 2, 2025 · Artificial Intelligence

Deep Learning Optimizers Demystified: Momentum, AdaGrad, RMSProp & Adam Explained

This article breaks down the core deep‑learning optimizers—gradient descent, Momentum, AdaGrad, RMSProp and Adam—showing why vanilla gradient descent converges slowly, how each method uses exponential moving averages to accelerate training, and why Adam is generally the preferred choice.

AdaGradAdamDeep Learning

0 likes · 8 min read

Deep Learning Optimizers Demystified: Momentum, AdaGrad, RMSProp & Adam Explained

AI Algorithm Path

Aug 1, 2025 · Fundamentals

Intuitive Explanation of the Exponential Weighted Moving Average Algorithm

This article explains the exponential weighted moving average (EWMA) as a practical time‑series approximation method, detailing its motivation, recursive formula, weight decay behavior, typical beta values, and a bias‑correction technique that improves early‑stage estimates.

Beta ParameterBias CorrectionExponential Weighted Moving Average

0 likes · 7 min read

Intuitive Explanation of the Exponential Weighted Moving Average Algorithm

AI Algorithm Path

Jul 29, 2025 · Artificial Intelligence

Why GLM‑4.5 Sets a New Benchmark for Open‑Source Large Language Models

GLM‑4.5 and its lightweight Air variant, featuring a deep‑layered MoE design, grouped‑query attention, and dual inference modes, achieve third‑place overall on 12 hard‑core benchmarks, excel in web‑browsing and tool‑calling with a 90.6 % success rate, and introduce novel training tricks such as the Muon optimizer and Slime RL framework.

AIGLM-4.5Large Language Model

0 likes · 8 min read

Why GLM‑4.5 Sets a New Benchmark for Open‑Source Large Language Models

AI Algorithm Path

Jul 27, 2025 · Artificial Intelligence

Understanding RLHF: How Human Feedback Trains Modern LLMs

This article explains the RLHF (Reinforcement Learning from Human Feedback) pipeline that powers ChatGPT and other large language models, covering the limitations of traditional fine‑tuning, the creation of human‑feedback datasets, reward‑model training, loss design, and the final PPO‑based fine‑tuning step.

ChatGPTHuman FeedbackLarge Language Models

0 likes · 8 min read

Understanding RLHF: How Human Feedback Trains Modern LLMs

AI Algorithm Path

Jul 26, 2025 · Artificial Intelligence

Qwen3-Coder: Alibaba’s 480‑Billion‑Parameter Open‑Source Code Model Takes on Claude 4

Alibaba’s Qwen team has released Qwen3-Coder, a 480‑billion‑parameter open‑source LLM specialized for code, featuring a 1‑million‑token context via YaRN, extensive benchmark superiority over most open models, and performance that rivals Claude 4 Sonnet while remaining fully accessible.

APILarge Language ModelOpen-source LLM

0 likes · 12 min read

Qwen3-Coder: Alibaba’s 480‑Billion‑Parameter Open‑Source Code Model Takes on Claude 4

AI Algorithm Path

Jul 20, 2025 · Artificial Intelligence

How to Build an Open‑Set Object Detection Workflow: A Comprehensive Guide

This article presents a step‑by‑step agentic object detection pipeline that combines open‑vocabulary detectors such as Grounding‑DINO with visual language models (GPT‑4o, o1) for concept extraction, critique, refinement, and validation, complete with code snippets, design rationale, and real‑world examples.

Grounding DINOOpen-Vocabulary DetectionPipeline

0 likes · 33 min read

How to Build an Open‑Set Object Detection Workflow: A Comprehensive Guide