Author

AI Algorithm Path

A public account focused on deep learning, computer vision, and autonomous driving perception algorithms, covering visual CV, neural networks, pattern recognition, related hardware and software configurations, and open-source projects.

138

Articles

Likes

442

Views

Comments

Latest from AI Algorithm Path

100 recent articles max

AI Algorithm Path

Jun 11, 2025 · Artificial Intelligence

OpenAI's O3‑Pro Model: Deep Reasoning, Pricing, Benchmarks, and Access Guide

OpenAI introduced the O3‑Pro multimodal deep‑reasoning model with an 80% price cut for O3, detailed its training via large‑scale reinforcement learning, compared its capabilities and costs against GPT‑4o, GPT‑4.1 and O3‑Pro, listed its core specs, limitations, access methods, and presented benchmark tests that highlight both strengths and weaknesses.

AIO3-ProOpenAI

0 likes · 10 min read

OpenAI's O3‑Pro Model: Deep Reasoning, Pricing, Benchmarks, and Access Guide

AI Algorithm Path

Jun 8, 2025 · Artificial Intelligence

Autoregressive vs Diffusion Language Models: Principles, Trade‑offs, and Future Directions

The article compares autoregressive and diffusion language models, detailing their mathematical foundations, training and inference pipelines, performance trade‑offs such as speed, coherence and diversity, and explores hybrid approaches and emerging research directions for more efficient and controllable text generation.

AI researchTransformerautoregressive

0 likes · 17 min read

Autoregressive vs Diffusion Language Models: Principles, Trade‑offs, and Future Directions

AI Algorithm Path

Jun 4, 2025 · Artificial Intelligence

Why LLMs Hallucinate and How to Mitigate the Problem

The article explains that hallucinations in large language models stem mainly from the supervised fine‑tuning stage, illustrates the issue with concrete examples, and presents mitigation techniques such as knowledge‑probing data generation and web‑search tool integration using special tokens.

LLMMetaOpenAssistant

0 likes · 12 min read

Why LLMs Hallucinate and How to Mitigate the Problem

AI Algorithm Path

Jun 3, 2025 · Artificial Intelligence

Inside Tencent’s HunyuanVideo-Avatar: How Open‑Source AI Generates Digital Human Videos

Tencent’s HunyuanVideo-Avatar converts a static portrait and an audio clip into a lip‑synced, expressive video using a multimodal diffusion Transformer, offering open‑source weights, detailed module designs, hardware requirements, code examples, and a candid assessment of its strengths and current limitations.

AI Video GenerationCUDAHunyuanVideo-Avatar

0 likes · 8 min read

AI Algorithm Path

May 27, 2025 · Artificial Intelligence

Reinforcement Learning Tutorial 8: Building State Feature Representations for Objective Optimization

This tutorial explains how to construct state feature vectors for reinforcement‑learning value‑function approximation, covering linear, polynomial, Fourier, and radial‑basis representations, as well as state aggregation techniques such as coarse coding and tile coding, and discusses non‑parametric approaches like kernel methods.

feature engineeringfourier basisfunction approximation

0 likes · 16 min read

Reinforcement Learning Tutorial 8: Building State Feature Representations for Objective Optimization

AI Algorithm Path

May 25, 2025 · Artificial Intelligence

Reinforcement Learning Tutorial 7: Introducing Value Function Approximation Methods

This article explains why tabular reinforcement‑learning methods scale poorly, introduces supervised‑learning‑based value‑function approximation using a parameterized vector w, discusses loss design, stochastic‑gradient updates, bootstrapping, semi‑gradient techniques, and linear function approximation, and summarizes practical implications.

gradient Monte Carlolinear function approximationreinforcement learning

0 likes · 13 min read

Reinforcement Learning Tutorial 7: Introducing Value Function Approximation Methods

AI Algorithm Path

May 24, 2025 · Artificial Intelligence

Claude 4 Unveiled: What the New AI Model Means for Coding, Safety, and Pricing

Claude 4 introduces two upgraded models—Opus 4, touted as the world’s best coding model, and Sonnet 4 with stronger reasoning—along with new tool‑use capabilities, benchmark wins, a controversial safety test showing opportunistic extortion, and detailed pricing and availability in the Cursor IDE.

AI modelAnthropicClaude 4

0 likes · 10 min read

Claude 4 Unveiled: What the New AI Model Means for Coding, Safety, and Pricing

AI Algorithm Path

May 24, 2025 · Artificial Intelligence

How N-step Temporal-Difference Methods Extend TD Learning in Reinforcement AI

This tutorial explains how n-step temporal‑difference (TD) algorithms generalize the one‑step TD and Monte‑Carlo methods, presents the n‑step return update rule, walks through a three‑step TD example, shows how Sarsa and Q‑learning can be extended, and discusses how to choose the optimal n value for a given problem.

Monte CarloQ-Learningalgorithm analysis

0 likes · 9 min read

How N-step Temporal-Difference Methods Extend TD Learning in Reinforcement AI

AI Algorithm Path

May 23, 2025 · Artificial Intelligence

Understanding Temporal‑Difference Algorithms in Reinforcement Learning

This tutorial explains temporal‑difference (TD) learning, compares it with dynamic programming and Monte‑Carlo methods, walks through concrete soccer‑match examples, shows one‑step TD versus constant‑α Monte‑Carlo updates, discusses convergence, bias, and introduces popular TD variants such as Sarsa, Q‑learning, Expected Sarsa and double learning.

Monte CarloQ-LearningTD learning

0 likes · 18 min read

Understanding Temporal‑Difference Algorithms in Reinforcement Learning

AI Algorithm Path

May 22, 2025 · Artificial Intelligence

Monte Carlo Policy Improvement in RL: Epsilon‑Greedy, On‑Policy vs Off‑Policy, and Incremental Updates

This tutorial explains how Monte Carlo methods are enhanced in reinforcement learning through epsilon‑greedy and epsilon‑soft policies, Monte Carlo control, a Blackjack Q‑function example, the distinction between on‑policy and off‑policy learning, importance sampling, and efficient incremental update techniques.

Epsilon-GreedyImportance SamplingIncremental Update

0 likes · 14 min read

Monte Carlo Policy Improvement in RL: Epsilon‑Greedy, On‑Policy vs Off‑Policy, and Incremental Updates