Tagged articles

deep learning

1261 articles · Page 2 of 13
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 18, 2025 · Artificial Intelligence

Time Series Paper Digest (Oct 11‑17 2025): FIRE, CauchyNet, EvoRate, CoRA

From Oct 11‑17 2025, this digest presents four recent AI papers on time‑series forecasting: FIRE introduces a frequency‑domain decomposition with independent amplitude‑phase modeling and adaptive weighting; CauchyNet leverages holomorphic activations for compact, data‑efficient learning; the EvoRate framework quantifies learnability via mutual information; and CoRA adds covariate‑aware adaptation to foundation models, all reporting significant accuracy gains and enhanced interpretability.

AI researchcovariate-aware adaptationdeep learning
0 likes · 10 min read
Time Series Paper Digest (Oct 11‑17 2025): FIRE, CauchyNet, EvoRate, CoRA
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 11, 2025 · Artificial Intelligence

Recent Advances in Multivariate Time Series Forecasting: Paper Summaries (Sep 27 – Oct 10 2025)

This article summarizes eight newly released AI papers on multivariate time‑series forecasting and anomaly detection, detailing each work's motivation, proposed methodology, key innovations such as CRIB, TS‑JEPA, DSAT‑HD, DIMIGNN, ASTGI, IndexNet, TsLLM, Moon, TimeSeriesScientist, MLG‑4TS, and Augur, and reports their experimental validation on real‑world datasets.

Anomaly DetectionLarge Language ModelTransformer
0 likes · 23 min read
Recent Advances in Multivariate Time Series Forecasting: Paper Summaries (Sep 27 – Oct 10 2025)
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Oct 10, 2025 · Artificial Intelligence

Quantitative Finance Paper Digest (Sep 27 – Oct 10 2025)

This digest summarizes recent arXiv papers that introduce new AI‑driven methods for portfolio similarity, Bayesian portfolio optimization, end‑to‑end deep‑learning portfolio construction, large‑language‑model‑based financial prediction, and multi‑agent crypto‑trading systems, highlighting their datasets, architectures, and empirical gains.

Bayesian OptimizationMulti-Agent Systemsasset allocation
0 likes · 18 min read
Quantitative Finance Paper Digest (Sep 27 – Oct 10 2025)
Data Party THU
Data Party THU
Oct 5, 2025 · Artificial Intelligence

How ImageDDI Boosts Drug‑Drug Interaction Prediction with Motif Sequences and Molecular Images

The ImageDDI framework, introduced by a team from Hunan University, combines molecular motif sequences with 2D/3D molecular images using a Transformer encoder and adaptive feature fusion, achieving significantly higher accuracy and macro‑F1 scores than existing methods on multiple DDI datasets, while also providing interpretable visual explanations.

Drug InteractionImage FusionMolecular Representation
0 likes · 10 min read
How ImageDDI Boosts Drug‑Drug Interaction Prediction with Motif Sequences and Molecular Images
Data Party THU
Data Party THU
Oct 4, 2025 · Artificial Intelligence

Unveiling Transformer Internals: From Theory to PyTorch Code

This article deeply explores the Transformer architecture by combining original paper principles with PyTorch source code, covering encoder‑decoder design, positional encoding assumptions, core parameters, residual connections, attention mechanisms, and detailed implementation snippets to help readers understand and reproduce the model.

Positional EncodingPyTorchTransformer
0 likes · 22 min read
Unveiling Transformer Internals: From Theory to PyTorch Code
Mashang Consumer UXC
Mashang Consumer UXC
Sep 29, 2025 · Artificial Intelligence

Open-Source AI 3D, Video & Audio Models: Tencent, Vidu, Audio2Face and More

This article reviews the latest open‑source AI models released by major tech firms—including Tencent's 3D‑Omni and 3D‑Part, Shengshu Tech's Vidu Q2 for facial video, Nvidia's Audio2Face for real‑time facial animation, plus updates from Figma, Google, Alibaba and Kuaishou—highlighting their capabilities and potential applications in gaming, AR/VR, design and content creation.

3D modelingAIcomputer graphics
0 likes · 8 min read
Open-Source AI 3D, Video & Audio Models: Tencent, Vidu, Audio2Face and More
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Sep 25, 2025 · Artificial Intelligence

How MARS Uses Risk‑Aware Multi‑Agent RL to Master Portfolio Management

This article reviews the MARS framework, a risk‑aware multi‑agent reinforcement‑learning system for automated portfolio management that tackles market non‑stationarity and proactive risk control, detailing its hierarchical architecture, formal MDP formulation, training process, and superior experimental results on DJIA and HSI benchmarks.

Portfolio Managementdeep learningfinancial markets
0 likes · 13 min read
How MARS Uses Risk‑Aware Multi‑Agent RL to Master Portfolio Management
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Sep 25, 2025 · Artificial Intelligence

Master Self-Attention & Multi-Head Attention for Large Model Interviews

This guide breaks down the core logic, computation steps, formulas, and common interview questions about Self‑Attention and Multi‑Head Attention in Transformers, offering concrete explanations, dimensional examples, and practical answering techniques to help candidates ace large‑model algorithm interviews.

Interview TipsMulti-Head AttentionSelf-Attention
0 likes · 8 min read
Master Self-Attention & Multi-Head Attention for Large Model Interviews
AIWalker
AIWalker
Sep 24, 2025 · Artificial Intelligence

Top 2025 Object Detection Research Paths: From Grounding DINO 1.5 to Open‑Set Breakthroughs

The article outlines four key innovation avenues—architecture redesign, task expansion, information fusion, and paradigm shift—highlighting recent works such as Mr. DETR, Grounding DINO 1.5, SM3Det, and RoboFusion, and offers a curated list of 176 cutting‑edge object‑detection papers with code and datasets for free.

deep learningmodel architectureobject detection
0 likes · 8 min read
Top 2025 Object Detection Research Paths: From Grounding DINO 1.5 to Open‑Set Breakthroughs
Data Party THU
Data Party THU
Sep 24, 2025 · Artificial Intelligence

What’s New in Stanford’s CS231n 2025: Full Course Materials and Syllabus

Stanford’s CS231n Spring 2025 course, led by Fei‑Fei Li and a team of leading AI researchers, is now fully available online with video lectures, detailed syllabus, instructor bios, and prerequisite guidelines, offering a comprehensive deep‑learning curriculum for computer‑vision enthusiasts.

CS231nCourseStanford
0 likes · 5 min read
What’s New in Stanford’s CS231n 2025: Full Course Materials and Syllabus
Data Party THU
Data Party THU
Sep 20, 2025 · Artificial Intelligence

How Mamba-Adaptor Revives State‑Space Models for Vision Tasks

The Mamba-Adaptor introduces a dual‑module adapter that overcomes causal computation limits, long‑range memory decay, and spatial structure loss in state‑space models, delivering state‑of‑the‑art results on ImageNet, COCO, and various downstream visual tasks with minimal overhead.

AdapterCOCOImageNet
0 likes · 8 min read
How Mamba-Adaptor Revives State‑Space Models for Vision Tasks
AIWalker
AIWalker
Sep 17, 2025 · Artificial Intelligence

Cutting-Edge Attention Mechanism Innovations for 2025: Modal Fusion and Domain Adaptation

This article surveys 183 recent attention‑mechanism papers, classifies them into four innovation categories, and highlights representative works such as MILA, ARFFT, CNN‑Transformer for speech emotion, and LSTM‑attention epidemic forecasting, providing concrete methods, code links, and performance insights.

2025Attention MechanismDomain Adaptation
0 likes · 7 min read
Cutting-Edge Attention Mechanism Innovations for 2025: Modal Fusion and Domain Adaptation
DataFunTalk
DataFunTalk
Sep 14, 2025 · Artificial Intelligence

Why Modern LLMs Skip Thinking: Token Routing and Zero‑Compute Experts Explained

The article examines how large language models now use routing mechanisms and token‑level expert selection to reduce computation and cost, illustrating the trade‑offs with real‑world examples from OpenAI, LongCat, and DeepSeek while highlighting both the benefits and the pitfalls of this approach.

AIdeep learningmodel routing
0 likes · 8 min read
Why Modern LLMs Skip Thinking: Token Routing and Zero‑Compute Experts Explained
Data Party THU
Data Party THU
Sep 13, 2025 · Artificial Intelligence

How AI is Revolutionizing Quantum System Modeling: A Comprehensive Review

This review surveys how artificial intelligence—through machine learning, deep learning, and large language models—enables researchers to characterize, predict, and reconstruct complex quantum systems, outlines a unified learning framework, discusses current breakthroughs and challenges, and envisions a future "quantum GPT" that could transform quantum science.

AIQuantum Physicsdeep learning
0 likes · 10 min read
How AI is Revolutionizing Quantum System Modeling: A Comprehensive Review
AI Frontier Lectures
AI Frontier Lectures
Sep 9, 2025 · Artificial Intelligence

Can UniConvNet Expand Receptive Fields While Preserving Gaussian Distribution?

The paper introduces UniConvNet, a novel convolutional architecture that expands the effective receptive field (ERF) of ConvNets without breaking the asymptotically Gaussian distribution (AGD), achieving superior accuracy‑parameter and accuracy‑FLOPs trade‑offs across image classification, detection, and segmentation benchmarks.

Effective Receptive FieldUniConvNetconvolutional neural networks
0 likes · 9 min read
Can UniConvNet Expand Receptive Fields While Preserving Gaussian Distribution?
AI Frontier Lectures
AI Frontier Lectures
Sep 7, 2025 · Artificial Intelligence

How Dynamic Snake and Pinwheel Convolutions Boost Small‑Target Segmentation Accuracy

This article reviews two recent AI papers—Dynamic Snake Convolution with topological constraints for tubular structure segmentation and Pinwheel‑shaped Convolution with scale‑based dynamic loss for infrared small‑target detection—detailing their methods, innovations, experimental gains, and future research directions.

deep learningdynamic convolutionmedical imaging
0 likes · 7 min read
How Dynamic Snake and Pinwheel Convolutions Boost Small‑Target Segmentation Accuracy
Architects' Tech Alliance
Architects' Tech Alliance
Sep 7, 2025 · Artificial Intelligence

How Huawei’s Ascend 910D Stacks Up Against Global AI Chip Rivals

Huawei’s Ascend 910D AI chip boasts a revamped architecture, 320 TFLOPS half‑precision performance, liquid‑cooling with only 350 W power, and 4 TB/s inter‑chip bandwidth, and the article compares these advantages to previous 910 models, domestic competitors and leading foreign chips such as Nvidia H100, highlighting performance, cost and ecosystem benefits.

AI chipAscend 910DHardware
0 likes · 15 min read
How Huawei’s Ascend 910D Stacks Up Against Global AI Chip Rivals
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Sep 3, 2025 · Artificial Intelligence

Decoding TINs: Reconstructing Classic Technical Analysis with Neural Networks

The paper introduces Technical Indicator Networks (TINs), a framework that maps traditional technical analysis formulas to neural‑network topologies, initializes weights to preserve indicator behavior, and uses reinforcement learning for dynamic optimization, achieving significantly higher Sharpe, Sortino, and cumulative returns on US30 component stocks than conventional MACD approaches.

Algorithmic TradingTechnical Indicator Networksdeep learning
0 likes · 9 min read
Decoding TINs: Reconstructing Classic Technical Analysis with Neural Networks
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Sep 3, 2025 · Artificial Intelligence

Understanding AI Compilers: A TVM Example

The article explains how AI compilers transform high‑level models into efficient hardware code, using TVM to illustrate operator optimization, automated scheduling, and end‑to‑end compilation workflow with concrete code examples and performance considerations.

AI compilerTVMauto-scheduler
0 likes · 8 min read
Understanding AI Compilers: A TVM Example
Data Party THU
Data Party THU
Sep 2, 2025 · Artificial Intelligence

Gradient-Based Multi-Objective Deep Learning: Theory, Algorithms, and LLM Applications

This tutorial provides a systematic overview of gradient‑based multi‑objective optimization for deep learning, covering core solution strategies, algorithmic details, convergence and generalization analyses, and demonstrates how these methods can be applied to fine‑tune and align large language models.

Gradient MethodsLLM fine-tuningPareto Front
0 likes · 3 min read
Gradient-Based Multi-Objective Deep Learning: Theory, Algorithms, and LLM Applications
Data STUDIO
Data STUDIO
Sep 2, 2025 · Artificial Intelligence

Understanding NAS: Core Algorithms and Python Implementations

This article reviews Neural Architecture Search (NAS), explains its bi‑level optimization formulation, compares three major search strategies—reinforcement learning, evolutionary algorithms, and differentiable gradient‑based methods—provides complete Python code for each, and analyzes experimental results highlighting performance trade‑offs and remaining challenges.

Differentiable Architecture SearchEvolutionary AlgorithmsNAS
0 likes · 25 min read
Understanding NAS: Core Algorithms and Python Implementations
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Aug 29, 2025 · Artificial Intelligence

Weekly Quantitative Finance Paper Digest (Aug 23‑29, 2025)

This digest summarizes nine recent arXiv papers covering quantum portfolio optimization, thematic investing with semantic stock representations, multi‑indicator reinforcement learning for trading, attention‑based asset pricing, ESG variable selection, deep neural networks for return distribution forecasting, a foundation model for financial time‑series, a multi‑agent trading system with self‑reflection, and dynamic weighting machine‑learning stock selection strategies.

ESGdeep learningfinancial time series
0 likes · 17 min read
Weekly Quantitative Finance Paper Digest (Aug 23‑29, 2025)
Data Party THU
Data Party THU
Aug 29, 2025 · Artificial Intelligence

How AI Is Transforming Ceramic Artifact Classification and Market Valuation

A collaborative study by Universiti Putra Malaysia and UNSW Sydney presents an AI-driven framework that combines an enhanced YOLOv11 model with a random‑forest regressor to automatically classify ceramic artifacts and predict their auction prices, demonstrating significant performance gains over traditional methods.

AICeramic ClassificationYOLOv11
0 likes · 13 min read
How AI Is Transforming Ceramic Artifact Classification and Market Valuation
21CTO
21CTO
Aug 27, 2025 · Artificial Intelligence

Who Built Modern AI? Meet the Pioneers Behind the Revolution

This article chronicles the evolution of artificial intelligence over eight decades, spotlighting seminal figures such as Alan Turing, Allen Newell, Marvin Minsky, John McCarthy, Yoshua Bengio, Geoffrey Hinton, Andrew Ng and Yann LeCun, and explains how their groundbreaking work shaped modern AI.

AI historyComputer Scienceartificial-intelligence
0 likes · 8 min read
Who Built Modern AI? Meet the Pioneers Behind the Revolution
AIWalker
AIWalker
Aug 19, 2025 · Artificial Intelligence

Easy Ways to Boost YOLO: Systematic Review of Versions and Use Cases

This article systematically reviews every YOLO version, classifies five major improvement directions—architecture enhancements, efficiency optimizations, multi‑task learning, temporal modeling, and domain‑specific customizations—provides concrete paper references, code links, and dataset resources to help researchers and engineers quickly locate and apply the most effective techniques.

YOLOdeep learningmodel improvement
0 likes · 8 min read
Easy Ways to Boost YOLO: Systematic Review of Versions and Use Cases
Bilibili Tech
Bilibili Tech
Aug 12, 2025 · Artificial Intelligence

How AI Recreates Original Voices in Multilingual Video Dubbing

This article explains the technical challenges and innovative AI solutions behind preserving speaker identity, emotion, and timing while translating video content into multiple languages, covering speech generation modeling, speaker segmentation, adversarial reinforcement learning, proper‑noun adaptation, and audio‑visual alignment techniques.

AI voice cloningSpeech synthesisaudio-visual alignment
0 likes · 22 min read
How AI Recreates Original Voices in Multilingual Video Dubbing
Architects' Tech Alliance
Architects' Tech Alliance
Aug 10, 2025 · Artificial Intelligence

From Volta to Blackwell: How NVIDIA GPUs Evolved for Deep Learning

This article traces the evolution of NVIDIA's GPU architectures—from Volta's pioneering Tensor Cores through Turing, Ampere, Hopper, and the latest Blackwell—highlighting key innovations such as mixed‑precision support, NVLink, and specialized Tensor Core designs that have dramatically boosted AI training and inference performance.

AI hardwareGPU architectureNVLink
0 likes · 10 min read
From Volta to Blackwell: How NVIDIA GPUs Evolved for Deep Learning
Qborfy AI
Qborfy AI
Aug 8, 2025 · Artificial Intelligence

Why Transformers Revolutionized AI: A Deep Dive into Self‑Attention

This article explains how the Transformer model replaces sequential RNN processing with parallel self‑attention, detailing its core components, positional encoding, encoder‑decoder workflow, industry impact, and surprising facts such as training speed gains and energy efficiency.

AISelf-AttentionTransformer
0 likes · 5 min read
Why Transformers Revolutionized AI: A Deep Dive into Self‑Attention
Qborfy AI
Qborfy AI
Aug 7, 2025 · Artificial Intelligence

Understanding RNNs: From Memory Cells to Real‑World Applications

This article explains how recurrent neural networks (RNNs) add memory to neural models, details the gate mechanisms of LSTM and GRU, compares their structures and parameter counts, and illustrates their use in speech recognition, translation, stock prediction, and video generation, while highlighting practical insights and energy considerations.

AIGRULSTM
0 likes · 5 min read
Understanding RNNs: From Memory Cells to Real‑World Applications
AIWalker
AIWalker
Aug 3, 2025 · Artificial Intelligence

Tree-Guided CNN Boosts Image Super-Resolution in Joint University Study

A collaborative team from five universities proposes a tree-structured convolutional neural network that leverages binary‑tree guidance, cosine cross‑domain extraction, and an adaptive Nesterov momentum optimizer to markedly improve image super‑resolution performance.

adaptive optimizercomputer visiondeep learning
0 likes · 5 min read
Tree-Guided CNN Boosts Image Super-Resolution in Joint University Study
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 1, 2025 · Artificial Intelligence

Unlocking Qwen3-Coder-30B: Features, Fast Start, and Agentic Coding Guide

The article introduces Qwen3‑Coder‑30B‑A3B‑Instruct (aka Qwen3‑Coder‑Flash), detailing its architecture, 256K‑to‑1M token context, agentic coding capabilities, installation steps with Transformers, sample code for tool use, optimal sampling parameters, and deployment tips across various runtimes.

AI coding assistantLarge Language ModelQwen3
0 likes · 6 min read
Unlocking Qwen3-Coder-30B: Features, Fast Start, and Agentic Coding Guide
Architecture Development Notes
Architecture Development Notes
Jul 21, 2025 · Artificial Intelligence

Why Rust’s Burn Framework Is Redefining Deep Learning Performance

Burn, a native Rust deep learning framework by Tracel AI, combines extreme flexibility, high computational efficiency, and cross‑platform portability through a modular backend abstraction, type‑safe tensor operations, asynchronous execution, and extensive tooling, offering performance‑competitive alternatives to Python‑based frameworks for both training and inference.

BurnGPUdeep learning
0 likes · 23 min read
Why Rust’s Burn Framework Is Redefining Deep Learning Performance
Tencent Technical Engineering
Tencent Technical Engineering
Jul 18, 2025 · Artificial Intelligence

From CPUs to GPUs: How Traditional Backend Skills Power Modern AI Infrastructure

This article explores the evolution of AI infrastructure, comparing it with traditional backend systems, and details how hardware shifts to GPU-centric designs, software adaptations like deep learning frameworks, and engineering challenges in model training and inference can be addressed using established backend methodologies.

AI InfrastructureGPU computingInference Optimization
0 likes · 19 min read
From CPUs to GPUs: How Traditional Backend Skills Power Modern AI Infrastructure
Tencent Cloud Developer
Tencent Cloud Developer
Jul 17, 2025 · Artificial Intelligence

Why GPUs Are the New CPUs: Unpacking AI Infrastructure Challenges

This article explores how AI infrastructure has shifted from CPU‑centric designs to GPU‑driven architectures, detailing hardware evolution, software changes, and the engineering challenges of large‑model training and inference, while offering practical insights for traditional backend engineers transitioning to AI systems.

AI InfrastructureGPU computingModel Training
0 likes · 16 min read
Why GPUs Are the New CPUs: Unpacking AI Infrastructure Challenges
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 16, 2025 · Artificial Intelligence

What Are the Core Concepts Behind AI? From Data to Models Explained

This article walks readers through the fundamentals of artificial intelligence, covering AI, machine learning, deep learning, data types, linear regression, supervised and unsupervised learning, reinforcement learning, feature engineering, tokenization, vectorization, embeddings, and includes a practical Word2Vec code example.

AIEmbeddingdata science
0 likes · 21 min read
What Are the Core Concepts Behind AI? From Data to Models Explained
Kuaishou Large Model
Kuaishou Large Model
Jul 11, 2025 · Artificial Intelligence

How MODA’s Modular Duplex Attention Boosts Multimodal Emotion Understanding

The paper introduces MODA, a new multimodal model that tackles attention imbalance across modalities with a modular duplex attention mechanism, achieving significant performance gains on perception, cognition, and emotion tasks across 21 benchmarks and demonstrating strong potential for human‑machine interaction.

MODA modelMultimodal AIattention mechanisms
0 likes · 13 min read
How MODA’s Modular Duplex Attention Boosts Multimodal Emotion Understanding
IT Services Circle
IT Services Circle
Jul 6, 2025 · Artificial Intelligence

Why Transformers Train Like Any Neural Network: Backpropagation Explained

This article demystifies how Transformers are trained by showing that all their linear layers have learnable weights and biases, and that the attention mechanism—including softmax and dot‑product operations—is fully differentiable and updated via standard back‑propagation.

BackpropagationPyTorchTransformer
0 likes · 7 min read
Why Transformers Train Like Any Neural Network: Backpropagation Explained
Qborfy AI
Qborfy AI
Jul 3, 2025 · Artificial Intelligence

Why Loss Functions Matter: From Theory to Real‑World AI Applications

This article explains what loss functions are, outlines their three essential components, categorizes them for regression, classification, and generation tasks, reviews five classic loss functions with their noise resistance and gradient traits, and offers practical guidelines for selecting the right loss for AI models.

AI FundamentalsRegressionclassification
0 likes · 4 min read
Why Loss Functions Matter: From Theory to Real‑World AI Applications
Qborfy AI
Qborfy AI
Jul 2, 2025 · Artificial Intelligence

Mastering Activation Functions: From Sigmoid to Swish and When to Use Them

This article explains the role of activation functions in neural networks, compares five classic functions with formulas, performance trade‑offs, and gradient behavior, and provides a Python visualization demo plus several practical insights and real‑world examples.

ReLUSwishactivation functions
0 likes · 7 min read
Mastering Activation Functions: From Sigmoid to Swish and When to Use Them
JD Tech Talk
JD Tech Talk
Jul 2, 2025 · Artificial Intelligence

How JoyGen Delivers High‑Quality Audio‑Driven 3D Talking‑Face Video Editing

JoyGen introduces a two‑stage framework that combines 3D facial reconstruction with audio‑driven motion generation to produce synchronized, high‑fidelity talking‑face videos, and validates its effectiveness on both the HDTF benchmark and a newly built high‑resolution Chinese speaking‑face dataset.

3DMMAIGCaudio-driven
0 likes · 13 min read
How JoyGen Delivers High‑Quality Audio‑Driven 3D Talking‑Face Video Editing
JD Cloud Developers
JD Cloud Developers
Jul 2, 2025 · Artificial Intelligence

How JoyGen Achieves High‑Quality Audio‑Driven 3D Talking‑Face Video Editing

JoyGen introduces a two‑stage framework that combines 3D morphable model reconstruction with audio‑driven lip motion generation and depth‑aware visual synthesis, delivering precise audio‑lip synchronization and superior visual quality on both the HDTF benchmark and a newly built high‑resolution Chinese talking‑face dataset.

3DMMAIGCaudio-driven video
0 likes · 12 min read
How JoyGen Achieves High‑Quality Audio‑Driven 3D Talking‑Face Video Editing
Tencent Architect
Tencent Architect
Jul 2, 2025 · Artificial Intelligence

How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge

Tencent TEG Shannon Lab won the NTIRE 2025 UGC Video Enhancement competition with a progressive training framework that combines adaptive color enhancement, high‑speed denoising, and temporal stability under bitrate constraints, achieving top subjective scores, significant inference speed‑ups, and successful INT8 quantization for real‑time deployment.

AI video codecNTIRE2025Quantization
0 likes · 18 min read
How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge
Huolala Tech
Huolala Tech
Jul 2, 2025 · Artificial Intelligence

Can Diffusion Models Revolutionize Salient Object Detection?

This article introduces a diffusion‑based framework for salient object detection, discusses its background, challenges, and motivations, details the model architecture and training, presents extensive experiments and ablation studies, and outlines limitations and future research directions.

computer visiondeep learningdiffusion model
0 likes · 11 min read
Can Diffusion Models Revolutionize Salient Object Detection?
Qborfy AI
Qborfy AI
Jul 1, 2025 · Artificial Intelligence

Why CNNs Outperform Fully Connected Networks: A Deep Dive into Architecture and Applications

This article explains the fundamentals of convolutional neural networks (CNNs), detailing their definition, advantages over fully connected networks, architectural components such as input, hidden, and output layers, key operations like convolution, pooling, and activation, and showcases practical applications and notable insights.

CNNartificial-intelligencecomputer vision
0 likes · 5 min read
Why CNNs Outperform Fully Connected Networks: A Deep Dive into Architecture and Applications
JD Retail Technology
JD Retail Technology
Jul 1, 2025 · Artificial Intelligence

JoyGen: Audio‑Driven 3D Depth‑Aware Talking‑Face Video Editing Explained

JoyGen introduces a two‑stage framework that generates high‑quality talking‑face videos by synchronizing lip movements with input audio using 3DMM‑based identity and expression coefficients, depth‑aware supervision, and a newly built high‑resolution Chinese speaking‑face dataset, achieving state‑of‑the‑art performance on multiple benchmarks.

3DMMAIGCaudio-driven video
0 likes · 13 min read
JoyGen: Audio‑Driven 3D Depth‑Aware Talking‑Face Video Editing Explained
Cognitive Technology Team
Cognitive Technology Team
Jun 29, 2025 · Artificial Intelligence

Understanding Transformers: Core Mechanics Behind Modern AI Models

This article demystifies the Transformer architecture for beginners, explaining its relationship to large models, the self‑attention and multi‑head attention mechanisms, positional encoding, and the roles of Encoder and Decoder components, using clear analogies and visual diagrams to aid comprehension.

Encoder-DecoderMulti-Head AttentionPositional Encoding
0 likes · 20 min read
Understanding Transformers: Core Mechanics Behind Modern AI Models
AIWalker
AIWalker
Jun 24, 2025 · Artificial Intelligence

How Multimodal Fusion Accelerates Paper Publication: Key Insights and Resources

The article surveys 117 recent multimodal‑fusion papers, classifies them into improvement‑based and combination‑based approaches, highlights representative works such as TimeXL, OGP‑Net, MMR‑Mamba and FusionSight, and provides a free collection of papers, classic models and code repositories for researchers.

AI researchcomputer visiondeep learning
0 likes · 8 min read
How Multimodal Fusion Accelerates Paper Publication: Key Insights and Resources
DataFunSummit
DataFunSummit
Jun 21, 2025 · Artificial Intelligence

From Bias to Fairness: De‑biasing Techniques in Uplift Modeling

This article explores the fundamentals and challenges of uplift modeling, explains why unbiased random data are essential, and presents a comprehensive suite of bias‑correction methods—including reweighting, propensity‑score matching, and advanced deep‑learning architectures such as TarNet, CFRNet, and DragonNet—to improve causal effect estimation in marketing and finance applications.

Bias CorrectionUplift Modelingcausal inference
0 likes · 15 min read
From Bias to Fairness: De‑biasing Techniques in Uplift Modeling
Architects' Tech Alliance
Architects' Tech Alliance
Jun 19, 2025 · Fundamentals

Unlock the Secrets of GPUs: 100 Essential Fundamentals Explained

This comprehensive guide covers 100 essential GPU fundamentals, from basic definitions and architecture to core technologies, performance optimization, emerging trends, and industry developments, providing a complete technical foundation for graphics, AI, and high‑performance computing applications.

Computer ArchitectureGPUGraphics Processing Unit
0 likes · 19 min read
Unlock the Secrets of GPUs: 100 Essential Fundamentals Explained
AI Algorithm Path
AI Algorithm Path
Jun 19, 2025 · Artificial Intelligence

Training Neural Networks with Minimal Labeled Data Using Active Learning

This article explains how active learning can dramatically reduce the amount of labeled data required for training deep neural networks by selecting the most informative and representative samples, and provides a complete Python implementation of a hybrid query strategy (DBAL) with ResNet‑18.

Active LearningDBALPython
0 likes · 14 min read
Training Neural Networks with Minimal Labeled Data Using Active Learning
AI Frontier Lectures
AI Frontier Lectures
Jun 16, 2025 · Artificial Intelligence

What Do the CVPR 2025 Awards Reveal About the Future of Computer Vision?

The CVPR 2025 awards spotlight groundbreaking work—from the VGGT transformer that predicts full 3D scenes in a single feed‑forward pass to neural inverse rendering that reconstructs geometry from time‑resolved light—offering a comprehensive view of emerging trends, novel architectures, and performance breakthroughs across computer‑vision research.

3D reconstructionCVPR 2025deep learning
0 likes · 11 min read
What Do the CVPR 2025 Awards Reveal About the Future of Computer Vision?
MaGe Linux Operations
MaGe Linux Operations
Jun 15, 2025 · Artificial Intelligence

Mastering Transformers: Key Extensions and Optimization Techniques Explained

This comprehensive guide walks you through the Transformer architecture—from its encoder‑decoder structure and self‑attention mechanism to multi‑head attention, positional embeddings, and practical PyTorch implementations—providing clear visualizations and code examples for deep learning practitioners.

PyTorchSelf-AttentionTransformer
0 likes · 22 min read
Mastering Transformers: Key Extensions and Optimization Techniques Explained
Architects' Tech Alliance
Architects' Tech Alliance
Jun 15, 2025 · Fundamentals

Master GPU Fundamentals: Architecture, Performance, and Programming Insights

This comprehensive guide covers GPU definitions, evolution, core components, architectural designs, performance metrics, programming models, deep‑learning applications, comparisons with other processors, practical use cases, optimization techniques, and future trends, providing a solid foundation for anyone interested in modern graphics and compute acceleration.

Computer ArchitectureGPUHardware
0 likes · 43 min read
Master GPU Fundamentals: Architecture, Performance, and Programming Insights
Open Source Linux
Open Source Linux
Jun 12, 2025 · Artificial Intelligence

From Transformers to DeepSeek‑R1: The Evolution of Large Language Models (2017‑2025)

This article chronicles the rapid development of large language models from the 2017 Transformer breakthrough through the rise of BERT, GPT‑3, multimodal models, alignment techniques like RLHF, and finally the cost‑efficient DeepSeek‑R1 in 2025, highlighting key innovations, scaling trends, and real‑world impacts.

AI alignmentModel ScalingMultimodal
0 likes · 26 min read
From Transformers to DeepSeek‑R1: The Evolution of Large Language Models (2017‑2025)
Zhihu Tech Column
Zhihu Tech Column
Jun 11, 2025 · Artificial Intelligence

How Minute‑Level Time Decay Boosts User Retention Modeling in Recommendation Systems

This article presents a novel minute‑level future‑reward framework with dual‑delay incentives, activity‑based attribution, multi‑task delayed modeling, and sequential streaming training that dramatically improves user retention prediction accuracy and real‑time performance in large‑scale recommendation platforms.

User Retentiondeep learningmulti‑task modeling
0 likes · 17 min read
How Minute‑Level Time Decay Boosts User Retention Modeling in Recommendation Systems
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Jun 11, 2025 · Artificial Intelligence

Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI

Kuaishou presented twelve peer‑reviewed papers at CVPR 2025 covering video quality assessment, large‑scale video datasets, dynamic 3D avatar reconstruction, 4D scene simulation, controllable video generation, scaling laws for diffusion transformers, multimodal foundations, and more, highlighting the company's leading research in computer vision and AI.

AI researchCVPR2025Multimodal
0 likes · 21 min read
Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI
AI Frontier Lectures
AI Frontier Lectures
Jun 10, 2025 · Artificial Intelligence

Can One Model Master All Remote Sensing Tasks? Introducing the TSSUN Framework

This paper presents the Temporal‑Spectral‑Spatial Unified Network (TSSUN), a flexible deep‑learning architecture that simultaneously handles semantic segmentation, semantic change detection, and binary change detection across heterogeneous remote‑sensing inputs, achieving state‑of‑the‑art performance without task‑specific retraining.

Attention MechanismTSSUNdeep learning
0 likes · 15 min read
Can One Model Master All Remote Sensing Tasks? Introducing the TSSUN Framework
AIWalker
AIWalker
Jun 3, 2025 · Artificial Intelligence

DeepKD: Double‑Layer Decoupling and Adaptive Denoising Set New ImageNet SOTA

DeepKD introduces a double‑layer decoupling framework and a dynamic top‑K mask that adaptively denoises low‑confidence logits, addressing conflicts between target and non‑target knowledge flows; extensive experiments on CIFAR‑100, ImageNet‑1K, and MS‑COCO demonstrate consistent accuracy gains and state‑of‑the‑art performance.

GSNRSOTAdeep learning
0 likes · 23 min read
DeepKD: Double‑Layer Decoupling and Adaptive Denoising Set New ImageNet SOTA
AIWalker
AIWalker
Jun 2, 2025 · Artificial Intelligence

NTIRE 2025 UGC Video Enhancement Challenge: Methods and Results

The NTIRE 2025 challenge introduced a new benchmark for user‑generated content video enhancement, detailing a 150‑video dataset, a pairwise subjective evaluation using the Bradley‑Terry model, hardware specifications, and the diverse multi‑stage deep‑learning methods and results of participating teams.

NTIRE 2025UGC videobenchmark
0 likes · 22 min read
NTIRE 2025 UGC Video Enhancement Challenge: Methods and Results
AIWalker
AIWalker
Jun 2, 2025 · Artificial Intelligence

Multi-University Team Proposes Tree-Guided CNN for Image Super-Resolution

The paper presents a tree‑guided convolutional neural network that leverages binary‑tree structures, cosine‑based cross‑domain feature extraction, and an adaptive Nesterov momentum optimizer to enhance key layer interactions, achieving superior image super‑resolution performance as demonstrated by extensive experiments.

adaptive Nesterov optimizercosine feature extractiondeep learning
0 likes · 5 min read
Multi-University Team Proposes Tree-Guided CNN for Image Super-Resolution
DaTaobao Tech
DaTaobao Tech
May 16, 2025 · Artificial Intelligence

JianYi: AI‑Powered Image Segmentation and Matting System for Taobao Home‑Decoration

The article introduces JianYi, a self‑developed image segmentation and matting system for Taobao's home‑decoration business that supports product, human, and panoramic segmentation with multi‑modal interaction, achieving high‑precision real‑time performance and powering AI tools such as "Jiazuo" and "Fang Wo Jia".

artificial-intelligencecomputer visiondeep learning
0 likes · 11 min read
JianYi: AI‑Powered Image Segmentation and Matting System for Taobao Home‑Decoration
Bilibili Tech
Bilibili Tech
May 16, 2025 · Artificial Intelligence

How FineVQ Sets New Standards for Fine‑Grained UGC Video Quality Assessment

The article introduces FineVD, the first large‑scale multi‑dimensional UGC video quality dataset, and presents FineVQ, a unified model that predicts quality scores, attributes, and distortion types across six dimensions, achieving state‑of‑the‑art performance on multiple benchmarks and cross‑dataset evaluations.

FineVQMultimodalUGC
0 likes · 9 min read
How FineVQ Sets New Standards for Fine‑Grained UGC Video Quality Assessment
Amap Tech
Amap Tech
May 8, 2025 · Artificial Intelligence

FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

FantasyTalking generates high-fidelity, coherent talking portraits from a single static image by employing a two-stage audio-visual alignment—global segment-level motion and frame-level lip refinement—combined with face-centric cross-attention for identity preservation and a motion-intensity module that lets users control expression and body movement, achieving superior realism, synchronization, and performance over prior methods.

audio-visual alignmentdeep learningidentity preservation
0 likes · 10 min read
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
IT Services Circle
IT Services Circle
May 2, 2025 · Artificial Intelligence

Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It

The article explains why deep networks suffer from gradient vanishing—especially when using sigmoid or tanh activations—covers the underlying mathematics, compares activation functions, and presents practical techniques such as proper weight initialization, batch normalization, residual connections, and code examples to visualize the phenomenon.

Batch NormalizationResNetactivation functions
0 likes · 7 min read
Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It
JD Tech
JD Tech
Apr 30, 2025 · Artificial Intelligence

TimeHF: A Billion‑Scale Time Series Forecasting Model Guided by Human Feedback

The JD Supply Chain algorithm team introduces TimeHF, a billion‑parameter time‑series large model that leverages RLHF to boost demand‑forecast accuracy by over 10%, detailing dataset construction, the PCTLM architecture, a custom RLHF framework (TPO), and extensive SOTA experimental results.

Big DataRLHFdeep learning
0 likes · 10 min read
TimeHF: A Billion‑Scale Time Series Forecasting Model Guided by Human Feedback
AI Frontier Lectures
AI Frontier Lectures
Apr 30, 2025 · Artificial Intelligence

How Dual‑Domain Strip Attention Revolutionizes Image Restoration

The paper introduces Dual‑Domain Strip Attention Network (DSANet), a lightweight architecture that combines spatial and frequency strip attention to boost multi‑scale representation learning, achieving state‑of‑the‑art performance on dehazing, desnowing, defocus deblurring, and denoising tasks with significantly lower computational cost.

deep learningdual-domain attentionneural networks
0 likes · 10 min read
How Dual‑Domain Strip Attention Revolutionizes Image Restoration
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Apr 23, 2025 · Artificial Intelligence

DeepQueueNet in Practice: Quickly Achieve High‑Precision Network Simulation

This article walks through using DeepQueueNet—a deep‑learning‑enhanced network performance estimator—to set up a device model, train the PyTorch version, configure a fattree16 topology, and run multi‑GPU simulations that deliver minute‑level, packet‑accurate results in as little as 1 minute 27 seconds.

DeepQueueNetPyTorchdeep learning
0 likes · 6 min read
DeepQueueNet in Practice: Quickly Achieve High‑Precision Network Simulation
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 22, 2025 · Artificial Intelligence

How DistilQwen2.5-DS3-0324 Achieves Fast, Accurate Reasoning via Quick‑Think Distillation

This article introduces DistilQwen2.5-DS3-0324, a distilled language model series that balances rapid inference with strong reasoning by applying a fast‑thinking chain‑of‑thought strategy, details its two‑stage distillation framework, evaluation on diverse benchmarks, and provides code for downloading and using the models.

Chain-of-Thoughtdeep learningfast inference
0 likes · 17 min read
How DistilQwen2.5-DS3-0324 Achieves Fast, Accurate Reasoning via Quick‑Think Distillation
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 18, 2025 · Artificial Intelligence

How the New 14B End‑to‑End Video Model Generates Custom 720p Clips from Two Images

The open‑sourced 14‑billion‑parameter Tongyi Wanxiang video model can create high‑quality 720p videos that seamlessly connect user‑provided start and end images, offering controllable, personalized video generation with prompt‑driven camera motions and easy access via its website, GitHub, Hugging Face, and ModelScope.

AI modelcomputer visiondeep learning
0 likes · 5 min read
How the New 14B End‑to‑End Video Model Generates Custom 720p Clips from Two Images
AIWalker
AIWalker
Apr 16, 2025 · Artificial Intelligence

Plug‑and‑Play Multi‑Scale Attention: A Seamless Boost for Model Performance

This article reviews recent multi‑scale attention breakthroughs—including EMA, MSDA, VWA, and related modules—showing how they improve accuracy, cut FLOPs by up to 70%, and can be inserted into existing models with minimal effort, backed by code and paper links.

Model EfficiencyPlug-and-Playcomputer vision
0 likes · 10 min read
Plug‑and‑Play Multi‑Scale Attention: A Seamless Boost for Model Performance
Cognitive Technology Team
Cognitive Technology Team
Apr 12, 2025 · Artificial Intelligence

Analyzing a Trained Neural Network: Visualizing Hidden Layers and Understanding Its Limitations

This article walks through an interactive exploration of a simple two‑hidden‑layer neural network, showing how real‑time visualizations reveal its learned representations, accuracy limits, and why constrained training leads to over‑confident yet unintelligent predictions before introducing backpropagation.

Backpropagationdeep learninghidden layers
0 likes · 10 min read
Analyzing a Trained Neural Network: Visualizing Hidden Layers and Understanding Its Limitations
Cognitive Technology Team
Cognitive Technology Team
Apr 9, 2025 · Artificial Intelligence

How Neural Networks Learn: Gradient Descent and Loss Functions

This article explains how neural networks learn by using labeled training data, describing the role of weights, biases, activation functions, and how gradient descent iteratively adjusts parameters to minimize loss, illustrated with the MNIST digit‑recognition example.

MNISTdeep learninggradient descent
0 likes · 16 min read
How Neural Networks Learn: Gradient Descent and Loss Functions
JD Tech
JD Tech
Apr 8, 2025 · Artificial Intelligence

MaRCA: Multi‑Agent Reinforcement Learning Computation Allocation for Full‑Chain Advertising Systems

The article presents MaRCA, a multi‑agent reinforcement learning framework that models user value, compute consumption, and action reward to allocate limited computation resources across the entire advertising recommendation pipeline, achieving higher ad revenue while keeping system load stable under fluctuating traffic and diverse request values.

Load-Aware SchedulingMulti-Agent Reinforcement Learningadvertising systems
0 likes · 16 min read
MaRCA: Multi‑Agent Reinforcement Learning Computation Allocation for Full‑Chain Advertising Systems
AI Frontier Lectures
AI Frontier Lectures
Apr 8, 2025 · Artificial Intelligence

How HINT’s Hierarchical Multi‑Head Attention Boosts Image Restoration

The article introduces HINT, a Transformer‑based image restoration model that solves the redundancy of standard multi‑head attention by using Hierarchical Multi‑Head Attention and a Query‑Key Cache Updating module, and demonstrates superior PSNR/SSIM performance across multiple low‑level vision tasks while keeping model complexity low.

deep learningquery-key cache
0 likes · 10 min read
How HINT’s Hierarchical Multi‑Head Attention Boosts Image Restoration
Cognitive Technology Team
Cognitive Technology Team
Apr 8, 2025 · Artificial Intelligence

Understanding Neural Networks: Structure, Layers, and Activation

This article explains how a simple neural network can recognize handwritten digits by preprocessing images, organizing neurons into input, hidden, and output layers, using weighted sums, biases, sigmoid compression, and matrix multiplication to illustrate the fundamentals of deep learning.

LayersSigmoidactivation functions
0 likes · 16 min read
Understanding Neural Networks: Structure, Layers, and Activation
Data Thinking Notes
Data Thinking Notes
Apr 6, 2025 · Artificial Intelligence

Why Mixture of Experts (MoE) is Revolutionizing Large AI Models

Mixture of Experts (MoE) leverages dynamic conditional computation and specialized expert networks to overcome the parameter explosion and inefficiency of dense models, offering scalable capacity, multi‑task adaptability, and improved efficiency, while addressing challenges such as training stability, communication overhead, and load balancing.

Dynamic RoutingMixture of ExpertsModel Scaling
0 likes · 7 min read
Why Mixture of Experts (MoE) is Revolutionizing Large AI Models
Baidu Tech Salon
Baidu Tech Salon
Apr 2, 2025 · Artificial Intelligence

PaddlePaddle Framework 3.0 Released: Five Core Innovations for Large Models and Scientific Computing

PaddlePaddle 3.0, launched on April 1 2025, introduces five core innovations—including dynamic‑static unified automatic parallelism, a training‑inference integrated PIR, high‑order automatic differentiation for scientific computing, a one‑stage CINN compiler, and heterogeneous multi‑chip adaptation—that dramatically reduce distributed‑training code, boost performance up to four‑fold, and extend the framework to aerospace, automotive, meteorology and life‑science applications while remaining fully compatible with the 2.0 API.

PaddlePaddleScientific Computingautomatic parallelism
0 likes · 21 min read
PaddlePaddle Framework 3.0 Released: Five Core Innovations for Large Models and Scientific Computing
Cognitive Technology Team
Cognitive Technology Team
Mar 31, 2025 · Artificial Intelligence

Recommendation Algorithms: Using Mathematical Methods for Efficient Information Matching

Recommendation algorithms, rooted in machine learning and deep learning, transform massive user‑generated data into mathematical models that filter and personalize content, covering traditional collaborative filtering, matrix factorization, cosine similarity, and modern deep models such as Wide & Deep and Two‑Tower retrieval, illustrating their evolution and practical applications.

Wide&Deepcollaborative filteringdeep learning
0 likes · 14 min read
Recommendation Algorithms: Using Mathematical Methods for Efficient Information Matching
Cognitive Technology Team
Cognitive Technology Team
Mar 31, 2025 · Artificial Intelligence

Understanding Douyin's Recommendation Algorithm: From Behavior Prediction to Value Modeling

The article explains how Douyin's recommendation system uses machine‑learning and deep‑learning models to predict user actions, assign value weights, and dynamically adjust scores, highlighting both its efficiency in large‑scale content distribution and its inherent limitations compared to human understanding.

AIdeep learningrecommendation system
0 likes · 7 min read
Understanding Douyin's Recommendation Algorithm: From Behavior Prediction to Value Modeling
JavaEdge
JavaEdge
Mar 27, 2025 · Artificial Intelligence

Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)

This article examines the limitations of current vision‑language and reasoning models, proposes a visual reasoning model (VRM) that can process images and perform deep logical inference, and discusses architecture, training methods, reinforcement‑learning reward designs, and practical challenges.

LLMVisual Reasoningartificial-intelligence
0 likes · 8 min read
Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)
JD Retail Technology
JD Retail Technology
Mar 18, 2025 · Artificial Intelligence

Multi‑Agent Reinforcement Learning Based Full‑Chain Computation Allocation (MaRCA) for Advertising Systems

MaRCA, a multi‑agent reinforcement‑learning framework, allocates compute across JD’s advertising playback chain by jointly estimating user value, resource consumption, and action outcomes while dynamically adjusting to real‑time load, achieving roughly 15 % higher ad revenue without extra compute resources.

AdvertisingCompute Schedulingdeep learning
0 likes · 18 min read
Multi‑Agent Reinforcement Learning Based Full‑Chain Computation Allocation (MaRCA) for Advertising Systems
JavaEdge
JavaEdge
Mar 15, 2025 · Artificial Intelligence

Boost NLP Model Performance with n-gram Feature Engineering

This article explains why feature engineering is crucial for NLP tasks, introduces n‑gram enhancements, provides Python implementations for generating bi‑gram and higher‑order features, demonstrates dynamic padding for text length standardization, and offers practical deployment tips such as feature dimension control and monitoring.

N-gramNLPPython
0 likes · 7 min read
Boost NLP Model Performance with n-gram Feature Engineering
AIWalker
AIWalker
Mar 14, 2025 · Artificial Intelligence

Dynamic Tanh Lets He Kaiming and LeCun Drop Transformer Normalization in 9 Lines

Researchers He Kaiming, Yann LeCun and colleagues propose a 9‑line Dynamic Tanh (DyT) layer that replaces LayerNorm/RMSNorm in Transformers, showing comparable or superior accuracy across vision, language, speech and DNA tasks while also reducing inference latency on modern GPUs.

AI researchDynamic TanhModel Efficiency
0 likes · 18 min read
Dynamic Tanh Lets He Kaiming and LeCun Drop Transformer Normalization in 9 Lines
AI Frontier Lectures
AI Frontier Lectures
Mar 14, 2025 · Artificial Intelligence

Do Vision Models Really Need Mamba? A Deep Dive into MambaOut

This article critically examines the MambaOut paper, analyzing whether state‑space‑based Mamba token mixers are necessary for vision tasks, presenting two hypotheses, describing the construction of MambaOut models without SSM, and reporting extensive ImageNet, COCO and ADE20K experiments that reveal when Mamba is beneficial.

MambaState Space ModelToken Mixer
0 likes · 17 min read
Do Vision Models Really Need Mamba? A Deep Dive into MambaOut
AIWalker
AIWalker
Mar 10, 2025 · Artificial Intelligence

HSR-Mamba Solves Mamba’s HSISR Issue with Dual Strategies, Beats Prior Methods

HSR-Mamba introduces a contextual spatial‑spectral state‑space model that tackles Mamba's limitations in hyperspectral image super‑resolution through a local partition mechanism and a global spectral rearrangement strategy, achieving significantly higher PSNR, SSIM and SAM scores than existing approaches while using fewer parameters and FLOPs.

Dual strategyHSI super-resolutionMamba
0 likes · 25 min read
HSR-Mamba Solves Mamba’s HSISR Issue with Dual Strategies, Beats Prior Methods
AIWalker
AIWalker
Mar 8, 2025 · Artificial Intelligence

Trainable HVI Color Space Turns Dark Photos into Cinematic Images – CVPR 2025

The paper introduces a globally first trainable HVI color space and a lightweight CIDNet network that jointly model intensity and chrominance, eliminating color bias and brightness artifacts in low‑light image enhancement and achieving state‑of‑the‑art results on ten benchmark datasets.

CIDNetCVPR 2025HVI color space
0 likes · 12 min read
Trainable HVI Color Space Turns Dark Photos into Cinematic Images – CVPR 2025
Cognitive Technology Team
Cognitive Technology Team
Mar 6, 2025 · Artificial Intelligence

From Traditional Machine Learning to Deep Learning: A Comprehensive Guide to Algorithms, Feature Engineering, and Model Training

This article provides a step‑by‑step tutorial that walks readers through the fundamentals of traditional machine‑learning algorithms, feature‑engineering techniques, model training pipelines, evaluation metrics, and then advances to deep‑learning concepts such as MLPs, activation functions, transformers, and modern recommendation‑system models.

Model TrainingPythonRecommendation Systems
0 likes · 63 min read
From Traditional Machine Learning to Deep Learning: A Comprehensive Guide to Algorithms, Feature Engineering, and Model Training
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 6, 2025 · Artificial Intelligence

From Linear Regression to Transformers: Mastering Machine Learning Foundations

This comprehensive guide walks readers through the evolution of machine learning, starting with basic linear models and feature engineering, progressing through logistic regression, decision trees, and deep learning architectures like MLPs, CNNs, RNNs, and transformers, and demonstrates practical implementations with code examples and evaluation metrics.

Evaluation MetricsRecommendation Systemsdeep learning
0 likes · 64 min read
From Linear Regression to Transformers: Mastering Machine Learning Foundations