Tagged articles
1235 articles
Page 2 of 13
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Aug 29, 2025 · Artificial Intelligence

Weekly Quantitative Finance Paper Digest (Aug 23‑29, 2025)

This digest summarizes nine recent arXiv papers covering quantum portfolio optimization, thematic investing with semantic stock representations, multi‑indicator reinforcement learning for trading, attention‑based asset pricing, ESG variable selection, deep neural networks for return distribution forecasting, a foundation model for financial time‑series, a multi‑agent trading system with self‑reflection, and dynamic weighting machine‑learning stock selection strategies.

Deep LearningESGQuantitative Finance
0 likes · 17 min read
Weekly Quantitative Finance Paper Digest (Aug 23‑29, 2025)
Data Party THU
Data Party THU
Aug 29, 2025 · Artificial Intelligence

How AI Is Transforming Ceramic Artifact Classification and Market Valuation

A collaborative study by Universiti Putra Malaysia and UNSW Sydney presents an AI-driven framework that combines an enhanced YOLOv11 model with a random‑forest regressor to automatically classify ceramic artifacts and predict their auction prices, demonstrating significant performance gains over traditional methods.

AICeramic ClassificationDeep Learning
0 likes · 13 min read
How AI Is Transforming Ceramic Artifact Classification and Market Valuation
21CTO
21CTO
Aug 27, 2025 · Artificial Intelligence

Who Built Modern AI? Meet the Pioneers Behind the Revolution

This article chronicles the evolution of artificial intelligence over eight decades, spotlighting seminal figures such as Alan Turing, Allen Newell, Marvin Minsky, John McCarthy, Yoshua Bengio, Geoffrey Hinton, Andrew Ng and Yann LeCun, and explains how their groundbreaking work shaped modern AI.

AI historyDeep Learningartificial intelligence
0 likes · 8 min read
Who Built Modern AI? Meet the Pioneers Behind the Revolution
AIWalker
AIWalker
Aug 19, 2025 · Artificial Intelligence

Easy Ways to Boost YOLO: Systematic Review of Versions and Use Cases

This article systematically reviews every YOLO version, classifies five major improvement directions—architecture enhancements, efficiency optimizations, multi‑task learning, temporal modeling, and domain‑specific customizations—provides concrete paper references, code links, and dataset resources to help researchers and engineers quickly locate and apply the most effective techniques.

Deep LearningYOLOmodel improvement
0 likes · 8 min read
Easy Ways to Boost YOLO: Systematic Review of Versions and Use Cases
Bilibili Tech
Bilibili Tech
Aug 12, 2025 · Artificial Intelligence

How AI Recreates Original Voices in Multilingual Video Dubbing

This article explains the technical challenges and innovative AI solutions behind preserving speaker identity, emotion, and timing while translating video content into multiple languages, covering speech generation modeling, speaker segmentation, adversarial reinforcement learning, proper‑noun adaptation, and audio‑visual alignment techniques.

AI voice cloningDeep LearningSpeech synthesis
0 likes · 22 min read
How AI Recreates Original Voices in Multilingual Video Dubbing
Architects' Tech Alliance
Architects' Tech Alliance
Aug 10, 2025 · Artificial Intelligence

From Volta to Blackwell: How NVIDIA GPUs Evolved for Deep Learning

This article traces the evolution of NVIDIA's GPU architectures—from Volta's pioneering Tensor Cores through Turing, Ampere, Hopper, and the latest Blackwell—highlighting key innovations such as mixed‑precision support, NVLink, and specialized Tensor Core designs that have dramatically boosted AI training and inference performance.

AI hardwareDeep LearningGPU architecture
0 likes · 10 min read
From Volta to Blackwell: How NVIDIA GPUs Evolved for Deep Learning
Qborfy AI
Qborfy AI
Aug 8, 2025 · Artificial Intelligence

Why Transformers Revolutionized AI: A Deep Dive into Self‑Attention

This article explains how the Transformer model replaces sequential RNN processing with parallel self‑attention, detailing its core components, positional encoding, encoder‑decoder workflow, industry impact, and surprising facts such as training speed gains and energy efficiency.

AIDeep LearningModel architecture
0 likes · 5 min read
Why Transformers Revolutionized AI: A Deep Dive into Self‑Attention
Qborfy AI
Qborfy AI
Aug 7, 2025 · Artificial Intelligence

Understanding RNNs: From Memory Cells to Real‑World Applications

This article explains how recurrent neural networks (RNNs) add memory to neural models, details the gate mechanisms of LSTM and GRU, compares their structures and parameter counts, and illustrates their use in speech recognition, translation, stock prediction, and video generation, while highlighting practical insights and energy considerations.

AIDeep LearningGRU
0 likes · 5 min read
Understanding RNNs: From Memory Cells to Real‑World Applications
AIWalker
AIWalker
Aug 3, 2025 · Artificial Intelligence

Tree-Guided CNN Boosts Image Super-Resolution in Joint University Study

A collaborative team from five universities proposes a tree-structured convolutional neural network that leverages binary‑tree guidance, cosine cross‑domain extraction, and an adaptive Nesterov momentum optimizer to markedly improve image super‑resolution performance.

Computer VisionDeep Learningadaptive optimizer
0 likes · 5 min read
Tree-Guided CNN Boosts Image Super-Resolution in Joint University Study
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 1, 2025 · Artificial Intelligence

Unlocking Qwen3-Coder-30B: Features, Fast Start, and Agentic Coding Guide

The article introduces Qwen3‑Coder‑30B‑A3B‑Instruct (aka Qwen3‑Coder‑Flash), detailing its architecture, 256K‑to‑1M token context, agentic coding capabilities, installation steps with Transformers, sample code for tool use, optimal sampling parameters, and deployment tips across various runtimes.

AI coding assistantAgentic CodingDeep Learning
0 likes · 6 min read
Unlocking Qwen3-Coder-30B: Features, Fast Start, and Agentic Coding Guide
Architecture Development Notes
Architecture Development Notes
Jul 21, 2025 · Artificial Intelligence

Why Rust’s Burn Framework Is Redefining Deep Learning Performance

Burn, a native Rust deep learning framework by Tracel AI, combines extreme flexibility, high computational efficiency, and cross‑platform portability through a modular backend abstraction, type‑safe tensor operations, asynchronous execution, and extensive tooling, offering performance‑competitive alternatives to Python‑based frameworks for both training and inference.

BurnDeep LearningGPU
0 likes · 23 min read
Why Rust’s Burn Framework Is Redefining Deep Learning Performance
Tencent Technical Engineering
Tencent Technical Engineering
Jul 18, 2025 · Artificial Intelligence

From CPUs to GPUs: How Traditional Backend Skills Power Modern AI Infrastructure

This article explores the evolution of AI infrastructure, comparing it with traditional backend systems, and details how hardware shifts to GPU-centric designs, software adaptations like deep learning frameworks, and engineering challenges in model training and inference can be addressed using established backend methodologies.

AI InfrastructureDeep LearningGPU computing
0 likes · 19 min read
From CPUs to GPUs: How Traditional Backend Skills Power Modern AI Infrastructure
Tencent Cloud Developer
Tencent Cloud Developer
Jul 17, 2025 · Artificial Intelligence

Why GPUs Are the New CPUs: Unpacking AI Infrastructure Challenges

This article explores how AI infrastructure has shifted from CPU‑centric designs to GPU‑driven architectures, detailing hardware evolution, software changes, and the engineering challenges of large‑model training and inference, while offering practical insights for traditional backend engineers transitioning to AI systems.

AI InfrastructureDeep LearningGPU computing
0 likes · 16 min read
Why GPUs Are the New CPUs: Unpacking AI Infrastructure Challenges
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 16, 2025 · Artificial Intelligence

What Are the Core Concepts Behind AI? From Data to Models Explained

This article walks readers through the fundamentals of artificial intelligence, covering AI, machine learning, deep learning, data types, linear regression, supervised and unsupervised learning, reinforcement learning, feature engineering, tokenization, vectorization, embeddings, and includes a practical Word2Vec code example.

AIData ScienceDeep Learning
0 likes · 21 min read
What Are the Core Concepts Behind AI? From Data to Models Explained
Kuaishou Large Model
Kuaishou Large Model
Jul 11, 2025 · Artificial Intelligence

How MODA’s Modular Duplex Attention Boosts Multimodal Emotion Understanding

The paper introduces MODA, a new multimodal model that tackles attention imbalance across modalities with a modular duplex attention mechanism, achieving significant performance gains on perception, cognition, and emotion tasks across 21 benchmarks and demonstrating strong potential for human‑machine interaction.

Deep LearningMODA modelMultimodal AI
0 likes · 13 min read
How MODA’s Modular Duplex Attention Boosts Multimodal Emotion Understanding
IT Services Circle
IT Services Circle
Jul 6, 2025 · Artificial Intelligence

Why Transformers Train Like Any Neural Network: Backpropagation Explained

This article demystifies how Transformers are trained by showing that all their linear layers have learnable weights and biases, and that the attention mechanism—including softmax and dot‑product operations—is fully differentiable and updated via standard back‑propagation.

BackpropagationDeep LearningPyTorch
0 likes · 7 min read
Why Transformers Train Like Any Neural Network: Backpropagation Explained
Qborfy AI
Qborfy AI
Jul 3, 2025 · Artificial Intelligence

Why Loss Functions Matter: From Theory to Real‑World AI Applications

This article explains what loss functions are, outlines their three essential components, categorizes them for regression, classification, and generation tasks, reviews five classic loss functions with their noise resistance and gradient traits, and offers practical guidelines for selecting the right loss for AI models.

AI fundamentalsDeep Learningclassification
0 likes · 4 min read
Why Loss Functions Matter: From Theory to Real‑World AI Applications
Qborfy AI
Qborfy AI
Jul 2, 2025 · Artificial Intelligence

Mastering Activation Functions: From Sigmoid to Swish and When to Use Them

This article explains the role of activation functions in neural networks, compares five classic functions with formulas, performance trade‑offs, and gradient behavior, and provides a Python visualization demo plus several practical insights and real‑world examples.

Deep LearningNeural NetworksReLU
0 likes · 7 min read
Mastering Activation Functions: From Sigmoid to Swish and When to Use Them
JD Tech Talk
JD Tech Talk
Jul 2, 2025 · Artificial Intelligence

How JoyGen Delivers High‑Quality Audio‑Driven 3D Talking‑Face Video Editing

JoyGen introduces a two‑stage framework that combines 3D facial reconstruction with audio‑driven motion generation to produce synchronized, high‑fidelity talking‑face videos, and validates its effectiveness on both the HDTF benchmark and a newly built high‑resolution Chinese speaking‑face dataset.

3DMMAIGCDeep Learning
0 likes · 13 min read
How JoyGen Delivers High‑Quality Audio‑Driven 3D Talking‑Face Video Editing
JD Cloud Developers
JD Cloud Developers
Jul 2, 2025 · Artificial Intelligence

How JoyGen Achieves High‑Quality Audio‑Driven 3D Talking‑Face Video Editing

JoyGen introduces a two‑stage framework that combines 3D morphable model reconstruction with audio‑driven lip motion generation and depth‑aware visual synthesis, delivering precise audio‑lip synchronization and superior visual quality on both the HDTF benchmark and a newly built high‑resolution Chinese talking‑face dataset.

3DMMAIGCDeep Learning
0 likes · 12 min read
How JoyGen Achieves High‑Quality Audio‑Driven 3D Talking‑Face Video Editing
Tencent Architect
Tencent Architect
Jul 2, 2025 · Artificial Intelligence

How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge

Tencent TEG Shannon Lab won the NTIRE 2025 UGC Video Enhancement competition with a progressive training framework that combines adaptive color enhancement, high‑speed denoising, and temporal stability under bitrate constraints, achieving top subjective scores, significant inference speed‑ups, and successful INT8 quantization for real‑time deployment.

AI video codecDeep LearningNTIRE2025
0 likes · 18 min read
How Tencent’s TEG Shannon Lab Dominated the NTIRE 2025 UGC Video Enhancement Challenge
Huolala Tech
Huolala Tech
Jul 2, 2025 · Artificial Intelligence

Can Diffusion Models Revolutionize Salient Object Detection?

This article introduces a diffusion‑based framework for salient object detection, discusses its background, challenges, and motivations, details the model architecture and training, presents extensive experiments and ablation studies, and outlines limitations and future research directions.

Computer VisionDeep Learningdiffusion model
0 likes · 11 min read
Can Diffusion Models Revolutionize Salient Object Detection?
Qborfy AI
Qborfy AI
Jul 1, 2025 · Artificial Intelligence

Why CNNs Outperform Fully Connected Networks: A Deep Dive into Architecture and Applications

This article explains the fundamentals of convolutional neural networks (CNNs), detailing their definition, advantages over fully connected networks, architectural components such as input, hidden, and output layers, key operations like convolution, pooling, and activation, and showcases practical applications and notable insights.

CNNComputer VisionDeep Learning
0 likes · 5 min read
Why CNNs Outperform Fully Connected Networks: A Deep Dive into Architecture and Applications
JD Retail Technology
JD Retail Technology
Jul 1, 2025 · Artificial Intelligence

JoyGen: Audio‑Driven 3D Depth‑Aware Talking‑Face Video Editing Explained

JoyGen introduces a two‑stage framework that generates high‑quality talking‑face videos by synchronizing lip movements with input audio using 3DMM‑based identity and expression coefficients, depth‑aware supervision, and a newly built high‑resolution Chinese speaking‑face dataset, achieving state‑of‑the‑art performance on multiple benchmarks.

3DMMAIGCDeep Learning
0 likes · 13 min read
JoyGen: Audio‑Driven 3D Depth‑Aware Talking‑Face Video Editing Explained
Cognitive Technology Team
Cognitive Technology Team
Jun 29, 2025 · Artificial Intelligence

Understanding Transformers: Core Mechanics Behind Modern AI Models

This article demystifies the Transformer architecture for beginners, explaining its relationship to large models, the self‑attention and multi‑head attention mechanisms, positional encoding, and the roles of Encoder and Decoder components, using clear analogies and visual diagrams to aid comprehension.

Deep LearningEncoder-DecoderPositional Encoding
0 likes · 20 min read
Understanding Transformers: Core Mechanics Behind Modern AI Models
AIWalker
AIWalker
Jun 24, 2025 · Artificial Intelligence

How Multimodal Fusion Accelerates Paper Publication: Key Insights and Resources

The article surveys 117 recent multimodal‑fusion papers, classifies them into improvement‑based and combination‑based approaches, highlights representative works such as TimeXL, OGP‑Net, MMR‑Mamba and FusionSight, and provides a free collection of papers, classic models and code repositories for researchers.

AI researchComputer VisionDeep Learning
0 likes · 8 min read
How Multimodal Fusion Accelerates Paper Publication: Key Insights and Resources
DataFunSummit
DataFunSummit
Jun 21, 2025 · Artificial Intelligence

From Bias to Fairness: De‑biasing Techniques in Uplift Modeling

This article explores the fundamentals and challenges of uplift modeling, explains why unbiased random data are essential, and presents a comprehensive suite of bias‑correction methods—including reweighting, propensity‑score matching, and advanced deep‑learning architectures such as TarNet, CFRNet, and DragonNet—to improve causal effect estimation in marketing and finance applications.

Bias CorrectionDeep LearningUplift Modeling
0 likes · 15 min read
From Bias to Fairness: De‑biasing Techniques in Uplift Modeling
Architects' Tech Alliance
Architects' Tech Alliance
Jun 19, 2025 · Fundamentals

Unlock the Secrets of GPUs: 100 Essential Fundamentals Explained

This comprehensive guide covers 100 essential GPU fundamentals, from basic definitions and architecture to core technologies, performance optimization, emerging trends, and industry developments, providing a complete technical foundation for graphics, AI, and high‑performance computing applications.

Deep LearningGPUGraphics Processing Unit
0 likes · 19 min read
Unlock the Secrets of GPUs: 100 Essential Fundamentals Explained
AI Algorithm Path
AI Algorithm Path
Jun 19, 2025 · Artificial Intelligence

Training Neural Networks with Minimal Labeled Data Using Active Learning

This article explains how active learning can dramatically reduce the amount of labeled data required for training deep neural networks by selecting the most informative and representative samples, and provides a complete Python implementation of a hybrid query strategy (DBAL) with ResNet‑18.

DBALDeep LearningPython
0 likes · 14 min read
Training Neural Networks with Minimal Labeled Data Using Active Learning
AI Frontier Lectures
AI Frontier Lectures
Jun 16, 2025 · Artificial Intelligence

What Do the CVPR 2025 Awards Reveal About the Future of Computer Vision?

The CVPR 2025 awards spotlight groundbreaking work—from the VGGT transformer that predicts full 3D scenes in a single feed‑forward pass to neural inverse rendering that reconstructs geometry from time‑resolved light—offering a comprehensive view of emerging trends, novel architectures, and performance breakthroughs across computer‑vision research.

3D reconstructionCVPR 2025Deep Learning
0 likes · 11 min read
What Do the CVPR 2025 Awards Reveal About the Future of Computer Vision?
MaGe Linux Operations
MaGe Linux Operations
Jun 15, 2025 · Artificial Intelligence

Mastering Transformers: Key Extensions and Optimization Techniques Explained

This comprehensive guide walks you through the Transformer architecture—from its encoder‑decoder structure and self‑attention mechanism to multi‑head attention, positional embeddings, and practical PyTorch implementations—providing clear visualizations and code examples for deep learning practitioners.

Deep LearningPyTorchSelf-Attention
0 likes · 22 min read
Mastering Transformers: Key Extensions and Optimization Techniques Explained
Architects' Tech Alliance
Architects' Tech Alliance
Jun 15, 2025 · Fundamentals

Master GPU Fundamentals: Architecture, Performance, and Programming Insights

This comprehensive guide covers GPU definitions, evolution, core components, architectural designs, performance metrics, programming models, deep‑learning applications, comparisons with other processors, practical use cases, optimization techniques, and future trends, providing a solid foundation for anyone interested in modern graphics and compute acceleration.

Deep LearningGPUHardware
0 likes · 43 min read
Master GPU Fundamentals: Architecture, Performance, and Programming Insights
Open Source Linux
Open Source Linux
Jun 12, 2025 · Artificial Intelligence

From Transformers to DeepSeek‑R1: The Evolution of Large Language Models (2017‑2025)

This article chronicles the rapid development of large language models from the 2017 Transformer breakthrough through the rise of BERT, GPT‑3, multimodal models, alignment techniques like RLHF, and finally the cost‑efficient DeepSeek‑R1 in 2025, highlighting key innovations, scaling trends, and real‑world impacts.

AI AlignmentDeep LearningModel Scaling
0 likes · 26 min read
From Transformers to DeepSeek‑R1: The Evolution of Large Language Models (2017‑2025)
Zhihu Tech Column
Zhihu Tech Column
Jun 11, 2025 · Artificial Intelligence

How Minute‑Level Time Decay Boosts User Retention Modeling in Recommendation Systems

This article presents a novel minute‑level future‑reward framework with dual‑delay incentives, activity‑based attribution, multi‑task delayed modeling, and sequential streaming training that dramatically improves user retention prediction accuracy and real‑time performance in large‑scale recommendation platforms.

Deep LearningUser Retentionmulti‑task modeling
0 likes · 17 min read
How Minute‑Level Time Decay Boosts User Retention Modeling in Recommendation Systems
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Jun 11, 2025 · Artificial Intelligence

Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI

Kuaishou presented twelve peer‑reviewed papers at CVPR 2025 covering video quality assessment, large‑scale video datasets, dynamic 3D avatar reconstruction, 4D scene simulation, controllable video generation, scaling laws for diffusion transformers, multimodal foundations, and more, highlighting the company's leading research in computer vision and AI.

AI researchCVPR2025Deep Learning
0 likes · 21 min read
Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI
AI Frontier Lectures
AI Frontier Lectures
Jun 10, 2025 · Artificial Intelligence

Can One Model Master All Remote Sensing Tasks? Introducing the TSSUN Framework

This paper presents the Temporal‑Spectral‑Spatial Unified Network (TSSUN), a flexible deep‑learning architecture that simultaneously handles semantic segmentation, semantic change detection, and binary change detection across heterogeneous remote‑sensing inputs, achieving state‑of‑the‑art performance without task‑specific retraining.

Attention MechanismDeep LearningTSSUN
0 likes · 15 min read
Can One Model Master All Remote Sensing Tasks? Introducing the TSSUN Framework
AIWalker
AIWalker
Jun 3, 2025 · Artificial Intelligence

DeepKD: Double‑Layer Decoupling and Adaptive Denoising Set New ImageNet SOTA

DeepKD introduces a double‑layer decoupling framework and a dynamic top‑K mask that adaptively denoises low‑confidence logits, addressing conflicts between target and non‑target knowledge flows; extensive experiments on CIFAR‑100, ImageNet‑1K, and MS‑COCO demonstrate consistent accuracy gains and state‑of‑the‑art performance.

Deep LearningGSNRSOTA
0 likes · 23 min read
DeepKD: Double‑Layer Decoupling and Adaptive Denoising Set New ImageNet SOTA
AIWalker
AIWalker
Jun 2, 2025 · Artificial Intelligence

NTIRE 2025 UGC Video Enhancement Challenge: Methods and Results

The NTIRE 2025 challenge introduced a new benchmark for user‑generated content video enhancement, detailing a 150‑video dataset, a pairwise subjective evaluation using the Bradley‑Terry model, hardware specifications, and the diverse multi‑stage deep‑learning methods and results of participating teams.

BenchmarkDeep LearningNTIRE 2025
0 likes · 22 min read
NTIRE 2025 UGC Video Enhancement Challenge: Methods and Results
AIWalker
AIWalker
Jun 2, 2025 · Artificial Intelligence

Multi-University Team Proposes Tree-Guided CNN for Image Super-Resolution

The paper presents a tree‑guided convolutional neural network that leverages binary‑tree structures, cosine‑based cross‑domain feature extraction, and an adaptive Nesterov momentum optimizer to enhance key layer interactions, achieving superior image super‑resolution performance as demonstrated by extensive experiments.

Deep Learningadaptive Nesterov optimizercosine feature extraction
0 likes · 5 min read
Multi-University Team Proposes Tree-Guided CNN for Image Super-Resolution
DaTaobao Tech
DaTaobao Tech
May 16, 2025 · Artificial Intelligence

JianYi: AI‑Powered Image Segmentation and Matting System for Taobao Home‑Decoration

The article introduces JianYi, a self‑developed image segmentation and matting system for Taobao's home‑decoration business that supports product, human, and panoramic segmentation with multi‑modal interaction, achieving high‑precision real‑time performance and powering AI tools such as "Jiazuo" and "Fang Wo Jia".

Computer VisionDeep Learningartificial intelligence
0 likes · 11 min read
JianYi: AI‑Powered Image Segmentation and Matting System for Taobao Home‑Decoration
Bilibili Tech
Bilibili Tech
May 16, 2025 · Artificial Intelligence

How FineVQ Sets New Standards for Fine‑Grained UGC Video Quality Assessment

The article introduces FineVD, the first large‑scale multi‑dimensional UGC video quality dataset, and presents FineVQ, a unified model that predicts quality scores, attributes, and distortion types across six dimensions, achieving state‑of‑the‑art performance on multiple benchmarks and cross‑dataset evaluations.

Computer VisionDatasetDeep Learning
0 likes · 9 min read
How FineVQ Sets New Standards for Fine‑Grained UGC Video Quality Assessment
Amap Tech
Amap Tech
May 8, 2025 · Artificial Intelligence

FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

FantasyTalking generates high-fidelity, coherent talking portraits from a single static image by employing a two-stage audio-visual alignment—global segment-level motion and frame-level lip refinement—combined with face-centric cross-attention for identity preservation and a motion-intensity module that lets users control expression and body movement, achieving superior realism, synchronization, and performance over prior methods.

Deep Learningaudio-visual alignmentidentity preservation
0 likes · 10 min read
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
IT Services Circle
IT Services Circle
May 2, 2025 · Artificial Intelligence

Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It

The article explains why deep networks suffer from gradient vanishing—especially when using sigmoid or tanh activations—covers the underlying mathematics, compares activation functions, and presents practical techniques such as proper weight initialization, batch normalization, residual connections, and code examples to visualize the phenomenon.

Batch NormalizationDeep LearningNeural Networks
0 likes · 7 min read
Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It
JD Tech
JD Tech
Apr 30, 2025 · Artificial Intelligence

TimeHF: A Billion‑Scale Time Series Forecasting Model Guided by Human Feedback

The JD Supply Chain algorithm team introduces TimeHF, a billion‑parameter time‑series large model that leverages RLHF to boost demand‑forecast accuracy by over 10%, detailing dataset construction, the PCTLM architecture, a custom RLHF framework (TPO), and extensive SOTA experimental results.

Big DataDeep LearningRLHF
0 likes · 10 min read
TimeHF: A Billion‑Scale Time Series Forecasting Model Guided by Human Feedback
AI Frontier Lectures
AI Frontier Lectures
Apr 30, 2025 · Artificial Intelligence

How Dual‑Domain Strip Attention Revolutionizes Image Restoration

The paper introduces Dual‑Domain Strip Attention Network (DSANet), a lightweight architecture that combines spatial and frequency strip attention to boost multi‑scale representation learning, achieving state‑of‑the‑art performance on dehazing, desnowing, defocus deblurring, and denoising tasks with significantly lower computational cost.

Deep LearningNeural Networksdual-domain attention
0 likes · 10 min read
How Dual‑Domain Strip Attention Revolutionizes Image Restoration
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Apr 23, 2025 · Artificial Intelligence

DeepQueueNet in Practice: Quickly Achieve High‑Precision Network Simulation

This article walks through using DeepQueueNet—a deep‑learning‑enhanced network performance estimator—to set up a device model, train the PyTorch version, configure a fattree16 topology, and run multi‑GPU simulations that deliver minute‑level, packet‑accurate results in as little as 1 minute 27 seconds.

Deep LearningDeepQueueNetPyTorch
0 likes · 6 min read
DeepQueueNet in Practice: Quickly Achieve High‑Precision Network Simulation
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 22, 2025 · Artificial Intelligence

How DistilQwen2.5-DS3-0324 Achieves Fast, Accurate Reasoning via Quick‑Think Distillation

This article introduces DistilQwen2.5-DS3-0324, a distilled language model series that balances rapid inference with strong reasoning by applying a fast‑thinking chain‑of‑thought strategy, details its two‑stage distillation framework, evaluation on diverse benchmarks, and provides code for downloading and using the models.

Deep Learningchain-of-thoughtfast inference
0 likes · 17 min read
How DistilQwen2.5-DS3-0324 Achieves Fast, Accurate Reasoning via Quick‑Think Distillation
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 18, 2025 · Artificial Intelligence

How the New 14B End‑to‑End Video Model Generates Custom 720p Clips from Two Images

The open‑sourced 14‑billion‑parameter Tongyi Wanxiang video model can create high‑quality 720p videos that seamlessly connect user‑provided start and end images, offering controllable, personalized video generation with prompt‑driven camera motions and easy access via its website, GitHub, Hugging Face, and ModelScope.

AI modelComputer VisionDeep Learning
0 likes · 5 min read
How the New 14B End‑to‑End Video Model Generates Custom 720p Clips from Two Images
AIWalker
AIWalker
Apr 16, 2025 · Artificial Intelligence

Plug‑and‑Play Multi‑Scale Attention: A Seamless Boost for Model Performance

This article reviews recent multi‑scale attention breakthroughs—including EMA, MSDA, VWA, and related modules—showing how they improve accuracy, cut FLOPs by up to 70%, and can be inserted into existing models with minimal effort, backed by code and paper links.

Computer VisionDeep LearningPlug-and-Play
0 likes · 10 min read
Plug‑and‑Play Multi‑Scale Attention: A Seamless Boost for Model Performance
Cognitive Technology Team
Cognitive Technology Team
Apr 12, 2025 · Artificial Intelligence

Analyzing a Trained Neural Network: Visualizing Hidden Layers and Understanding Its Limitations

This article walks through an interactive exploration of a simple two‑hidden‑layer neural network, showing how real‑time visualizations reveal its learned representations, accuracy limits, and why constrained training leads to over‑confident yet unintelligent predictions before introducing backpropagation.

BackpropagationDeep LearningNeural Networks
0 likes · 10 min read
Analyzing a Trained Neural Network: Visualizing Hidden Layers and Understanding Its Limitations
Cognitive Technology Team
Cognitive Technology Team
Apr 9, 2025 · Artificial Intelligence

How Neural Networks Learn: Gradient Descent and Loss Functions

This article explains how neural networks learn by using labeled training data, describing the role of weights, biases, activation functions, and how gradient descent iteratively adjusts parameters to minimize loss, illustrated with the MNIST digit‑recognition example.

Deep LearningMNISTNeural Networks
0 likes · 16 min read
How Neural Networks Learn: Gradient Descent and Loss Functions
JD Tech
JD Tech
Apr 8, 2025 · Artificial Intelligence

MaRCA: Multi‑Agent Reinforcement Learning Computation Allocation for Full‑Chain Advertising Systems

The article presents MaRCA, a multi‑agent reinforcement learning framework that models user value, compute consumption, and action reward to allocate limited computation resources across the entire advertising recommendation pipeline, achieving higher ad revenue while keeping system load stable under fluctuating traffic and diverse request values.

Deep LearningLoad-Aware SchedulingResource Optimization
0 likes · 16 min read
MaRCA: Multi‑Agent Reinforcement Learning Computation Allocation for Full‑Chain Advertising Systems
AI Frontier Lectures
AI Frontier Lectures
Apr 8, 2025 · Artificial Intelligence

How HINT’s Hierarchical Multi‑Head Attention Boosts Image Restoration

The article introduces HINT, a Transformer‑based image restoration model that solves the redundancy of standard multi‑head attention by using Hierarchical Multi‑Head Attention and a Query‑Key Cache Updating module, and demonstrates superior PSNR/SSIM performance across multiple low‑level vision tasks while keeping model complexity low.

Deep Learningquery-key cache
0 likes · 10 min read
How HINT’s Hierarchical Multi‑Head Attention Boosts Image Restoration
Cognitive Technology Team
Cognitive Technology Team
Apr 8, 2025 · Artificial Intelligence

Understanding Neural Networks: Structure, Layers, and Activation

This article explains how a simple neural network can recognize handwritten digits by preprocessing images, organizing neurons into input, hidden, and output layers, using weighted sums, biases, sigmoid compression, and matrix multiplication to illustrate the fundamentals of deep learning.

Deep LearningLayersNeural Networks
0 likes · 16 min read
Understanding Neural Networks: Structure, Layers, and Activation
Data Thinking Notes
Data Thinking Notes
Apr 6, 2025 · Artificial Intelligence

Why Mixture of Experts (MoE) is Revolutionizing Large AI Models

Mixture of Experts (MoE) leverages dynamic conditional computation and specialized expert networks to overcome the parameter explosion and inefficiency of dense models, offering scalable capacity, multi‑task adaptability, and improved efficiency, while addressing challenges such as training stability, communication overhead, and load balancing.

Deep LearningMixture of ExpertsModel Scaling
0 likes · 7 min read
Why Mixture of Experts (MoE) is Revolutionizing Large AI Models
Baidu Tech Salon
Baidu Tech Salon
Apr 2, 2025 · Artificial Intelligence

PaddlePaddle Framework 3.0 Released: Five Core Innovations for Large Models and Scientific Computing

PaddlePaddle 3.0, launched on April 1 2025, introduces five core innovations—including dynamic‑static unified automatic parallelism, a training‑inference integrated PIR, high‑order automatic differentiation for scientific computing, a one‑stage CINN compiler, and heterogeneous multi‑chip adaptation—that dramatically reduce distributed‑training code, boost performance up to four‑fold, and extend the framework to aerospace, automotive, meteorology and life‑science applications while remaining fully compatible with the 2.0 API.

Deep LearningPaddlePaddleautomatic parallelism
0 likes · 21 min read
PaddlePaddle Framework 3.0 Released: Five Core Innovations for Large Models and Scientific Computing
Cognitive Technology Team
Cognitive Technology Team
Mar 31, 2025 · Artificial Intelligence

Recommendation Algorithms: Using Mathematical Methods for Efficient Information Matching

Recommendation algorithms, rooted in machine learning and deep learning, transform massive user‑generated data into mathematical models that filter and personalize content, covering traditional collaborative filtering, matrix factorization, cosine similarity, and modern deep models such as Wide & Deep and Two‑Tower retrieval, illustrating their evolution and practical applications.

Deep LearningWide&Deepcollaborative filtering
0 likes · 14 min read
Recommendation Algorithms: Using Mathematical Methods for Efficient Information Matching
Cognitive Technology Team
Cognitive Technology Team
Mar 31, 2025 · Artificial Intelligence

Understanding Douyin's Recommendation Algorithm: From Behavior Prediction to Value Modeling

The article explains how Douyin's recommendation system uses machine‑learning and deep‑learning models to predict user actions, assign value weights, and dynamically adjust scores, highlighting both its efficiency in large‑scale content distribution and its inherent limitations compared to human understanding.

AIDeep Learningrecommendation system
0 likes · 7 min read
Understanding Douyin's Recommendation Algorithm: From Behavior Prediction to Value Modeling
JavaEdge
JavaEdge
Mar 27, 2025 · Artificial Intelligence

Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)

This article examines the limitations of current vision‑language and reasoning models, proposes a visual reasoning model (VRM) that can process images and perform deep logical inference, and discusses architecture, training methods, reinforcement‑learning reward designs, and practical challenges.

Deep LearningLLMVision-Language Model
0 likes · 8 min read
Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)
JD Retail Technology
JD Retail Technology
Mar 18, 2025 · Artificial Intelligence

Multi‑Agent Reinforcement Learning Based Full‑Chain Computation Allocation (MaRCA) for Advertising Systems

MaRCA, a multi‑agent reinforcement‑learning framework, allocates compute across JD’s advertising playback chain by jointly estimating user value, resource consumption, and action outcomes while dynamically adjusting to real‑time load, achieving roughly 15 % higher ad revenue without extra compute resources.

AdvertisingCompute SchedulingDeep Learning
0 likes · 18 min read
Multi‑Agent Reinforcement Learning Based Full‑Chain Computation Allocation (MaRCA) for Advertising Systems
JavaEdge
JavaEdge
Mar 15, 2025 · Artificial Intelligence

Boost NLP Model Performance with n-gram Feature Engineering

This article explains why feature engineering is crucial for NLP tasks, introduces n‑gram enhancements, provides Python implementations for generating bi‑gram and higher‑order features, demonstrates dynamic padding for text length standardization, and offers practical deployment tips such as feature dimension control and monitoring.

Deep LearningN-gramNLP
0 likes · 7 min read
Boost NLP Model Performance with n-gram Feature Engineering
AIWalker
AIWalker
Mar 14, 2025 · Artificial Intelligence

Dynamic Tanh Lets He Kaiming and LeCun Drop Transformer Normalization in 9 Lines

Researchers He Kaiming, Yann LeCun and colleagues propose a 9‑line Dynamic Tanh (DyT) layer that replaces LayerNorm/RMSNorm in Transformers, showing comparable or superior accuracy across vision, language, speech and DNA tasks while also reducing inference latency on modern GPUs.

AI researchDeep LearningDynamic Tanh
0 likes · 18 min read
Dynamic Tanh Lets He Kaiming and LeCun Drop Transformer Normalization in 9 Lines
AI Frontier Lectures
AI Frontier Lectures
Mar 14, 2025 · Artificial Intelligence

Do Vision Models Really Need Mamba? A Deep Dive into MambaOut

This article critically examines the MambaOut paper, analyzing whether state‑space‑based Mamba token mixers are necessary for vision tasks, presenting two hypotheses, describing the construction of MambaOut models without SSM, and reporting extensive ImageNet, COCO and ADE20K experiments that reveal when Mamba is beneficial.

Deep LearningMambaState Space Model
0 likes · 17 min read
Do Vision Models Really Need Mamba? A Deep Dive into MambaOut
AIWalker
AIWalker
Mar 10, 2025 · Artificial Intelligence

HSR-Mamba Solves Mamba’s HSISR Issue with Dual Strategies, Beats Prior Methods

HSR-Mamba introduces a contextual spatial‑spectral state‑space model that tackles Mamba's limitations in hyperspectral image super‑resolution through a local partition mechanism and a global spectral rearrangement strategy, achieving significantly higher PSNR, SSIM and SAM scores than existing approaches while using fewer parameters and FLOPs.

Deep LearningDual strategyHSI super-resolution
0 likes · 25 min read
HSR-Mamba Solves Mamba’s HSISR Issue with Dual Strategies, Beats Prior Methods
AIWalker
AIWalker
Mar 8, 2025 · Artificial Intelligence

Trainable HVI Color Space Turns Dark Photos into Cinematic Images – CVPR 2025

The paper introduces a globally first trainable HVI color space and a lightweight CIDNet network that jointly model intensity and chrominance, eliminating color bias and brightness artifacts in low‑light image enhancement and achieving state‑of‑the‑art results on ten benchmark datasets.

CIDNetCVPR 2025Computer Vision
0 likes · 12 min read
Trainable HVI Color Space Turns Dark Photos into Cinematic Images – CVPR 2025
Cognitive Technology Team
Cognitive Technology Team
Mar 6, 2025 · Artificial Intelligence

From Traditional Machine Learning to Deep Learning: A Comprehensive Guide to Algorithms, Feature Engineering, and Model Training

This article provides a step‑by‑step tutorial that walks readers through the fundamentals of traditional machine‑learning algorithms, feature‑engineering techniques, model training pipelines, evaluation metrics, and then advances to deep‑learning concepts such as MLPs, activation functions, transformers, and modern recommendation‑system models.

Deep LearningModel TrainingPython
0 likes · 63 min read
From Traditional Machine Learning to Deep Learning: A Comprehensive Guide to Algorithms, Feature Engineering, and Model Training
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 6, 2025 · Artificial Intelligence

From Linear Regression to Transformers: Mastering Machine Learning Foundations

This comprehensive guide walks readers through the evolution of machine learning, starting with basic linear models and feature engineering, progressing through logistic regression, decision trees, and deep learning architectures like MLPs, CNNs, RNNs, and transformers, and demonstrates practical implementations with code examples and evaluation metrics.

Deep LearningEvaluation MetricsRecommendation Systems
0 likes · 64 min read
From Linear Regression to Transformers: Mastering Machine Learning Foundations
AntTech
AntTech
Mar 5, 2025 · Artificial Intelligence

Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting

Pyraformer introduces a pyramidal attention mechanism that captures long-range dependencies in time-series data with linear time and space complexity, achieving state-of-the-art forecasting accuracy on multiple real-world datasets while reducing computational cost, as demonstrated in extensive ICLR-2022 experiments.

Deep LearningICLR 2022Pyraformer
0 likes · 11 min read
Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting
AIWalker
AIWalker
Mar 1, 2025 · Artificial Intelligence

Lightweight Remote Sensing Backbone LSKNet and Strip R-CNN: Design, Benchmarks, and Open‑Source Release

The NK‑Remote repository introduces LSKNet and Strip R‑CNN, two lightweight yet powerful models for remote‑sensing object detection that dynamically adjust receptive fields and combine square‑and‑strip convolutions, achieving state‑of‑the‑art performance on benchmarks such as DOTA, FAIR1M, HRSC2016, and DIOR.

BenchmarkDeep LearningJDet
0 likes · 9 min read
Lightweight Remote Sensing Backbone LSKNet and Strip R-CNN: Design, Benchmarks, and Open‑Source Release
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Feb 27, 2025 · Artificial Intelligence

SAFE: A Lightweight General AI Image Detection Method Achieving 96.7% Accuracy Across 33 Test Subsets

SAFE is a lightweight AI‑image detection framework using only 1.44 M parameters and 2.30 B FLOPs that preserves fine‑grained artifacts through crop‑based preprocessing, invariant augmentations, and high‑frequency wavelet features, achieving an average 96.7 % accuracy across 33 test subsets and strong generalization to unseen GAN and diffusion generators.

AI image detectionComputer VisionDeep Learning
0 likes · 11 min read
SAFE: A Lightweight General AI Image Detection Method Achieving 96.7% Accuracy Across 33 Test Subsets
DataFunTalk
DataFunTalk
Feb 26, 2025 · Artificial Intelligence

Alibaba Cloud's Wanxiang 2.1: Open‑Source Dual‑Version Visual Generation Model with Full‑Scale Capabilities

Wanxiang 2.1, an open‑source visual generation model released by Alibaba Cloud, offers a 140‑billion‑parameter professional version and a 13‑billion‑parameter consumer‑grade version, delivering SOTA performance across multiple benchmarks, supporting diverse video generation tasks, and employing advanced DiT‑based architecture, 3D VAE, and efficient distributed training strategies.

AI modelDeep Learningvisual generation
0 likes · 11 min read
Alibaba Cloud's Wanxiang 2.1: Open‑Source Dual‑Version Visual Generation Model with Full‑Scale Capabilities
AIWalker
AIWalker
Feb 25, 2025 · Artificial Intelligence

Sliding Tile Attention speeds up HunyuanVideo DiT generation 3.5×

Sliding Tile Attention (STA) replaces costly full‑3D attention in video DiT models with a block‑wise sliding‑window scheme, achieving up to 10× attention speedup and a 3.53× end‑to‑end generation boost for HunyuanVideo without quality loss, as demonstrated by extensive benchmarks and kernel analyses.

Deep LearningGPU OptimizationHunyuanVideo
0 likes · 16 min read
Sliding Tile Attention speeds up HunyuanVideo DiT generation 3.5×
JavaEdge
JavaEdge
Feb 24, 2025 · Artificial Intelligence

Build a CIFAR‑10 Image Classifier with PyTorch – A Java Developer’s Guide

This tutorial walks Java developers through building, training, evaluating, and deploying a CIFAR‑10 image classifier using PyTorch, covering data loading, preprocessing, network definition, loss and optimizer setup, GPU acceleration, model saving, and per‑class accuracy analysis.

CIFAR-10Deep LearningGPU
0 likes · 18 min read
Build a CIFAR‑10 Image Classifier with PyTorch – A Java Developer’s Guide
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Feb 24, 2025 · Artificial Intelligence

AIDE: Hybrid Feature Detector for AI‑Generated Image Detection and the Chameleon Benchmark

The paper introduces AIDE, a hybrid AI‑generated image detector that fuses low‑level pixel statistics with high‑level semantic embeddings, and the manually curated Chameleon benchmark of ~26 000 diverse, high‑realism images, showing AIDE surpasses nine state‑of‑the‑art methods by up to 4.6 % while highlighting remaining challenges on this tougher dataset.

AI-generated image detectionComputer VisionDeep Learning
0 likes · 14 min read
AIDE: Hybrid Feature Detector for AI‑Generated Image Detection and the Chameleon Benchmark
DaTaobao Tech
DaTaobao Tech
Feb 24, 2025 · Artificial Intelligence

AIGC Video Generation Techniques for E‑commerce: Lip‑Sync, Head/Body Driving, and Business Applications

The article surveys recent AIGC video generation advances for Taobao e‑commerce, detailing lip‑sync models like Wav2Lip and MuseTalk, head‑driven systems such as Hallo and EchoMimic, body‑driven pipelines including AnimateAnyone and Tango, and a four‑stage production workflow that boosts click‑through rates and enables virtual try‑on.

AIGCDeep LearningMultimodal AI
0 likes · 21 min read
AIGC Video Generation Techniques for E‑commerce: Lip‑Sync, Head/Body Driving, and Business Applications
DevOps
DevOps
Feb 23, 2025 · Artificial Intelligence

Understanding Reinforcement Learning, RLHF, PPO and GRPO for AI Applications

This article explains how DeepSeek‑R1‑Zero uses group‑relative policy optimization (GRPO) to enhance inference without labeled data, introduces reinforcement learning with human feedback (RLHF) and its components, and compares the PPO and GRPO algorithms, highlighting their suitable engineering scenarios and practical implications for AI applications.

AI model trainingDeep LearningGRPO
0 likes · 15 min read
Understanding Reinforcement Learning, RLHF, PPO and GRPO for AI Applications
JavaEdge
JavaEdge
Feb 23, 2025 · Artificial Intelligence

How Java Developers Can Build Neural Networks with PyTorch: A Step‑by‑Step Guide

This tutorial walks Java developers through the complete workflow of building, training, and evaluating a neural network in PyTorch, covering network definition, data iteration, forward and backward passes, loss calculation, and parameter updates with detailed code examples and Java‑centric analogies.

BackpropagationDeep LearningJava
0 likes · 12 min read
How Java Developers Can Build Neural Networks with PyTorch: A Step‑by‑Step Guide
DeWu Technology
DeWu Technology
Feb 19, 2025 · Artificial Intelligence

Scenario-aware Multi-Scenario Recommendation Models: SACN, SAINet, and DSWIN

The paper presents a comprehensive multi‑scenario recommendation study introducing three models—SACN, SAINet, and DSWIN—that integrate scene‑aware attention, attribute‑level preferences, and contrastive disentanglement to capture distinct user interests, achieving consistent AUC gains and online CTR improvements across real‑world datasets.

CTR predictionDeep Learningcontrastive learning
0 likes · 43 min read
Scenario-aware Multi-Scenario Recommendation Models: SACN, SAINet, and DSWIN
DataFunTalk
DataFunTalk
Feb 19, 2025 · Artificial Intelligence

Large Models: Concepts, Principles, Classifications and Applications

This report provides a comprehensive overview of large-scale AI models, explaining their definition, massive parameter and data requirements, underlying transformer architecture, classification into language, vision and multimodal models, notable examples such as DeepSeek, and a survey of popular AIGC tools and practical use cases.

AIGC toolsDeep LearningMultimodal AI
0 likes · 9 min read
Large Models: Concepts, Principles, Classifications and Applications
AI Code to Success
AI Code to Success
Feb 19, 2025 · Artificial Intelligence

How to Build Traffic‑Sign Recognition and Sentiment Analysis with Keras – A Step‑by‑Step Guide

This article walks through practical Keras tutorials for image‑based traffic‑sign classification and text‑based sentiment analysis, covering data preparation, preprocessing, model construction, training, evaluation, deployment, and a concise comparison of Keras with TensorFlow and PyTorch.

Deep LearningImage ClassificationKeras
0 likes · 19 min read
How to Build Traffic‑Sign Recognition and Sentiment Analysis with Keras – A Step‑by‑Step Guide
Python Programming Learning Circle
Python Programming Learning Circle
Feb 18, 2025 · Artificial Intelligence

Getting Started with PyTorch: Installation, Core Operations, and Practical Deep Learning Projects

This article introduces PyTorch, covering installation on CPU/GPU, basic tensor operations, automatic differentiation, building and training neural networks, data loading with DataLoader, image classification on MNIST, model deployment, and useful tips for accelerating deep‑learning workflows.

Deep LearningGPUNeural Networks
0 likes · 9 min read
Getting Started with PyTorch: Installation, Core Operations, and Practical Deep Learning Projects
AI Code to Success
AI Code to Success
Feb 14, 2025 · Artificial Intelligence

TensorFlow vs PyTorch: Which Deep Learning Framework Wins for Your Projects?

An in‑depth comparison of TensorFlow and PyTorch examines their computation graph models, deployment tools, API ergonomics, community ecosystems, and performance characteristics, helping developers decide which framework best fits industrial production or fast‑paced research scenarios.

AI DevelopmentDeep LearningPyTorch
0 likes · 8 min read
TensorFlow vs PyTorch: Which Deep Learning Framework Wins for Your Projects?
AI Code to Success
AI Code to Success
Feb 13, 2025 · Artificial Intelligence

Why PyTorch Is the Go-To Framework for Modern AI Development

This article introduces PyTorch, explains its dynamic computation graph, Python‑centric design, and tensor operations, surveys its major applications in computer vision, natural language processing, and reinforcement learning, and provides a step‑by‑step tutorial for building and training a multilayer perceptron on the MNIST dataset.

Deep LearningDynamic Computation GraphMNIST
0 likes · 11 min read
Why PyTorch Is the Go-To Framework for Modern AI Development
Cognitive Technology Team
Cognitive Technology Team
Feb 12, 2025 · Artificial Intelligence

Introduction to Neural Networks by Professor Li Yongle

In this introductory session, renowned graduate exam instructor Professor Li Yongle provides a clear, beginner-friendly overview of neural networks, covering basic concepts and their relevance within artificial intelligence, including their structure, learning mechanisms, and typical applications in modern AI systems.

AIDeep LearningNeural Networks
0 likes · 1 min read
Introduction to Neural Networks by Professor Li Yongle
AI Code to Success
AI Code to Success
Feb 11, 2025 · Artificial Intelligence

Unlocking TensorFlow: From Basics to Building Your First Linear Regression Model

This article introduces TensorFlow's core concepts—tensors, computational graphs, variables, and sessions—covers its wide range of AI applications from traditional machine learning to deep learning in NLP and computer vision, and provides a step‑by‑step Python tutorial for implementing a simple linear regression model.

AI TutorialDeep LearningNeural Networks
0 likes · 6 min read
Unlocking TensorFlow: From Basics to Building Your First Linear Regression Model
IT Architects Alliance
IT Architects Alliance
Feb 10, 2025 · Artificial Intelligence

DeepSeek Distillation Technology: Principles, Innovations, Performance, and Future Outlook

The article explains DeepSeek's model distillation technique, covering its fundamental knowledge‑transfer principles, unique innovations such as data‑model fusion and task‑specific strategies, impressive benchmark results, practical applications in edge and online inference, existing challenges, and future research directions.

AI OptimizationDeep LearningEdge Computing
0 likes · 15 min read
DeepSeek Distillation Technology: Principles, Innovations, Performance, and Future Outlook
Cognitive Technology Team
Cognitive Technology Team
Feb 9, 2025 · Artificial Intelligence

A Beginner’s Guide to the History and Key Concepts of Deep Learning

From the perceptron’s inception in 1958 to modern Transformer-based models like GPT, this article traces the evolution of deep learning, explaining foundational architectures such as DNNs, CNNs, RNNs, LSTMs, attention mechanisms, and recent innovations like DeepSeek’s MLA, highlighting their principles and impact.

Deep LearningGPTMLA
0 likes · 19 min read
A Beginner’s Guide to the History and Key Concepts of Deep Learning
AIWalker
AIWalker
Feb 9, 2025 · Artificial Intelligence

Douyin’s BDVQAGroup Secures Global Runner‑Up in DXOMARK Image Quality Challenge at CVPR 2024

At CVPR 2024 NTIRE, Douyin’s BDVQAGroup achieved second place worldwide in the DXOMARK portrait quality track using their SampleIQA model, which combines data‑re‑sampling, a Swin‑Transformer backbone, twin‑network ranking loss and content‑aware cropping to outperform existing IQA state‑of‑the‑art methods.

Computer VisionDXOMARKDeep Learning
0 likes · 10 min read
Douyin’s BDVQAGroup Secures Global Runner‑Up in DXOMARK Image Quality Challenge at CVPR 2024
Cognitive Technology Team
Cognitive Technology Team
Feb 7, 2025 · Artificial Intelligence

Knowledge Distillation: Concepts, Techniques, Applications, and Future Directions

This article explains knowledge distillation—a technique introduced by Geoffrey Hinton that transfers knowledge from large teacher models to compact student models—covering its core concepts, loss functions, various distillation strategies, notable applications in edge computing, federated learning, continual learning, and emerging research directions.

Deep LearningEdge ComputingFederated Learning
0 likes · 7 min read
Knowledge Distillation: Concepts, Techniques, Applications, and Future Directions
JavaEdge
JavaEdge
Feb 6, 2025 · Artificial Intelligence

Why Training Transformers Faces an Impossible Triangle of Speed, Performance, and Cost

The article explains the “impossible triangle” in Transformer training, showing how speed, model performance, and computational cost cannot all be optimized simultaneously, and uses analogies and real‑world examples like GPT‑4 to illustrate the necessary trade‑offs.

Deep LearningModel TrainingPerformance Tradeoff
0 likes · 7 min read
Why Training Transformers Faces an Impossible Triangle of Speed, Performance, and Cost
Architects' Tech Alliance
Architects' Tech Alliance
Feb 4, 2025 · Artificial Intelligence

Why AI Frameworks Are the Backbone of Modern AI – Spotlight on MindSpore

The article explains what AI frameworks are, why they act as the operating system of artificial intelligence, showcases real‑world uses in transportation and finance, and provides an in‑depth analysis of Huawei's MindSpore framework, highlighting its development experience, hardware optimization, deployment flexibility, and enterprise‑grade security features.

AI FrameworkDeep LearningEnterprise AI
0 likes · 7 min read
Why AI Frameworks Are the Backbone of Modern AI – Spotlight on MindSpore