Tag

neural networks

0 views collected around this technical thread.

IT Services Circle
IT Services Circle
May 2, 2025 · Artificial Intelligence

Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It

The article explains why deep networks suffer from gradient vanishing—especially when using sigmoid or tanh activations—covers the underlying mathematics, compares activation functions, and presents practical techniques such as proper weight initialization, batch normalization, residual connections, and code examples to visualize the phenomenon.

ResNetactivation functionsbatch normalization
0 likes · 7 min read
Understanding Gradient Vanishing in Deep Neural Networks and How to Mitigate It
Cognitive Technology Team
Cognitive Technology Team
Apr 12, 2025 · Artificial Intelligence

Analyzing a Trained Neural Network: Visualizing Hidden Layers and Understanding Its Limitations

This article walks through an interactive exploration of a simple two‑hidden‑layer neural network, showing how real‑time visualizations reveal its learned representations, accuracy limits, and why constrained training leads to over‑confident yet unintelligent predictions before introducing backpropagation.

backpropagationdeep learninghidden layers
0 likes · 10 min read
Analyzing a Trained Neural Network: Visualizing Hidden Layers and Understanding Its Limitations
Cognitive Technology Team
Cognitive Technology Team
Apr 9, 2025 · Artificial Intelligence

How Neural Networks Learn: Gradient Descent and Loss Functions

This article explains how neural networks learn by using labeled training data, describing the role of weights, biases, activation functions, and how gradient descent iteratively adjusts parameters to minimize loss, illustrated with the MNIST digit‑recognition example.

MNISTdeep learninggradient descent
0 likes · 16 min read
How Neural Networks Learn: Gradient Descent and Loss Functions
Cognitive Technology Team
Cognitive Technology Team
Apr 8, 2025 · Artificial Intelligence

Understanding Neural Networks: Structure, Layers, and Activation

This article explains how a simple neural network can recognize handwritten digits by preprocessing images, organizing neurons into input, hidden, and output layers, using weighted sums, biases, sigmoid compression, and matrix multiplication to illustrate the fundamentals of deep learning.

activation functionsdeep learninglayers
0 likes · 16 min read
Understanding Neural Networks: Structure, Layers, and Activation
Python Programming Learning Circle
Python Programming Learning Circle
Feb 18, 2025 · Artificial Intelligence

Getting Started with PyTorch: Installation, Core Operations, and Practical Deep Learning Projects

This article introduces PyTorch, covering installation on CPU/GPU, basic tensor operations, automatic differentiation, building and training neural networks, data loading with DataLoader, image classification on MNIST, model deployment, and useful tips for accelerating deep‑learning workflows.

GPUPyTorchPython
0 likes · 9 min read
Getting Started with PyTorch: Installation, Core Operations, and Practical Deep Learning Projects
Cognitive Technology Team
Cognitive Technology Team
Feb 12, 2025 · Artificial Intelligence

Introduction to Neural Networks by Professor Li Yongle

In this introductory session, renowned graduate exam instructor Professor Li Yongle provides a clear, beginner-friendly overview of neural networks, covering basic concepts and their relevance within artificial intelligence, including their structure, learning mechanisms, and typical applications in modern AI systems.

AIdeep learningeducation
0 likes · 1 min read
Introduction to Neural Networks by Professor Li Yongle
Architect
Architect
Feb 10, 2025 · Artificial Intelligence

Evolution of DeepSeek Mixture‑of‑Experts (MoE) Architecture from V1 to V3

This article reviews the development of DeepSeek's Mixture-of-Experts (MoE) models, tracing their evolution from the original DeepSeekMoE V1 through V2 to V3, detailing architectural innovations such as fine‑grained expert segmentation, shared‑expert isolation, load‑balancing losses, device‑limited routing, and the shift from softmax to sigmoid gating.

DeepSeekLLMLoad Balancing
0 likes · 21 min read
Evolution of DeepSeek Mixture‑of‑Experts (MoE) Architecture from V1 to V3
Cognitive Technology Team
Cognitive Technology Team
Feb 9, 2025 · Artificial Intelligence

A Beginner’s Guide to the History and Key Concepts of Deep Learning

From the perceptron’s inception in 1958 to modern Transformer-based models like GPT, this article traces the evolution of deep learning, explaining foundational architectures such as DNNs, CNNs, RNNs, LSTMs, attention mechanisms, and recent innovations like DeepSeek’s MLA, highlighting their principles and impact.

GPTHistoryMLA
0 likes · 19 min read
A Beginner’s Guide to the History and Key Concepts of Deep Learning
Model Perspective
Model Perspective
Dec 26, 2024 · Fundamentals

What Makes a Mathematical Model Enduring? Lessons from AI and Ecology

The article explores the characteristics of long‑lasting mathematical models—continuous refinement, expanding applicability, elegant simplicity, extensibility, focus on essence, and philosophical depth—illustrated with examples such as neural networks and the Lotka‑Volterra predator‑prey system, and offers guidance on creating such vibrant models.

InterdisciplinaryLotka-Volterramathematical models
0 likes · 6 min read
What Makes a Mathematical Model Enduring? Lessons from AI and Ecology
DevOps
DevOps
Dec 5, 2024 · Artificial Intelligence

A Brief History of Artificial Intelligence: From McCulloch‑Pitts Neurons to GPT‑4

This article traces the evolution of artificial intelligence from the 1943 McCulloch‑Pitts neuron model through key milestones such as Turing's test, the Dartmouth conference, the rise of neural networks, deep learning breakthroughs, and recent large language models like GPT‑4, illustrating the field's rapid progress.

Artificial IntelligenceGPTHistory
0 likes · 7 min read
A Brief History of Artificial Intelligence: From McCulloch‑Pitts Neurons to GPT‑4
Model Perspective
Model Perspective
Dec 5, 2024 · Artificial Intelligence

Choosing the Right Activation Function: Pros, Cons, and Best Practices

Activation functions are crucial for neural networks, providing non‑linearity, normalization, and gradient flow; this article reviews common functions such as Sigmoid, Tanh, ReLU, Leaky ReLU, ELU, Noisy ReLU, Softmax, and Swish, comparing their characteristics, advantages, drawbacks, and guidance for selecting the appropriate one.

activation functionsdeep learningmachine learning
0 likes · 10 min read
Choosing the Right Activation Function: Pros, Cons, and Best Practices
Cognitive Technology Team
Cognitive Technology Team
Nov 20, 2024 · Artificial Intelligence

Fundamentals and Implementation of Neural Networks and Transformers with PyTorch Examples

This article provides a comprehensive overview of neural network fundamentals, loss functions, activation functions, embedding techniques, attention mechanisms, multi‑head attention, residual networks, and the full Transformer encoder‑decoder architecture, illustrated with detailed PyTorch code and a practical MiniRBT fine‑tuning case for Chinese text classification.

AIPyTorchTransformer
0 likes · 49 min read
Fundamentals and Implementation of Neural Networks and Transformers with PyTorch Examples
DaTaobao Tech
DaTaobao Tech
Nov 13, 2024 · Artificial Intelligence

Understanding Neural Networks and Transformers: Principles, Implementation, and Applications

The article surveys neural networks from basic neuron operations and loss functions through deep architectures to the Transformer model, detailing embeddings, positional encoding, self‑attention, multi‑head attention, residual links, and encoder‑decoder design, and includes PyTorch code examples for linear regression, translation, and fine‑tuning Hugging Face’s MiniRBT for text classification.

AINLPPyTorch
0 likes · 44 min read
Understanding Neural Networks and Transformers: Principles, Implementation, and Applications
Model Perspective
Model Perspective
Oct 17, 2024 · Artificial Intelligence

Visualizing How Neural Networks Approximate Any Function

This article explains the universal approximation theorem, showing how even a simple neural network with one hidden layer can approximate any continuous function by adjusting weights and biases, and illustrates the process with visual examples of step and bump functions, linking theory to recent Nobel recognitions.

AIdeep learningfunction approximation
0 likes · 9 min read
Visualizing How Neural Networks Approximate Any Function
DataFunSummit
DataFunSummit
Oct 9, 2024 · Artificial Intelligence

2024 Nobel Physics Prize Recognizes Hopfield and Hinton for Foundational AI Discoveries

The 2024 Nobel Prize in Physics was awarded to John J. Hopfield and Geoffrey E. Hinton for pioneering neural‑network research that transformed machine learning, underscoring artificial intelligence’s evolution from a technology into a scientific discipline.

Artificial IntelligenceHintonHopfield
0 likes · 6 min read
2024 Nobel Physics Prize Recognizes Hopfield and Hinton for Foundational AI Discoveries
JD Tech Talk
JD Tech Talk
Jun 25, 2024 · Artificial Intelligence

Understanding Large Language Models: From Parameters to Transformer Architecture

This article explains the fundamental concepts behind large language models, including their two-file structure, training process, neural network basics, perceptron examples, weight and threshold calculations, the TensorFlow Playground, and a detailed walkthrough of the Transformer architecture with tokenization, positional encoding, self‑attention, normalization, and feed‑forward layers.

AISelf‑AttentionTransformer
0 likes · 20 min read
Understanding Large Language Models: From Parameters to Transformer Architecture
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jun 12, 2024 · Artificial Intelligence

A Simple Introduction to the Transformer Model

This article provides a comprehensive, beginner-friendly explanation of the Transformer architecture, covering its encoder‑decoder structure, self‑attention, multi‑head attention, positional encoding, residual connections, decoding process, final linear and softmax layers, and training considerations, illustrated with numerous diagrams and code snippets.

Self‑AttentionTransformerdeep learning
0 likes · 24 min read
A Simple Introduction to the Transformer Model
Architects Research Society
Architects Research Society
May 21, 2024 · Artificial Intelligence

27 Essential AI Papers Recommended by Ilya Sutskever for John Carmack

Ilya Sutskever, former OpenAI chief scientist, shared a curated list of 27 seminal AI research papers—including the Annotated Transformer, Attention Is All You Need, and Deep Residual Learning—with links, claiming mastering them covers roughly 90% of today’s essential artificial‑intelligence knowledge.

AIResearch Papersdeep learning
0 likes · 7 min read
27 Essential AI Papers Recommended by Ilya Sutskever for John Carmack
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
May 5, 2024 · Artificial Intelligence

Comprehensive Guide to Neural Network Algorithms: Definitions, Structure, Implementation, and Training

This article provides an in‑depth tutorial on neural network algorithms, covering their biological inspiration, significance, advantages and drawbacks, detailed architecture, data preparation, one‑hot encoding, weight initialization, forward and backward propagation, cost functions, regularization, gradient checking, and complete Python code examples.

AIPythonbackpropagation
0 likes · 37 min read
Comprehensive Guide to Neural Network Algorithms: Definitions, Structure, Implementation, and Training
DaTaobao Tech
DaTaobao Tech
Apr 22, 2024 · Artificial Intelligence

Neural Networks and Deep Learning: Principles and MNIST Example

The article reviews recent generative‑AI breakthroughs such as GPT‑5 and AI software engineers, explains that AI systems are deterministic rather than black boxes, and then teaches neural‑network fundamentals—including activation functions, back‑propagation, and a hands‑on MNIST digit‑recognition example with discussion of overfitting and regularization.

MNISTactivation functionsdeep learning
0 likes · 17 min read
Neural Networks and Deep Learning: Principles and MNIST Example