Tagged articles

hyperparameters

11 articles · Page 1 of 1

Apr 26, 2026 · Artificial Intelligence

Has Deep Learning Discovered Its Own “Newton’s Law”?

A new collaborative paper titled “There Will Be a Scientific Theory of Deep Learning” proposes a unified “Learning Mechanics” framework that connects solvable idealized models, tractable limits, empirical scaling laws, hyperparameter theory, and universal representation behavior, aiming to give deep learning a first‑principles scientific foundation.

Deep Learninghyperparameterslearning mechanics

0 likes · 14 min read

Has Deep Learning Discovered Its Own “Newton’s Law”?

Fun with Large Models

Apr 17, 2026 · Artificial Intelligence

Mastering Large Model Training: Practical Parameter Tuning from Beginner to Pro

This guide walks you through interpreting training logs and loss curves, diagnosing common issues such as oscillation, under‑fitting, and over‑fitting, and applying concrete adjustments to learning rate, LoRA settings, batch size, and epochs, with scenario‑specific strategies to turn a novice into a tuning expert.

AI trainingLoRAhyperparameters

0 likes · 23 min read

Mastering Large Model Training: Practical Parameter Tuning from Beginner to Pro

Fun with Large Models

Apr 1, 2026 · Artificial Intelligence

A Beginner's Deep Dive into Large‑Model Training Parameters with LLaMAFactory

This article walks readers through the three major training methods—full‑parameter, LoRA, and QLoRA—explaining their memory costs, data requirements, and trade‑offs, then provides a line‑by‑line breakdown of LLaMAFactory configuration files, hyper‑parameter tuning guidelines, and the process for merging LoRA adapters into a deployable model.

LLaMAFactoryLoRAModel Merge

0 likes · 27 min read

A Beginner's Deep Dive into Large‑Model Training Parameters with LLaMAFactory

Amazon Cloud Developers

Sep 12, 2025 · Artificial Intelligence

Fine‑Tune Amazon Nova Canvas in 12 Hours for Consistent, Cohesive AI Storyboards (Part 2)

This guide shows how to fine‑tune the Amazon Nova Canvas foundation model on Amazon Bedrock using a 12‑hour workflow that extracts character frames from video with Amazon Rekognition, prepares labeled data, configures hyper‑parameters, creates a custom model, deploys it with provisioned throughput, and tests the model to generate coherent storyboard images, while also covering cleanup steps to avoid ongoing costs.

AI storyboardAmazon BedrockAmazon Nova Canvas

0 likes · 17 min read

Fine‑Tune Amazon Nova Canvas in 12 Hours for Consistent, Cohesive AI Storyboards (Part 2)

Baobao Algorithm Notes

Mar 30, 2025 · Artificial Intelligence

Why Scaling, Data, and Infra Matter More Than Reward Design in R1 Replication

The article analyses two months of community attempts to reproduce DeepSeek R1, highlighting that model scaling, high‑quality data, robust training infrastructure, and careful hyper‑parameter tuning outweigh pure reward‑based tricks, and it outlines common pitfalls and future research directions.

DeepSeekLLMRLHF

0 likes · 13 min read

Why Scaling, Data, and Infra Matter More Than Reward Design in R1 Replication

Baobao Algorithm Notes

Nov 11, 2024 · Artificial Intelligence

Sneaky Tricks to Inflate Deep Learning Model Scores (And Why They’re Misleading)

The article enumerates a series of dubious techniques—from inflating batch sizes and hidden compute to hyper‑parameter tricks and fabricated evaluation methods—designed to artificially boost deep‑learning model scores on benchmarks, exposing how easy it is to game performance metrics.

AI tricksDeep Learningbenchmark cheating

0 likes · 9 min read

Sneaky Tricks to Inflate Deep Learning Model Scores (And Why They’re Misleading)

NewBeeNLP

Jul 31, 2024 · Artificial Intelligence

Training 7B–13B LLMs: Practical Tips, Hyperparameters, and Scaling Challenges

The article shares hands‑on experience training 7‑ and 13‑billion‑parameter language models, covering essential hyper‑parameters, hardware requirements, data quality considerations, open dataset resources, and the systemic difficulties that arise when scaling to trillion‑parameter models.

LLM trainingLarge Language Modelshyperparameters

0 likes · 8 min read

Training 7B–13B LLMs: Practical Tips, Hyperparameters, and Scaling Challenges

Rare Earth Juejin Tech Community

Jul 5, 2024 · Artificial Intelligence

Understanding and Tuning Hyperparameters for Large Language Models

This article explores the role of hyperparameters in large language models, explains each key hyperparameter, and guides readers through manual and automated tuning methods such as random search, grid search, and Bayesian optimization to achieve optimal model performance.

AILLMModel tuning

0 likes · 18 min read

Understanding and Tuning Hyperparameters for Large Language Models

NewBeeNLP

Feb 22, 2024 · Artificial Intelligence

Practical Tips for CPT, SFT, and LoRA in Large Language Model Fine‑Tuning

This article shares hands‑on guidance on using continual pre‑training (CPT), supervised fine‑tuning (SFT), and LoRA adapters for large language models, covering dataset size requirements, learning‑rate scheduling, warm‑up ratios, epoch strategies, and practical routing choices based on real‑world experiments.

CPTLLM fine-tuningLoRA

0 likes · 12 min read

Practical Tips for CPT, SFT, and LoRA in Large Language Model Fine‑Tuning

Architects' Tech Alliance

Sep 3, 2020 · Artificial Intelligence

Deep Learning Specialization Infographic Overview

This article presents a comprehensive English summary of the deep learning specialization infographics originally shared by Andrew Ng, covering fundamentals, logistic regression, shallow and deep neural networks, regularization, optimization, hyperparameters, convolutional and recurrent networks, and practical advice for model building and evaluation.

CNNDeep LearningOptimization

0 likes · 21 min read

Deep Learning Specialization Infographic Overview

Sohu Tech Products

Mar 6, 2019 · Artificial Intelligence

Applying Word2Vec Embeddings to Rental and News Recommendation: Model, Hyper‑parameters, and Optimization

This article explains the fundamentals of the Word2Vec SGNS model, details its hyper‑parameters and training tricks, and demonstrates how customized embeddings are built for rental‑listing and news‑article recommendation, covering data preparation, objective‑function redesign, evaluation, and deployment in both recall and ranking stages.

EmbeddingSGNSWord2Vec

0 likes · 14 min read

Applying Word2Vec Embeddings to Rental and News Recommendation: Model, Hyper‑parameters, and Optimization