Artificial Intelligence 14 min read

Dynamic Learning Rate Adjustment in PyTorch: Optimizer Basics and Scheduler Usage

This article explains how to configure and use PyTorch optimizers, their attributes and methods, and demonstrates various learning‑rate scheduling techniques—including manual updates and built‑in schedulers such as LambdaLR, StepLR, MultiStepLR, ExponentialLR, CosineAnnealingLR, and ReduceLROnPlateau—through clear code examples.

Python Programming Learning Circle

Jan 11, 2022

Learning rate is crucial for training neural networks; starting with a larger rate speeds up learning, then gradually decreasing it helps find the optimum. In PyTorch, dynamic learning‑rate adjustment can be performed using optimizers and schedulers.

Optimizer Basics

Typical training steps are:

loss.backward()<br/>optimizer.step()<br/>optimizer.zero_grad()<br/>...

loss.backward()

computes gradients, optimizer.step() updates parameters, and optimizer.zero_grad() clears gradients for the next iteration. Common optimizers reside in torch.optim and are imported as:

import torch.optim.Adam<br/>import torch.optim.SGD

A simple network example:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.layer = nn.Linear(10, 2)
        self.layer2 = nn.Linear(2, 10)
    def forward(self, input):
        return self.layer(input)

Optimizer Core Attributes

lr : learning rate

eps : minimum learning rate

weight_decay : L2 regularization coefficient

betas : (to be studied)

amsgrad : (bool, to be studied)

Each optimizer maintains a param_groups list that stores parameters and their specific settings.

Optimizer Core Methods add_param_group(param_group): add a new parameter group (useful for fine‑tuning) load_state_dict(state_dict): load saved optimizer state state_dict(): return a dict containing state and

param_groups

step(closure)

: perform a parameter update zero_grad(): clear gradients of all parameters

Creating an optimizer is straightforward:

model = Net()
optimizer_Adam = torch.optim.Adam(model.parameters(), lr=0.1)

model.parameters()

returns all model parameters, which are passed to the optimizer with a specified learning rate.

Training Only Part of a Model

model = Net()
optimizer_Adam = torch.optim.Adam(model.layer.parameters(), lr=0.1)  # only updates layer

Setting Different Learning Rates for Different Parts

params_dict = [
    {'params': model.layer.parameters(), 'lr': 0.1},
    {'params': model.layer2.parameters(), 'lr': 0.2}
]
optimizer_Adam = torch.optim.Adam(params_dict)

Manual learning‑rate modification during training can be done by iterating over optimizer.param_groups and adjusting the 'lr' entry:

lr_list = []
for epoch in range(100):
    if epoch % 5 == 0:
        for params in optimizer_Adam.param_groups:
            params['lr'] *= 0.9
    lr_list.append(optimizer_Adam.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100), lr_list, color='r')
plt.show()

Learning‑Rate Schedulers (torch.optim.lr_scheduler)

The package provides several scheduler classes:

LambdaLR

StepLR

MultiStepLR

ExponentialLR

CosineAnnealingLR

ReduceLROnPlateau

Note: After creating a scheduler, the first step() is executed automatically (PyTorch ≥ 1.1.0), so the typical order is loss.backward() → optimizer.step() → scheduler.step().

LambdaLR

torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda, last_epoch=-1)

lr_lambda

is a function (or list of functions) that receives the epoch index and returns a scaling factor α; the new learning rate is initial_lr * α.

StepLR

scheduler = torch.optim.lr_scheduler.StepLR(optimizer_Adam, step_size=5, gamma=0.5, last_epoch=-1)
for epoch in range(100):
    scheduler.step()
    lr_list_1.append(optimizer_Adam.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100), lr_list_1, color='r', label='lr')
plt.legend()
plt.show()

MultiStepLR

scheduler = torch.optim.lr_scheduler.MultiStepLR(
    optimizer_Adam,
    milestones=[20, 40, 60, 80],
    gamma=0.5,
    last_epoch=-1)
for epoch in range(100):
    scheduler.step()
    lr_list_1.append(optimizer_Adam.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100), lr_list_1, color='r', label='lr')
plt.legend()
plt.show()

ExponentialLR

scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer_Adam, gamma=0.9, last_epoch=-1)
for epoch in range(100):
    scheduler.step()
    lr_list_1.append(optimizer_Adam.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100), lr_list_1, color='r', label='lr')
plt.legend()
plt.show()

CosineAnnealingLR

scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
    optimizer_Adam, T_max=25, eta_min=0, last_epoch=-1)
for epoch in range(100):
    scheduler.step()
    lr_list_1.append(optimizer_Adam.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100), lr_list_1, color='r', label='lr')
plt.legend()
plt.show()

ReduceLROnPlateau

scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
    optimizer, mode='min', factor=0.1, patience=10, verbose=False,
    threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-08)
for epoch in range(10):
    train(...)
    val_loss = validate(...)
    scheduler.step(val_loss)

These schedulers enable flexible, automated learning‑rate adjustments based on epoch count or validation metrics, facilitating faster convergence and better model performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

deep learning Scheduler PyTorch optimizer learning rate

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.