GitHub - asvcode/1_cycle: A disciplined approach to neural network parameters - Reviewing the approach for setting Hyper parameters by Leslie Smith

A Disciplined Approach to Neural Network Hyper-parameters: Part 1 - Learning Rate, Batch Size, Momentum and Weight Decay

Reviewing the approach for setting Hyperparameters by Leslie Smith.
'Setting the hyper-parameters remains a black art that requires years of experience to acquire' - Leslie Smith

You can review the paper here: (https://arxiv.org/abs/1803.09820)

The 1 cycle policy involves a cycle with 2 steps of equal length: Step 1 where the learning rate increases linearly from the maximum to the minimum and Step 2 where it linearly decreases.

The peak in the middle of the cycle (at 100 iterations) acts as a regularization method to prevent overfitting

Batch Size and Learning Rate Analysis

Low BS and High LR as well as High BS and High LR produce the highest accuracy

Learning and Validation Loss Analysis based on Weight Decay

Pictorial explanation of the tradeoff between underfitting and overfitting

The graphs from left to right potray Training (Orange) and Validation (Blue) Loss plots with a Weight Decay(wds) of 1e5, 1e4, 1e3 and 1e2
The graphs show that the Training loss is above the Validation loss when the wds is 1e5 and 1e4 but the two losses then intersect when the wds is 1e3 and 1e2

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
images		images
A Disciplined Approach To Neural Network Hyper-Parameters.ipynb		A Disciplined Approach To Neural Network Hyper-Parameters.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Disciplined Approach to Neural Network Hyper-parameters: Part 1 - Learning Rate, Batch Size, Momentum and Weight Decay

Batch Size and Learning Rate Analysis

Learning and Validation Loss Analysis based on Weight Decay

About

Releases

Packages

Languages

asvcode/1_cycle

Folders and files

Latest commit

History

Repository files navigation

A Disciplined Approach to Neural Network Hyper-parameters: Part 1 - Learning Rate, Batch Size, Momentum and Weight Decay

Batch Size and Learning Rate Analysis

Learning and Validation Loss Analysis based on Weight Decay

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages