Add Adagrad optimizer implementation in Pure Numpy #13681

Adhithya-Laxman · 2025-10-22T09:15:40Z

Description

This PR implements the Adagrad (Adaptive Gradient) optimizer using pure NumPy as part of the effort to add neural network optimizers to the repository.

This PR addresses part of issue #13662 - Add neural network optimizers module to enhance training capabilities

What does this PR do?

Implements Adagrad optimizer that adapts the learning rate for each parameter individually based on historical gradient information
Accumulates squared gradients and scales learning rate inversely with the square root of this accumulation
Particularly effective for sparse data and features with varying frequencies
Provides a clean, educational implementation without external deep learning frameworks

Implementation Details

Algorithm: Adagrad (Adaptive Gradient)

Update rule:

accumulated_grad += gradient^2
adjusted_lr = learning_rate / (sqrt(accumulated_grad) + epsilon)
param = param - adjusted_lr * gradient

Key Features:
- Parameter-specific adaptive learning rates
- Accumulation of squared gradients over time
- Epsilon term for numerical stability
Pure NumPy: No PyTorch, TensorFlow, or other frameworks required
Educational focus: Clear variable names, detailed docstrings, and comments

Features

✅ Complete docstrings with parameter descriptions
✅ Type hints for all function parameters and return values
✅ Doctests for correctness validation
✅ Usage example demonstrating optimizer on quadratic function minimization
✅ PEP8 compliant code formatting
✅ Accumulated gradient tracking per parameter
✅ Numerical stability with epsilon parameter

Testing

All doctests pass:

python -m doctest neural_network/optimizers/adagrad.py -v

Linting passes:

ruff check neural_network/optimizers/adagrad.py

Example output demonstrates proper convergence behavior, with learning rates automatically adapting for each parameter.

References

Relation to Issue #13662

This PR is part of the planned optimizer sequence outlined in #13662:

✅ Stochastic Gradient Descent (SGD) - feat: add SGD optimizer for neural networks #13671
✅ Momentum SGD - previous PR
⏳ Nesterov Accelerated Gradient (NAG) - upcoming
✅ Adagrad (this PR)
⏳ Adam - upcoming
⏳ Muon - upcoming

Why Adagrad?

Adagrad is particularly useful for:

Training on sparse data (e.g., NLP tasks)
Handling features that appear with different frequencies
Automatic learning rate adaptation without manual tuning
Early stopping of frequently updated parameters

Checklist

Next Steps

Additional optimizers (NAG, Adam, Muon) will be submitted in follow-up PRs to maintain focused, reviewable contributions as outlined in issue #13662.

Related: Part of #13662

- Implements Adagrad (Adaptive Gradient) using pure NumPy - Adapts learning rate individually for each parameter - Includes comprehensive docstrings and type hints - Adds doctests for validation - Provides usage example demonstrating convergence - Follows PEP8 coding standards - Part of issue TheAlgorithms#13662

algorithms-keeper bot added the awaiting reviews This PR is ready to be reviewed label Oct 22, 2025

This was referenced Oct 23, 2025

Add muon optimizer #13719

Closed

Add muon optimizer #13720

Closed

Add Muon optimizer #13721

Closed

Add Muon optimizer implementation #13724

Closed

Added Nesterov and Adam Optimizers #13718

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add Adagrad optimizer implementation in Pure Numpy #13681

Add Adagrad optimizer implementation in Pure Numpy #13681

Adhithya-Laxman commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Add Adagrad optimizer implementation in Pure Numpy #13681

Are you sure you want to change the base?

Add Adagrad optimizer implementation in Pure Numpy #13681

Conversation

Adhithya-Laxman commented Oct 22, 2025

Description

What does this PR do?

Implementation Details

Features

Testing

References

Relation to Issue #13662

Why Adagrad?

Checklist

Next Steps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant