Question about SGD update #6

codlife · 2016-11-19T02:37:11Z

Hi @zhangyuc
I find in your code, the weight update is as below:
Local mode: number of cores on the local machine

while(k < n) { values(k) *= - actual_stepsize / math.sqrt(s(k) + 1e-16f) k += 1 }
what troube me is why the update is this, which seem not the standard format:
such as w = w - stepsize*gradient.
thank you!

The text was updated successfully, but these errors were encountered:

zhangyuc · 2016-11-19T03:30:02Z

It implements the AdaGrad algorithm (http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf), which chooses different stepsizes for different coordinates. The stepsize of coordinate k is controlled by the accumulated gradient squared-norm s(k).

codlife · 2016-11-19T05:02:46Z

got it,thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about SGD update #6

Question about SGD update #6

codlife commented Nov 19, 2016

zhangyuc commented Nov 19, 2016

codlife commented Nov 19, 2016 •

edited

Loading

Question about SGD update #6

Question about SGD update #6

Comments

codlife commented Nov 19, 2016

zhangyuc commented Nov 19, 2016

codlife commented Nov 19, 2016 • edited Loading

codlife commented Nov 19, 2016 •

edited

Loading