Update LoggingHandler to support logging per interval #16922

liuzh47 · 2019-11-27T08:08:31Z

Description

Add support of logging per interval to the default LoggingHandler. It acts as a middle ground between logging per batch and logging per epoch. Please refer to the description in Issue

Fixes #16918

eric-haibin-lin

Shall we add a unit test for this?

python/mxnet/gluon/contrib/estimator/event_handler.py

liuzh47 · 2019-12-02T03:17:45Z

Shall we add a unit test for this?

What assertion shall we make in this case?

leezu · 2019-12-02T04:00:40Z

You could assert the number of lines logged. For example, if you specify training for 10 batches and log every 5 batches, you may test for two lines being logged.

roywei

Thanks for the fix. Shall we remove LOG_PER_BATCH as it can be replaced by LOG_PER_INTERVAL?
Since this is in contrib, I think it's fine and less options will be easier for users.

liuzh47 · 2019-12-03T02:17:19Z

You could assert the number of lines logged. For example, if you specify training for 10 batches and log every 5 batches, you may test for two lines being logged.

Ok, I'll take a try.

liuzh47 · 2019-12-03T02:29:43Z

Thanks for the fix. Shall we remove LOG_PER_BATCH as it can be replaced by LOG_PER_INTERVAL?
Since this is in contrib, I think it's fine and less options will be easier for users.

Thanks for the suggestion.

I have thought about the option before. Most parts of logging batch and logging interval are overlapped. But during batch logging, we only logged the training metrics:

            for metric in self.train_metrics:
                # only log current training loss & metric after each batch
                name, value = metric.get()
                msg += '%s: %.4f, ' % (name, value)

But during interval logging, I think it is better to log both training and validation metrics. So LOG_PER_BATCH is still necessary?

          for monitor in self.train_metrics + self.val_metrics:
                name, value = monitor.get()
                msg += '%s: %.4f, ' % (name, value)

Or we can merge LOG_PER_BATCH and LOG_PER_INTERVAL, then we use self.log_interval to check whether using validation metrics or not.

leezu · 2019-12-03T02:59:07Z

I'm actually not sure if logging both self.train_metrics + self.val_metrics during LOG_PER_INTERVAL is good.
Currently the notion of val_metrics and train_metrics is not clearly decoupled. Thinking about how to tackle #16959 may also help to clarify the relation of eval_metrics and train_metrics and how their values should be updated and logged.

liuzh47 · 2019-12-03T03:18:37Z

I'm actually not sure if logging both self.train_metrics + self.val_metrics during LOG_PER_INTERVAL is good.
Currently the notion of val_metrics and train_metrics is not clearly decoupled. Thinking about how to tackle #16959 may also help to clarify the relation of eval_metrics and train_metrics and how their values should be updated and logged.

It makes sense to me. I'll merge LOG_PER_BATCH and LOG_PER_INTERVAL and leave out the val_metrics.

python/mxnet/gluon/contrib/estimator/event_handler.py

…r batch interval (apache#16922) * Update LoggingHandler to support logging per interval * Fix the constant variable issue in the logging handler * Remove the constant variable hack in the logging handler. * 1) replace LOG_PER_BATCH with LOG_PER_INTERVAL 2) add test case * Improve the test script for LoggingHandler * small fix on the test script * logging handler test case bug fix * remove parameter verbose from LoggingHandler * move log_interval to the first argument * resolve unittest mistakes

* Fix ndarray indexing bug (#16895) * Fix indexing bug * More test cases * Add test from 16647 * [Gluon] Update contrib.Estimator LoggingHandler to support logging per batch interval (#16922) * Update LoggingHandler to support logging per interval * Fix the constant variable issue in the logging handler * Remove the constant variable hack in the logging handler. * 1) replace LOG_PER_BATCH with LOG_PER_INTERVAL 2) add test case * Improve the test script for LoggingHandler * small fix on the test script * logging handler test case bug fix * remove parameter verbose from LoggingHandler * move log_interval to the first argument * resolve unittest mistakes * Add micro averaging strategy to pearsonr metric (#16878) Strategy to be used for aggregating across mini-batches. "macro": average the pearsonr scores for each batch. "micro": compute a single pearsonr score across all batches. * [Bugfix] [Numpy] Add `kAddTo` and kNullOp to Transpose (#16979) * update Check for repeated axes enable addto to transpose fix fix fix fix remove unused ndim Update pseudo2DTranspose_op-inl.cuh Update pseudo2DTranspose_op-inl.cuh Update pseudo2DTranspose_op-inl.cuh fix Update pseudo2DTranspose_op-inl.cuh try to fix Update pseudo2DTranspose_op-inl.cuh Update pseudo2DTranspose_op-inl.cuh Update pseudo2DTranspose_op-inl.cuh fix Update np_matrix_op.cc Update test_numpy_op.py update test case fix implementation fix bug update fix bug Update pseudo2DTranspose_op-inl.cuh fix fix Update test_numpy_op.py * Fix bug * fix docstring * try to address comment * no need to change this line * Fix bug * address comments * address comment * introduce gradient update handler to the base estimator (#16900) * introduce gradient update handler to the base estimator * Modify the gradient update handler to include the batch size * Remove unrelated gradient update handler. * Modify gradient update handler to take the current batch size. * Remove white space to avoid the sanity check failure * add small tweak to the handler code * Modify the documentation of priority parameter of relevant handlers. * small modification on the documentation. * Add small modification on the documentation. * Remove unnecessary list check

Update LoggingHandler to support logging per interval

09d1db8

liuzh47 requested a review from szha as a code owner November 27, 2019 08:08

leezu requested a review from roywei November 28, 2019 02:50

leezu added the R1.6.0 label Nov 28, 2019

eric-haibin-lin reviewed Nov 29, 2019

View reviewed changes

python/mxnet/gluon/contrib/estimator/event_handler.py Outdated Show resolved Hide resolved

Fix the constant variable issue in the logging handler

fea0bc6

Remove the constant variable hack in the logging handler.

1d588df

roywei reviewed Dec 2, 2019

View reviewed changes

1) replace LOG_PER_BATCH with LOG_PER_INTERVAL 2) add test case

20e44a0

leezu reviewed Dec 4, 2019

View reviewed changes

python/mxnet/gluon/contrib/estimator/event_handler.py Outdated Show resolved Hide resolved

liuzh47 added 4 commits December 5, 2019 04:26

Improve the test script for LoggingHandler

05018ed

small fix on the test script

7820ea9

logging handler test case bug fix

0a3532b

remove parameter verbose from LoggingHandler

458d22b

leezu reviewed Dec 5, 2019

View reviewed changes

python/mxnet/gluon/contrib/estimator/event_handler.py Show resolved Hide resolved

liuzh47 added 2 commits December 5, 2019 07:57

move log_interval to the first argument

59d7915

resolve unittest mistakes

777addd

roywei approved these changes Dec 6, 2019

View reviewed changes

leezu added the API change label Dec 7, 2019

leezu merged commit f06fdf6 into apache:master Dec 7, 2019

ptrendx mentioned this pull request Dec 9, 2019

Backport #16895, #16922, #16878, #16979 and #16900 to 1.6 #17029

Merged

leezu mentioned this pull request Dec 10, 2019

Include eval_net the validation model in the estimator api #16957

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update LoggingHandler to support logging per interval #16922

Update LoggingHandler to support logging per interval #16922

liuzh47 commented Nov 27, 2019 •

edited by leezu

Loading

eric-haibin-lin left a comment

liuzh47 commented Dec 2, 2019

leezu commented Dec 2, 2019

roywei left a comment

liuzh47 commented Dec 3, 2019

liuzh47 commented Dec 3, 2019 •

edited

Loading

leezu commented Dec 3, 2019

liuzh47 commented Dec 3, 2019

Update LoggingHandler to support logging per interval #16922

Update LoggingHandler to support logging per interval #16922

Conversation

liuzh47 commented Nov 27, 2019 • edited by leezu Loading

Description

eric-haibin-lin left a comment

Choose a reason for hiding this comment

liuzh47 commented Dec 2, 2019

leezu commented Dec 2, 2019

roywei left a comment

Choose a reason for hiding this comment

liuzh47 commented Dec 3, 2019

liuzh47 commented Dec 3, 2019 • edited Loading

leezu commented Dec 3, 2019

liuzh47 commented Dec 3, 2019

liuzh47 commented Nov 27, 2019 •

edited by leezu

Loading

liuzh47 commented Dec 3, 2019 •

edited

Loading