Model CI #9002

Superjomn · 2018-03-12T10:36:04Z

wangkuiyi

Thanks to @Superjomn for considering a model testing solution. Is this a work in progress?

wangkuiyi · 2018-03-13T02:23:35Z

contrib/modelci/README.md

@@ -0,0 +1,47 @@
+# Model CI
+
+A simple Continuous Interation for Models, tracking the overall effect and performance.


What is the plan to run this CI? Are we going to bridge this CI with TeamCity, or set it up as a new configuration on Travis-CI?

In the beginning, this is just a bunch of scripts, no relation with TeamCity or other CI platforms.

It might be integrated with TeamCity latter, but currently, just plan to be a while-loop process which keeps testing the last merged code and tracking the performance and precision.

I am working on it, seems just a few days work. It will keep clear in the begging, and more factors that need to track can be added later.

But it needs some computation resources, such as a free GPU machine, to test several classical models both in CPU and GPU mode (single card).
@wangkuiyi

Superjomn · 2018-03-13T02:55:55Z

WIP, will reopen latter.

dzhwinter · 2018-03-13T05:12:14Z

contrib/modelci/README.md

@@ -0,0 +1,47 @@
+# Model CI
+
+A simple Continuous Interation for Models, tracking the overall effect and performance.


In my mind, the model integration job needs to collect indicators in three aspects. The model evaluation(i.e. loss, accuracy), speed and memory cost.

For our regression test, we only need to run several batches(say 100 batches) when the training process is stable, then the speed and memory cost data are collected. We also can compare the first several batch losses to validate the model evaluation is correct.

There some problems need to be discussed.

The GPU/CUDA Version difference
Should we consider the different GPU/CUDA machine in regression test? Given a model and a fixed dataset, then some metrics are not changed when you change the training machine and some are changed. For instance, the loss and accuracy regression curves are fixed, but the speed and memory cost will be changed if you use the different version of CUDA and GPU. Our numbers on Pascal architecture GPU make no sense on other version GPUs.

The mini batches size difference
For the online learning job or saving training resource, they need small batch size in training models. But for the training speed, they may need big batch size. And the convergence curve is different and the training speed cannot be referenced. Will we consider the different batch size in regression test?

The training/inference difference
Currently, most users care more about the inference performance, because online service needs to ensure the performance. The inference is totally different with the training phase, will we consider that in regression test?

reference https://github.com/Superjomn/Paddle/blob/254fee8d86bc7872f4b392f5d0dc46c2a5bcc0cf/contrib/modelci/README.md#make-factor-tracking-extensible

the factors to track can be extended, in my understanding, the initial implementation will just include some general factors such as train cost, validate cost, duration of each batch, more factors can be added by more people in the late.

dzhwinter · 2018-03-13T05:20:50Z

contrib/modelci/README.md

+
+the log format should like this
+
+for `train_cost` and `valid_cost`, each line is


Can we use VisualDL make a baseline, then every regression test just compare the several first mini batches results.

init

a75f233

tonyyang-svail changed the title ~~init~~ Model CI Mar 12, 2018

wangkuiyi reviewed Mar 13, 2018

View reviewed changes

panyx0718 self-requested a review March 13, 2018 02:33

Superjomn closed this Mar 13, 2018

dzhwinter reviewed Mar 13, 2018

View reviewed changes

gongweibao mentioned this pull request Mar 25, 2018

Fluid's vgg16 doesn't converge now. #9353

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model CI #9002

Model CI #9002

Superjomn commented Mar 12, 2018 •

edited by dzhwinter

Loading

wangkuiyi left a comment

wangkuiyi Mar 13, 2018

Superjomn Mar 13, 2018

Superjomn Mar 13, 2018

Superjomn commented Mar 13, 2018

dzhwinter Mar 13, 2018

Superjomn Mar 13, 2018

dzhwinter Mar 13, 2018

		@@ -0,0 +1,47 @@
		# Model CI

		A simple Continuous Interation for Models, tracking the overall effect and performance.


		the log format should like this

		for `train_cost` and `valid_cost`, each line is

Model CI #9002

Model CI #9002

Conversation

Superjomn commented Mar 12, 2018 • edited by dzhwinter Loading

wangkuiyi left a comment

Choose a reason for hiding this comment

wangkuiyi Mar 13, 2018

Choose a reason for hiding this comment

Superjomn Mar 13, 2018

Choose a reason for hiding this comment

Superjomn Mar 13, 2018

Choose a reason for hiding this comment

Superjomn commented Mar 13, 2018

dzhwinter Mar 13, 2018

Choose a reason for hiding this comment

Superjomn Mar 13, 2018

Choose a reason for hiding this comment

dzhwinter Mar 13, 2018

Choose a reason for hiding this comment

Superjomn commented Mar 12, 2018 •

edited by dzhwinter

Loading