Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model CI #9002

Closed
wants to merge 1 commit into from
Closed

Conversation

Superjomn
Copy link
Contributor

@Superjomn Superjomn commented Mar 12, 2018

fix #8903

@tonyyang-svail tonyyang-svail changed the title init Model CI Mar 12, 2018
Copy link
Collaborator

@wangkuiyi wangkuiyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to @Superjomn for considering a model testing solution. Is this a work in progress?

@@ -0,0 +1,47 @@
# Model CI

A simple Continuous Interation for Models, tracking the overall effect and performance.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the plan to run this CI? Are we going to bridge this CI with TeamCity, or set it up as a new configuration on Travis-CI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the beginning, this is just a bunch of scripts, no relation with TeamCity or other CI platforms.

It might be integrated with TeamCity latter, but currently, just plan to be a while-loop process which keeps testing the last merged code and tracking the performance and precision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on it, seems just a few days work. It will keep clear in the begging, and more factors that need to track can be added later.

But it needs some computation resources, such as a free GPU machine, to test several classical models both in CPU and GPU mode (single card).
@wangkuiyi

@panyx0718 panyx0718 self-requested a review March 13, 2018 02:33
@Superjomn
Copy link
Contributor Author

WIP, will reopen latter.

@Superjomn Superjomn closed this Mar 13, 2018
@@ -0,0 +1,47 @@
# Model CI

A simple Continuous Interation for Models, tracking the overall effect and performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, the model integration job needs to collect indicators in three aspects. The model evaluation(i.e. loss, accuracy), speed and memory cost.

For our regression test, we only need to run several batches(say 100 batches) when the training process is stable, then the speed and memory cost data are collected. We also can compare the first several batch losses to validate the model evaluation is correct.

There some problems need to be discussed.

  1. The GPU/CUDA Version difference
    Should we consider the different GPU/CUDA machine in regression test? Given a model and a fixed dataset, then some metrics are not changed when you change the training machine and some are changed. For instance, the loss and accuracy regression curves are fixed, but the speed and memory cost will be changed if you use the different version of CUDA and GPU. Our numbers on Pascal architecture GPU make no sense on other version GPUs.

  2. The mini batches size difference
    For the online learning job or saving training resource, they need small batch size in training models. But for the training speed, they may need big batch size. And the convergence curve is different and the training speed cannot be referenced. Will we consider the different batch size in regression test?

  3. The training/inference difference
    Currently, most users care more about the inference performance, because online service needs to ensure the performance. The inference is totally different with the training phase, will we consider that in regression test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference https://github.com/Superjomn/Paddle/blob/254fee8d86bc7872f4b392f5d0dc46c2a5bcc0cf/contrib/modelci/README.md#make-factor-tracking-extensible

the factors to track can be extended, in my understanding, the initial implementation will just include some general factors such as train cost, validate cost, duration of each batch, more factors can be added by more people in the late.


the log format should like this

for `train_cost` and `valid_cost`, each line is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use VisualDL make a baseline, then every regression test just compare the several first mini batches results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need a Model CI
3 participants