-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dvc-cml container working with Gitlab and Github #12
Conversation
.github/workflows/deploy.yaml
Outdated
- name: Publish to dockerhub | ||
uses: elgohr/Publish-Docker-Github-Action@master | ||
with: | ||
name: davidgortega/dvc-cml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use iterative? where does this name go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! This is actually an error, in the original branch I had it right but I had to redo it and I overlooked it.
[your own runners](https://help.github.com/en/actions/hosting-your-own-runners) | ||
with special capabilities like GPUs. | ||
tool for ML experimentation. This repo offers the possibility of using DVC to | ||
establish your ML pipeline to be run by Github Actions runners or Gitlab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Github Action runners
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Github product is named Github Action, maybe has to be double quoted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was about typo Actions - Action .. in english you don't put two plurals one after another
README.md
Outdated
or [your own Gitlab runners](https://docs.gitlab.com/runner/) with special | ||
capabilities like GPUs... | ||
|
||
Major beneficts of using DVC-CML in your ML projects includes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not clear so far what CML stands for to be honest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuous Machine Learning, I have not chosen the name but I really like it
README.md
Outdated
|
||
- Reproducibility: DVC is always in charge of maintain your experiment tracking | ||
all the dependencies, so you don't have to. Additionally your experiment is | ||
always running under the same software constrains so you dont have to worry | ||
about replicating the same enviroment again. | ||
always running under the same constrains so you dont have to worry about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't
do you run some editor with spell checking, by chance? )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this is not going to be the real README! This is going to be redone by someone else as far I understand.
README.md
Outdated
- Releases: DVC-action tags every experiment that runs with repro. Aside of that | ||
DVC-action is just a job inside your workflow that could generate your model | ||
releases or deployment according to your bussiness requirements. | ||
experiments run through the DVC Report offeered as checks in Github or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here ... please use some tools to fix simple language typos, stylistic mistakes, etc
README.md
Outdated
experiments run through the DVC Report offeered as checks in Github or | ||
Releases in Gitlab. | ||
- Releases: DVC-action tags every experiment that runs with repro generating the | ||
report. Aside of that DVC-cml is just a step in your |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DVC-cml or DVC-CML - use one style
README.md
Outdated
or [your own Gitlab runners](https://docs.gitlab.com/runner/) with special | ||
capabilities like GPUs... | ||
|
||
Major beneficts of using DVC-CML in your ML projects includes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we mention benefits, and non of them about simple things that are essential to CI (running tests/training independently to make sure that build is "green") and we don't mention another big one - running infra for you to train something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the last benefit, actually, I would expect ML users to be running their own runners using gpu, locally or cloud like AWS or Azure or any other gpu vendor.
IMHO the benefits of CI ML are Releases and Reproducibility and having everything containerised helps with that a lot since the environments are going to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, it still manages the workflow for you. As a data scientist I don't care if there is AWS machine or something - I just push and wait. I don't provision, dockerize, SSH, copy data, etc ... It's one of the major benefits of this whole thing unless I'm missing something cc @dmpetrov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see. Yep, thats a benefit. Actually shared by teaming and reproducibility.
Its very easy to have a model and results just only branching and pushing new changes, without having to setup the enviroment.
And its reproducible since all are working with the same software/hardware constrains...
README.md
Outdated
|
||
Example of a simple DVC-cml workflow in Gitlab: | ||
|
||
> :eyes: Some needed variables like remote credentials and GITLAB_TOKEN are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed -> required
variables -> environment variables
are setted -> are set (or even come up with a better term)
as CI/CD ... -> as Gitlab Runners ... in Gitlab settings ... or what is the right term here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI/CD enviroment variables is the way to go in Gitlab
README.md
Outdated
|
||
dvc: | ||
stage: dvc_action_run | ||
image: davidgortega/dvc-cml:dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
image name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dvcorg/dvc-cml:latest
README.md
Outdated
</details> | ||
|
||
This workflow will run everytime that you push code or do a Pull/Merge Request. | ||
When triggered DVC-cml will setup the runner and DVC will run the pipelines | ||
specified by repro_targets. Two scenarios may happen: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repro_targets -> repro_targets
README.md
Outdated
| metrics_diff_targets | string | no | | Comma delimited array of metrics files. If not specified will use all the metric files | | ||
| rev | string | no | origin/master | Revision to be compared with current experiment. I.E. HEAD~1. | | ||
|
||
> :warning: In Gitlab is needed that you generate the GITLAB_TOKEN that is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is required
README.md
Outdated
|
||
> :warning: In Gitlab is needed that you generate the GITLAB_TOKEN that is | ||
> analogous to GITHUB_TOKEN. See | ||
> [Tensorflow Mnist in Gitlab](#tensorflow-mnist-in-gitlab) example For a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For -> for
- actions/setup-python | ||
|
||
Example of a simple DVC-action workflow: | ||
> :eyes: Note the use of the container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain why do I need to note this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To not forget it in the job definition. It might be a pitfall. People adding DVC-CML inside an existing job that and the don't add the container section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, put that explanation in the note itself?
action.yml
Outdated
@@ -1,29 +1,6 @@ | |||
name: 'DVC-action' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is name, description different now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Its not actually doing anything and its only useful if we publish the repo as a Github Action
docker/Dockerfile
Outdated
@@ -0,0 +1,27 @@ | |||
FROM ubuntu:18.04 | |||
|
|||
LABEL Iterative Inc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Iterative, Inc
src/gitlab.js
Outdated
CI_PROJECT_URL, | ||
CI_COMMIT_REF_NAME, | ||
CI_COMMIT_SHA, | ||
// CI_COMMIT_BEFORE_SHA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we don't need it - remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 some comments are inline
.github/workflows/deploy.yaml
Outdated
- name: Publish to dockerhub | ||
uses: elgohr/Publish-Docker-Github-Action@master | ||
with: | ||
name: davidgortega/dvc-cml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use iterative/dvc-cml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dvcorg/dvc-cml its the dockerhub 🙂
or [your own Gitlab runners](https://docs.gitlab.com/runner/) with special | ||
capabilities like GPUs... | ||
|
||
Major benefits of using DVC-CML in your ML projects includes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still don't understand this list of benefits ... can you summarize them w/o this official language - like A,B,C - the way you understand them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
No description provided.