-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example with collecting timestamp of the metrics #970
Example with collecting timestamp of the metrics #970
Conversation
…mp-metric-example
…mp-metric-example
spec: | ||
containers: | ||
- name: {{.Trial}} | ||
image: docker.io/andreyvelichkevich/timestamp-metric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had added you into kubeflowkatib, you can now push kubeflowkatib/timestamp-metric now
BTW, I think you can rename the image name to kubeflowkatib/mxnet-mnist-example
to make all example has timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to rename folder and file name also to metrics-with-timestamp
to mxnet-mnist-example
. There is no need to highlight the termwith-timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a mxnet-mnist-example
.
I took code of pytorch-mnist from here: https://github.com/kubeflow/katib/blob/master/examples/v1alpha3/file-metrics-collector/mnist.py.
Should I change example to Mxnet in that case? Mxnet example is that one: https://github.com/apache/incubator-mxnet/blob/master/example/image-classification/train_mnist.py.
Or we can name this example kubeflowkatib/pytorch-mnist-example
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed it. IIRC @richardsliu created this image to show mxnet support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnugeorge So what can we do with this example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richardsliu Can you point out at the Dockerfile used?
@@ -0,0 +1,6 @@ | |||
FROM pytorch/pytorch:1.0-cuda10.0-cudnn7-runtime | |||
|
|||
WORKDIR /var |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe /var is appropriate location for the script. If you check FSH (https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) /var
should be used for changing files.
I would suggest using /opt
and ideally a subdir in there - e.g. /opt/mnist/
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. What do you think about /app
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason you do not want to go with existing directory structure in /
? I don't think it is a good practice in linux systems to put code in "random" dirs in /
, but feel free to convince me otherwise:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean in /examples/v1alpha3/metrics-with-timestamp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean in
/examples/v1alpha3/metrics-with-timestamp
?
Is this answer to @johnugeorge'S question?:-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason you do not want to go with existing directory structure in
/
? I don't think it is a good practice in linux systems to put code in "random" dirs in/
, but feel free to convince me otherwise:)
To this question :)
Which WORKDIR
you think will be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry:) as I suggested in my first comment - I would go for /opt/mnist/mnist.py
or to make it extremely obvious what it is (although that is already captured in the image name, but you never know..) /opt/metrics-with-timestamp/mnist.py
or something along those lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnugeorge @hougangliu What do you think about WORKDIR
for the example as /opt/metrics-with-timestamp/mnist.py
?
I changed example from pytorch to mxnet. I named it /cc @johnugeorge @vpavlin |
@andreyvelich: GitHub didn't allow me to request PR reviews from the following users: vpavlin. Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I changed I renamed images in according yaml files, please take a look. |
…mp-metric-example
I changed CI cluster |
@@ -33,6 +33,9 @@ echo "Creating GPU cluster" | |||
gcloud --project ${PROJECT} beta container clusters create ${CLUSTER_NAME} \ | |||
--zone ${ZONE} \ | |||
--machine-type=n1-standard-8 \ | |||
--num-nodes=8 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be changed since overall resources are fixed
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnugeorge The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes: #944.
Hold until I change docker hub repository.
/hold
This change is