Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pollux, Reproducibility, Inquiry] Are dataset-fetching mechanisms broken? #110

Open
stet-stet opened this issue Jan 17, 2022 · 3 comments

Comments

@stet-stet
Copy link

stet-stet commented Jan 17, 2022

Hi, I am trying to run the pollux benchmark with custom workload and a different cluster (one that is not aws), to evaluate how pollux does in a variety of situations. However, I cannot seem to pull from your docker registry at registry.petuum.com, which is needed to assemble the containers for each of the six models. (See this directory, for example )

Below is a part of what kubectl describe pods outputs for the dataset pod, after I successfully launch the three kinds of sched pods.

Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  2m2s                default-scheduler  Successfully assigned default/datasets-jxz86 to elsa-05
  Normal   Pulling    53s (x3 over 2m)    kubelet            Pulling image "registry.petuum.com/dev/esper-datasets:latest"
  Warning  Failed     38s (x3 over 104s)  kubelet            Failed to pull image "registry.petuum.com/dev/esper-datasets:latest": rpc error: code = Unknown desc = Error response from daemon: Get https://registry.petuum.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     38s (x3 over 104s)  kubelet            Error: ErrImagePull
  Normal   BackOff    9s (x4 over 104s)   kubelet            Back-off pulling image "registry.petuum.com/dev/esper-datasets:latest"
  Warning  Failed     9s (x4 over 104s)   kubelet            Error: ImagePullBackOff

I tried just pulling an image as well, and I got what you can see below. I am starting to think that maybe some undocumented procedure(eg. registration) is required to access registry.petuum.com...?

> ping registry.petuum.com
PING ec2-54-245-165-47.us-west-2.compute.amazonaws.com (54.245.165.47) 56(84) bytes of data.

^C
> sudo docker pull registry.petuum.com/dev/esper-datasets:latest

Error response from daemon: Get https://registry.petuum.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

I googled a bit, and tested some of the more common solutions:

Regrettably, the former did not work, and it turns out the latter is not an option given my circumstances.

How can I proceed if I want to pull images from your server, and/or download the datasets you used in the evaluations in the paper?

Thank you in advance!

@gudiandian
Copy link

sudo docker pull registry.petuum.com/dev/esper-datasets:latest

Hi @stet-stet , I am encountering a similar problem. Have you solved it?

@stet-stet
Copy link
Author

No, unfortunately...

@aurickq
Copy link
Contributor

aurickq commented May 25, 2022

Hi, unfortunately we're not able to host the datasets for public access due to cost reasons and (for certain datasets like ImageNet) license reasons. However, all the datasets we used are public ones with citations provided in the Pollux paper. You should be able to obtain the datasets to reproduce the experiments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants