Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting started tutorial not working #192

Closed
Ademord opened this issue May 6, 2019 · 11 comments
Closed

Getting started tutorial not working #192

Ademord opened this issue May 6, 2019 · 11 comments

Comments

@Ademord
Copy link

Ademord commented May 6, 2019

Hello

I have followed your Getting Started Tutorial, and after a whole day of fixing dependencies and build problems, have decided to give up because it's just not feasible to run this.

After building the image, and then running it, the dependencies in the first notebook (test) are all messed up; modules are importing from a parent directory and don't have the "modules.x" prefix, which makes the jupyter notebook blow up, error after error.
I created a notebook on the root of dense-correspondence, and then fixed on my local sandbox, all the dependencies. Finally, the first (test) notebook you point to, works.

Then I go into the training_notebook, and ran into this issue: _init__() got an unexpected keyword argument "fully_conv", which had me clean up your dependency install file (pytorch outdated, pip outdated; reminder: there's only 7 months left of support for python 2.7..), which I have not been able to fix and has made me come here to ask for help. Do you have any idea about this problem?

Appendix

Example the dependencies in your notebooks dont work outside of the box (modules.x prefix missing):
image

Second error I haven't been able to fix:
image

@manuelli
Copy link
Collaborator

manuelli commented May 6, 2019

Hi,

It seems like maybe some of the environment variables didn't get properly sourced. Try running

use_pytorch_dense_correspondence ## this sets necessary environment variables

before launching the notebook and see if that helps. I think this is detailed in our tutorial but if not let us know and we can fix it.

@manuelli
Copy link
Collaborator

manuelli commented May 6, 2019

Your second error is related to the fact that we use a custom fork of pytorch/vision which does support the fully_conv argument. You are probably loading the standard pytorch/vision since environment variables maybe haven't been set correctly due to the aforementioned issue.

FYI the custom fork we use is here https://github.com/warmspringwinds/vision/tree/5e0a760fc847d55a4c1699410a14003452fa4581

@peteflorence
Copy link
Collaborator

Hi @Ademord , yes I think you are just missing that part of Step 5 that Lucas pointed out. That will probably fix all of your problems, but let us know if it doesn't.

@peteflorence
Copy link
Collaborator

Also feel free to try to change all the versions of all the dependencies, but it should just work if you keep to the ones we provided. The reason we provide it in a docker format is so that all these dependencies can be well tracked, and the libraries are contained to this project only.

Also we have a version of this working with pytorch 1.0 on a private branch, was a bit of work to get there but we will release it eventually.

@Ademord
Copy link
Author

Ademord commented May 7, 2019

Hello,
I tried running the command and then also installing from source from the fork repo you mentioned and it still doesn't work.
Is there another way to test this module is properly installed?

The env vars are set but the fully_conv problem persists...

@manuelli
Copy link
Collaborator

manuelli commented May 7, 2019

Hi,

So there is no need to do any additional installation, if you follow the tutorial everything you need is there.. Can you confirm that (as detailed in the tutorial) you have run

git submodule update --init --recursive

before building the docker image. My guess is that you missed this command and so you don't have the aforementioned fork of torchvision in your source tree. In your jupyter notebook please check where your torchvision is. It should be in the source tree, as in the code snippet below.

>>>import dense_correspondence.network.dense_correspondence_network
/usr/local/lib/python2.7/dist-packages/requests/__init__.py:83: RequestsDependencyWarning: Old version of cryptography ([1, 2, 3]) may cause slowdown.
  warnings.warn(warning, RequestsDependencyWarning)
>>> import torchvision
>>> torchvision.__file__
'/home/manuelli/code/pytorch-segmentation-detection/vision/torchvision/__init__.pyc'

You shouldn't be importing the system installed version, which is what would happen if you didn't do the git submodule update --init --recursive call).

>>> torchvision.__file__
'/usr/local/lib/python2.7/dist-packages/torchvision/__init__.pyc'

In general it seems like maybe you missed one or two steps in the tutorial/setup so it may be worth going through that again from scratch and just make sure you have done everything correctly.

@Ademord
Copy link
Author

Ademord commented May 7, 2019

Hi,
I removed all my files 😂 and cloned again, followed the steps carefully. I am on the first notebook, and torchvision is on the proper path as you showed 👍🏻

Now, I get the following error: AttributeError: 'module' object has no attribute 'float32'.

It seems like this is because of a newer torchtext version thats not compatible with our older torch version... I tried with pip install torchtext==0.2.3 and it's still not working 😕. Any ideas?

@Ademord
Copy link
Author

Ademord commented May 7, 2019

Okay, I think I got it. GEEZ.

  1. I removed the (freakin') torchvision from your install_pytorch.sh and built
  2. I assume your use_pytorch_dense_correspondence setup doesnt work at all because torchvision was not being found in the submodule that is imported into the folder from the repo.
  3. I looked at what the guys at pytorch-segmentation-detection did, and they have this code in their notebooks (and mention explicitly that you should add it):
import sys
sys.path.insert(0, '/home/ribr/code/pytorch-segmentation-detection/vision/')
  1. Now all the first notebook runs. I am going to go to the other ones.

@peteflorence
Copy link
Collaborator

Glad you also figured it out.

This is addressed in our code by the following line:

sys.path.insert(0, os.path.join(dc_source_dir, 'pytorch-segmentation-detection', 'vision'))

This is why add_dense_correspondence_to_python_path() is at the start of every notebook.

@Ademord
Copy link
Author

Ademord commented May 7, 2019

Thanks! Interestingly it didn't run on mine (new, clean download).

I think it would be good to have an assert in case it doesnt import it in the beginning for a user ? (verify that torchvision.file is at X location ?

Problem 1:
Now, on the training tutorial notebook i get this error:
ValueError: scene_name = 2018-04-10-16-02-59 doesn't exist

Problem 2:
And the qualitative_evaluation_tutorial gives me this error:
IOError: [Errno 2] No such file or directory: '/home/blabla/code/data_volume/pdc/trained_models/tutorials/caterpillar_3/003500.pth'

The files in the path under caterpillar_3 are

blabla@blabla:~/blabla/pytorch-dense-correspondence/data_volume/pdc/trained_models/tutorials
/caterpillar_3$ ls
000000.pth      000201.pth.opt           dataset.yaml                tensorboard
000000.pth.opt  000201_log_history.yaml  descriptor_statistics.yaml  training.yaml
000201.pth      analysis                 loss.yaml

@Ademord
Copy link
Author

Ademord commented May 8, 2019

I have found that a fix for problem 1 is open on issue #169 :

python config/download_pdc_data.py config/dense_correspondence/dataset/composite/caterpillar_only.yaml

But problem 2 still persists.

Update: so I figured out that problem 2 assumes there is some hardcoded 3500 somewhere else for the number of iterations (which in the notebook is variable).
I trained for the 3500 iterations and it's all good.

In these images, are we seeing the cross-object loss? and could you please help me understand a bit more of this concept?
image

And also, how can I better understand the descriptor images?
image

My ultimate goal is to use the Dense Object Nets for robot manipulation as well. I am looking also at this. I'm going to re-read your paper in the meantime. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants