-
Notifications
You must be signed in to change notification settings - Fork 94
Difference between the settings for demo and those in your paper #13
Comments
Hi @whiteking64 , Thanks for your interest in LSeg!
Hope this helps. Best wishes for your research! |
Thank you for your reply.
I changed the Am I correct? Also, I just ran experimentally with batch size = 1 with 1 GPU. It took about 17 hours for one epoch. With 8 GPUs, I assume it is reduced to 2.1 hours, but 240 epochs would tale days to complete. Was it so on your experiments? |
Hi @whiteking64 , In this case, when you are using smaller backbone, you could increase the Hope this helps! |
Yes! |
Thank you again!! |
Hi,
I have a question on the difference of settings between the demo in your README and the experiment in your paper.
In the README, you published the pre-trained weight for demo.
It says while training the backbones for both image and text are
ViT-L/16
.The section 5.1 in your paper says
When reproducing your results in 5.1, does that require a full-scratch training with ViT-B/32 backbone for the images?
Also, are there any other differences, such as batch size? More specifically, How do I change the arguments in
train.sh
?Finally, is it possible to share with us (or me) the weight used for your results?
Thank you in advance.
The text was updated successfully, but these errors were encountered: