-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarks #10
Comments
is there any code or options, to show how to train any of these models (topdown, etc) with self-critical algorithm? @ruotianluo |
It's in my another repository |
Did you fine-tune the CNN when trained the model with cross entropy loss? |
No. |
Wow. It's unbelievable. I can not achieve that high score without fine-tune when train my own captioning model under cross entropy loss. Most papers I have read will fine-tune the CNN when train the model with cross entropy loss. Is there any tips when train the model with cross entropy? |
Finetuning is actually worse. It's about how to extract the features, check the self critical sequence training paper. |
I think they means they did not do finetuning when trained the model under RL loss, while they did not mention whether they finetune the CNN when train the model under cross entropy loss. |
I finetnue the CNN under cross entropy loss as neuraltalk2 (Lua version) and I got cider 0.91 on validation set without beamsearch. Then I train the self-critical model without finetuning based on the best pretrained model and I finally got cider almost close result compared with self-critical paper. |
They didn't fine-tune in both phase. And finetuning may not work as well under attention based model. |
I did not train the attention based model. But I will try. Thank you and your codes. I will start learning pytorch with you code. |
Dear @ruotianluo, |
Can you try downloading the pertrained model and evaluate on your test images. It helps me to narrow down the problem. |
Yes, I can download the pre-trained models and use them. The results from pre-Trained models were appropriate and nice; However, the obtained results from my Trained models were same for all of the images. It seems something wrong with my used parameters for training and the trained model produced same caption for all of given images. |
You should be able to reproduce my result following my instructions, it's really weird. |
Thank you very much for your help. The problem has been solved. In fact, I have trained your code on another Synthetic data set, and as a result the error has been occurred. However, when I used your code on MS-COCO data set, the training process hasn't any problem. |
@ahkarami is the previous problem related to my code? |
Dear @ruotianluo, |
Hi @ruotianluo , |
Actually no. I didn't spend much time on that model. |
Thanks for your reply. |
It's good, just I couldn't get it work well. |
Could you clarify, which features are used for the results above? resnet152? And does |
@dmitriy-serdyuk it's using res101. and FC stands for the FC model in self critical sequence training paper which can be regarded as a variant of showtell. |
Thank you for your fantastic code. I am a beginner, and it helped me a lot. |
The in gate is different. |
OK, I got it. But why do you make this change? Is there any paper or any research about this? |
Self-critical Sequence Training for Image Captioning |
Thank you very much! |
@YuanEZhou which feature did you use? the default resnet101 feature or the bottom up feature |
bottom up feature |
@YuanEZhou may I ask how did you use these features? did you modify the code to incorporate bounding box information? Or just use the default options. |
@jamiechoi1995 I use the default options. |
Adaptive Attention model {'CIDEr': 1.0295328576254532, 'Bleu_4': 0.32367107232015596, 'Bleu_3': 0.4308636494026319, 'Bleu_2': 0.5710839754137301, 'Bleu_1': 0.7375622419883233, 'ROUGE_L': 0.5415854013591195, 'METEOR': 0.2603669044858015, 'SPICE': 0.193603187345227 |
@YuanEZhou can you please share the results.json file you got from the coco caption code which includes all the image ids with their predictions for the validation images? I urgently need it. Your help is highly appreciated |
Hi @fawazsammani , I am sorry that I have lost the file. |
@2033329616 maybe the mistake is in your images. Yesterday, i ran the att2in2 model on the COCO karpathy split validation images, you can run them in the coco caption and see the results, they are identical to the ones posted. (I've already pre-processed the file to include the image ids for evaluation purpose, so you may just run the coco caption code on it directly). |
@2033329616 You need to download pretrained resnet model from the link in this project. |
@fawazsammani @YuanEZhou , Thanks for your reply, I download the "att2in2_results.zip" and run the coco metrics code, it gets a good result. I have already used the pretrained att2in2 mode in this project, and test it on the karpathy split test cocodataset, but I can't get the correct result, I notice the output sentences are same whatever I change the image or fc and att feature, I have no idea how to solve this problem? |
is there a pretrained model in which the self attention was used? |
i meet the same problem. Are you solving the problem now? |
Hi @2033329616 and @kakazl . I'm not sure exactly what's the problem in your case. Maybe you used different settings? This is the command i run: |
Sorry,when I run:python eval.py --model 'self_cirtical/att2in2/model-best.pth' --infos_path 'self_cirtical/att2in2/infos_a2i2-best.pkl' --image_folder 'data/coco/images/val2014/' --num_images 10, |
|
@sssilence are you using Python 2 or 3? I just ran it again and it works. According to your error, your fc_feats is an integer. Are you sure to extracted the features correctly and didn't modify something in the code? |
Yeah I used python2.I didn't modify anything in the code.And I used resnet101 extracting the features.Then I modify some code in rval_utils.py: tmp = [torch.from_numpy(_).cuda() if _ is not None else _ for _ in tmp],and I can run python rval.py but I can't run python train successfuly. |
Dear @ruotianluo, |
Hi everyone. Thanks and kudos to this great repository. I am just a newbie and this repo has helped me a lot. I want to mimic the results of ShowAndTell, ShowAttendAndTell. I have provided the path to the model as I changed the name of
|
@Willowlululu i guess you are using python3? This repo only support py2. Try selfcritical. Pytorch |
Hi @ruotianluo, Thank you for the great repo! I was wondering is there a pretrained transformer model in the drive link? |
There is, check out self critical pytorch repo model zoo |
@ruotianluo Thank you for the quick response! To check my understanding, the fc_nsc, fc_rl and att2in2 are from the self critical paper and the updown is the Anderson paper. Apologies if I am missing out anything here. |
Hi, I also want to use Adaptive Attention. What was your training command at that time? Waiting for your answer |
Cross entropy loss (Cider score on validation set without beam search; 25epochs):
fc 0.92
att2in 0.95
att2in2 0.99
topdown 1.01
(self critical training is in https://github.com/ruotianluo/self-critical.pytorch)
Self-critical training. (Self critical after 25epochs; Suggestion: don't start self critical too late):
att2in 1.12
topdown 1.12
Test split (beam size 5):
cross entropy:
topdown: 1.07
self-critical:
topdown:
Bleu_1: 0.779 Bleu_2: 0.615 Bleu_3: 0.467 Bleu_4: 0.347 METEOR: 0.269 ROUGE_L: 0.561 CIDEr: 1.143
att2in2:
Bleu_1: 0.777 Bleu_2: 0.613 Bleu_3: 0.465 Bleu_4: 0.347 METEOR: 0.267 ROUGE_L: 0.560 CIDEr: 1.156
The text was updated successfully, but these errors were encountered: