Clarification on the provided training log file #11

gu6225ha-s · 2024-02-23T12:17:37Z

Hi @MCC-WH. First, thanks for making your training code publicly available. I'm trying to reproduce your training results and have some questions about the provided log file. The README states that it's from a 200 epoch schedule but it only seems to contain 100 epochs?

Also I'm wondering if the log is from the rSfM120k or AugrSfM120k experiment? With a batch size of 256 there should be 91642 / 256 = 358 batches for rSfM120k and 274926 / 256 = 1074 for AugrSfM120k. However the log file indicates that training was run on 765 batches. Perhaps another batch size was used?

MCC-WH · 2024-02-23T12:25:58Z

In fact, we use gradient accumulation during the training process, so the equivalent batch size would be larger than the one inside the script. According to our experimental experience, the performance is stable when equivalent batch size is large than 500. You can try to adjust the step of gradient accumulation to get different equivalent batch sizes.

gu6225ha-s · 2024-02-23T12:52:45Z

Okay so you're saying I could increase the --update_every parameter, which would result in a larger effective batch size? I'll try that if I don't get good results when running experiment_rSfm120k.sh as it is.

gu6225ha-s · 2024-02-24T15:45:41Z

Unfortunately the performance was not as expected when training a model with the experiment_rSfm120k.sh script. Here is the output from test.sh:

>> Test Dataset: roxford5k *** fist-stage >>
>> gl18-tl-resnet101-gem-w: mAP Eeay: 84.42, Medium: 67.31, Hard: 44.26
>> gl18-tl-resnet101-gem-w: mP@k[1, 5, 10] Easy: [97.06 91.76 87.04], Medium: [95.71 90.29 84.57], Hard: [87.14 70.29 59.57]

>> Test Dataset: roxford5k *** rerank-top1024 >>
>> gl18-tl-resnet101-gem-w: mAP Eeay: 84.36, Medium: 67.12, Hard: 41.85
>> gl18-tl-resnet101-gem-w: mP@k[1, 5, 10] Easy: [91.18 89.93 86.63], Medium: [90.   85.05 80.6 ], Hard: [77.14 69.5  57.68]

>> Test Dataset: rparis6k *** fist-stage >>
>> gl18-tl-resnet101-gem-w: mAP Eeay: 92.83, Medium: 80.5, Hard: 61.36
>> gl18-tl-resnet101-gem-w: mP@k[1, 5, 10] Easy: [98.57 96.   95.29], Medium: [100.    98.    96.86], Hard: [97.14 93.43 90.43]

>> Test Dataset: rparis6k *** rerank-top1024 >>
>> gl18-tl-resnet101-gem-w: mAP Eeay: 94.1, Medium: 84.25, Hard: 67.98
>> gl18-tl-resnet101-gem-w: mP@k[1, 5, 10] Easy: [95.71 96.29 95.43], Medium: [95.71 98.   97.  ], Hard: [94.29 94.29 90.71]

I trained on a single GPU though, should I be adjusting the batch size or learning rate in any way to compensate?

gu6225ha-s · 2024-02-28T15:31:31Z

Sorry to bother you again @MCC-WH but I also have a question about Table 5 in your paper. What values of K and L did you use to compute the mAP for Affinity Feature (second row)? Also do you L2 normalize the affinity features before re-ranking?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on the provided training log file #11

Clarification on the provided training log file #11

gu6225ha-s commented Feb 23, 2024

MCC-WH commented Feb 23, 2024

gu6225ha-s commented Feb 23, 2024 •

edited

Loading

gu6225ha-s commented Feb 24, 2024

gu6225ha-s commented Feb 28, 2024

Clarification on the provided training log file #11

Clarification on the provided training log file #11

Comments

gu6225ha-s commented Feb 23, 2024

MCC-WH commented Feb 23, 2024

gu6225ha-s commented Feb 23, 2024 • edited Loading

gu6225ha-s commented Feb 24, 2024

gu6225ha-s commented Feb 28, 2024

gu6225ha-s commented Feb 23, 2024 •

edited

Loading