Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpectedly high (over 10%) search performance #18

Closed
nobuyuki-nc opened this issue Sep 10, 2021 · 8 comments
Closed

Unexpectedly high (over 10%) search performance #18

nobuyuki-nc opened this issue Sep 10, 2021 · 8 comments
Assignees

Comments

@nobuyuki-nc
Copy link

I'm tring to reproduce your paper results.

In my resource, I can use only TR_BATCH_SZ=120.
So your paper said top-1 exact match rate 55.9% and near match rate 62.3% at query length 1s in Table 3.
But I get 67.95% of exact match rate and 73.0% of near match rate. (see below)
Is it an applicable result?

$ time -p python run.py evaluate test 100
cli: Configuration from ./config/default.yaml
Load 29,500 items from ./logs/emb/test/100/query.mm.
Load 29,500 items from ./logs/emb/test/100/db.mm.
Load 54,336,000 items from ./logs/emb/test/100/dummy_db.mm.
Creating index: ivfpq
Copy index to GPU.
Training index using 18.40 % of data...
Elapsed time: 31.04 seconds.
54336000 items from dummy DB
29500 items from reference DB
Added total 54365500 items to DB. 104.26 sec.
Created fake_recon_index, total 54365500 items. 0.11 sec.
test_id: icassp,  n_test: 2000
========= Top1 hit rate (%) of segment-level search =========
               ---------------- Query length ----------------
   segments      1        3       5       9       11      19     
   seconds      (1s)     (2s)    (3s)    (5s)    (6s)   (10s)    

  Top1 exact   67.95    88.90   94.30   97.35   98.10   99.15
  Top1 near    73.00    90.30   94.70   97.40   98.15   99.20
  Top3 exact   76.65    92.05   95.65   98.25   98.85   99.55
 Top10 exact   80.00    92.65   96.05   98.45   99.00   99.60
=============================================================
average search + evaluation time 18.77 ms/query
Saved test_ids and raw score to ./logs/emb/test/100/.
real 364.41
user 544.26
sys 115.40

I use dataset-full v1.1 from ieee-dataport.org.

I change configuration TEST_DUMMY_DB=100k_full_icassp.
Then train & generate, evaluate

@nobuyuki-nc nobuyuki-nc changed the title Unexpectedly high () search performance Unexpectedly high (over 10%) search performance Sep 10, 2021
@nobuyuki-nc
Copy link
Author

I know #4.

One problem (?) is that after the data set correction, the performance improved by almost 5 to 6 percent for 1 second query.

But I already use v1.1, so it may have more 5% gain.

@mimbres mimbres self-assigned this Sep 11, 2021
@mimbres
Copy link
Owner

mimbres commented Sep 11, 2021

@nobuyuki-nc Hi, thanks for sharing your experiment.
Yes, it is an unexpected result, but to some extent, it can be explained.
My best result from this repo with bsz=640.

========= Top1 hit rate (%) of segment-level search =========
               ---------------- Query length ----------------
   segments      1        3       5       9       11      19     
   seconds      (1s)     (2s)    (3s)    (5s)    (6s)   (10s)    

  Top1 exact   70.30    88.50   93.65   97.00   97.75   98.95
  Top1 near    72.15    89.15   93.65   97.00   97.75   98.95
  Top3 exact   75.90    90.70   94.60   97.75   98.25   99.50
 Top10 exact   78.60    91.45   95.40   98.10   98.40   99.55
=============================================================
average search + evaluation time 19.83 ms/query
Saved test_ids and raw score to ./logs/emb/exp_mini640_tau005/100/.
  • In comparison with the paper, there was about 8pp (62.2% --> 70.3%) improvement.
  • Comparing with your result (bsz=120) and my best one (bsz=640), the exact hitrate at 1s is still better in the larger batch-size.
  • Interestingly, many parts of your result is better than bsz=640. This makes our assumption (larger bsz is better) a bit questionable. I will retry bsz=120 in my machine soon.
  • Currently I reckon the performance gain mainly came from a different implementation of mel-feature extractor.

@nobuyuki-nc
Copy link
Author

Thank you for your reply.

I agree with your comment.
And if you get further results, I would like to know.

Thanks

@mimbres
Copy link
Owner

mimbres commented Sep 16, 2021

@nobuyuki-nc Here is my result of bsz=120 with Adam. I could see very similar result with yours.

========= Top1 hit rate (%) of segment-level search =========
               ---------------- Query length ----------------
   segments      1        3       5       9       11      19     
   seconds      (1s)     (2s)    (3s)    (5s)    (6s)   (10s)    

  Top1 exact   68.60    88.85   93.90   97.60   98.05   99.30
  Top1 near    72.75    89.65   94.20   97.60   98.05   99.30
  Top3 exact   75.30    91.50   95.70   98.45   98.80   99.70
 Top10 exact   79.55    92.05   96.30   98.50   98.90   99.70
=============================================================
average search + evaluation time 31.04 ms/query
Saved test_ids and raw score to ./logs/emb/120_adam/100/.

@mimbres mimbres reopened this Sep 16, 2021
@mimbres mimbres closed this as completed Sep 16, 2021
@nobuyuki-nc
Copy link
Author

@nobuyuki-nc Here is my result of bsz=120 with Adam. I could see very similar result with yours.

Thank you, it's an interesting result!

@Novicei
Copy link

Novicei commented May 7, 2022

@nobuyuki-nc这是我的结果bsz=120with Adam。我可以看到与您的结果非常相似的结果。

========= Top1 hit rate (%) of segment-level search =========
               ---------------- Query length ----------------
   segments      1        3       5       9       11      19     
   seconds      (1s)     (2s)    (3s)    (5s)    (6s)   (10s)    

  Top1 exact   68.60    88.85   93.90   97.60   98.05   99.30
  Top1 near    72.75    89.65   94.20   97.60   98.05   99.30
  Top3 exact   75.30    91.50   95.70   98.45   98.80   99.70
 Top10 exact   79.55    92.05   96.30   98.50   98.90   99.70
=============================================================
average search + evaluation time 31.04 ms/query
Saved test_ids and raw score to ./logs/emb/120_adam/100/.

How would you interpret the results of this experiment? The effect of bs120 is better than that of bs640, which is very rare in comparative study.Is it because the optimizer is different?

@Novicei
Copy link

Novicei commented May 7, 2022

Because one of the improvements I made to your baseline was to make the results more similar across batches, but based on the results you posted, it's a bit confusing for me.

@mimbres
Copy link
Owner

mimbres commented May 7, 2022

@Novicei

How would you interpret the results of this experiment? The effect of bs120 is better than that of bs640, which is very rare in comparative study.Is it because the optimizer is different?

Here, the main point is that the overall performance is higher than reported in the paper.

On the effect of batch-size, the new results with Adam are 70.30% (1s exact, bsz=640) and 68.60% (1s exact, bsz=120). Again, bsz=640 is better than bsz=120. There's no result that contradicts the argument made in the paper.

You may observe bsz=120 can perform better sometimes with different query lengths. But this can happen because our model is only trained to maximize 1s-segment-similarity, not the sequence similarity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants