Unexpectedly high (over 10%) search performance #18

nobuyuki-nc · 2021-09-10T09:54:05Z

I'm tring to reproduce your paper results.

In my resource, I can use only TR_BATCH_SZ=120.
So your paper said top-1 exact match rate 55.9% and near match rate 62.3% at query length 1s in Table 3.
But I get 67.95% of exact match rate and 73.0% of near match rate. (see below)
Is it an applicable result?

$ time -p python run.py evaluate test 100
cli: Configuration from ./config/default.yaml
Load 29,500 items from ./logs/emb/test/100/query.mm.
Load 29,500 items from ./logs/emb/test/100/db.mm.
Load 54,336,000 items from ./logs/emb/test/100/dummy_db.mm.
Creating index: ivfpq
Copy index to GPU.
Training index using 18.40 % of data...
Elapsed time: 31.04 seconds.
54336000 items from dummy DB
29500 items from reference DB
Added total 54365500 items to DB. 104.26 sec.
Created fake_recon_index, total 54365500 items. 0.11 sec.
test_id: icassp,  n_test: 2000
========= Top1 hit rate (%) of segment-level search =========
               ---------------- Query length ----------------
   segments      1        3       5       9       11      19     
   seconds      (1s)     (2s)    (3s)    (5s)    (6s)   (10s)    

  Top1 exact   67.95    88.90   94.30   97.35   98.10   99.15
  Top1 near    73.00    90.30   94.70   97.40   98.15   99.20
  Top3 exact   76.65    92.05   95.65   98.25   98.85   99.55
 Top10 exact   80.00    92.65   96.05   98.45   99.00   99.60
=============================================================
average search + evaluation time 18.77 ms/query
Saved test_ids and raw score to ./logs/emb/test/100/.
real 364.41
user 544.26
sys 115.40

I use dataset-full v1.1 from ieee-dataport.org.

https://ieee-dataport.org/open-access/neural-audio-fingerprint-dataset#files

I change configuration TEST_DUMMY_DB=100k_full_icassp.
Then train & generate, evaluate

The text was updated successfully, but these errors were encountered:

nobuyuki-nc · 2021-09-10T10:01:07Z

I know #4.

One problem (?) is that after the data set correction, the performance improved by almost 5 to 6 percent for 1 second query.

But I already use v1.1, so it may have more 5% gain.

mimbres · 2021-09-11T08:46:53Z

@nobuyuki-nc Hi, thanks for sharing your experiment.
Yes, it is an unexpected result, but to some extent, it can be explained.
My best result from this repo with bsz=640.

========= Top1 hit rate (%) of segment-level search =========
               ---------------- Query length ----------------
   segments      1        3       5       9       11      19     
   seconds      (1s)     (2s)    (3s)    (5s)    (6s)   (10s)    

  Top1 exact   70.30    88.50   93.65   97.00   97.75   98.95
  Top1 near    72.15    89.15   93.65   97.00   97.75   98.95
  Top3 exact   75.90    90.70   94.60   97.75   98.25   99.50
 Top10 exact   78.60    91.45   95.40   98.10   98.40   99.55
=============================================================
average search + evaluation time 19.83 ms/query
Saved test_ids and raw score to ./logs/emb/exp_mini640_tau005/100/.

In comparison with the paper, there was about 8pp (62.2% --> 70.3%) improvement.
Comparing with your result (bsz=120) and my best one (bsz=640), the exact hitrate at 1s is still better in the larger batch-size.
Interestingly, many parts of your result is better than bsz=640. This makes our assumption (larger bsz is better) a bit questionable. I will retry bsz=120 in my machine soon.
Currently I reckon the performance gain mainly came from a different implementation of mel-feature extractor.

nobuyuki-nc · 2021-09-14T03:18:05Z

Thank you for your reply.

I agree with your comment.
And if you get further results, I would like to know.

Thanks

mimbres · 2021-09-16T13:56:09Z

@nobuyuki-nc Here is my result of bsz=120 with Adam. I could see very similar result with yours.

========= Top1 hit rate (%) of segment-level search =========
               ---------------- Query length ----------------
   segments      1        3       5       9       11      19     
   seconds      (1s)     (2s)    (3s)    (5s)    (6s)   (10s)    

  Top1 exact   68.60    88.85   93.90   97.60   98.05   99.30
  Top1 near    72.75    89.65   94.20   97.60   98.05   99.30
  Top3 exact   75.30    91.50   95.70   98.45   98.80   99.70
 Top10 exact   79.55    92.05   96.30   98.50   98.90   99.70
=============================================================
average search + evaluation time 31.04 ms/query
Saved test_ids and raw score to ./logs/emb/120_adam/100/.

nobuyuki-nc · 2021-09-21T13:16:34Z

@nobuyuki-nc Here is my result of bsz=120 with Adam. I could see very similar result with yours.

Thank you, it's an interesting result!

Novicei · 2022-05-07T10:39:33Z

@nobuyuki-nc这是我的结果bsz=120with Adam。我可以看到与您的结果非常相似的结果。

========= Top1 hit rate (%) of segment-level search =========
               ---------------- Query length ----------------
   segments      1        3       5       9       11      19     
   seconds      (1s)     (2s)    (3s)    (5s)    (6s)   (10s)    

  Top1 exact   68.60    88.85   93.90   97.60   98.05   99.30
  Top1 near    72.75    89.65   94.20   97.60   98.05   99.30
  Top3 exact   75.30    91.50   95.70   98.45   98.80   99.70
 Top10 exact   79.55    92.05   96.30   98.50   98.90   99.70
=============================================================
average search + evaluation time 31.04 ms/query
Saved test_ids and raw score to ./logs/emb/120_adam/100/.

How would you interpret the results of this experiment? The effect of bs120 is better than that of bs640, which is very rare in comparative study.Is it because the optimizer is different?

Novicei · 2022-05-07T10:43:14Z

Because one of the improvements I made to your baseline was to make the results more similar across batches, but based on the results you posted, it's a bit confusing for me.

mimbres · 2022-05-07T22:52:24Z

@Novicei

How would you interpret the results of this experiment? The effect of bs120 is better than that of bs640, which is very rare in comparative study.Is it because the optimizer is different?

Here, the main point is that the overall performance is higher than reported in the paper.

On the effect of batch-size, the new results with Adam are 70.30% (1s exact, bsz=640) and 68.60% (1s exact, bsz=120). Again, bsz=640 is better than bsz=120. There's no result that contradicts the argument made in the paper.

You may observe bsz=120 can perform better sometimes with different query lengths. But this can happen because our model is only trained to maximize 1s-segment-similarity, not the sequence similarity.

nobuyuki-nc changed the title ~~Unexpectedly high () search performance~~ Unexpectedly high (over 10%) search performance Sep 10, 2021

mimbres self-assigned this Sep 11, 2021

mimbres added the performance label Sep 11, 2021

nobuyuki-nc closed this as completed Sep 14, 2021

mimbres reopened this Sep 16, 2021

mimbres closed this as completed Sep 16, 2021

mimbres mentioned this issue Apr 6, 2022

Could you please provide the checksum of fma_full only? #25

Closed

mimbres mentioned this issue Apr 29, 2022

Questions about inquiries #24

Open

mimbres mentioned this issue May 7, 2022

Why the loss computed during training is Nan？ #26

Closed

raraz15 mentioned this issue Oct 26, 2023

Reported train, val, test split sizes vs actual FMA size #40

Closed

mimbres pinned this issue Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpectedly high (over 10%) search performance #18

Unexpectedly high (over 10%) search performance #18

nobuyuki-nc commented Sep 10, 2021

nobuyuki-nc commented Sep 10, 2021

mimbres commented Sep 11, 2021 •

edited

Loading

nobuyuki-nc commented Sep 14, 2021

mimbres commented Sep 16, 2021

nobuyuki-nc commented Sep 21, 2021

Novicei commented May 7, 2022

Novicei commented May 7, 2022

mimbres commented May 7, 2022

Unexpectedly high (over 10%) search performance #18

Unexpectedly high (over 10%) search performance #18

Comments

nobuyuki-nc commented Sep 10, 2021

nobuyuki-nc commented Sep 10, 2021

mimbres commented Sep 11, 2021 • edited Loading

nobuyuki-nc commented Sep 14, 2021

mimbres commented Sep 16, 2021

nobuyuki-nc commented Sep 21, 2021

Novicei commented May 7, 2022

Novicei commented May 7, 2022

mimbres commented May 7, 2022

mimbres commented Sep 11, 2021 •

edited

Loading