Questions Regarding Custom Data Testing in Audio Fragment Identification #41

devkya · 2023-12-26T10:22:54Z

Hello, I found this paper quite interesting and encountered some issues during the testing process with my custom data. I apologize if my questions seem naive, as I lack some knowledge in this area.

In my case, I don't need an algorithm to identify a specific audio from various songs using audio fragments. I have a single audio file (2-3 hours) and need to find where in this file a certain audio fragment (3-5 seconds) begins (I understand from the issue check that I need to customize the process for obtaining the start timestamp).

Is this code suitable for such a scenario?
I trained using the provided dataset mini. Then I used the command python run.py generate --source CUSTOM_SOURCE_ROOT_DIR --output FP_OUTPUT_DIR --skip_dummy to generate fingerprints for my custom data, which is an audio file. Afterward, I wanted to evaluate a short audio fragment (3-5 second wav file) but wasn't sure how to proceed. Also, is this a meaningful process?
Should my custom audio data also be included in the training?

Thank you.

mimbres · 2024-01-17T05:28:33Z

@devkya Sorry for the late reply😅

Yes, if:
- your 3-hour-long audio has moderately unique segments,
- or you want to search all the segments similar to query.
As described in Fingerprint Generation, you should prepare a source directory that contains subfolders named test_query and test_db. You may put 3-hr audio in test_db, and some slices of queries in test_query. In fact, you won't need to slice them by yourself. Fingerprints will be sliced by segments anyway... Then follow the description about Search & Evaluation.

EDIT: If you don't assume any specific noise in queries, don't slice it and just reuse the 3-hr audio as query. This way, the frame index of queries and db will exactly match.

No, unless you assume a very different type of data other than music. However, if you assume speech-only data for example, I recommend re-training using some speech-only data. If you assume a specific type of acoustic noise added to your query, you should use the similar type of noise for augmentation in training. The thing is, neural-FP mainly learns what kind of sound sources to discard in similarity search.

Hope this helps, good luck!

mimbres added the question Further information is requested label Jan 17, 2024

mimbres self-assigned this Jan 17, 2024

mimbres closed this as completed Jan 24, 2024

feranzie mentioned this issue Jul 3, 2024

Questions about custom data usage for my context #46

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions Regarding Custom Data Testing in Audio Fragment Identification #41

Questions Regarding Custom Data Testing in Audio Fragment Identification #41

devkya commented Dec 26, 2023

mimbres commented Jan 17, 2024 •

edited

Loading

Questions Regarding Custom Data Testing in Audio Fragment Identification #41

Questions Regarding Custom Data Testing in Audio Fragment Identification #41

Comments

devkya commented Dec 26, 2023

mimbres commented Jan 17, 2024 • edited Loading

mimbres commented Jan 17, 2024 •

edited

Loading