Skip to content

How can I make a prediction without using a manifest file? #2248

Answered by nithinraok
yogso asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @yogso,
I see what you would like to do. The answer is fairly simple and it should be applicable to any of asr collections in NeMo. In general, we generate the PyTorch dataset based on the input manifest. Instead if one would like to use inference on audio directly then input audio has to be read and should be passed through a collate function which is dependent on collection (asr/speech commands/speaker recognition/VAD).

Now coming to speaker verification collection,
the collate processing function used is _fixed_seq_collate_fn . Here in _fixed_seq_collate_fn we limit the input audio signal to max time_length (can be found in config) along with other basic processing, but if only a si…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@titu1994
Comment options

@yogso
Comment options

Answer selected by okuchaiev
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants