Skip to content

How to change pytorch DataLoader/Dataset for nemo? #7769

Closed Answered by EndOfTheGlory
EndOfTheGlory asked this question in Q&A
Discussion options

You must be logged in to vote

Well, for anyone looking at this in the future, I have two advices how to solve the problem.

First one is depicted in one NeMo tutorial (https://github.com/NVIDIA/NeMo/blob/main/tutorials/01_NeMo_Models.ipynb), where they create custom Dataset and model class.

Second one is to create custom model class by redefining it's parent classes like we have done with colleague. It means that you should find the function that processes what you need and redifine it to proccess it your way. In my case it was class ASRAudioText (I changed how it processes text via my_text = re.sub('[a-zA-Zа-яА-Я]', '-', item['text']) for different tokens).

class ASRAudioText(collections.AudioText):
    """`AudioText`…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@eduardo-onate
Comment options

Answer selected by EndOfTheGlory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants