Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter-10 Infinite iterator #126

Open
1 of 11 tasks
Roland-Szucs opened this issue Nov 2, 2023 · 0 comments
Open
1 of 11 tasks

Chapter-10 Infinite iterator #126

Roland-Szucs opened this issue Nov 2, 2023 · 0 comments

Comments

@Roland-Szucs
Copy link

Information

The problem arises in chapter:

  • Introduction
  • Text Classification
  • Transformer Anatomy
  • Multilingual Named Entity Recognition
  • Text Generation
  • Summarization
  • Question Answering
  • Making Transformers Efficient in Production
  • Dealing with Few to No Labels
  • Training Transformers from Scratch
  • Future Directions

Describe the bug

class ConstantLengthDataset(IterableDataset) results infinite iteration. The reason for this is due to the exception handling part when we run out from the underlying dataset and catch the StopIteration exception. The code there:

To Reproduce

We do not need that as HF already created this code correctly just forgot to update this notebook. In this youtube video , the presented code is good. When the StopIteration is caught, the following good code is shown:

try:
    m=f"Fill buffer: {buffer_len}<{self.input_characters:.0f}"
    print(m)
    buffer.append(next(iterator)["content"])
    buffer_len += len(buffer[-1])
except StopIteration:
    more_examples = False
    break
try:
    m=f"Fill buffer: {buffer_len}<{self.input_characters:.0f}"
    print(m)
    buffer.append(next(iterator)["content"])
    buffer_len += len(buffer[-1])
except StopIteration:
    iterator = iter(self.dataset)

Expected behavior

Do not start the iteration if we just finished it otherwise it results infinit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant