Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix test_generated_length_assisted_generation #34935

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

keyboardAnt
Copy link
Contributor

What does this PR do?

This PR fixes and expands the test_generated_length_assisted_generation test in test_utils.py.

Before submitting

Who can review?

@Rocketknight1
Copy link
Member

@zucchini-nlp can you take a look at this?

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Can you explain why we need to make a test for max_new_tokens=7 and what is the reason to remove check on length <= 20

Is the test failing for you on any of the PRs?

@keyboardAnt
Copy link
Contributor Author

keyboardAnt commented Nov 28, 2024

@zucchini-nlp
Hey Raushan!

I noticed that this test fails locally for me. Are you able to run it successfully on your end?

While reviewing the test, I saw that it calls generate twice and validates the results. In the second call, max_new_tokens isn’t explicitly provided, which leads to the test failing because the number of generated tokens exceeds 20. Since max_new_tokens isn’t set in this case, I believe the assertion should be adjusted to reflect the behavior when no explicit limit is applied. After making this change, the test passes locally for me.

Additionally, I added a third call to generate to verify that using max_new_tokens without specifying min_new_tokens works as expected. For this, I used an arbitrary value of 7 tokens for max_new_tokens. I believe this is a legitimate choice, given that the test already uses the value of 20 tokens arbitrarily in the first two calls without providing any deep explanation. The aim here is to ensure that the behavior remains consistent across different configurations and arbitrary limits.

Let me know if this makes sense or if you have any concerns. Would appreciate your approval if it looks alright to you.

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense now. Indeed the test is failing and weird it has not been causing us troubles before (cc @ydshieh maybe). The issues stems from #34377 and preceding PR, where the default value was changed from max=20 to max_new=20

So I think it is better if you modify the code to out.shape[-1] <= 20 + input_ids.shape[1] and leave a small comment explaining where we got 20 :)

@ydshieh
Copy link
Collaborator

ydshieh commented Dec 2, 2024

@zucchini-nlp

weird it has not been causing us troubles

see #34807. TL;DR: test fetcher issue so some tests are not fetched (including this one)

Thank you @keyboardAnt for helping and this PR.

For this length issue, @gante in our team has opened #34814, but we are still discussing what would be the final call. Therefore I think it would be good not to merge this PR and wait for #34814 to make a decision.

@keyboardAnt
Copy link
Contributor Author

@zucchini-nlp

weird it has not been causing us troubles

see #34807. TL;DR: test fetcher issue so some tests are not fetched (including this one)

Thank you @keyboardAnt for helping and this PR.

For this length issue, @gante in our team has opened #34814, but we are still discussing what would be the final call. Therefore I think it would be good not to merge this PR and wait for #34814 to make a decision.

@ydshieh, thank you for responding! I’m a bit puzzled though—since #34814 doesn’t fix test_generated_length_assisted_generation and this PR does, wouldn’t it make sense to merge this PR now to resolve the bug immediately? #34814 can still later refine or build on it as needed. Why leave the bug open when there’s already a fix ready to merge?

@zucchini-nlp
Copy link
Member

zucchini-nlp commented Dec 3, 2024

@keyboardAnt it should fix it by removing extra length added in utils.py but the discussion is still going on whether to keep that extra length or not. So yeah, let's wait

@ydshieh
Copy link
Collaborator

ydshieh commented Dec 3, 2024

Hi @keyboardAnt . IIRC, the issue comes from #34377 which is from #34026 (and later #34617 is involved). This is more of the design we have to make decision, which is still under discussion. Fixing a test while we still need to agree on a choice isn't a super great idea :-).

(if it's about fixing something that affects many users' usage, that's another story)

@ydshieh
Copy link
Collaborator

ydshieh commented Dec 3, 2024

But if our assumption is wrong (the causes of the failure), please correct us🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants