[DOCS] Added docstring example for EpsilonLogitsWarper #24783 #25378

sanjeevk-os · 2023-08-08T11:08:40Z

What does this PR do?

See #24783
Added docstring example for EpsilonLogitsWarper

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?

Who can review?

gante

In addition to the comments in the lines below, see also this comment: let's try to build shorter code for the example :)

gante · 2023-08-08T12:16:10Z

src/transformers/generation/logits_process.py

+    >>> set_seed(100)
+
+    >>> # We can see that the model generates `J. Trump` as the next token
+    >>> outputs = model.generate(inputs["input_ids"], max_new_tokens=4)


I'd set do_sample=True here, for a direct comparison with the example below :)

You may also want to illustrate a case where this line does not generate the most common case (i.e. does NOT generate J. Trump Jr), where applying epsilon_cutoff would result in generating the common case

Hi @gante : Can you share some explanation for illustrating the second case you mentioned ? I am trying to understand: How does the case when ordinary multinomial sampling doesn't generate the most common case and applying epsilon_cutoff resulting in the generation of common case show that using epsilon_cutoff is better at sampling from multiple variety of tokens ? Thanks.

epsilon_cutoff is a variation of top_p/top_k, sharing a common point: it drops some low-probability solutions according to its algorithm. As such, we should be able to show an example where applying it on a sequence that has a low probability token results in a modified sequence :)

But hold this thought, the example is a bit too long and we need to think about a more concise way to showcase this processor

src/transformers/generation/logits_process.py

sanjeevk-os · 2023-08-14T11:44:11Z

Hi @gante, addressed your comments.

HuggingFaceDocBuilderDev · 2023-08-14T12:07:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

gante

The example has 40 lines at the moment, which is too long. Our docs should be concise, otherwise users won't bother reading them :)

Let's get a case where we can showcase the processor with 2 generate calls (one with and another without epsilon_cutoff). Note that set_seed needs to be called before each generate call, otherwise we can't show that epsilon_cutoff had an impact (i.e. otherwise the difference can be attributed to sampling, and not to epsilon_cutoff)

sanjeevk-os · 2023-08-22T01:58:52Z

The example has 40 lines at the moment, which is too long. Our docs should be concise, otherwise users won't bother reading them :)

Let's get a case where we can showcase the processor with 2 generate calls (one with and another without epsilon_cutoff). Note that set_seed needs to be called before each generate call, otherwise we can't show that epsilon_cutoff had an impact (i.e. otherwise the difference can be attributed to sampling, and not to epsilon_cutoff)

Hi @gante made the suggested changes.

gante

Thank you for iterating 👍

ArthurZucker

Left a very small nit! Thanks for working on this! 🤗

ArthurZucker · 2023-08-22T18:16:01Z

src/transformers/generation/logits_process.py

+    >>> # The use of the `epsilon_cutoff` parameter (best performing values between 3e-4 and 9e-4 from the paper mentioned above) generates tokens
+    >>> # by sampling from a variety of tokens with probabilities greater than or equal to epsilon value. The disadvantage of this sampling is that
+    >>> # if there are many possible tokens to sample from, the epsilon value has to be very small for sampling to occur from all the possible tokens.


These lines are a bit too long should be less than 119 characters

@ArthurZucker : made the suggested change

gante · 2023-08-23T16:26:01Z

@sanjeevk-os Thank you for your contribution 💛

(huggingface#25378) * [DOCS] Added docstring example for EpsilonLogitsWarper huggingface#24783 * minor code changes based on review comments * set seed for both generate calls, reduced the example length * fixed line length under 120 chars

gante reviewed Aug 8, 2023

View reviewed changes

sanjeevk-os force-pushed the feature/24783-add-example-epsilonlogitswarper branch from a6fb4e3 to 01efbb0 Compare August 14, 2023 11:42

gante reviewed Aug 17, 2023

View reviewed changes

sanjeevk-os force-pushed the feature/24783-add-example-epsilonlogitswarper branch from 01efbb0 to 28400c9 Compare August 21, 2023 13:15

gante approved these changes Aug 22, 2023

View reviewed changes

gante requested a review from ArthurZucker August 22, 2023 15:44

ArthurZucker approved these changes Aug 22, 2023

View reviewed changes

sanjeevk-os added 4 commits August 23, 2023 13:51

[DOCS] Added docstring example for EpsilonLogitsWarper huggingface#24783

521dc48

minor code changes based on review comments

2e0a44a

set seed for both generate calls, reduced the example length

63bd311

fixed line length under 120 chars

386801b

sanjeevk-os force-pushed the feature/24783-add-example-epsilonlogitswarper branch from 28400c9 to 386801b Compare August 23, 2023 04:01

gante merged commit 6add3b3 into huggingface:main Aug 23, 2023

sanjeevk-os deleted the feature/24783-add-example-epsilonlogitswarper branch August 24, 2023 04:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOCS] Added docstring example for EpsilonLogitsWarper #24783 #25378

[DOCS] Added docstring example for EpsilonLogitsWarper #24783 #25378

sanjeevk-os commented Aug 8, 2023

gante left a comment

gante Aug 8, 2023

sanjeevk-os Aug 11, 2023

gante Aug 17, 2023 •

edited

Loading

sanjeevk-os commented Aug 14, 2023

HuggingFaceDocBuilderDev commented Aug 14, 2023

gante left a comment

sanjeevk-os commented Aug 22, 2023

gante left a comment

ArthurZucker left a comment

ArthurZucker Aug 22, 2023

sanjeevk-os Aug 23, 2023

gante commented Aug 23, 2023

[DOCS] Added docstring example for EpsilonLogitsWarper #24783 #25378

[DOCS] Added docstring example for EpsilonLogitsWarper #24783 #25378

Conversation

sanjeevk-os commented Aug 8, 2023

What does this PR do?

Before submitting

Who can review?

gante left a comment

Choose a reason for hiding this comment

gante Aug 8, 2023

Choose a reason for hiding this comment

sanjeevk-os Aug 11, 2023

Choose a reason for hiding this comment

gante Aug 17, 2023 • edited Loading

Choose a reason for hiding this comment

sanjeevk-os commented Aug 14, 2023

HuggingFaceDocBuilderDev commented Aug 14, 2023

gante left a comment

Choose a reason for hiding this comment

sanjeevk-os commented Aug 22, 2023

gante left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Aug 22, 2023

Choose a reason for hiding this comment

sanjeevk-os Aug 23, 2023

Choose a reason for hiding this comment

gante commented Aug 23, 2023

gante Aug 17, 2023 •

edited

Loading