-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix doctest more (for docs/source/en
)
#30247
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -57,9 +57,10 @@ When you load a model explicitly, you can inspect the generation configuration t | |
>>> model = AutoModelForCausalLM.from_pretrained("distilbert/distilgpt2") | ||
>>> model.generation_config | ||
GenerationConfig { | ||
"bos_token_id": 50256, | ||
"eos_token_id": 50256, | ||
"bos_token_id": 50256, | ||
"eos_token_id": 50256 | ||
} | ||
<BLANKLINE> | ||
``` | ||
|
||
Printing out the `model.generation_config` reveals only the values that are different from the default generation | ||
|
@@ -244,8 +245,7 @@ To enable multinomial sampling set `do_sample=True` and `num_beams=1`. | |
|
||
>>> outputs = model.generate(**inputs, do_sample=True, num_beams=1, max_new_tokens=100) | ||
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True) | ||
['Today was an amazing day because when you go to the World Cup and you don\'t, or when you don\'t get invited, | ||
that\'s a terrible feeling."'] | ||
["Today was an amazing day because we received these wonderful items by the way of a gift shop. The box arrived on a Thursday and I opened it on Monday afternoon to receive the gifts. Both bags featured pieces from all the previous years!\n\nThe box had lots of surprises in it, including some sweet little mini chocolate chips! I don't think I'd eat all of these. This was definitely one of the most expensive presents I have ever got, I actually got most of them for free!\n\nThe first package came"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have to check if respect the the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK it does |
||
``` | ||
|
||
### Beam-search decoding | ||
|
@@ -393,7 +393,7 @@ just like in multinomial sampling. However, in assisted decoding, reducing the t | |
>>> assistant_model = AutoModelForCausalLM.from_pretrained(assistant_checkpoint) | ||
>>> outputs = model.generate(**inputs, assistant_model=assistant_model, do_sample=True, temperature=0.5) | ||
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True) | ||
['Alice and Bob are going to the same party. It is a small party, in a small'] | ||
['Alice and Bob, a couple of friends of mine, who are both in the same office as'] | ||
``` | ||
|
||
Alternativelly, you can also set the `prompt_lookup_num_tokens` to trigger n-gram based assisted decoding, as opposed | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -309,7 +309,7 @@ The predicted tokens will then be placed between the sentinel tokens. | |
>>> sequence_ids = model.generate(input_ids) | ||
>>> sequences = tokenizer.batch_decode(sequence_ids) | ||
>>> sequences | ||
['<pad><extra_id_0> park offers<extra_id_1> the<extra_id_2> park.</s>'] | ||
['<pad> <extra_id_0> park offers <extra_id_1> the <extra_id_2> park.</s>'] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ArthurZucker Would be nice if you can confirm this is expected format 🙏 |
||
``` | ||
|
||
## Performance | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the way to pass the docstring.
See https://wiki.python.org/moin/MultiLineStringsInDocTest