added the max_matching_ngram_size to GenerationConfig #29131

mosheber · 2024-02-20T08:37:24Z

What does this PR do?

Added the max_matching_ngram_size parameter into the GenerationConfig, for the PromptLookupCandidateGenerator.
Included the max_matching_ngram_size when calling the init of PromptLookupCandidateGenerator in _get_candidate_generator, in case it is specified.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@gante , would appreciate it if you could give this PR a glance, and thank you in advance.

…, for the PromptLookupCandidateGenerator

gante · 2024-02-26T13:41:38Z

Hi @mosheber 👋

I'd be happy to merge the PR, conditional on the answer to the following question being yes (preferably backed with data): have you found significant benefits of changing the flag you added?

On the original issue, the author's experiments showed that there were little benefits in changing this option. As such, we don't want to add new flags unless they result in clear benefits :)

danielkorat · 2024-02-26T16:32:00Z

Hi @gante @mosheber 👋

The results below show a 3ms speedup in latency with a 7B target model, when comparing the default max_matching_ngram_size=2 with max_matching_ngram_size=4.
The bottom graph shows n_matches vs max_matching_ngram_size.
As the target size increases further, the speedup will be greater.
Note that these results use an optimized routine for PLD subsequence matching (~70x faster on avg, uses numba):

get_candidates:       0.3467 ms
get_candidates_opt:   0.0047 ms

gante · 2024-02-26T19:50:20Z

@danielkorat convinced by your numbers 👍

Let's add this PR!

gante · 2024-02-26T19:52:14Z

src/transformers/generation/utils.py

@@ -698,9 +698,12 @@ def _get_candidate_generator(
        Returns the candidate generator to be used in `assisted_generation`
        """
        if generation_config.prompt_lookup_num_tokens is not None:
-            candidate_generator = PromptLookupCandidateGenerator(
+            candidate_generator_params = dict(


Instead of creating a dict here, let's:
a) pass keyword arguments to PromptLookupCandidateGenerator (as before the PR)
b) default max_matching_ngram_size in PromptLookupCandidateGenerator to None, and set it to the original default value in __init__ if it is None.

This pushes complexity away from generate and into PromptLookupCandidateGenerator :)

Thanks for the comment! I switched back to keyword arguments.

gante · 2024-02-26T19:53:27Z

@mosheber after applying the fix, please run make fixup from the transformers folder and commit the result. It will fix the CI issues you're seeing :)

mosheber · 2024-02-26T20:38:03Z

@mosheber after applying the fix, please run make fixup from the transformers folder and commit the result. It will fix the CI issues you're seeing :)

I ran the make fixup as well.

gante

The changes look good to me, thank you for iterating and making transformers better 💛

gante

@mosheber

Actually, there is a tiny missing thing: max_matching_ngram_size should have an entry in the docstring of GenerationConfig

our CI is still complaining about code formating, make sure you have the latest version installed when running make fixup again :)

mosheber · 2024-02-28T10:33:05Z

@mosheber

Actually, there is a tiny missing thing: max_matching_ngram_size should have an entry in the docstring of GenerationConfig

our CI is still complaining about code formating, make sure you have the latest version installed when running make fixup again :)

Great idea! I added the doc string to the GenerationConfig class.
Regarding the tests, I fixed the formatting issue, it seems that the CI fails due to timing reasons, which are not related to this PR.

gante · 2024-02-28T12:26:31Z

Regarding the tests, I fixed the formatting issue, it seems that the CI fails due to timing reasons, which are not related to this PR.

Yeah don't worry about it :)

ArthurZucker

LGTM! just one nit

src/transformers/generation/configuration_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

HuggingFaceDocBuilderDev · 2024-03-06T15:05:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

added the max_matching_ngram_size parameter into the GenerationConfig…

a511aa4

…, for the PromptLookupCandidateGenerator

gante reviewed Feb 26, 2024

View reviewed changes

mosheber added 3 commits February 26, 2024 22:11

switched back to keyword arguments

55e87c3

Merge branch 'huggingface:main' into main

3a6019e

Merge branch 'main' of https://github.com/mosheber/transformers

49c4fde

gante approved these changes Feb 27, 2024

View reviewed changes

gante reviewed Feb 27, 2024

View reviewed changes

mosheber added 2 commits February 28, 2024 10:00

added PromptLookupCandidateGenerator docstring for its parameters

b88f103

ruff reformat

083a03a

gante requested a review from ArthurZucker February 28, 2024 12:25

ArthurZucker approved these changes Mar 6, 2024

View reviewed changes

src/transformers/generation/configuration_utils.py Outdated Show resolved Hide resolved

Update src/transformers/generation/configuration_utils.py

9056b9d

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

gante merged commit 19fb1e2 into huggingface:main Mar 6, 2024
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added the max_matching_ngram_size to GenerationConfig #29131

added the max_matching_ngram_size to GenerationConfig #29131

mosheber commented Feb 20, 2024

gante commented Feb 26, 2024

danielkorat commented Feb 26, 2024 •

edited

Loading

gante commented Feb 26, 2024

gante Feb 26, 2024

mosheber Feb 26, 2024

gante commented Feb 26, 2024

mosheber commented Feb 26, 2024

gante left a comment

gante left a comment

mosheber commented Feb 28, 2024

gante commented Feb 28, 2024

ArthurZucker left a comment

HuggingFaceDocBuilderDev commented Mar 6, 2024

added the max_matching_ngram_size to GenerationConfig #29131

added the max_matching_ngram_size to GenerationConfig #29131

Conversation

mosheber commented Feb 20, 2024

What does this PR do?

Who can review?

gante commented Feb 26, 2024

danielkorat commented Feb 26, 2024 • edited Loading

gante commented Feb 26, 2024

gante Feb 26, 2024

Choose a reason for hiding this comment

mosheber Feb 26, 2024

Choose a reason for hiding this comment

gante commented Feb 26, 2024

mosheber commented Feb 26, 2024

gante left a comment

Choose a reason for hiding this comment

gante left a comment

Choose a reason for hiding this comment

mosheber commented Feb 28, 2024

gante commented Feb 28, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 6, 2024

danielkorat commented Feb 26, 2024 •

edited

Loading