Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: change default block_size in block size > max position embeddings #26069

Merged
merged 2 commits into from
Sep 18, 2023
Merged

Conversation

pphuc25
Copy link
Contributor

@pphuc25 pphuc25 commented Sep 9, 2023

Hi,
In the original code, this appears to function correctly when the default block_size is set to 1024. However, I believe that this setting might potentially hinder the training performance. Therefore, I have adjusted the default to be max_position_embeddings when it doesn't match case.
I would like to cc @sanchit-gandhi to review my PR, thank you so much

@pphuc25
Copy link
Contributor Author

pphuc25 commented Sep 12, 2023

Thank you for your helpful information.

@pphuc25 pphuc25 closed this Sep 12, 2023
@pphuc25 pphuc25 deleted the flax_token branch September 12, 2023 18:25
@sanchit-gandhi
Copy link
Contributor

No worries @pphuc25! Would you like to open a PR to set the block size to the minimum of (1024, config.max_position_embeddings)? This will prevent the error when we set the block size > max position embeddings

@pphuc25
Copy link
Contributor Author

pphuc25 commented Sep 14, 2023

That's the cool idea, I will do it, thank you @sanchit-gandhi.

@pphuc25 pphuc25 restored the flax_token branch September 14, 2023 15:59
@pphuc25 pphuc25 reopened this Sep 14, 2023
@pphuc25 pphuc25 changed the title refactor: change default block_size when not initialize and not match case refactor: change default block_size in block size > max position embeddings Sep 14, 2023
Copy link
Contributor

@sanchit-gandhi sanchit-gandhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely! Thanks @pphuc25 for the clean PR :)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch ! 🤗

@sanchit-gandhi sanchit-gandhi merged commit 8b5da9f into huggingface:main Sep 18, 2023
@pphuc25 pphuc25 deleted the flax_token branch September 18, 2023 17:07
MKhalusova pushed a commit to MKhalusova/transformers that referenced this pull request Sep 19, 2023
…ddings (huggingface#26069)

* refactor: change default block_size when not initialize

* reformat: add the min of block size
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
…ddings (huggingface#26069)

* refactor: change default block_size when not initialize

* reformat: add the min of block size
blbadger pushed a commit to blbadger/transformers that referenced this pull request Nov 8, 2023
…ddings (huggingface#26069)

* refactor: change default block_size when not initialize

* reformat: add the min of block size
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 18, 2023
…ddings (huggingface#26069)

* refactor: change default block_size when not initialize

* reformat: add the min of block size
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants