Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BytePair special tokens tokenization #1447

Closed
wants to merge 70 commits into from

Add special_tokens_in_strings to byte_pair_tokenizer

d0ff826
Select commit
Loading
Failed to load commit list.
Closed

Fix BytePair special tokens tokenization #1447

Add special_tokens_in_strings to byte_pair_tokenizer
d0ff826
Select commit
Loading
Failed to load commit list.
Google CLA / cla/google failed Apr 2, 2024 in 9s

❌ Missing CLA from one or more contributors

We couldn't find a Contributor License Agreement (CLA) for some of the contributors shown below. All contributors listed must be covered under a CLA for this pull request to be merged.

📝 If you are not currently covered under a CLA, please visit https://cla.developers.google.com/. Once you've signed, follow the "New Contributors" link at the bottom of that page to update this check.


Help! I've signed the CLA, but it's still showing me as unsigned.

Individual signers
Corporate signers
  • Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
  • The email used to register you as an authorized contributor must be the same email used in your Git commits. Check your existing CLA data and verify that your email is set on your git commits.
  • The email used to register you as an authorized contributor must also be attached to your GitHub account.
  • You may have Keep my email address private enabled. Without a visible email address, the CLA cannot be checked. Uncheck it and re-create the offending commit, or have your CLA point of contact add your @users.noreply.github.com address to the CLA group.

ℹ️ Googlers: Go here to view more details and manage scans for this pull request.

🔁 New Contributors: Update this check after signing the CLA by clicking here.

Details

The following contributors were found for this pull request:

d0ff826 Author: @abuelnasr0 <abu******70​@gmail.com>, <64566340+abuelnasr0​@users.noreply.github.com>
d95c271 Author: @mattdangerw <1389937+mattdangerw​@users.noreply.github.com>, <mat******rw​@gmail.com>
9da7400 Co-Author: <fran********llet​@gmail.com>
6e946e2 Author: @grasskin <43894452+grasskin​@users.noreply.github.com>
db855bc Author: @qlzh727 <sc****hu​@google.com>
4a0adf2 Author: @nkovela1 <60985914+nkovela1​@users.noreply.github.com>
2be333c Author: @SamanehSaadat <ss****t​@google.com>
9da7400 Co-Author: @haifeng-jin <5476582+haifeng-jin​@users.noreply.github.com>
035a776 Author: @sampathweb <1437573+sampathweb​@users.noreply.github.com>
9da7400 Co-Author: <r**n​@ryanmullins.org>
dcebc7c Author: @tirthasheshpatel <tirt********atel​@gmail.com>
414b4f4 Author: @cpsauer <cpsauer​@users.noreply.github.com>
134f8b7 Author: @TheCrazyT <TheCrazyT​@users.noreply.github.com>
f92d4f8 Author: @shmishra99 <124146945+shmishra99​@users.noreply.github.com>
5944635 Author: @sachinprasadhs <sac******sad​@google.com>
29873a9 Co-Author: @dependabot[bot] <49699333+dependabot[bot]​@users.noreply.github.com>
898329f Author: @mykolaskrynnyk <45297092+mykolaskrynnyk​@users.noreply.github.com>
298e15c Author: @RyanMullins <rya******ns​@google.com>
6a8166e Author: @pranavvp16 <94780581+pranavvp16​@users.noreply.github.com>
6ea1e63 Author: @Wauplin <lu****p​@gmail.com>
e5b2833 Author: @asmith26 <asmith26​@users.noreply.github.com>
91aa654 Author: @briango28 <72905199+briango28​@users.noreply.github.com>

(Only the first commit for a unique contributor is listed.)