-
Notifications
You must be signed in to change notification settings - Fork 41
fix: tokenizers normalizers sequence api changed #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: tokenizers normalizers sequence api changed #195
Conversation
`tokenizers::normalizers::Sequence::get_normalizers` has been changed to `AsRef<[NormalizerWrapper]>`
|
I didn't include this in the PR but if huggingface is going to introduce api breaking changes with patches maybe the version constraint should be |
|
The main reason I had this very flexible dep is so that the tokenizer lib can be easily updated in mistral.rs (I think that's the only pure rust user of llg). But I think you're right - if they break it like this, we should probably pin it... |
|
Maybe make it "0.21.2" and hope they follow semver guidelines from now on? |
|
This would be really helpful if we could progress it. |
@ammar-elsabe did you mean to set it to |
|
I'm with @hudson-ai on this one - please set 0.21.2 as minimum and let's see! |
|
@hudson-ai I actually set it to "0.21.2" on purpose, technically speaking SemVer considers any version change before 1.0.0 to be incompatible
cargo is a little more flexible, considering minor version bumps as incompatible pre-1.0.0 but patches are allowed
I thought |
hudson-ai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I wasn't aware of cargo's flexibility on patch versions for pre-1.0.0. This means 0.21.2 := >=0.21.2, <0.22.0? I would agree that this is a reasonable middle-ground. @mmoskal what do you think?
4 days ago huggingface released
tokenizers@v0.21.2, which changed theget_normalizersmethod to be animpl AsRef<[NormalizerWrapper]> for Sequence. Due to the version requirement intoktrie_hf_tokenizers(version = ">=0.20.0, <1.0.0"), in the absence of a lockfile cargo fetches v0.21.2 which will not compileThis was caught in EricLBuehler/mistral.rs#1523