Closed as not planned
Description
After the PR #252, all base models need to be converted new.
For me, this is a big breaking change. The LoRa and/or Alpaca fine-tuned models are not compatible anymore.
Reconverting is not possible.
I see from the PR, that the tokenizer scores are written into the model.
Would it make sense to write the tokenizer scores into a seperate file to stay compatible with the (old) models?
The question then arrises, if
- by loading the model the scoring file will be checked of existense and the sentencepiece tokenizer will be used, or
- the user can decide which tokenizer to use.
What you think?