Breaking change of models since PR #252

After the PR #252, all base models need to be converted new.

For me, this is a big breaking change. The LoRa and/or Alpaca fine-tuned models are not compatible anymore.
Reconverting is not possible.

I see from the PR, that the tokenizer scores are written into the model.
Would it make sense to write the tokenizer scores into a seperate file to stay compatible with the (old) models?
The question then arrises, if 
1. by loading the model the scoring file will be checked of existense and the sentencepiece tokenizer will be used, or
2. the user can decide which tokenizer to use.

What you think?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Breaking change of models since PR #252 #324

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Breaking change of models since PR #252 #324

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions