-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrongly annotated configuration when saving a model that has a custom pipeline #28907
Comments
to understand more about this try checking these 2 checkpoints # checkpoint that doesn't have a custom pipeline
model_without_custom_pipeline = AutoModelForImageClassification.from_pretrained("not-lain/MyRepo", trust_remote_code=True, revision="dba8d15072d743b6cb4a707246f801699897fb72")
model_without_custom_pipeline.push_to_hub("model_without_custom_pipeline_repo") # checkpoint that has a custom pipeline
model_with_custom_pipeline = AutoModelForImageClassification.from_pretrained("not-lain/MyRepo", trust_remote_code=True, revision="4b57ca965d5070af9975666e2a0f584241991597")
model_with_custom_pipeline.push_to_hub("model_with_custom_pipeline_repo") and try checking the difference in configuration between them |
cc @Rocketknight1 if you can have a look! |
@Rocketknight1 I'm going to give you a helping hand. you can trace |
Hey! I've been reviewing this issue and the associated fix PR and the Current situation:
Proposed fixes:This PR: Fix the #29172: Add a Stuff I don't understand yet:Right now, we don't really have a way to initialize a custom pipeline independently of a model. According to the docs, the only way to get a remote code pipeline is to initialize it with an associated model: from transformers import pipeline
classifier = pipeline(model="{your_username}/test-dynamic-pipeline", trust_remote_code=True) Therefore, I'm a bit confused about some of these PRs:
The bit where I call for help because I'm not sure what to do:cc @Narsil - can you clarify what the intention was with the custom pipeline code? Should it be possible for users to upload custom pipelines separately from models? cc the core maintainers @amyeroberts and @ArthurZucker as well - even though they're not that big, I think these PRs touch very fundamental parts of the conceptual model in |
as for this issue I propose a fix which can be found in #29004 the config should look like this{
(...),
"auto_map": {
"AutoConfig": "not-lain/MyRepo--MyConfig.MnistConfig",
"AutoModelForImageClassification": "not-lain/MyRepo--MyModel.MnistModel"
},
(...),
"custom_pipelines": {
"image-classification": {
"impl": "not-lain/MyRepo--MyPipe.MnistPipe",
"pt": [
"AutoModelForImageClassification"
],
"tf": [],
"type": "image"
}
},
(...) instead of this{
(...),
"auto_map": {
"AutoConfig": "not-lain/MyRepo--MyConfig.MnistConfig",
"AutoModelForImageClassification": "not-lain/MyRepo--MyModel.MnistModel"
},
(...),
"custom_pipelines": {
"image-classification": {
"impl": "MyPipe.MnistPipe",
"pt": [
"AutoModelForImageClassification"
],
"tf": [],
"type": "image"
}
},
(...) also this is can be considered as a seperate issue from #29172 since the latter needs 1 minor extra check to test if we are pushing the new registered pipeline to the same repo that the model came from (i know a pretty specific condition) , a ruff estimation for the latter's solution is this #29172 (comment). |
from transformers import pipeline
pipe = pipeline("image-segmentation", model="briaai/RMBG-1.4",revision ="refs/pr/9", trust_remote_code=True)
numpy_mask = pipe("img_path") # outputs numpy mask
nah, not really, the the library as a whole is not consistent enough when working with custom architectures, which is why i reported these issues, maybe we can work through every issue independently, one step at a time stabilizing it. |
@Rocketknight1 Yes, I think this poses a more fundamental question about the behaviour we want from out pipelines. The first question I have is where should a pipeline live if we push it to the hub? Would it be under a model space? If that's the case, can a pipeline and a model exist under the same space? If so, do we have the same mapping behaviour that we do with e.g. tokenizers and configs? If the model and pipeline are under the same space, should that pipeline be associated with that model by default? If a pipeline is uploaded by itself, I do think it should still have an associated default model, but I don't see why that couldn't be any model i.e. its default could be a remote model, or a model in the same model repo Tbh, the current behaviour coupling a model and a pipeline together I think is sensible as although pipelines within transformers are meant to be able to use any task-compatible model as an option, I don't think we can enforce the same guarantees with custom pipelines. |
@amyeroberts can you read this one to understand more if you still have more questions my dms are always open or you can check back with @Rocketknight1 since I have explained most of this to him in the dms |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
System Info
transformers
version: 4.35.2Who can help?
@Narsil
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
impl
is wrongly annotatedExpected behavior
output configuration is
The text was updated successfully, but these errors were encountered: