-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to use a custom class in TasksManager & use canonical tasks names #967
Allow to use a custom class in TasksManager & use canonical tasks names #967
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR to have more consistency between libraries 🔥
I left a couple of comments, I think the doc still contains some occurrences of sequence-classification
while it should now be text-classification
if I'm not mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR 🚀
# Set of model topologies we support associated to the tasks supported by each topology and the factory | ||
_SUPPORTED_MODEL_TYPE = { | ||
"audio-spectrogram-transformer": supported_tasks_mapping( | ||
"default", | ||
"feature-extraction", | ||
"audio-classification", | ||
onnx="ASTOnnxConfig", | ||
), | ||
"albert": supported_tasks_mapping( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make it so that supported_tasks_mapping
can accept synonyms as well. But not primordial either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, that'd be interesting if external users are writing their own mappings. Is that something that happens frequently?
According to the nomenclature used in the Hub https://huggingface.co/tasks, and following huggingface/optimum#967 which was addressed after a Slack discussion.
Some architectures do not map to an auto class in transformers (e.g. as pix2struct and VisualBert). Thus we introduce the
_CUSTOM_CLASSES
in TasksManager to handle those.Moreover, when possible, we use
model_info.pipeline_tag
instead ofmodel_info.transformersInfo.pipeline_tag
to infer the task, as the later may be wrong. For reference: https://huggingface.slack.com/archives/C02EMARJ65P/p1681215560263539For example, now
task = TasksManager.infer_task_from_model("google/pix2struct-base")
rightfully returnsimage-to-text
, while it was returningtext2text-generation
before (due to the issue explained on slack).Moreover,
model = TasksManager.get_model_from_task("image-to-text", "google/pix2struct-base")
successfully loads the model asPix2StructForConditionalGeneration
, although it is not registered inAutoModelForVision2Seq
.Moreover, we align with the pipelines/tasks names from https://huggingface.co/datasets/huggingface/transformers-metadata/blob/main/pipeline_tags.json & https://docs.google.com/spreadsheets/d/1zvGwDwo-efnAhqepIleYLsALO70dn6dxRAF2ih8mCH8/edit#gid=0 instead of defining our own. For reference: https://huggingface.slack.com/archives/C03AHFPCWHM/p1680025770248079?thread_ts=1680020763.420539&cid=C03AHFPCWHM Backward compatibility is kept in the CLI.
Note: using
-with-past
to denote the case where past key values are used should probably not be in the task name.Backward compatibility can be tested with
RUN_SLOW=1 pytest tests/exporters/onnx/test_exporters_onnx_cli.py -k "test_legacy_tasks_backward_compatibility" -s
Edit: remaining issues: