-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MAINTENANCE] Refactor and clean up. #4008
Conversation
ludwig/encoders/image/base.py
Outdated
gradient_checkpointing=gradient_checkpointing, | ||
) | ||
if output_attentions: | ||
config = ViTConfig( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just create a dictionary mapping, then optionally add "attn_implementation" if output_attention and pass the dictionary as **kwargs into the Config? Will reduce boilerplate :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arnavgarg1 Done -- sorry about missing the obvious! Thanks!
"gradient_checkpointing": gradient_checkpointing, | ||
} | ||
config_dict["attn_implementation"] = "eager" | ||
config = ViTConfig(**config_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we no longer need to do the if else so we can pass these custom args in both cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arnavgarg1 We do -- I updated the code. If we do not do it, the tests still pass, because locally nothing changes (but runs slower). This is because of the breaking change on the HF side, where they default to SDPA for speed (but it is lazy execution, leaving attention tensors as None
; with "eager" in either case, things still work, but less efficiently). This is fixed now -- good catch -- thanks!
Scope
ViTEncoder
to ensure that thetransformers.ViTModel
returns theoutput_attentions
more elegant (and cuts on the amount of code).Code Pull Requests
Please provide the following:
Documentation Pull Requests
Note that the documentation HTML files are in
docs/
while the Markdown sources are inmkdocs/docs
.If you are proposing a modification to the documentation you should change only the Markdown files.
api.md
is automatically generated from the docstrings in the code, so if you want to change something in that file, first modifyludwig/api.py
docstring, then runmkdocs/code_docs_autogen.py
, which will createmkdocs/docs/api.md
.