-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent type checks for prepend and append tags. #1824
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1824
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 4c77153 with merge base 7744608 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1824 +/- ##
==========================================
+ Coverage 67.05% 69.09% +2.03%
==========================================
Files 305 304 -1
Lines 15937 16014 +77
==========================================
+ Hits 10687 11065 +378
+ Misses 5250 4949 -301 ☔ View full report in Codecov by Sentry. |
@RdoubleA Require some review |
Pretty minor, so waiting for merge. |
torchtune/data/_prompt_templates.py
Outdated
+ message.content | ||
+ [{"type": "text", "content": append_tag}] | ||
) | ||
if isinstance(prepend_tag, str) and isinstance(append_tag, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should structure this slightly different to avoid the cascade of ifs:
content = message.content
if message.role in self.template:
prepend_tag = self.template[message.role][0]
append_tag = self.template[message.role][1]
if prepend_tag is not None and len(prepend_tag) > 0:
# add prepend tag
if append_tag is not None and len(append_tag) > 0:
# add append tag
formatted_dialogue.append(Message(...))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on if can something "bad" except None can come there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see now how do you want to do it. Ok, let me fix it like this. But I would like to save isinstance there
torchtune/data/_prompt_templates.py
Outdated
+ message.content | ||
+ [{"type": "text", "content": append_tag}] | ||
) | ||
if isinstance(prepend_tag, str) and isinstance(append_tag, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment above
torchtune/data/_prompt_templates.py
Outdated
+ message.content | ||
+ [{"type": "text", "content": append_tag}] | ||
) | ||
content = {message.content} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the {} should be removed here, otherwise we cannot concatenate lists below. otherwise the rest looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in both cases
@RdoubleA Probably now all issues are fixed(CI might be reruned) We will save isinstance + len > 0 to handle anything that is not good as prepend/append in this case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work, will wait for CI
torchtune/data/_prompt_templates.py
Outdated
content = [{"type": "text", "content": prepend_tag}] + content | ||
|
||
if isinstance(append_tag, str) and len(append_tag) > 0: | ||
content = content + [{"type": "text", "content": append_tag}] | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you could move content = message.content
above the if statement and remove this else entirely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
torchtune/data/_prompt_templates.py
Outdated
{"type": "text", "content": prepend_tag} | ||
] + message.content | ||
else: | ||
content = message.content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
#1565
Changelog
What are the changes made in this PR?
Test plan
Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.
pre-commit install
)pytest tests
pytest tests -m integration_test
UX
If your function changed a public API, please add a dummy example of what the user experience will look like when calling it.
Here is a docstring example
and a tutorial example
Add checks that prepend and append tags are actually valid. Did not make None <=> empty string. Pretty minor fix(written in perfomance most efficient way)