-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Optimization] Advance parser concurrently with model forward pass #1065
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
820ed87
Simplify Engine.__call__ to remove second call to get_next_token
hudson-ai 9f434ad
simplify parser loop since we know mask will be non-none on every ite…
hudson-ai e5ee621
simplify model.__call__ loop again
hudson-ai 1b49dd3
prototype concurrent parser
hudson-ai f9d38fd
generator cleanup
hudson-ai facafb7
move Mock temperature hook from get_next_token to sample_with_tempera…
hudson-ai 28053c1
wrong assert
hudson-ai baa1d6b
silence cleanup exceptions in garbage collection
hudson-ai b1545ed
Simplify ByteParser
hudson-ai d97d72a
Allow non-concurrent path with get_next_token
hudson-ai f59f1fb
wrong assert
hudson-ai 8aa882c
Merge branch 'main' into parallel_parser
hudson-ai f8779e7
test associativity on get_logits rather than get_next_token
hudson-ai fe33742
fix associativity test to get the args of the FIRST call
hudson-ai e0d8e69
use has_pending_stop to prevent unnecessary forward pass
hudson-ai 1b2c5e2
comment
hudson-ai e718137
prevent parser cleanup from raising exceptions at system exit
hudson-ai 27b3d19
bump llg
hudson-ai a22a40c
move LLInterpreterResponse validation into thread with mid_process to…
hudson-ai 93cff1f
add some comments
hudson-ai 90e5b6e
fix exception
hudson-ai 42a5e5e
Merge branch 'main' into parallel_parser
hudson-ai File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if mask is None, isn't it in accepting mode? any tokens should be accepted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mask should never be none unless the parser is actually done (i.e. we should not be accepting ANY tokens, as the loop should be stopping). This condition should be equivalent to ll_response.stop if we were to parse the string in the second slot of the future above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the .cleanup code, which is currently responsible for sending the final None token to get the generator loop to break. Let me know if you have any better ideas on how to structure it!