-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: prefill chunking #2600
base: main
Are you sure you want to change the base?
feat: prefill chunking #2600
Conversation
0c87a35
to
bdbc652
Compare
70fe6ce
to
1b43d2d
Compare
ea9384b
to
f923a3f
Compare
@@ -531,7 +546,7 @@ mod tests { | |||
request: ValidGenerateRequest { | |||
inputs: vec![], | |||
input_ids: Some(Arc::new(vec![])), | |||
input_length: 0, | |||
input_length: 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really ? Isn't this supposed to be 0
Are you changing it for other reasons ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why it was 0. 0 is not a possible value in the router as it is considered invalid.
@@ -784,7 +799,7 @@ mod tests { | |||
|
|||
#[tokio::test] | |||
async fn test_queue_next_batch_dropped_receiver() { | |||
let queue = Queue::new(false, 1, false, None, 0, 16); | |||
let queue = Queue::new(false, 1, false, None, 0, 16, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a few tests with chunking ?
cache_length <= prompt_length | ||
), f"Prefix {cache_length} vs input {prompt_length}" | ||
if cache_length == prompt_length: | ||
assert False, "unreachable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd put a better error message in case it actually happens
No description provided.