Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Support logprob calculation for loglikelihood approach #69

Closed
wants to merge 10 commits into from

Conversation

vvchernov
Copy link

@vvchernov vvchernov commented Nov 17, 2023

Loglikelihood approach is needed to check accuracy of a model on popular datasets and tasks like MMLU and BigBench. Particularly HuggingFace leaderboard bases on such tasks only.

Notes:

  1. The branch bases on the work branch of Enable Logprobs in MLC Batch Serving #82, the latter PR should be merged before it.
  2. Loglikelihood requires sequence of logits after prefill-like inference. But in mlc-llm on the model topology side the last logits set is splitted from all. I've added all logits to output tuple on llama topology side for multi-batch implementation.

@masahi
Copy link
Member

masahi commented Nov 17, 2023

Why do we need this for batch-serving? For batched inference, we use PyTorch for sampling. So adding logprob support is easy.
https://github.com/octoml/mlc-llm/blob/batch-serving/serve/mlc_serve/model/paged_cache_model.py#L246-L319

@vvchernov
Copy link
Author

Hello @masahi! Thank you for your reference. The main idea is support accuracy benchmark of octoml endpoints on tasks (like MMLU, HellaSwag) with loglikelihood approach. Unfortunately I'm not familiar with serve implementation and made mistake when implemented it on "old" part of mlc-llm. I plan to use this PR and do it on serve side. Possibly logprobs calculation has been already done or can be easily done, but it also needs some high-level API for request-response of logprobs.

@vvchernov vvchernov force-pushed the vc/serve-logprob branch 5 times, most recently from 7dbdbc4 to 8b20bb9 Compare December 25, 2023 17:18
@vvchernov vvchernov force-pushed the vc/serve-logprob branch 2 times, most recently from 86d63fa to 26692df Compare January 9, 2024 11:26
@sunggg
Copy link
Member

sunggg commented Jan 11, 2024

Since we have #82, do we still need this PR?

@vvchernov
Copy link
Author

vvchernov commented Jan 11, 2024

Hello @sunggg! Yes, of course. Functionality from #82 allows to get logprobs info for new generated tokens. This PR allows to get logprobs for all tokens from input prompt (prefill step) which used for loglikelihood calculation. Particularly here I modify Relax model due to it cut the last set of logits, but I need all of them.
I'm helping with #82 due to want it will be merged first. My branch bases on the branch from the PR and contains the code from there, but I rebase it when the PR is merged.

@vvchernov vvchernov changed the title Support logprob calculation for loglikelihood approach [WIP] Support logprob calculation for loglikelihood approach Jan 15, 2024
@vvchernov vvchernov marked this pull request as ready for review January 15, 2024 07:07
@vvchernov vvchernov force-pushed the vc/serve-logprob branch 3 times, most recently from e3bff68 to 5a4bde6 Compare January 15, 2024 16:51
@vvchernov
Copy link
Author

Close due to transfer to octoml/mlc-serve/pull/56

@vvchernov vvchernov closed this Feb 28, 2024
@vvchernov vvchernov deleted the vc/serve-logprob branch February 28, 2024 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants