- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.9k
Closed
Description
Hi, wonderful work!
I want to know if there is a easy way to obtain the logits, since sometimes I only need to calculate the perplexity/language modeling loss of specific sequence.
I saw the code here: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/llama.py#L211-L235
So I want to know if I directly use the logits generated by lm_head, can I be benefited from the paged attention framework? Thanks very much!
Metadata
Metadata
Assignees
Labels
No labels