Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval harness batch decoding attention mask #1237

Closed
iankur opened this issue Jul 28, 2024 · 4 comments
Closed

Eval harness batch decoding attention mask #1237

iankur opened this issue Jul 28, 2024 · 4 comments
Assignees
Labels
discussion Start a discussion

Comments

@iankur
Copy link

iankur commented Jul 28, 2024

Should we be returning proper attention mask here which will be required for batch decoding?

return x, torch.ones_like(x)  # return 'mask' b/c it's expected by the harness
@joecummings joecummings added the discussion Start a discussion label Jul 29, 2024
@joecummings joecummings self-assigned this Jul 29, 2024
@joecummings
Copy link
Contributor

Hi @iankur - good question! We integrate with Eleuther for our evaluation, therefore we have to fit certain API contracts. Eleuther has a pretty strong integration with Hugging Face transformers and therefore most of their APIs are fit to the params that their models expect.

In this case, Hugging Face models and therefore Eleuther expect an attention mask. However, since we actually handle calling the model forward, we don't need this mask. It definitely would be best practices though :)

@iankur
Copy link
Author

iankur commented Jul 30, 2024

I could not find where do we mask the left padding done (here) in case of batch decoding. If we are masking then could you point where it is being done? If we are indeed not masking, then shouldn't we? Also, I see lm eval numbers improve by 0.5-1% on old open llm leaderboard tasks if I change batch size from 4 to 1.

@joecummings
Copy link
Contributor

joecummings commented Aug 2, 2024

@iankur On a closer look at our casual generation mask, I see that it is indeed broken for bsz > 1. I will work on fixing this, but as it's not a straightforward change, I've called this out in our eleuther script and filed an issue #1250

@joecummings
Copy link
Contributor

Closing this to track everything on #1250, but feel free to re-open if you don't feel like this adequately addressed your concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Start a discussion
Projects
None yet
Development

No branches or pull requests

2 participants