Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for possibly incorrect mean pooling #102

Merged
merged 1 commit into from
Oct 19, 2024
Merged

Conversation

ir2718
Copy link
Contributor

@ir2718 ir2718 commented Oct 17, 2024

Hi,

this is the fix for issue #101, (in case it truly is incorrect). To the best of my knowledge, the current code sums over the whole attention mask tensor, instead of keeping dimension zero intact, and summing only over the first dimension.

@SeanLee97
Copy link
Owner

SeanLee97 commented Oct 19, 2024

@ir2718 Thank you very much for your PR!

After reviewing the implementation, you are absolutely correct!
I overlooked setting keepdim=True when summing the attention_mask, leading to a global sum being used.

@SeanLee97 SeanLee97 merged commit 9acf28f into SeanLee97:main Oct 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants