Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[minor] typo & comments #441

Merged
merged 3 commits into from
Nov 18, 2024
Merged

[minor] typo & comments #441

merged 3 commits into from
Nov 18, 2024

Conversation

casinca
Copy link
Contributor

@casinca casinca commented Nov 17, 2024

  • put line of code batch_size, seq_len = in_idx.shape as comment since we don't need these dims anymore (for absolute pos embs) now that we use RoPE

  • Added a # NEW in GQA __init__ for the added line assert num_heads % num_kv_groups == 0

  • I wanted to PR the GPT-to-all-Llama main picture in ch05/07_gpt_to_llama with my edit below, which include weights tying for Llama 3.2 but pictures are coming from your own website, so i thought it would be best you're the one taking care of it (having the same source)

68747470733a2f2f73656261737469616e72617363686b612e636f6d2f696d616765732f4c4c4d732d66726f6d2d736372617463682d696d616765732f626f6e75732f6770742d746f2d6c6c616d612f6c6c616d6133312d746f2d6c6c616d6133322e776562703f31

- safe -> save
- commenting code: batch_size, seq_len = in_idx.shape
- adding # NEW for assert num_heads % num_kv_groups == 0
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Owner

@rasbt rasbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks for the fix. I also re-uploaded the updated figure (it may take a while until it's reflected because GitHub buffers images for a couple of hours)

@rasbt rasbt merged commit bb31de8 into rasbt:main Nov 18, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants