[minor] typo & comments #441

casinca · 2024-11-17T20:26:01Z

put line of code batch_size, seq_len = in_idx.shape as comment since we don't need these dims anymore (for absolute pos embs) now that we use RoPE
Added a # NEW in GQA __init__ for the added line assert num_heads % num_kv_groups == 0
I wanted to PR the GPT-to-all-Llama main picture in ch05/07_gpt_to_llama with my edit below, which include weights tying for Llama 3.2 but pictures are coming from your own website, so i thought it would be best you're the one taking care of it (having the same source)

- safe -> save - commenting code: batch_size, seq_len = in_idx.shape

- adding # NEW for assert num_heads % num_kv_groups == 0

review-notebook-app · 2024-11-17T20:26:05Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

rasbt

This looks good, thanks for the fix. I also re-uploaded the updated figure (it may take a while until it's reflected because GitHub buffers images for a couple of hours)

casinca added 2 commits November 17, 2024 19:57

typo & comment

4925790

- safe -> save - commenting code: batch_size, seq_len = in_idx.shape

comment

7292733

- adding # NEW for assert num_heads % num_kv_groups == 0

update memory wording

5860057

rasbt approved these changes Nov 18, 2024

View reviewed changes

rasbt merged commit bb31de8 into rasbt:main Nov 18, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[minor] typo & comments #441

[minor] typo & comments #441

casinca commented Nov 17, 2024

review-notebook-app bot commented Nov 17, 2024

rasbt left a comment

[minor] typo & comments #441

[minor] typo & comments #441

Conversation

casinca commented Nov 17, 2024

review-notebook-app bot commented Nov 17, 2024

rasbt left a comment

Choose a reason for hiding this comment