{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":783833344,"defaultBranch":"master","name":"llm.c","ownerLogin":"karpathy","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2024-04-08T16:58:11.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/241138?v=4","public":true,"private":false,"isOrgOwned":false},"refInfo":{"name":"","listCacheKey":"v0:1726256347.0","currentOid":""},"activityList":{"items":[{"before":"508c474bf9646f0929d78ed357adec104400b610","after":"685617f1646c60ea851619167eae1eb756ec22ac","ref":"refs/heads/llama3","pushedAt":"2024-09-17T21:31:21.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"make fp32 path in .py code work correctly","shortMessageHtmlLink":"make fp32 path in .py code work correctly"}},{"before":"234de31fdf8306bf8cfcb1f550b4587e16fa4218","after":"508c474bf9646f0929d78ed357adec104400b610","ref":"refs/heads/llama3","pushedAt":"2024-09-17T21:20:00.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"move debugging into fp32, so python has to write the fp32 version, and then we are focusing on the non-cudnn path at first. we're currently right after the first rmsnorm. the encoding right before this matched EXACTLY. but right now, after the first rmsnorm there is already an error of 1e-3 or so, which is highly suspicious so we are looking into it.","shortMessageHtmlLink":"move debugging into fp32, so python has to write the fp32 version, an…"}},{"before":"72e6f1ab0b83ab252639949b4880568460a673fa","after":"234de31fdf8306bf8cfcb1f550b4587e16fa4218","ref":"refs/heads/llama3","pushedAt":"2024-09-16T21:43:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"introduce rmsnorm, unfused, forward","shortMessageHtmlLink":"introduce rmsnorm, unfused, forward"}},{"before":"77e1d7afda1aaba823c10da65a5034644e5971b8","after":"72e6f1ab0b83ab252639949b4880568460a673fa","ref":"refs/heads/llama3","pushedAt":"2024-09-16T21:03:05.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"add new Encoder that does not use positional embeddings, like in llama 3. The activations match after encoding. onwards","shortMessageHtmlLink":"add new Encoder that does not use positional embeddings, like in llam…"}},{"before":"88663086fb23fe0bef5b773ec3c2ba8ac9031bf8","after":"77e1d7afda1aaba823c10da65a5034644e5971b8","ref":"refs/heads/llama3","pushedAt":"2024-09-16T19:44:36.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"add support for dataloader to serve uint32_t tokens, as necessary in Llama 3","shortMessageHtmlLink":"add support for dataloader to serve uint32_t tokens, as necessary in …"}},{"before":"b883560d264a173f6853d8c4097d512be3d51165","after":"88663086fb23fe0bef5b773ec3c2ba8ac9031bf8","ref":"refs/heads/llama3","pushedAt":"2024-09-16T17:45:44.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"adapt the sizes of all the parameter tensors and load them from file. so now we are loading all the Llama 3 weights. I verified that the sizes of all the tensors agree with python, and the total number of parameters","shortMessageHtmlLink":"adapt the sizes of all the parameter tensors and load them from file.…"}},{"before":"01bc4c685a78c52b6aa5c5c12b5482d26a9d1e51","after":"b883560d264a173f6853d8c4097d512be3d51165","ref":"refs/heads/llama3","pushedAt":"2024-09-13T22:47:32.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"change the export code of Llama 3 to be very GPT-2 friendly, using a combination of 3 hacks. this will make it so that we have to change very little code on the C side","shortMessageHtmlLink":"change the export code of Llama 3 to be very GPT-2 friendly, using a …"}},{"before":"09b47a747d16cd8cb59f75758daa8c5643c93a04","after":"01bc4c685a78c52b6aa5c5c12b5482d26a9d1e51","ref":"refs/heads/llama3","pushedAt":"2024-09-13T20:44:14.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"first set of changes to match up the .py and the .cu version. default hyperparameters, introduce int+float section of header, read the header and EXIT for now","shortMessageHtmlLink":"first set of changes to match up the .py and the .cu version. default…"}},{"before":null,"after":"09b47a747d16cd8cb59f75758daa8c5643c93a04","ref":"refs/heads/llama3","pushedAt":"2024-09-13T19:39:07.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"llama3 starting point is at gpt-2 exact copy paste for both train/test files","shortMessageHtmlLink":"llama3 starting point is at gpt-2 exact copy paste for both train/tes…"}},{"before":"bd457aa19bdb7c0776725f05fe9ecb692558aed8","after":"bd8c6045be3e108f1c221a53012631643b85e873","ref":"refs/heads/master","pushedAt":"2024-09-13T19:04:10.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"change default params: use tinyshakespeare and decrease LR","shortMessageHtmlLink":"change default params: use tinyshakespeare and decrease LR"}},{"before":"a2bdae248ad2d4ee6cf9421237a1c51f83a95283","after":"bd457aa19bdb7c0776725f05fe9ecb692558aed8","ref":"refs/heads/master","pushedAt":"2024-08-26T19:40:06.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #724 from GaoYusong/master\n\nadd llm.cpp(a port of this project featuring a tinytorch.hpp library) link to notable forks in readme","shortMessageHtmlLink":"Merge pull request #724 from GaoYusong/master"}},{"before":"ebc28b9563f4f1c0a4533ea110eb70adb1387dc6","after":"a2bdae248ad2d4ee6cf9421237a1c51f83a95283","ref":"refs/heads/master","pushedAt":"2024-08-26T19:37:33.000Z","pushType":"pr_merge","commitsCount":4,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #733 from zhangpiu/feature/llm.cpp\n\nAdd llm.cpp(a port of this project using Eigen library, supporting CPU/CUDA), link to notable forks in readme","shortMessageHtmlLink":"Merge pull request #733 from zhangpiu/feature/llm.cpp"}},{"before":"2c9213731ecbe6eff6b7ec597d3b760e9fe423ae","after":"ebc28b9563f4f1c0a4533ea110eb70adb1387dc6","ref":"refs/heads/master","pushedAt":"2024-08-26T19:24:23.000Z","pushType":"pr_merge","commitsCount":5,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #735 from gordicaleksa/minor_refactor3\n\nMinor LLaMA 3 refactor","shortMessageHtmlLink":"Merge pull request #735 from gordicaleksa/minor_refactor3"}},{"before":"0ddedf940de28f1bdc42ffbb88cd55445b41652c","after":"2c9213731ecbe6eff6b7ec597d3b760e9fe423ae","ref":"refs/heads/master","pushedAt":"2024-08-16T19:49:55.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #744 from dengl11/pr\n\nfix a typo","shortMessageHtmlLink":"Merge pull request #744 from dengl11/pr"}},{"before":"4c84bc743c1d1e2910a5477e6f3c32e758c9f661","after":"0ddedf940de28f1bdc42ffbb88cd55445b41652c","ref":"refs/heads/master","pushedAt":"2024-08-16T19:11:11.000Z","pushType":"pr_merge","commitsCount":8,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #745 from karpathy/feature/managed2\n\nfeature/managed2","shortMessageHtmlLink":"Merge pull request #745 from karpathy/feature/managed2"}},{"before":"8c586f915808a208373c76456f1cad84de66b4c9","after":"18298f3a408b666557133628b30dca2135cc3775","ref":"refs/heads/feature/managed2","pushedAt":"2024-08-16T18:39:24.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"i misspelled reduced","shortMessageHtmlLink":"i misspelled reduced"}},{"before":"e6856bc5664ca4b43d6c87883a0c5cadfc202cf6","after":"8c586f915808a208373c76456f1cad84de66b4c9","ref":"refs/heads/feature/managed2","pushedAt":"2024-08-16T18:08:37.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"reduce across GPUs nicer","shortMessageHtmlLink":"reduce across GPUs nicer"}},{"before":null,"after":"e6856bc5664ca4b43d6c87883a0c5cadfc202cf6","ref":"refs/heads/feature/managed2","pushedAt":"2024-08-16T17:23:48.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"fallback to memory allocation of m,v,master_weights on host automatically in case of OOM. will run slower but won't OOM","shortMessageHtmlLink":"fallback to memory allocation of m,v,master_weights on host automatic…"}},{"before":"1787210306ac06c356fe71219e94da3ebb8be3a2","after":"4c84bc743c1d1e2910a5477e6f3c32e758c9f661","ref":"refs/heads/master","pushedAt":"2024-08-13T18:18:57.000Z","pushType":"pr_merge","commitsCount":14,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #740 from karpathy/gordicaleksa-fix_dataloader2\n\nGordicaleksa fix dataloader2","shortMessageHtmlLink":"Merge pull request #740 from karpathy/gordicaleksa-fix_dataloader2"}},{"before":"16635d41a2a7c0c21ec058eb4201ff75ab97e392","after":"755458d0e1a457e35be97cf3bc400487c731c03e","ref":"refs/heads/gordicaleksa-fix_dataloader2","pushedAt":"2024-08-13T17:24:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"fix tokenizer omg","shortMessageHtmlLink":"fix tokenizer omg"}},{"before":null,"after":"16635d41a2a7c0c21ec058eb4201ff75ab97e392","ref":"refs/heads/gordicaleksa-fix_dataloader2","pushedAt":"2024-08-13T02:13:25.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"attempt to fix PR","shortMessageHtmlLink":"attempt to fix PR"}},{"before":"6e6a528111cc6641f09d0ebf2ca1e7432d1c87a4","after":"1787210306ac06c356fe71219e94da3ebb8be3a2","ref":"refs/heads/master","pushedAt":"2024-08-12T23:45:15.000Z","pushType":"pr_merge","commitsCount":3,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #738 from ademeure/faster_compile\n\nImprove compile time (simple makefile changes)","shortMessageHtmlLink":"Merge pull request #738 from ademeure/faster_compile"}},{"before":"29aacba1f502e9038afb1555f469c8a5629f55c4","after":"6e6a528111cc6641f09d0ebf2ca1e7432d1c87a4","ref":"refs/heads/master","pushedAt":"2024-08-08T21:08:20.000Z","pushType":"pr_merge","commitsCount":37,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #725 from gordicaleksa/llama\n\nAdd LLaMA 3 Python support","shortMessageHtmlLink":"Merge pull request #725 from gordicaleksa/llama"}},{"before":null,"after":"0aa91755691656ddfc46d4798fd1d7647cb99e67","ref":"refs/heads/feature/finetune_llama31py","pushedAt":"2024-08-04T20:26:54.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"still wip just putting things up for comment","shortMessageHtmlLink":"still wip just putting things up for comment"}},{"before":"ef12d1b80ebc99b0af9d3ec6d4dddbbd0423b019","after":"29aacba1f502e9038afb1555f469c8a5629f55c4","ref":"refs/heads/master","pushedAt":"2024-07-30T20:25:04.000Z","pushType":"pr_merge","commitsCount":2,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #705 from gordicaleksa/refactor_c\n\nRefactor C code","shortMessageHtmlLink":"Merge pull request #705 from gordicaleksa/refactor_c"}},{"before":"3cefe09f7ee7c91460a0b06abbc01dc3bfcaca75","after":"ef12d1b80ebc99b0af9d3ec6d4dddbbd0423b019","ref":"refs/heads/master","pushedAt":"2024-07-30T20:10:23.000Z","pushType":"pr_merge","commitsCount":7,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #717 from ngc92/nvml\n\nNvidia management library for more detailed GPU state printing","shortMessageHtmlLink":"Merge pull request #717 from ngc92/nvml"}},{"before":"9cc357e27d26021ac8b094c4355a90247237eafb","after":"3cefe09f7ee7c91460a0b06abbc01dc3bfcaca75","ref":"refs/heads/master","pushedAt":"2024-07-30T20:08:17.000Z","pushType":"push","commitsCount":5,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge branch 'master' of github.com:karpathy/llm.c","shortMessageHtmlLink":"Merge branch 'master' of github.com:karpathy/llm.c"}},{"before":"b4623bc5e74288b75de9b6297d10bc4b37760f8c","after":"9cc357e27d26021ac8b094c4355a90247237eafb","ref":"refs/heads/master","pushedAt":"2024-07-30T20:04:38.000Z","pushType":"pr_merge","commitsCount":11,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"Merge pull request #715 from karpathy/feature/restore_from_master\n\nFeature/restore from master","shortMessageHtmlLink":"Merge pull request #715 from karpathy/feature/restore_from_master"}},{"before":"2b827f1659f1f6146d600dbbf512ae9bbe7ebeab","after":"51dd102328a68d3a53b6588b09cf3f60dfd1bb36","ref":"refs/heads/feature/restore_from_master","pushedAt":"2024-07-30T20:01:46.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"remove confusing comment","shortMessageHtmlLink":"remove confusing comment"}},{"before":"a794bcb395c459b1173d1b2bcf99321b6bbd6851","after":"2b827f1659f1f6146d600dbbf512ae9bbe7ebeab","ref":"refs/heads/feature/restore_from_master","pushedAt":"2024-07-29T02:39:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"karpathy","name":"Andrej","path":"/karpathy","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/241138?s=80&v=4"},"commit":{"message":"bring back state allocation into build_from_checkpoint","shortMessageHtmlLink":"bring back state allocation into build_from_checkpoint"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0xN1QyMTozMToyMS4wMDAwMDBazwAAAAS5AeHr","startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0xN1QyMTozMToyMS4wMDAwMDBazwAAAAS5AeHr","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wNy0yOVQwMjozOTo1Mi4wMDAwMDBazwAAAASLb132"}},"title":"Activity · karpathy/llm.c"}