From 3d78d3acc98478476a3fecf6cb2f6da357dda380 Mon Sep 17 00:00:00 2001 From: Zhongdong Yang Date: Thu, 7 Sep 2023 17:37:16 +0800 Subject: [PATCH] Add: zh/codellama.md (#1465) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * add starcoder cn Signed-off-by: Yao, Matrix Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Resize image * Update blog index --------- Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca --------- Co-authored-by: Pedro Cuenca * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffusers-turns-1.md wrong meta data format Policy blog: Open ML Considerations in the EU AI Act (#1342) * Create .gitignore * Add files via upload * Create eu-ai-act-oss.md * Delete .gitignore * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update _blog.yml * Update eu-ai-act-oss.md * Update: zh/game-jam-first-edition-results.md Update: zh/game-jam-first-edition-results.md * Add: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * 3 Gaudi posts cn done: - bridgetower.md - getting-started-habana.md - habana-gaudi-2-benchmark.md Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang * Update: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * Add: zh/transformers-design-philosophy.md Signed-off-by: Yao, Matrix Update: proofread zh/transformers-design-philosophy.md * Add: zh/os-llms.md * Translate os-llms.md * Update _blog.yml Update: proofread zh/os-llms.md * Add: zh/dpo-trl.md * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * add starcoder cn Signed-off-by: Yao, Matrix Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Resize image * Update blog index --------- Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca --------- Co-authored-by: Pedro Cuenca * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffusers-turns-1.md wrong meta data format Policy blog: Open ML Considerations in the EU AI Act (#1342) * Create .gitignore * Add files via upload * Create eu-ai-act-oss.md * Delete .gitignore * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update _blog.yml * Update eu-ai-act-oss.md * Update: zh/game-jam-first-edition-results.md Update: zh/game-jam-first-edition-results.md * Add: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * 3 Gaudi posts cn done: - bridgetower.md - getting-started-habana.md - habana-gaudi-2-benchmark.md Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang * Update: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * Add: zh/transformers-design-philosophy.md Signed-off-by: Yao, Matrix Update: proofread zh/transformers-design-philosophy.md * dpo-trl cn done Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Signed-off-by: Yang, Zhongdong Co-authored-by: innovation64 Co-authored-by: Zhongdong Yang Co-authored-by: SuSung-boy <872414318@qq.com> Co-authored-by: Zhongdong Yang Co-authored-by: Luke Cheng <2258420+chenglu@users.noreply.github.com> Co-authored-by: yaoqih <40328311+yaoqih@users.noreply.github.com> Co-authored-by: Shiliang Chen <36809537+csl122@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: 李洋 <45715979+innovation64@users.noreply.github.com> Co-authored-by: Hoi2022 <120370631+Hoi2022@users.noreply.github.com> Co-authored-by: Nicolas Patry Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> Co-authored-by: Victor Muštar Co-authored-by: Kashif Rasul Co-authored-by: Julien Simon <3436143+juliensimon@users.noreply.github.com> Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca Co-authored-by: gxy-gxy <57594446+gxy-gxy@users.noreply.github.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Lucain Co-authored-by: Eswar Divi <76403422+EswarDivi@users.noreply.github.com> Co-authored-by: Qi Zhang <82949744+Vermillion-de@users.noreply.github.com> * Update: proofread zh/dpo-trl.md * Add: zh/optimizing-bark.md Signed-off-by: Yao, Matrix Update: zh/optimizing-bark.md * Add: zh/sd_distillation.md * update sd_distillation cn * update sd_distillation cn * update sd_distillation cn * update sd_distillation cn * Update zh/sd_distillation.md * Update zh/sd_distillation.md --------- Co-authored-by: Zhongdong Yang Co-authored-by: Zhongdong Yang Update: zh/sd_distillation.md * Add zh for deploy-deepfloydif-using-bentoml.md (#42) * Add: zh/codellama.md & Update: zh/llama2.md * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * add starcoder cn Signed-off-by: Yao, Matrix Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Resize image * Update blog index --------- Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca --------- Co-authored-by: Pedro Cuenca * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffusers-turns-1.md wrong meta data format Policy blog: Open ML Considerations in the EU AI Act (#1342) * Create .gitignore * Add files via upload * Create eu-ai-act-oss.md * Delete .gitignore * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update _blog.yml * Update eu-ai-act-oss.md * Update: zh/game-jam-first-edition-results.md Update: zh/game-jam-first-edition-results.md * Add: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * 3 Gaudi posts cn done: - bridgetower.md - getting-started-habana.md - habana-gaudi-2-benchmark.md Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang * Update: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * Add: zh/transformers-design-philosophy.md Signed-off-by: Yao, Matrix Update: proofread zh/transformers-design-philosophy.md * Add: zh/os-llms.md * Translate os-llms.md * Update _blog.yml Update: proofread zh/os-llms.md * Add: zh/dpo-trl.md * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * add starcoder cn Signed-off-by: Yao, Matrix Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Resize image * Update blog index --------- Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca --------- Co-authored-by: Pedro Cuenca * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffusers-turns-1.md wrong meta data format Policy blog: Open ML Considerations in the EU AI Act (#1342) * Create .gitignore * Add files via upload * Create eu-ai-act-oss.md * Delete .gitignore * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update _blog.yml * Update eu-ai-act-oss.md * Update: zh/game-jam-first-edition-results.md Update: zh/game-jam-first-edition-results.md * Add: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * 3 Gaudi posts cn done: - bridgetower.md - getting-started-habana.md - habana-gaudi-2-benchmark.md Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Co-authored-by: Zhongdong Yang * Update: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * Add: zh/transformers-design-philosophy.md Signed-off-by: Yao, Matrix Update: proofread zh/transformers-design-philosophy.md * dpo-trl cn done Signed-off-by: Yao, Matrix --------- Signed-off-by: Yao, Matrix Signed-off-by: Yang, Zhongdong Co-authored-by: innovation64 Co-authored-by: Zhongdong Yang Co-authored-by: SuSung-boy <872414318@qq.com> Co-authored-by: Zhongdong Yang Co-authored-by: Luke Cheng <2258420+chenglu@users.noreply.github.com> Co-authored-by: yaoqih <40328311+yaoqih@users.noreply.github.com> Co-authored-by: Shiliang Chen <36809537+csl122@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: 李洋 <45715979+innovation64@users.noreply.github.com> Co-authored-by: Hoi2022 <120370631+Hoi2022@users.noreply.github.com> Co-authored-by: Nicolas Patry Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> Co-authored-by: Victor Muštar Co-authored-by: Kashif Rasul Co-authored-by: Julien Simon <3436143+juliensimon@users.noreply.github.com> Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca Co-authored-by: gxy-gxy <57594446+gxy-gxy@users.noreply.github.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Lucain Co-authored-by: Eswar Divi <76403422+EswarDivi@users.noreply.github.com> Co-authored-by: Qi Zhang <82949744+Vermillion-de@users.noreply.github.com> * Update: proofread zh/dpo-trl.md * codellama cn done, and add llama2 added section Signed-off-by: YAO Matrix * fix issue Signed-off-by: N * refine Signed-off-by: N --------- Signed-off-by: Yao, Matrix Signed-off-by: Yang, Zhongdong Signed-off-by: YAO Matrix Signed-off-by: N Co-authored-by: innovation64 Co-authored-by: Zhongdong Yang Co-authored-by: SuSung-boy <872414318@qq.com> Co-authored-by: Zhongdong Yang Co-authored-by: Luke Cheng <2258420+chenglu@users.noreply.github.com> Co-authored-by: yaoqih <40328311+yaoqih@users.noreply.github.com> Co-authored-by: Shiliang Chen <36809537+csl122@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: 李洋 <45715979+innovation64@users.noreply.github.com> Co-authored-by: Hoi2022 <120370631+Hoi2022@users.noreply.github.com> Co-authored-by: Nicolas Patry Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> Co-authored-by: Victor Muštar Co-authored-by: Kashif Rasul Co-authored-by: Julien Simon <3436143+juliensimon@users.noreply.github.com> Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca Co-authored-by: gxy-gxy <57594446+gxy-gxy@users.noreply.github.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Lucain Co-authored-by: Eswar Divi <76403422+EswarDivi@users.noreply.github.com> Co-authored-by: Qi Zhang <82949744+Vermillion-de@users.noreply.github.com> * Update: zh/codellama.md * Fix: add missing tags in zh/_blog.yml --------- Signed-off-by: Yao, Matrix Signed-off-by: Yang, Zhongdong Signed-off-by: YAO Matrix Signed-off-by: N Co-authored-by: innovation64 Co-authored-by: Yao, Matrix Co-authored-by: SuSung-boy <872414318@qq.com> Co-authored-by: Luke Cheng <2258420+chenglu@users.noreply.github.com> Co-authored-by: yaoqih <40328311+yaoqih@users.noreply.github.com> Co-authored-by: Shiliang Chen <36809537+csl122@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: 李洋 <45715979+innovation64@users.noreply.github.com> Co-authored-by: Yao Matrix Co-authored-by: Hoi2022 <120370631+Hoi2022@users.noreply.github.com> Co-authored-by: Nicolas Patry Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> Co-authored-by: Victor Muštar Co-authored-by: Kashif Rasul Co-authored-by: Julien Simon <3436143+juliensimon@users.noreply.github.com> Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca Co-authored-by: gxy-gxy <57594446+gxy-gxy@users.noreply.github.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Lucain Co-authored-by: Eswar Divi <76403422+EswarDivi@users.noreply.github.com> Co-authored-by: Qi Zhang <82949744+Vermillion-de@users.noreply.github.com> --- zh/_blog.yml | 11 ++ zh/codellama.md | 350 ++++++++++++++++++++++++++++++++++++++++++++++++ zh/llama2.md | 126 ++++++++++++----- 3 files changed, 451 insertions(+), 36 deletions(-) create mode 100644 zh/codellama.md diff --git a/zh/_blog.yml b/zh/_blog.yml index ea410e743c..08f53eeb4b 100644 --- a/zh/_blog.yml +++ b/zh/_blog.yml @@ -884,6 +884,17 @@ - announcement - security +- local: codellama + title: "Code Llama:Llama 2 学会写代码了!" + author: philschmid + thumbnail: /blog/assets/160_codellama/thumbnail.jpg + date: August 25, 2023 + tags: + - nlp + - community + - research + - LLM + - local: falcon-180b title: "Falcon 180B 现已登陆 Hugging Face Hub" author: philschmid diff --git a/zh/codellama.md b/zh/codellama.md new file mode 100644 index 0000000000..c67ebdd8d4 --- /dev/null +++ b/zh/codellama.md @@ -0,0 +1,350 @@ +--- +title: "Code Llama:Llama 2 学会写代码了!" +thumbnail: /blog/assets/160_codellama/thumbnail.jpg +authors: +- user: philschmid +- user: osanseviero +- user: pcuenq +- user: lewtun +- user: lvwerra +- user: loubnabnl +- user: ArthurZ +- user: joaogante +translators: +- user: MatrixYao +- user: zhongdongy + proofreader: true +--- + +# Code Llama: Llama 2 学会写代码了! + + + + +## 引言 + +Code Llama 是为代码类任务而生的一组最先进的、开放的 [Llama 2](https://huggingface.co/blog/zh/llama2) 模型,我们很高兴能将其集成入 Hugging Face 生态系统!Code Llama 使用与 Llama 2 相同的社区许可证,且可商用。 + +今天,我们很高兴能发布 Hugging Face 对 Code Llama 的全面支持 , 包括: + +- Hub 上的模型支持,包括模型卡及许可证 +- Transformers 已集成 Code Llama +- TGI 已集成 Code Llama,以支持对其进行快速高效的产品级推理 +- 推理终端 (Inference Endpoints) 已集成 Code Llama +- 对 Code Llama 的代码基准测试结果已发布 + +代码大语言模型的发展对于软件工程师来说无疑是振奋人心的,因为这意味着他们可以通过 IDE 中的代码补全功能来提高生产力,并利用其来处理重复或烦人的任务,例如为代码编写文档字符串或创建单元测试。 + +## 目录 + +- [引言](#引言) +- [目录](#目录) +- [Code Llama 简介](#code-llama-简介) +- [如何使用 Code Llama?](#如何使用-code-llama) + - [演示](#演示) + - [Transformers](#transformers) + - [代码补全](#代码补全) + - [代码填充](#代码填充) + - [对话式指令](#对话式指令) + - [4 比特加载](#4-比特加载) + - [使用 TGI 和推理终端](#使用-tgi-和推理终端) +- [评估](#评估) +- [其他资源](#其他资源) + +## Code Llama 简介 + +Code Llama 包含 3 个不同参数量的版本,分别为: 7 亿参数版、13 亿参数版 以及 340 亿参数版。在训练基础模型时,先用同等参数量的 Llama 2 模型初始化权重,然后在 5000 亿词元的代码数据集上训练。 Meta 还对训得的基础模型进行了两种不同风格的微调,分别为: Python 专家版 (再加 1000 亿个额外词元) ; 以及指令微调版,其可以理解自然语言指令。 + +这些模型在 Python、C++、Java、PHP、C#、TypeScript 和 Bash 中都展现出最先进的性能。7B 和 13B 基础版和指令版支持完形填空,因此非常适合用作代码助手。 + +Code Llama 基于 16k 上下文窗口训练。此外,这三个尺寸的模型还进行了额外的长上下文微调,使其上下文窗口最多可扩展至 10 万词元。 + +受益于 RoPE 扩展方面的最新进展,将 Llama 2 的 4k 上下文窗口增加到 Code Llama 的 16k (甚至可以外插至 100k) 成为可能。社区发现可以对 Llama 的位置嵌入进行线性插值或频域插值,这使得通过微调让基础模型轻松扩展到更大的上下文窗口成为可能。在 Code Llama 中,他们把频域缩放和松弛技术二者结合起来: 微调长度是缩放后的预训练长度的一小部分。这个做法赋予了模型强大的外推能力。 + +![训练过程](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/160_codellama/training-process.jpg 训练过程 ) + +第一步是在 5000 亿词元的公开代码数据集上训练出一个模型。该数据集中除了有代码数据集外,还包含一些自然语言数据集,例如有关代码和代码片段的讨论,且最终数据集是使用近似去重法去过重的。不幸的是,Meta 没有披露有关该数据集的更多信息。 + +在对模型进行指令微调时,使用了两个数据集: 为 Llama 2 Chat 收集的指令微调数据集和自指令数据集。自指令数据集收集了 Llama 2 编制出的编程面试问题,然后使用 Code Llama 生成单元测试和解答,最后通过执行测试来评估解答。 + +## 如何使用 Code Llama? + +`Transformers` 从 4.33 版开始支持 Code Llama。在此之前,需要从主分支进行源代码安装才行。 + +### 演示 + +我们准备了 **[这个 Space](https://huggingface.co/spaces/codellama/codellama-playground)** 或下面的 Playground 以供大家尝试 Code Llama 模型 (130 亿参数!): + + + + +这个演示背后使用了 Hugging Face [TGI](https://github.com/huggingface/text-generation-inference),[HuggingChat](https://huggingface.co/chat) 也用了相同的技术,具体内容见下文。 + +你还可以玩玩 [这个聊天机器人](https://huggingface.co/spaces/codellama/codellama-13b-chat),或者复制一份到自己的账号下以供你使用 – 它是自含的,因此你可以随心所欲地修改代码! + +### Transformers + +从最新发布的 `transformers` 4.33 开始,你可以在 Code Llama 上应用 HF 生态系统中的所有工具,例如: + +- 训练和推理脚本和示例 +- 安全的文件格式 (`safetensors` ) +- 与 `bitsandbytes` (4 比特量化) 和 PEFT 等工具结合使用 +- 运行模型生成所需的工具及辅助代码 +- 导出模型以进行部署的机制 + +在 `transformers` 4.33 发布之前,用户需要从主分支源码安装 `transformers` 。 + +```bash +!pip install git+https://github.com/huggingface/transformers.git@main accelerate +``` + +#### 代码补全 + +我们可以使用 7B 和 13B 模型进行文本/代码补全或填充。下述代码演示了如何使用 `pipeline` 接口来进行文本补全。运行时,只需选择 GPU 即可在 Colab 的免费 GPU 上运行。 + +```python +from transformers import AutoTokenizer +import transformers +import torch + +tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf") +pipeline = transformers.pipeline( + "text-generation", + model="codellama/CodeLlama-7b-hf", + torch_dtype=torch.float16, + device_map="auto", +) + +sequences = pipeline( + 'def fibonacci(', + do_sample=True, + temperature=0.2, + top_p=0.9, + num_return_sequences=1, + eos_token_id=tokenizer.eos_token_id, + max_length=100, +) +for seq in sequences: + print(f"Result: {seq['generated_text']}") +``` + +其输出如下: + +```python +Result: def fibonacci(n): + if n == 0: + return 0 + elif n == 1: + return 1 + else: + return fibonacci(n-1) + fibonacci(n-2) + +def fibonacci_memo(n, memo={}): + if n == 0: + return 0 + elif n == 1: + return +``` + +Code Llama 虽然专精于代码理解,但其仍是一个语言模型。你仍然可以使用相同的生成策略来自动完成注释或自然语言文本。 + +#### 代码填充 + +这是代码模型才能完成的专门任务。该模型经过训练后,可以生成与给定上下文最匹配的代码 (包括注释)。这是代码助理的典型使用场景: 要求它们根据上下文填充当前光标处的代码。 + +此任务需要使用 7B 和 13B 的 **基础** 或 **指令** 模型。任何 34B 或 Python 版模型不能用于此任务。 + +填充类任务需要在生成时使用与训练时相同格式的输入文本,因为训练时会使用特殊的分隔符来区分提示的不同部分。幸运的是, `transformers` 的 `CodeLlamaTokenizer` 已经帮你把这事做了,如下所示: + +```python +from transformers import AutoTokenizer, AutoModelForCausalLM +import transformers +import torch + +model_id = "codellama/CodeLlama-7b-hf" +tokenizer = AutoTokenizer.from_pretrained(model_id) +model = AutoModelForCausalLM.from_pretrained( + model_id, + torch_dtype=torch.float16 +).to("cuda") + +prompt = '''def remove_non_ascii(s: str) -> str: + """ + return result +''' + +input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"].to("cuda") +output = model.generate( + input_ids, + max_new_tokens=200, +) +output = output[0].to("cpu") + +filling = tokenizer.decode(output[input_ids.shape[1]:], skip_special_tokens=True) +print(prompt.replace("", filling)) +``` + +输出如下: + +```Python +def remove_non_ascii(s: str) -> str: + """ Remove non-ASCII characters from a string. + + Args: + s: The string to remove non-ASCII characters from. + + Returns: + The string with non-ASCII characters removed. + """ + result = "" + for c in s: + if ord(c) < 128: + result += c + return result +``` + +在底层,分词器会 [自动按 `` 分割](https://huggingface.co/docs/transformers/main/model_doc/code_llama#transformers.CodeLlamaTokenizer.fill_token) 并生成一个格式化的输入字符串,其格式与 [训练时的格式](https://github.com/facebookresearch/codellama/blob/cb51c14ec761370ba2e2bc351374a79265d0465e/llama/generation.py#L402) 相同。这样做既避免了用户自己格式化的很多麻烦,也避免了一些很难调试的陷阱,例如词元粘合 (token glueing)。 + +#### 对话式指令 + +如上所述,基础模型可用于补全和填充。Code Llama 还包含一个适用于对话场景的指令微调模型。 + +为此类任务准备输入时,我们需要一个提示模板。一个例子是我们在 [Llama 2 博文](https://huggingface.co/blog/zh/llama2#如何提示-Llama-2) 中描述的模板,如下: + +``` +[INST] <> +{{ system_prompt }} +<> + +{{ user_msg_1 }} [/INST]{{ model_answer_1 }} [INST]{{ user_msg_2 }} [/INST] +``` + +请注意,系统提示 ( `system prompt` ) 是可选的 - 没有它模型也能工作,但你可以用它来进一步指定模型的行为或风格。例如,如果你希望获得 JavaScript 的答案,即可在此声明。在系统提示之后,你需要提供对话交互历史: 用户问了什么以及模型回答了什么。与填充场景一样,你需要注意分隔符的使用。输入的最后必须是新的用户指令,这对模型而言是让其提供答案的信号。 + +以下代码片段演示了如何在实际工作中使用该模板。 + +1. **首次用户输入,无系统提示** + +```python +user = 'In Bash, how do I list all text files in the current directory (excluding subdirectories) that have been modified in the last month?' + +prompt = f"[INST]{user.strip()} [/INST]" +inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda") +``` + +1. **首次用户查询,有系统提示** + +```python +system = "Provide answers in JavaScript" +user = "Write a function that computes the set of sums of all contiguous sublists of a given list." + +prompt = f"<>\\n{system}\\n<>\\n\\n{user}" +inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda") +``` + +1. **含对话历史的多轮对话** + +该过程与 [Llama 2](https://huggingface.co/blog/zh/llama2#如何提示-Llama-2) 中的过程相同。为了最清楚起见,我们没有使用循环或泛化此示例代码: + +```python +system = "System prompt" +user_1 = "user_prompt_1" +answer_1 = "answer_1" +user_2 = "user_prompt_2" +answer_2 = "answer_2" +user_3 = "user_prompt_3" + +prompt = f"<>\\n{system}\\n<>\\n\\n{user_1}" +prompt = f"[INST]{prompt.strip()} [/INST]{answer_1.strip()} " +prompt += f"[INST]{user_2.strip()} [/INST]{answer_2.strip()} " +prompt += f"[INST]{user_3.strip()} [/INST]" + +inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda") +``` + +#### 4 比特加载 + +将 Code Llama 集成到 Transformers 中意味着我们可以立即获得 4 比特加载等高级功能的支持。这使得用户可以在英伟达 3090 卡等消费类 GPU 上运行大型的 32B 参数量模型! + +以下是在 4 比特模式下运行推理的方法: + +```Python +from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig +import torch + +model_id = "codellama/CodeLlama-34b-hf" +quantization_config = BitsAndBytesConfig( + load_in_4bit=True, + bnb_4bit_compute_dtype=torch.float16 +) + +tokenizer = AutoTokenizer.from_pretrained(model_id) +model = AutoModelForCausalLM.from_pretrained( + model_id, + quantization_config=quantization_config, + device_map="auto", +) + +prompt = 'def remove_non_ascii(s: str) -> str:\n """ ' +inputs = tokenizer(prompt, return_tensors="pt").to("cuda") + +output = model.generate( + inputs["input_ids"], + max_new_tokens=200, + do_sample=True, + top_p=0.9, + temperature=0.1, +) +output = output[0].to("cpu") +print(tokenizer.decode(output)) +``` + +### 使用 TGI 和推理终端 + +[TGI](https://github.com/huggingface/text-generation-inference) 是 Hugging Face 开发的生产级推理容器,可用于轻松部署大语言模型。它包含连续批处理、流式输出、基于张量并行的多 GPU 快速推理以及生产级的日志记录和跟踪等功能。 + +你可以在自己的基础设施上使用 TGI,也可以使用 Hugging Face 的 [推理终端](https://huggingface.co/inference-endpoints)。要部署 Codellama 2 模型,请登陆其 [模型页面](https://huggingface.co/codellama),然后单击 [Deploy -> Inference Endpoints](https://huggingface.co/codellama/CodeLlama-7b-hf) 按钮。 + +- 推理 7B 模型,我们建议选择“GPU [medium] - 1x Nvidia A10G”。 +- 推理 13B 模型,我们建议选择“GPU [xlarge] - 1x Nvidia A100”。 +- 推理 34B 模型,我们建议启用 `bitsandbytes` 量化并选择“GPU [1xlarge] - 1x Nvidia A100”或“GPU [2xlarge] - 2x Nvidia A100” + +_注意: 你可能需要发邮件给 **[api-enterprise@huggingface.co](mailto:api-enterprise@huggingface.co)** 申请配额升级才能访问 A100_ + +你可以在我们的博文中详细了解如何 [使用 Hugging Face 推理终端部署 LLM](https://huggingface.co/blog/zh/inference-endpoints-llm),该 [博文](https://huggingface.co/blog/zh/inference-endpoints-llm) 还包含了有关其支持的超参以及如何使用 Python 和 Javascript API 流式生成文本的相关知识。 + +## 评估 + +代码语言模型通常在 HumanEval 等数据集上进行基准测试,其包含了一系列编程题,我们将函数签名和文档字符串输入给模型,模型需要完成函数体代码的编写。接着是运行一组预定义的单元测试来验证所提出的解答。最后是报告通过率,即有多少解答通过了所有测试。pass@1 度量了模型一次生成即通过的频率,而 pass@10 描述了模型生成 10 个候选解答其中至少有一个解答通过的频率。 + +虽然 HumanEval 是一个 Python 基准测试,但社区付出了巨大努力将其转成更多编程语言,从而实现更全面的评估。其中一种方法是 [MultiPL-E](https://github.com/nuprl/MultiPL-E),它将 HumanEval 翻译成十多种编程语言。我们正在基于其制作一个 [多语言代码排行榜](https://huggingface.co/spaces/bigcode/multilingual-code-evals),这样社区就可以用它来比较不同模型在各种编程语言上的表现,以评估哪个模型最适合他们的需求。 + +| 模型 | 许可证 | 训练数据集是否已知 | 是否可商用 | 预训练词元数 | Python | JavaScript | Leaderboard Avg Score | +| ---------------------- | ------------------ | ------------- | --------------- | --------------------------- | ------ | ---------- | --------------------- | +| CodeLlaMa-34B | Llama 2 license | ❌ | ✅ | 2,500B | 45.11 | 41.66 | 33.89 | +| CodeLlaMa-13B | Llama 2 license | ❌ | ✅ | 2,500B | 35.07 | 38.26 | 28.35 | +| CodeLlaMa-7B | Llama 2 license | ❌ | ✅ | 2,500B | 29.98 | 31.8 | 24.36 | +| CodeLlaMa-34B-Python | Llama 2 license | ❌ | ✅ | 2,620B | 53.29 | 44.72 | 33.87 | +| CodeLlaMa-13B-Python | Llama 2 license | ❌ | ✅ | 2,620B | 42.89 | 40.66 | 28.67 | +| CodeLlaMa-7B-Python | Llama 2 license | ❌ | ✅ | 2,620B | 40.48 | 36.34 | 23.5 | +| CodeLlaMa-34B-Instruct | Llama 2 license | ❌ | ✅ | 2,620B | 50.79 | 45.85 | 35.09 | +| CodeLlaMa-13B-Instruct | Llama 2 license | ❌ | ✅ | 2,620B | 50.6 | 40.91 | 31.29 | +| CodeLlaMa-7B-Instruct | Llama 2 license | ❌ | ✅ | 2,620B | 45.65 | 33.11 | 26.45 | +| StarCoder-15B | BigCode-OpenRail-M | ✅ | ✅ | 1,035B | 33.57 | 30.79 | 22.74 | +| StarCoderBase-15B | BigCode-OpenRail-M | ✅ | ✅ | 1,000B | 30.35 | 31.7 | 22.4 | +| WizardCoder-15B | BigCode-OpenRail-M | ❌ | ✅ | 1,035B | 58.12 | 41.91 | 32.07 | +| OctoCoder-15B | BigCode-OpenRail-M | ✅ | ✅ | 1,000B | 45.3 | 32.8 | 24.01 | +| CodeGeeX-2-6B | CodeGeeX License | ❌ | ❌ | 2,000B | 33.49 | 29.9 | 21.23 | +| CodeGen-2.5-7B-Mono | Apache-2.0 | ✅ | ✅ | 1400B | 45.65 | 23.22 | 12.1 | +| CodeGen-2.5-7B-Multi | Apache-2.0 | ✅ | ✅ | 1400B | 28.7 | 26.27 | 20.04 | + +**注意:** 上表中的分数来自我们的代码排行榜,所有模型均使用相同的设置。欲了解更多详情,请参阅 [排行榜](https://huggingface.co/spaces/bigcode/multilingual-code-evals)。 + +## 其他资源 + +- [Hub 上的模型](https://huggingface.co/codellama) +- [论文](https://huggingface.co/papers/2308.12950) +- [Meta 官宣博文](https://ai.meta.com/blog/code-llama-large-language-model-coding/) +- [负责任使用指南](https://ai.meta.com/llama/responsible-use-guide/) +- [演示 (代码补全,流式生成)](https://huggingface.co/spaces/codellama/codellama-playground) +- [演示 (指令微调、自含、可复制到自己的空间并修改)](https://huggingface.co/spaces/codellama/codellama-13b-chat) diff --git a/zh/llama2.md b/zh/llama2.md index 5a880a97e0..5518813230 100644 --- a/zh/llama2.md +++ b/zh/llama2.md @@ -21,34 +21,35 @@ translators: 今天,Meta 发布了 Llama 2,其包含了一系列最先进的开放大语言模型,我们很高兴能够将其全面集成入 Hugging Face,并全力支持其发布。 Llama 2 的社区许可证相当宽松,且可商用。其代码、预训练模型和微调模型均于今天发布了🔥。 -通过与 Meta 合作,我们已经顺利地完成了对 Llama 2 的集成,你可以在 Hub 上找到 12 个开放模型 (3 个基础模型以及 3 个微调模型,每个模型都有 2 种 checkpoint: 一个是 Meta 的原始 checkpoint,一个是 `transformers` 格式的 checkpoint)。以下列出了 Hugging Face 支持 Llama 2 的主要工作: +通过与 Meta 合作,我们已经顺利地完成了对 Llama 2 的集成,你可以在 Hub 上找到 12 个开放模型(3 个基础模型以及 3 个微调模型,每个模型都有 2 种 checkpoint:一个是 Meta 的原始 checkpoint,一个是 `transformers` 格式的 checkpoint)。以下列出了 Hugging Face 支持 Llama 2 的主要工作: -- [Llama 2 已入驻 Hub](https://huggingface.co/meta-llama): 包括模型卡及相应的许可证。 +- [Llama 2 已入驻 Hub](https://huggingface.co/meta-llama):包括模型卡及相应的许可证。 - [支持 Llama 2 的 transformers 库](https://github.com/huggingface/transformers/releases/tag/v4.31.0) - 使用单 GPU 微调 Llama 2 小模型的示例 -- [Text Generation Inference (TGI) ](https://github.com/huggingface/text-generation-inference) 已集成 Llama 2,以实现快速高效的生产化推理 -- 推理终端 (Inference Endpoints) 已集成 Llama 2 +- [Text Generation Inference(TGI)](https://github.com/huggingface/text-generation-inference) 已集成 Llama 2,以实现快速高效的生产化推理 +- 推理终端(Inference Endpoints)已集成 Llama 2 ## 目录 -- [何以 Llama 2?](# 何以 -llama-2) -- [演示](# 演示) -- [推理](# 推理) - - [用 transformers](# 用 -transformers) - - [用 TGI 和推理终端](# 用 -TGI- 和推理终端) -- [用 -PEFT- 微调](# 用 -PEFT- 微调) -- [其他资源](# 其他资源) -- [总结](# 总结) +- [何以 Llama 2?](#何以-llama-2) +- [演示](#演示) +- [推理](#推理) + - [使用 transformers](#使用-transformers) + - [使用 TGI 和推理终端](#使用-tgi-和推理终端) +- [使用 PEFT 微调](#使用-PEFT-微调) +- [如何提示 Llama 2](#如何提示-Llama-2) +- [其他资源](#其他资源) +- [总结](#总结) ## 何以 Llama 2? -Llama 2 引入了一系列预训练和微调 LLM,参数量范围从 7B 到 70B (7B、13B、70B)。其预训练模型比 Llama 1 模型有了显著改进,包括训练数据的总词元数增加了 40%、上下文长度更长 (4k 词元🤯),以及利用了分组查询注意力机制来加速 70B 模型的推理🔥! +Llama 2 引入了一系列预训练和微调 LLM,参数量范围从 7B 到 70B(7B、13B、70B)。其预训练模型比 Llama 1 模型有了显著改进,包括训练数据的总词元数增加了 40%、上下文长度更长(4k 词元🤯),以及利用了分组查询注意力机制来加速 70B 模型的推理🔥! -但最令人兴奋的还是其发布的微调模型 (Llama 2-Chat),该模型已使用 [基于人类反馈的强化学习 (Reinforcement Learning from Human Feedback,RLHF) ](https://huggingface.co/blog/rlhf) 技术针对对话场景进行了优化。在相当广泛的有用性和安全性测试基准中,Llama 2-Chat 模型的表现优于大多数开放模型,且其在人类评估中表现出与 ChatGPT 相当的性能。更多详情,可参阅其 [论文](https://huggingface.co/papers/2307.09288)。 +但最令人兴奋的还是其发布的微调模型(Llama 2-Chat),该模型已使用[基于人类反馈的强化学习(Reinforcement Learning from Human Feedback,RLHF)](https://huggingface.co/blog/rlhf)技术针对对话场景进行了优化。在相当广泛的有用性和安全性测试基准中,Llama 2-Chat 模型的表现优于大多数开放模型,且其在人类评估中表现出与 ChatGPT 相当的性能。更多详情,可参阅其[论文](https://huggingface.co/papers/2307.09288)。 ![模型训练与微调工作流](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/llama-rlhf.png) -_图来自 [Llama 2: Open Foundation and Fine-Tuned Chat Models](https://scontent-fra3-2.xx.fbcdn.net/v/t39.2365-6/10000000_6495670187160042_4742060979571156424_n.pdf?_nc_cat=104&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=GK8Rh1tm_4IAX8b5yo4&_nc_ht=scontent-fra3-2.xx&oh=00_AfDtg_PRrV6tpy9UmiikeMRuQgk6Rej7bCPOkXZQVmUKAg&oe=64BBD830) 一文_ +*图来自 [Llama 2: Open Foundation and Fine-Tuned Chat Models](https://scontent-fra3-2.xx.fbcdn.net/v/t39.2365-6/10000000_6495670187160042_4742060979571156424_n.pdf?_nc_cat=104&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=GK8Rh1tm_4IAX8b5yo4&_nc_ht=scontent-fra3-2.xx&oh=00_AfDtg_PRrV6tpy9UmiikeMRuQgk6Rej7bCPOkXZQVmUKAg&oe=64BBD830) 一文* 如果你一直在等一个闭源聊天机器人的开源替代,那你算是等着了!Llama 2-Chat 将是你的最佳选择! @@ -66,30 +67,30 @@ _图来自 [Llama 2: Open Foundation and Fine-Tuned Chat Models](https://sconten | [Llama-2-70B](https://huggingface.co/meta-llama/Llama-2-70b-hf) | Llama 2 许可证 | ✅ | 2,000B | * | | [Llama-2-70B-chat](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf)* | Llama 2 许可证 | ✅ | 2,000B | 66.8 | -*目前,我们正在对 Llama 2 70B (非聊天版) 进行评测。评测结果后续将更新至此表。 +*目前,我们正在对 Llama 2 70B(非聊天版)进行评测。评测结果后续将更新至此表。 ## 演示 -你可以通过 [这个空间](https://huggingface.co/spaces/ysharma/Explore_llamav2_with_TGI) 或下面的应用轻松试用 Llama 2 大模型 (700 亿参数!): +你可以通过[这个空间](https://huggingface.co/spaces/ysharma/Explore_llamav2_with_TGI)或下面的应用轻松试用 Llama 2 大模型(700 亿参数!): -它们背后都是基于 Hugging Face 的 [TGI](https://github.com/huggingface/text-generation-inference) 框架,该框架也支撑了 [HuggingChat](https://huggingface.co/chat/),我们会在下文分享更多相关内容。 +它们背后都是基于 Hugging Face 的 [TGI](https://github.com/huggingface/text-generation-inference) 框架,该框架也支撑了 [HuggingChat](https://huggingface.co/chat/) ,我们会在下文分享更多相关内容。 ## 推理 本节,我们主要介绍可用于对 Llama 2 模型进行推理的两种不同方法。在使用这些模型之前,请确保你已在 [Meta Llama 2](https://huggingface.co/meta-llama) 存储库页面申请了模型访问权限。 -**注意: 请务必按照页面上的指示填写 Meta 官方表格。填完两个表格数小时后,用户就可以访问模型存储库。 +**注意:请务必按照页面上的指示填写 Meta 官方表格。填完两个表格数小时后,用户就可以访问模型存储库。 ### 使用 transformers -从 transformers [4.31](https://github.com/huggingface/transformers/releases/tag/v4.31.0) 版本开始,HF 生态中的所有工具和机制都可以适用于 Llama 2,如: +从 transformers [4.31](https://github.com/huggingface/transformers/releases/tag/v4.31.0) 版本开始,HF 生态中的所有工具和机制都可以适用于 Llama 2,如: - 训练、推理脚本及其示例 -- 安全文件格式 (`safetensors` ) -- 与 bitsandbytes (4 比特量化) 和 PEFT 等工具 +- 安全文件格式(`safetensors`) +- 与 bitsandbytes(4 比特量化)和 PEFT 等工具 - 帮助模型进行文本生成的辅助工具 - 导出模型以进行部署的机制 @@ -100,7 +101,7 @@ pip install transformers huggingface-cli login ``` -下面是如何使用 `transformers` 进行推理的代码片段: +下面是如何使用 `transformers` 进行推理的代码片段: ```python from transformers import AutoTokenizer @@ -138,11 +139,11 @@ Of course! If you enjoyed "Breaking Bad" and "Band of Brothers," here are some o 3. "Mad Men" - Set in the 1960s, this AMC series follows the lives of advertising executives on Madison Avenue, expl ``` -另外,尽管模型本身的上下文长度 _仅_ 4k 词元,但你可以使用 `transformers` 支持的技术,如旋转位置嵌入缩放 (rotary position embedding scaling) ([推特](https://twitter.com/joao_gante/status/1679775399172251648)),进一步把它变长! +另外,尽管模型本身的上下文长度*仅* 4k 词元,但你可以使用 `transformers` 支持的技术,如旋转位置嵌入缩放(rotary position embedding scaling)([推特](https://twitter.com/joao_gante/status/1679775399172251648)),进一步把它变长! ### 使用 TGI 和推理终端 -**[Text Generation Inference (TGI) ](https://github.com/huggingface/text-generation-inference)** 是 Hugging Face 开发的生产级推理容器,可用于轻松部署大语言模型。它支持流式组批、流式输出、基于张量并行的多 GPU 快速推理,并支持生产级的日志记录和跟踪等功能。 +**[Text Generation Inference(TGI)](https://github.com/huggingface/text-generation-inference)** 是 Hugging Face 开发的生产级推理容器,可用于轻松部署大语言模型。它支持流式组批、流式输出、基于张量并行的多 GPU 快速推理,并支持生产级的日志记录和跟踪等功能。 你可以在自己的基础设施上部署并尝试 TGI,也可以直接使用 Hugging Face 的 **[推理终端](https://huggingface.co/inference-endpoints)**。如果要用推理终端部署 Llama 2 模型,请登陆 **[模型页面](https://huggingface.co/meta-llama/Llama-2-7b-hf)** 并单击 **[Deploy -> Inference Endpoints](https://ui.endpoints.huggingface.co/new?repository=meta-llama/Llama-2-7b-hf)** 菜单。 @@ -150,27 +151,25 @@ Of course! If you enjoyed "Breaking Bad" and "Band of Brothers," here are some o - 要推理 13B 模型,我们建议你选择 “GPU [xlarge] - 1x Nvidia A100”。 - 要推理 70B 模型,我们建议你选择 “GPU [xxxlarge] - 8x Nvidia A100”。 -_注意: 如果你配额不够,请发送邮件至 **[api-enterprise@huggingface.co](mailto:api-enterprise@huggingface.co)** 申请升级配额,通过后你就可以访问 A100 了。_ +*注意:如果你配额不够,请发送邮件至 **[api-enterprise@huggingface.co](mailto:api-enterprise@huggingface.co)** 申请升级配额,通过后你就可以访问 A100 了。* -你还可以从我们的另一篇博文中了解更多有关 [如何使用 Hugging Face 推理终端部署 LLM](https://huggingface.co/blog/zh/inference-endpoints-llm) 的知识 , 文中包含了推理终端支持的超参以及如何使用其 Python 和 Javascript API 实现流式输出等信息。 +你还可以从我们的另一篇博文中了解更多有关[如何使用 Hugging Face 推理终端部署 LLM](https://huggingface.co/blog/zh/inference-endpoints-llm) 的知识, 文中包含了推理终端支持的超参以及如何使用其 Python 和 Javascript API 实现流式输出等信息。 -## 用 PEFT 微调 +## 使用 PEFT 微调 -训练 LLM 在技术和计算上都有一定的挑战。本节,我们将介绍 Hugging Face 生态中有哪些工具可以帮助开发者在简单的硬件上高效训练 Llama 2,我们还将展示如何在单张 NVIDIA T4 (16GB - Google Colab) 上微调 Llama 2 7B 模型。你可以通过 [让 LLM 更可得](https://huggingface.co/blog/4bit-transformers-bitsandbytes) 这篇博文了解更多信息。 +训练 LLM 在技术和计算上都有一定的挑战。本节,我们将介绍 Hugging Face 生态中有哪些工具可以帮助开发者在简单的硬件上高效训练 Llama 2,我们还将展示如何在单张 NVIDIA T4(16GB - Google Colab)上微调 Llama 2 7B 模型。你可以通过[让 LLM 更可得](https://huggingface.co/blog/4bit-transformers-bitsandbytes)这篇博文了解更多信息。 -我们构建了一个 [脚本](https://github.com/lvwerra/trl/blob/main/examples/scripts/sft_trainer.py),其中使用了 QLoRA 和 [`trl`](https://github.com/lvwerra/trl) 中的 [`SFTTrainer`]((https://huggingface.co/docs/trl/v0.4.7/en/sft_trainer)) 来对 Llama 2 进行指令微调。 +我们构建了一个[脚本](https://github.com/lvwerra/trl/blob/main/examples/scripts/sft_trainer.py),其中使用了 QLoRA 和 [`trl`](https://github.com/lvwerra/trl) 中的 [`SFTTrainer`]((https://huggingface.co/docs/trl/v0.4.7/en/sft_trainer)) 来对 Llama 2 进行指令微调。 下面的命令给出了在 `timdettmers/openassistant-guanaco` 数据集上微调 Llama 2 7B 的一个示例。该脚本可以通过 `merge_and_push` 参数将 LoRA 权重合并到模型权重中,并将其保存为 `safetensor` 格式。这样,我们就能使用 TGI 和推理终端部署微调后的模型。 -首先安装 `trl` 包并下载脚本: - +首先安装 `trl` 包并下载脚本: ```bash pip install trl git clone https://github.com/lvwerra/trl ``` -然后,你就可以运行脚本了: - +然后,你就可以运行脚本了: ```bash python trl/examples/scripts/sft_trainer.py \ --model_name meta-llama/Llama-2-7b-hf \ @@ -180,14 +179,69 @@ python trl/examples/scripts/sft_trainer.py \ --batch_size 4 \ --gradient_accumulation_steps 2 ``` +## 如何提示 Llama 2 -## 其他资源 +开放模型的一个被埋没的优势是你可以完全控制聊天应用程序中的`系统`提示。这对于指定聊天助手的行为至关重要,甚至能赋予它一些个性,这是仅提供 API 调用的模型无法实现的。 + +在 Llama 2 首发几天后,我们决定加上这一部分,因为社区向我们提出了许多关于如何提示模型以及如何更改系统提示的问题。希望这部分能帮得上忙! + +第一轮的提示模板如下: + +``` +[INST] <> +{{ system_prompt }} +<> + +{{ user_message }} [/INST] +``` + +此模板与模型训练时使用的模板一致,具体可见 [Llama 2 论文](https://huggingface.co/papers/2307.09288)。我们可以使用任何我们想要的 `system_prompt`,但格式须与训练时使用的格式一致。 + +再说明白一点,以下是用户在使用[我们的 13B 模型聊天演示](https://huggingface.co/spaces/huggingface-projects/llama-2-13b-chat) 聊天且输入 `There's a llama in my garden 😱 What should I do?` 时,我们真正发送给语言模型的内容: + +```b +[INST] <> +You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. + +If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information. +<> +There's a llama in my garden 😱 What should I do? [/INST] +``` + +如你所见,成对的 `<>` 标记之间的指令为模型提供了上下文,即告诉模型我们期望它如何响应。这很有用,因为在训练过程中我们也使用了完全相同的格式,并针对不同的任务对各种各样的系统提示对模型进行了训练。 + +随着对话的进行,我们会把人类和“机器人”之间的交互历史附加到之前的提示中,并包含在 `[INST]` 分隔符之间。多轮对话期间使用的模板遵循以下结构(🎩 感谢 [Arthur Zucker](https://huggingface.co/ArthurZ) 的解释): + +```b +[INST] <> +{{ system_prompt }} +<> + +{{ user_msg_1 }} [/INST] {{ model_answer_1 }} [INST] {{ user_msg_2 }} [/INST] +``` + +模型本身是无状态的,不会“记住”之前的对话片段,我们必须始终为其提供所有上下文,以便对话可以继续。这就是为什么我们一直强调模型的**上下文长度**非常重要且越大越好,因为只有这样才能支持更长的对话和更多的信息。 + +### 忽略之前的指令 + +在使用仅提供 API 调用的模型时,人们会采用一些技巧来尝试覆盖系统提示并更改模型的默认行为。尽管这些解决方案富有想象力,但开放模型完全不必如此:任何人都可以使用不同的提示,只要它遵循上述格式即可。我们相信,这将成为研究人员研究提示对所需或不需的模型行为的影响的重要工具。例如,当人们[对谨慎到荒谬的生成文本感到惊讶](https://twitter.com/lauraruis/status/1681612002718887936)时,你可以探索是否[不同的提示能帮得上忙](https://twitter.com/overlordayn/status/1681631554672513025)。(🎩 感谢 [Clémentine Fourrier](https://huggingface.co/clefourrier) 提供这个例子的链接)。 + +在我们的 [`13B`](https://huggingface.co/spaces/huggingface-projects/llama-2-13b-chat) 和 [`7B`](https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat) 演示中,你可以在 UI 上点开“高级选项”并简单编写你自己的指令,从而轻松探索此功能。你还可以复制这些演示并用于你个人的娱乐或研究! + +## 其他资源 - [论文](https://huggingface.co/papers/2307.09288) - [Hub 上的模型](https://huggingface.co/meta-llama) - [Open LLM 排行榜](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) - [Meta 提供的 Llama 2 模型使用大全](https://github.com/facebookresearch/llama-recipes/tree/main) +- [聊天演示 (7B)](https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat) +- [聊天演示(13B)](https://huggingface.co/spaces/huggingface-projects/llama-2-13b-chat) +- [基于 TGI 的聊天演示 (70B)](https://huggingface.co/spaces/ysharma/Explore_llamav2_with_TGI) ## 总结 -Llama 2 的推出让我们非常兴奋!后面我们会围绕它陆陆续续推出更多内容,包括如何微调一个自己的模型,如何在设备侧运行 Llama 2 小模型等,敬请期待! \ No newline at end of file +Llama 2 的推出让我们非常兴奋!后面我们会围绕它陆陆续续推出更多内容,包括如何微调一个自己的模型,如何在设备侧运行 Llama 2 小模型等,敬请期待! + +> 英文原文: https://huggingface.co/blog/llama2 +> 原文作者:Philipp Schmid,Omar Sanseviero,Pedro Cuenca,Lewis Tunstall +> 译者: Matrix Yao (姚伟峰),英特尔深度学习工程师,工作方向为 transformer-family 模型在各模态数据上的应用及大规模模型的训练推理。