From be8e96cbaca1b89a49db59b766986b021a4517a5 Mon Sep 17 00:00:00 2001 From: Zhongdong Yang Date: Tue, 30 May 2023 18:10:12 +0800 Subject: [PATCH] Update: fix missing references of figures. (#1155) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix * add starcoder cn Signed-off-by: Yao, Matrix Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review Co-authored-by: Luc Georges * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca * Resize image * Update blog index --------- Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. --------- Signed-off-by: Yao, Matrix Signed-off-by: Yang, Zhongdong Co-authored-by: innovation64 Co-authored-by: Yao, Matrix Co-authored-by: SuSung-boy <872414318@qq.com> Co-authored-by: Luke Cheng <2258420+chenglu@users.noreply.github.com> Co-authored-by: yaoqih <40328311+yaoqih@users.noreply.github.com> Co-authored-by: Shiliang Chen <36809537+csl122@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Omar Sanseviero Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: 李洋 <45715979+innovation64@users.noreply.github.com> Co-authored-by: Yao Matrix Co-authored-by: Hoi2022 <120370631+Hoi2022@users.noreply.github.com> Co-authored-by: Nicolas Patry Co-authored-by: Luc Georges Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> Co-authored-by: Victor Muštar Co-authored-by: Kashif Rasul Co-authored-by: Julien Simon <3436143+juliensimon@users.noreply.github.com> Co-authored-by: Julien Simon Co-authored-by: Pedro Cuenca Co-authored-by: gxy-gxy <57594446+gxy-gxy@users.noreply.github.com> --- zh/rwkv.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/zh/rwkv.md b/zh/rwkv.md index 196d73b0c1..08f0ede8b9 100644 --- a/zh/rwkv.md +++ b/zh/rwkv.md @@ -34,7 +34,7 @@ RNN 架构是最早广泛用于处理序列数据的神经网络架构之一。 | ![rnn_diagram](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/142_rwkv/RNN-scheme.png) | | :-: | -| | +| RNN 在不同场景下 RNN 的网络配置简图。图片来源:Andrej Karpathy 的博文 | 由于 RNN 在计算每一时刻的预测值时使用的都是同一组网络权重,因此 RNN 很难解决长距离序列信息的记忆问题,这一定程度上也是训练过程中梯度消失导致的。为解决这个问题,相继有新的网络架构被提出,如 LSTM 或者 GRU,其中 transformer 是已被证实最有效的架构。 @@ -42,11 +42,11 @@ RNN 架构是最早广泛用于处理序列数据的神经网络架构之一。 | ![transformer_diagram](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/142_rwkv/transformer-scheme.png) | | :-: | -| | +| transformer 模型中的注意力分数计算公式。图片来源:Jay Alammar 的博文 | | ![rwkv_attention_formula](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/142_rwkv/RWKV-formula.png) | | :-: | -| | +| RWKV 模型中的注意力分数计算公式。来源:RWKV 博文 | 在训练过程中,Transformer 架构相比于传统的 RNN 和 CNN 有多个优势,最突出的优势是它能够学到上下文特征表达。不同于每次仅处理输入序列中一个 token 的 RNN 和 CNN,transformer 可以单次处理整个输入序列,这种特性也使得 transformer 可以很好地应对长距离序列 token 依赖问题,因此 transformer 在语言翻译和问答等多种任务中表现非常亮眼。 @@ -68,7 +68,7 @@ RNN 本身支持非常长的上下文长度。即使在训练时接收的上下 | ![rwkv_loss](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/142_rwkv/RWKV-loss.png) | | :-: | -| | +| LM Loss 在不同上下文长度和模型大小的曲线。图片来源:RWKV 原始仓库 | 1. 传统的 RNN 模型无法并行训练,而 RWKV 更像一个 “线性 GPT”,因此比 GPT 训练得更快。 @@ -88,7 +88,7 @@ RWKV 模型架构与经典的 transformer 模型架构非常相似 (例如也包 | ![rwkv_loss](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/142_rwkv/RWKV-eval.png) | | :-: | -| | +| RWKV-4 与其他常见架构的性能对比。图片来源:Johan Wind 的博文 | #### 指令微调/Chat 版: RWKV-4 Raven