Merge pull request #505 from iCell/shawn/review-05

docs(zh-cn): Reviewed 05_transformer-models-encoders.srt
huggingface · Feb 27, 2023 · e1593ee · e1593ee
2 parents df3e0cf + f877f03
commit e1593ee
Showing 1 changed file with 56 additions and 56 deletions.
diff --git a/subtitles/zh-CN/05_transformer-models-encoders.srt b/subtitles/zh-CN/05_transformer-models-encoders.srt
@@ -1,6 +1,6 @@
 1
 00:00:00,253 --> 00:00:03,003
-（介绍引人注目）
+（引人注目的介绍）
 (intro striking)
 
 2
@@ -10,12 +10,12 @@
 
 3
 00:00:07,830 --> 00:00:11,070
-一个流行的仅编码器架构的例子是 BURT
+一个流行的仅使用编码器架构的例子是 BURT
 An example of a popular encoder only architecture is BURT
 
 4
 00:00:11,070 --> 00:00:13,323
-这是同类产品中最受欢迎的型号。
+这是同类产品中最受欢迎的模型。
 which is the most popular model of its kind.
 
 5
@@ -25,37 +25,37 @@ Let's first start by understanding how it works.
 
 6
 00:00:18,360 --> 00:00:20,910
-我们将使用一个使用三个词的小例子。
+我们将使用一个三个单词的小例子。
 We'll use a small example using three words.
 
 7
 00:00:20,910 --> 00:00:23,823
-我们使用这些作为输入并将它们传递给编码器。
+我们使用这些作为输入传递给编码器。
 We use these as inputs and pass them through the encoder.
 
 8
 00:00:25,290 --> 00:00:28,173
-我们检索每个单词的数字表示。
+得到了每个单词的数值表示。
 We retrieve a numerical representation of each word.
 
 9
 00:00:29,970 --> 00:00:32,700
-例如，在这里，编码器转换这三个词，
+例如，在这里，编码器将这三个词，
 Here, for example, the encoder converts those three words,
 
 10
 00:00:32,700 --> 00:00:37,350
-欢迎来到纽约，在这三个数字序列中。
+welcome to NYC，转换为这三个数字序列。
 welcome to NYC, in these three sequences of numbers.
 
 11
 00:00:37,350 --> 00:00:40,350
-编码器只输出一个数字序列
+编码器对于每个输入单词
 The encoder outputs exactly one sequence of numbers
 
 12
 00:00:40,350 --> 00:00:41,493
-每个输入词。
+精确输出一个数字序列。
 per input word.
 
 13
@@ -65,7 +65,7 @@ This numerical representation can also be called
 
 14
 00:00:44,880 --> 00:00:47,163
-特征向量或特征张量。
+特征向量（feature vector）或特征张量（feature tensor）。
 a feature vector, or a feature tensor.
 
 15
@@ -85,22 +85,22 @@ that was passed through the encoder.
 
 18
 00:00:56,130 --> 00:00:58,620
-这些向量中的每一个都是一个数字表示
+每个向量都是
 Each of these vector is a numerical representation
 
 19
 00:00:58,620 --> 00:01:00,033
-有问题的词。
+该词的数字表示。
 of the word in question.
 
 20
 00:01:01,080 --> 00:01:03,300
-该向量的维度被定义
+该向量的维度由
 The dimension of that vector is defined
 
 21
 00:01:03,300 --> 00:01:05,520
-通过模型的架构。
+模型的架构所决定。
 by the architecture of the model.
 
 22
@@ -115,17 +115,17 @@ These representations contain the value of a word,
 
 24
 00:01:13,230 --> 00:01:15,240
-但语境化。
+但包含上下文化的处理。
 but contextualized.
 
 25
 00:01:15,240 --> 00:01:18,570
-例如，归因于单词 “to” 的向量
+例如，与单词 "to" 相关联的向量
 For example, the vector attributed to the word "to"
 
 26
 00:01:18,570 --> 00:01:22,290
-不只是 “to” 这个词的代表。
+不只是 “to” 这个词的表示。
 isn't the representation of only the "to" word.
 
 27
@@ -150,7 +150,7 @@ the words on the left of the one we're studying,
 
 31
 00:01:32,970 --> 00:01:34,980
-这里是 “欢迎” 这个词，
+这里是 “Welcome” 这个词，
 here the word "Welcome",
 
 32
@@ -180,22 +180,22 @@ holds the meaning of the word within the text.
 
 37
 00:01:53,310 --> 00:01:56,073
-由于自我注意机制，它做到了这一点。
+由于自注意力机制，它做到了这一点。
 It does this thanks to the self-attention mechanism.
 
 38
 00:01:57,240 --> 00:02:00,630
-自注意力机制涉及到不同的位置，
+自注意力机制指的是与单个序列中的不同位置
 The self-attention mechanism relates to different positions,
 
 39
 00:02:00,630 --> 00:02:02,850
-或单个序列中的不同单词
+或不同单词相关联
 or different words in a single sequence
 
 40
 00:02:02,850 --> 00:02:06,003
-为了计算该序列的表示。
+以计算该序列的表示形式。
 in order to compute a representation of that sequence.
 
 41
@@ -220,17 +220,17 @@ We won't dive into the specifics here
 
 45
 00:02:18,030 --> 00:02:19,680
-这将提供一些进一步的阅读
+我们会提供一些进一步的阅读资料
 which will offer some further readings
 
 46
 00:02:19,680 --> 00:02:21,330
-如果你想获得更好的理解
+如果您想对底层发生了什么
 if you want to get a better understanding
 
 47
 00:02:21,330 --> 00:02:22,953
-在引擎盖下发生的事情。
+有更好的理解。
 at what happens under the hood.
 
 48
@@ -250,17 +250,17 @@ in a wide variety of tasks.
 
 51
 00:02:32,100 --> 00:02:33,360
-例如，伯特，
+例如，BERT，
 For example, BERT,
 
 52
 00:02:33,360 --> 00:02:35,670
-可以说是最著名的变压器模型，
+可以说是最著名的 transformer 模型，
 arguably the most famous transformer model,
 
 53
 00:02:35,670 --> 00:02:37,590
-是一个独立的编码器模型，
+它是一个独立的编码器模型，
 is a standalone encoder model,
 
 54
@@ -270,12 +270,12 @@ and at the time of release,
 
 55
 00:02:38,820 --> 00:02:40,440
-这将是最先进的
+它是许多
 it'd be the state of the art
 
 56
 00:02:40,440 --> 00:02:42,780
-在许多序列分类任务中，
+序列分类任务
 in many sequence classification tasks,
 
 57
@@ -285,32 +285,32 @@ question answering tasks,
 
 58
 00:02:44,190 --> 00:02:46,743
-掩码语言建模仅举几例。
+和掩码语言建模等任务中的最先进技术。
 and mask language modeling to only cite of few.
 
 59
 00:02:48,150 --> 00:02:50,460
-这个想法是编码器非常强大
+编码器非常擅长
 The idea is that encoders are very powerful
 
 60
 00:02:50,460 --> 00:02:52,470
-在提取携带载体
+提取包含有意义信息的
 at extracting vectors that carry
 
 61
 00:02:52,470 --> 00:02:55,350
-关于序列的有意义的信息。
+关于序列的向量。
 meaningful information about a sequence.
 
 62
 00:02:55,350 --> 00:02:57,870
-然后可以在路上处理这个向量
+这个向量可以被传递给后续的神经元来进一步处理
 This vector can then be handled down the road
 
 63
 00:02:57,870 --> 00:03:00,070
-通过额外的神经元来理解它们。
+以便理解其中包含的信息。
 by additional neurons to make sense of them.
 
 64
@@ -330,22 +330,22 @@ First of all, Masked Language Modeling, or MLM.
 
 67
 00:03:09,900 --> 00:03:11,970
-这是预测隐藏词的任务
+这是在一个单词序列中
 It's the task of predicting a hidden word
 
 68
 00:03:11,970 --> 00:03:13,590
-在一个单词序列中。
+预测隐藏词的任务。
 in a sequence of word.
 
 69
 00:03:13,590 --> 00:03:15,630
-在这里，例如，我们隐藏了这个词
+在这里，例如，我们在 “My” 和 “is” 之间
 Here, for example, we have hidden the word
 
 70
 00:03:15,630 --> 00:03:17,247
-在 “我的” 和 “是” 之间。
+隐藏了这个词。
 between "My" and "is".
 
 71
@@ -360,7 +360,7 @@ It was trained to predict hidden words in a sequence.
 
 73
 00:03:25,230 --> 00:03:27,930
-编码器尤其在这种情况下大放异彩
+编码器在这种情况下尤其大放异彩
 Encoders shine in this scenario in particular
 
 74
@@ -375,32 +375,32 @@ If we didn't have the words on the right,
 
 76
 00:03:32,947 --> 00:03:34,650
-“是”、“Sylvain” 和 “.”，
+“is”、“Sylvain” 和 “.”，
 "is", "Sylvain" and the ".",
 
 77
 00:03:34,650 --> 00:03:35,940
-那么机会就很小
+那么BERT 将能够识别名称的
 then there is very little chance
 
 78
 00:03:35,940 --> 00:03:38,580
-BERT 将能够识别名称
+作为正确的词
 that BERT would have been able to identify name
 
 79
 00:03:38,580 --> 00:03:40,500
-作为正确的词。
+的机会就很小。
 as the correct word.
 
 80
 00:03:40,500 --> 00:03:42,270
-编码器需要有很好的理解
+为了预测一个掩码单词
 The encoder needs to have a good understanding
 
 81
 00:03:42,270 --> 00:03:45,360
-序列以预测掩码词
+编码器需要对序列有很好的理解
 of the sequence in order to predict a masked word
 
 82
@@ -410,12 +410,12 @@ as even if the text is grammatically correct,
 
 83
 00:03:48,840 --> 00:03:50,610
-它不一定有意义
+但不一定符合
 it does not necessarily make sense
 
 84
 00:03:50,610 --> 00:03:52,413
-在序列的上下文中。
+序列的上下文。
 in the context of the sequence.
 
 85
@@ -440,22 +440,22 @@ The model's aim is to identify the sentiment of a sequence.
 
 89
 00:04:09,540 --> 00:04:11,280
-它的范围可以从给出一个序列，
+它可以从给出的一个序列，
 It can range from giving a sequence,
 
 90
 00:04:11,280 --> 00:04:12,960
-从一颗星到五颗星的评级
+做出一颗星到五颗星的评级
 a rating from one to five stars
 
 91
 00:04:12,960 --> 00:04:15,900
-如果进行评论分析以给予肯定，
+如果进行评论分析
 if doing review analysis to giving a positive,
 
 92
 00:04:15,900 --> 00:04:17,820
-或对序列的负面评价
+来对一个序列进行积极或消极的评级
 or negative rating to a sequence
 
 93
@@ -495,7 +495,7 @@ containing the same words,
 
 100
 00:04:35,220 --> 00:04:37,170
-意义完全不同，
+意义却完全不同，
 the meaning is entirely different,
 
 101
@@ -505,6 +505,6 @@ and the encoder model is able to grasp that difference.
 
 102
 00:04:41,404 --> 00:04:44,154
-（结尾引人注目）
+（引人注目的结尾）
 (outro striking)