huggingface · xianbaoqian · Apr 10, 2023 · Mar 1, 2023 · Mar 13, 2023 · Mar 13, 2023
diff --git a/subtitles/zh-CN/21_preprocessing-sentence-pairs-(pytorch).srt b/subtitles/zh-CN/21_preprocessing-sentence-pairs-(pytorch).srt
@@ -30,7 +30,7 @@ If this code look unfamiliar to you,
 
 7
 00:00:18,330 --> 00:00:20,030
-请务必再次检查该视频。
+请务必再次查看该视频。
 be sure to check that video again.
 
 8
@@ -40,12 +40,12 @@ Here will focus on tasks that classify pair of sentences.
 
 9
 00:00:25,620 --> 00:00:28,470
-例如，我们可能想要对两个文本进行分类
+例如，我们可能想要对两个文本是否被释义
 For instance, we may want to classify whether two texts
 
 10
 00:00:28,470 --> 00:00:30,360
-是否被释义。
+进行分类。
 are paraphrased or not.
 
 11
@@ -90,8 +90,8 @@ a problem called natural language inference or NLI.
 
 19
 00:00:53,970 --> 00:00:57,000
-在这个例子中，取自 MultiNLI 数据集，
-In this example, taken from the MultiNLI dataset,
+在这个取自 MultiNLI 数据集的例子中，
+In this example, taken from the MultiNLI data set,
 
 20
 00:00:57,000 --> 00:00:59,880
@@ -100,7 +100,7 @@ we have a pair of sentences for each possible label.
 
 21
 00:00:59,880 --> 00:01:02,490
-矛盾，自然的或必然的，
+矛盾，自然的或蕴涵，
 Contradiction, natural or entailment,
 
 22
@@ -115,12 +115,12 @@ implies the second.
 
 24
 00:01:06,930 --> 00:01:08,820
-所以分类成对的句子是一个问题
+所以分类成对的句子是一个
 So classifying pairs of sentences is a problem
 
 25
 00:01:08,820 --> 00:01:10,260
-值得被研究。
+值得研究的问题。
 worth studying.
 
 26
@@ -165,7 +165,7 @@ they often have an objective related to sentence pairs.
 
 34
 00:01:31,230 --> 00:01:34,320
-例如，在预训练期间 BERT 显示
+例如，在预训练期间 BERT 见到
 For instance, during pretraining BERT is shown
 
 35
@@ -175,12 +175,12 @@ pairs of sentences and must predict both
 
 36
 00:01:36,810 --> 00:01:39,930
-随机屏蔽 token 的价值，以及是否第二个
+随机掩蔽的标记值，以及第二个是否
 the value of randomly masked tokens, and whether the second
 
 37
 00:01:39,930 --> 00:01:41,830
-句子从第一个开始, 或反之。
+句子是否接着第一个句子。
 sentence follow from the first or not.
 
 38
@@ -205,27 +205,27 @@ to the tokenizer.
 
 42
 00:01:53,430 --> 00:01:55,470
-在输入 ID 和注意力掩码之上
+在我们已经研究过的输入 ID 
 On top of the input IDs and the attention mask
 
 43
 00:01:55,470 --> 00:01:56,970
-我们已经研究过，
+和注意掩码之上，
 we studied already,
 
 44
 00:01:56,970 --> 00:01:59,910
-它返回一个名为 token 类型 ID 的新字段，
+它返回一个名为标记类型 ID 的新字段，
 it returns a new field called token type IDs,
 
 45
 00:01:59,910 --> 00:02:01,790
-它告诉模型哪些 token 属于
+它告诉模型哪些标记属于
 which tells the model which tokens belong
 
 46
 00:02:01,790 --> 00:02:03,630
-对于第一句话，
+第一句话，
 to the first sentence,
 
 47
@@ -245,12 +245,12 @@ aligned with the tokens they correspond to,
 
 50
 00:02:12,180 --> 00:02:15,213
-它们各自的 token 类型 ID 和注意掩码。
+它们各自的标记类型 ID 和注意掩码。
 their respective token type ID and attention mask.
 
 51
 00:02:16,080 --> 00:02:19,260
-我们可以看到 tokenizer 还添加了特殊 token 。
+我们可以看到分词器还添加了特殊标记。
 We can see the tokenizer also added special tokens.
 
 52
@@ -260,12 +260,12 @@ So we have a CLS token, the tokens from the first sentence,
 
 53
 00:02:22,620 --> 00:02:25,770
-一个 SEP token ，第二句话中的 token ， 
+一个 SEP 标记，第二句话中的标记，
 a SEP token, the tokens from the second sentence,
 
 54
 00:02:25,770 --> 00:02:27,003
-和最终的 SEP token 。
+和最终的 SEP 标记。
 and a final SEP token.
 
 55
@@ -275,12 +275,12 @@ If we have several pairs of sentences,
 
 56
 00:02:30,570 --> 00:02:32,840
-我们可以通过传递列表将它们标记在一起
+我们可以通过第一句话的传递列表
 we can tokenize them together by passing the list
 
 57
 00:02:32,840 --> 00:02:36,630
-第一句话，然后是第二句话的列表
+将它们标记在一起，然后是第二句话的列表
 of first sentences, then the list of second sentences
 
 58
@@ -290,7 +290,7 @@ and all the keyword arguments we studied already
 
 59
 00:02:39,300 --> 00:02:40,353
-像 padding=True 。
+例如 padding=True。
 like padding=True.
 
 60
@@ -300,17 +300,17 @@ Zooming in at the result,
 
 61
 00:02:43,140 --> 00:02:45,030
-我们还可以看到标记化添加的填充
-we can see also tokenize added padding
+我们可以看到分词器如何添加填充
+we can see how the tokenizer added padding
 
 62
 00:02:45,030 --> 00:02:48,090
-到第二对句子来制作两个输出
+到第二对句子使得两个输出的
 to the second pair sentences to make the two outputs
 
 63
 00:02:48,090 --> 00:02:51,360
-相同的长度，并正确处理 token 类型 ID
+长度相同，并正确处理标记类型 ID
 the same length, and properly dealt with token type IDs
 
 64