diff --git a/subtitles/zh-CN/03_what-is-transfer-learning.srt b/subtitles/zh-CN/03_what-is-transfer-learning.srt index 8a8a0b509..4ddaf9c5d 100644 --- a/subtitles/zh-CN/03_what-is-transfer-learning.srt +++ b/subtitles/zh-CN/03_what-is-transfer-learning.srt @@ -5,22 +5,22 @@ 2 00:00:05,550 --> 00:00:07,293 -- 什么是迁移学习? +- 什么是转移学习? - What is transfer learning? 3 00:00:09,480 --> 00:00:10,920 -迁移学习的思想 +转移学习的思想 The idea of transfer learning 4 00:00:10,920 --> 00:00:12,570 -是利用所获得的知识 +是利用在另一项任务上使用大量数据训练的模型结果 is to leverage the knowledge acquired 5 00:00:12,570 --> 00:00:15,543 -通过在另一项任务上使用大量数据训练的模型。 +来获取知识。 by a model trained with lots of data on another task. 6 @@ -30,12 +30,12 @@ The model A will be trained specifically for task A. 7 00:00:20,130 --> 00:00:22,200 -现在假设你想训练模型 B +现在假设您想为了另一个任务 Now let's say you want to train a model B 8 00:00:22,200 --> 00:00:23,970 -为了不同的任务。 +训练模型 B。 for a different task. 9 @@ -45,17 +45,17 @@ One option would be to train the model from scratch. 10 00:00:27,330 --> 00:00:30,633 -这可能需要大量的计算、时间和数据。 +但这可能需要大量的计算、时间和数据。 This could take lots of computation, time and data. 11 00:00:31,470 --> 00:00:34,260 -相反,我们可以初始化模型 B +我们可以有另一种选择,初始化模型 B Instead, we could initialize model B 12 00:00:34,260 --> 00:00:36,570 -与模型 A 具有相同的权重, +它与模型 A 具有相同的权重, with the same weights as model A, 13 @@ -75,37 +75,37 @@ all the model's weight are initialized randomly. 16 00:00:45,870 --> 00:00:48,870 -在这个例子中,我们正在训练一个 BERT 模型 +在这个例子中,我们正在基于识别任务上 In this example, we are training a BERT model 17 00:00:48,870 --> 00:00:50,220 -在识别任务上 +训练一个 BERT 模型 on the task of recognizing 18 00:00:50,220 --> 00:00:52,203 -两个句子是否相似。 +来判断两个句子是否相似。 if two sentences are similar or not. 19 00:00:54,116 --> 00:00:56,730 -在左边,它是从头开始训练的, +左边的例子是从头开始训练的, On the left, it's trained from scratch, 20 00:00:56,730 --> 00:01:00,000 -在右侧,它正在微调预训练模型。 +右边则代表正在微调预训练模型。 and on the right it's fine-tuning a pretrained model. 21 00:01:00,000 --> 00:01:02,220 -正如我们所见,使用迁移学习 +正如我们所见,使用转移学习 As we can see, using transfer learning 22 00:01:02,220 --> 00:01:05,160 -并且预训练模型产生了更好的结果。 +和预训练模型产生了更好的结果。 and the pretrained model yields better results. 23 @@ -120,7 +120,7 @@ The training from scratch is capped around 70% accuracy 25 00:01:10,620 --> 00:01:13,293 -而预训练模型轻松击败了 86%。 +而预训练模型轻松达到了 86%。 while the pretrained model beats the 86% easily. 26 @@ -130,17 +130,17 @@ This is because pretrained models 27 00:01:16,140 --> 00:01:18,420 -通常接受大量数据的训练 +通常基于大量数据进行训练 are usually trained on large amounts of data 28 00:01:18,420 --> 00:01:21,000 -为模型提供统计理解 +这些数据为模型在预训练期间 that provide the model with a statistical understanding 29 00:01:21,000 --> 00:01:23,413 -预训练期间使用的语言。 +提供了对语言使用的统计理解。 of the language used during pretraining. 30 @@ -150,7 +150,7 @@ In computer vision, 31 00:01:25,950 --> 00:01:28,080 -迁移学习已成功应用 +转移学习已成功应用 transfer learning has been applied successfully 32 @@ -165,7 +165,7 @@ Models are frequently pretrained on ImageNet, 34 00:01:32,850 --> 00:01:36,153 -包含 120 万张照片图像的数据集。 +它是一种包含 120 万张照片图像的数据集。 a dataset containing 1.2 millions of photo images. 35 @@ -190,12 +190,12 @@ In Natural Language Processing, 39 00:01:49,140 --> 00:01:51,870 -迁移学习是最近才出现的。 +转移学习是最近才出现的。 transfer learning is a bit more recent. 40 00:01:51,870 --> 00:01:54,480 -与 ImageNet 的一个关键区别是预训练 +它与 ImageNet 的一个关键区别是预训练 A key difference with ImageNet is that the pretraining 41 @@ -205,12 +205,12 @@ is usually self-supervised, 42 00:01:56,460 --> 00:01:58,770 -这意味着它不需要人工注释 +这意味着它不需要人工对标签 which means it doesn't require humans annotations 43 00:01:58,770 --> 00:01:59,673 -对于标签。 +进行注释。 for the labels. 44 @@ -225,7 +225,7 @@ is to guess the next word in a sentence. 46 00:02:05,310 --> 00:02:07,710 -这只需要大量的文本。 +它只需要大量的输入文本。 Which only requires lots and lots of text. 47 @@ -235,12 +235,12 @@ GPT-2 for instance, was pretrained this way 48 00:02:10,710 --> 00:02:12,900 -使用 4500 万个链接的内容 +它使用 4500 万个用户在 Reddit 上发布的 using the content of 45 millions links 49 00:02:12,900 --> 00:02:14,673 -用户在 Reddit 上发布。 +链接的内容。 posted by users on Reddit. 50 @@ -260,42 +260,42 @@ Which is similar to fill-in-the-blank tests 53 00:02:24,540 --> 00:02:26,760 -你可能在学校做过。 +您可能在学校做过。 you may have done in school. 54 00:02:26,760 --> 00:02:29,880 -BERT 是使用英文维基百科以这种方式进行预训练的 +BERT 是使用英文维基百科和 11,000 本未出版的书籍 BERT was pretrained this way using the English Wikipedia 55 00:02:29,880 --> 00:02:31,893 -和 11,000 本未出版的书籍。 +进行预训练的。 and 11,000 unpublished books. 56 00:02:33,120 --> 00:02:36,450 -在实践中,迁移学习应用于给定模型 +在实践中,转移学习是通过抛弃原模型的头部 In practice, transfer learning is applied on a given model 57 00:02:36,450 --> 00:02:39,090 -通过扔掉它的头,也就是说, +即其针对预训练目标的最后几层 by throwing away its head, that is, 58 00:02:39,090 --> 00:02:42,150 -它的最后一层专注于预训练目标, +并用一个新的、随机初始化的头部 its last layers focused on the pretraining objective, 59 00:02:42,150 --> 00:02:45,360 -并用一个新的、随机初始化的头替换它 +来替换它来应用的 and replacing it with a new, randomly initialized head 60 00:02:45,360 --> 00:02:46,860 -适合手头的任务。 +这个新的头部适用于当前的任务。 suitable for the task at hand. 61 @@ -320,37 +320,37 @@ Since our task had two labels. 65 00:02:59,700 --> 00:03:02,490 -为了尽可能高效,使用预训练模型 +为了尽可能高效 To be as efficient as possible, the pretrained model used 66 00:03:02,490 --> 00:03:03,770 -应该尽可能相似 +所使用的预训练模型 should be as similar as possible 67 00:03:03,770 --> 00:03:06,270 -对其进行微调的任务。 +应尽可能与其微调的任务相似。 to the task it's fine-tuned on. 68 00:03:06,270 --> 00:03:08,190 -例如,如果问题 +例如,如果当前需要 For instance, if the problem 69 00:03:08,190 --> 00:03:10,860 -是对德语句子进行分类, +对德语句子进行分类, is to classify German sentences, 70 00:03:10,860 --> 00:03:13,053 -最好使用德国预训练模型。 +最好使用德语预训练模型。 it's best to use a German pretrained model. 71 00:03:14,370 --> 00:03:16,649 -但好事也有坏事。 +但有好事也有坏事。 But with the good comes the bad. 72 @@ -360,87 +360,87 @@ The pretrained model does not only transfer its knowledge, 73 00:03:19,380 --> 00:03:21,693 -以及它可能包含的任何偏见。 +同时也转移了它可能包含的任何偏见。 but also any bias it may contain. 74 00:03:22,530 --> 00:03:24,300 -ImageNet 主要包含图像 +ImageNet 主要包含来自美国和西欧 ImageNet mostly contains images 75 00:03:24,300 --> 00:03:26,850 -来自美国和西欧。 +的图像。 coming from the United States and Western Europe. 76 00:03:26,850 --> 00:03:28,020 -所以模型用它微调 +所以基于它进行微调的模型 So models fine-tuned with it 77 00:03:28,020 --> 00:03:31,710 -通常会在来自这些国家 / 地区的图像上表现更好。 +通常会在来自这些国家或地区的图像上表现更好。 usually will perform better on images from these countries. 78 00:03:31,710 --> 00:03:33,690 -OpenAI 还研究了偏差 +OpenAI 还研究了 OpenAI also studied the bias 79 00:03:33,690 --> 00:03:36,120 -在其 GPT-3 模型的预测中 +其使用猜测下一个单词目标 in the predictions of its GPT-3 model 80 00:03:36,120 --> 00:03:36,953 -这是预训练的 +预训练的 GPT-3 模型中 which was pretrained 81 00:03:36,953 --> 00:03:38,750 -使用猜测下一个单词目标。 +预测的偏差。 using the guess the next word objective. 82 00:03:39,720 --> 00:03:41,040 -更改提示的性别 +将提示的性别 Changing the gender of the prompt 83 00:03:41,040 --> 00:03:44,250 -从他非常到她非常 +从“他”更改到“她” from he was very to she was very 84 00:03:44,250 --> 00:03:47,550 -改变了大多数中性形容词的预测 +会使预测从主要是中性形容词 changed the predictions from mostly neutral adjectives 85 00:03:47,550 --> 00:03:49,233 -几乎只有物理的。 +变为几乎只有物理上的形容词。 to almost only physical ones. 86 00:03:50,400 --> 00:03:52,367 -在他们的 GPT-2 模型的模型卡中, +在他们的 GPT-2 的模型卡中, In their model card of the GPT-2 model, 87 00:03:52,367 --> 00:03:54,990 -OpenAI 也承认它的偏见 +OpenAI 也承认了它的偏见 OpenAI also acknowledges its bias 88 00:03:54,990 --> 00:03:56,730 -并且不鼓励使用它 +并且不鼓励在与人类交互的系统中 and discourages its use 89 00:03:56,730 --> 00:03:58,803 -在与人类交互的系统中。 +使用它。 in systems that interact with humans. 90