Skip to content

Commit

Permalink
Merge pull request #505 from iCell/shawn/review-05
Browse files Browse the repository at this point in the history
docs(zh-cn): Reviewed 05_transformer-models-encoders.srt
  • Loading branch information
xianbaoqian authored Feb 27, 2023
2 parents df3e0cf + f877f03 commit e1593ee
Showing 1 changed file with 56 additions and 56 deletions.
112 changes: 56 additions & 56 deletions subtitles/zh-CN/05_transformer-models-encoders.srt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
1
00:00:00,253 --> 00:00:03,003
介绍引人注目
引人注目的介绍
(intro striking)

2
Expand All @@ -10,12 +10,12 @@

3
00:00:07,830 --> 00:00:11,070
一个流行的仅编码器架构的例子是 BURT
一个流行的仅使用编码器架构的例子是 BURT
An example of a popular encoder only architecture is BURT

4
00:00:11,070 --> 00:00:13,323
这是同类产品中最受欢迎的型号
这是同类产品中最受欢迎的模型
which is the most popular model of its kind.

5
Expand All @@ -25,37 +25,37 @@ Let's first start by understanding how it works.

6
00:00:18,360 --> 00:00:20,910
我们将使用一个使用三个词的小例子
我们将使用一个三个单词的小例子
We'll use a small example using three words.

7
00:00:20,910 --> 00:00:23,823
我们使用这些作为输入并将它们传递给编码器
我们使用这些作为输入传递给编码器
We use these as inputs and pass them through the encoder.

8
00:00:25,290 --> 00:00:28,173
我们检索每个单词的数字表示
得到了每个单词的数值表示
We retrieve a numerical representation of each word.

9
00:00:29,970 --> 00:00:32,700
例如,在这里,编码器转换这三个词
例如,在这里,编码器将这三个词
Here, for example, the encoder converts those three words,

10
00:00:32,700 --> 00:00:37,350
欢迎来到纽约,在这三个数字序列中
welcome to NYC,转换为这三个数字序列
welcome to NYC, in these three sequences of numbers.

11
00:00:37,350 --> 00:00:40,350
编码器只输出一个数字序列
编码器对于每个输入单词
The encoder outputs exactly one sequence of numbers

12
00:00:40,350 --> 00:00:41,493
每个输入词
精确输出一个数字序列
per input word.

13
Expand All @@ -65,7 +65,7 @@ This numerical representation can also be called

14
00:00:44,880 --> 00:00:47,163
特征向量或特征张量
特征向量(feature vector)或特征张量(feature tensor)
a feature vector, or a feature tensor.

15
Expand All @@ -85,22 +85,22 @@ that was passed through the encoder.

18
00:00:56,130 --> 00:00:58,620
这些向量中的每一个都是一个数字表示
每个向量都是
Each of these vector is a numerical representation

19
00:00:58,620 --> 00:01:00,033
有问题的词
该词的数字表示
of the word in question.

20
00:01:01,080 --> 00:01:03,300
该向量的维度被定义
该向量的维度由
The dimension of that vector is defined

21
00:01:03,300 --> 00:01:05,520
通过模型的架构
模型的架构所决定
by the architecture of the model.

22
Expand All @@ -115,17 +115,17 @@ These representations contain the value of a word,

24
00:01:13,230 --> 00:01:15,240
但语境化
但包含上下文化的处理
but contextualized.

25
00:01:15,240 --> 00:01:18,570
例如,归因于单词 “to” 的向量
例如,与单词 "to" 相关联的向量
For example, the vector attributed to the word "to"

26
00:01:18,570 --> 00:01:22,290
不只是 “to” 这个词的代表
不只是 “to” 这个词的表示
isn't the representation of only the "to" word.

27
Expand All @@ -150,7 +150,7 @@ the words on the left of the one we're studying,

31
00:01:32,970 --> 00:01:34,980
这里是 “欢迎” 这个词,
这里是 “Welcome” 这个词,
here the word "Welcome",

32
Expand Down Expand Up @@ -180,22 +180,22 @@ holds the meaning of the word within the text.

37
00:01:53,310 --> 00:01:56,073
由于自我注意机制,它做到了这一点。
由于自注意力机制,它做到了这一点。
It does this thanks to the self-attention mechanism.

38
00:01:57,240 --> 00:02:00,630
自注意力机制涉及到不同的位置,
自注意力机制指的是与单个序列中的不同位置
The self-attention mechanism relates to different positions,

39
00:02:00,630 --> 00:02:02,850
或单个序列中的不同单词
或不同单词相关联
or different words in a single sequence

40
00:02:02,850 --> 00:02:06,003
为了计算该序列的表示
以计算该序列的表示形式
in order to compute a representation of that sequence.

41
Expand All @@ -220,17 +220,17 @@ We won't dive into the specifics here

45
00:02:18,030 --> 00:02:19,680
这将提供一些进一步的阅读
我们会提供一些进一步的阅读资料
which will offer some further readings

46
00:02:19,680 --> 00:02:21,330
如果你想获得更好的理解
如果您想对底层发生了什么
if you want to get a better understanding

47
00:02:21,330 --> 00:02:22,953
在引擎盖下发生的事情
有更好的理解
at what happens under the hood.

48
Expand All @@ -250,17 +250,17 @@ in a wide variety of tasks.

51
00:02:32,100 --> 00:02:33,360
例如,伯特
例如,BERT
For example, BERT,

52
00:02:33,360 --> 00:02:35,670
可以说是最著名的变压器模型
可以说是最著名的 transformer 模型
arguably the most famous transformer model,

53
00:02:35,670 --> 00:02:37,590
是一个独立的编码器模型
它是一个独立的编码器模型
is a standalone encoder model,

54
Expand All @@ -270,12 +270,12 @@ and at the time of release,

55
00:02:38,820 --> 00:02:40,440
这将是最先进的
它是许多
it'd be the state of the art

56
00:02:40,440 --> 00:02:42,780
在许多序列分类任务中,
序列分类任务
in many sequence classification tasks,

57
Expand All @@ -285,32 +285,32 @@ question answering tasks,

58
00:02:44,190 --> 00:02:46,743
掩码语言建模仅举几例
和掩码语言建模等任务中的最先进技术
and mask language modeling to only cite of few.

59
00:02:48,150 --> 00:02:50,460
这个想法是编码器非常强大
编码器非常擅长
The idea is that encoders are very powerful

60
00:02:50,460 --> 00:02:52,470
在提取携带载体
提取包含有意义信息的
at extracting vectors that carry

61
00:02:52,470 --> 00:02:55,350
关于序列的有意义的信息
关于序列的向量
meaningful information about a sequence.

62
00:02:55,350 --> 00:02:57,870
然后可以在路上处理这个向量
这个向量可以被传递给后续的神经元来进一步处理
This vector can then be handled down the road

63
00:02:57,870 --> 00:03:00,070
通过额外的神经元来理解它们
以便理解其中包含的信息
by additional neurons to make sense of them.

64
Expand All @@ -330,22 +330,22 @@ First of all, Masked Language Modeling, or MLM.

67
00:03:09,900 --> 00:03:11,970
这是预测隐藏词的任务
这是在一个单词序列中
It's the task of predicting a hidden word

68
00:03:11,970 --> 00:03:13,590
在一个单词序列中
预测隐藏词的任务
in a sequence of word.

69
00:03:13,590 --> 00:03:15,630
在这里,例如,我们隐藏了这个词
在这里,例如,我们在 “My” 和 “is” 之间
Here, for example, we have hidden the word

70
00:03:15,630 --> 00:03:17,247
在 “我的” 和 “是” 之间
隐藏了这个词
between "My" and "is".

71
Expand All @@ -360,7 +360,7 @@ It was trained to predict hidden words in a sequence.

73
00:03:25,230 --> 00:03:27,930
编码器尤其在这种情况下大放异彩
编码器在这种情况下尤其大放异彩
Encoders shine in this scenario in particular

74
Expand All @@ -375,32 +375,32 @@ If we didn't have the words on the right,

76
00:03:32,947 --> 00:03:34,650
”、“Sylvain” 和 “.”,
is”、“Sylvain” 和 “.”,
"is", "Sylvain" and the ".",

77
00:03:34,650 --> 00:03:35,940
那么机会就很小
那么BERT 将能够识别名称的
then there is very little chance

78
00:03:35,940 --> 00:03:38,580
BERT 将能够识别名称
作为正确的词
that BERT would have been able to identify name

79
00:03:38,580 --> 00:03:40,500
作为正确的词
的机会就很小
as the correct word.

80
00:03:40,500 --> 00:03:42,270
编码器需要有很好的理解
为了预测一个掩码单词
The encoder needs to have a good understanding

81
00:03:42,270 --> 00:03:45,360
序列以预测掩码词
编码器需要对序列有很好的理解
of the sequence in order to predict a masked word

82
Expand All @@ -410,12 +410,12 @@ as even if the text is grammatically correct,

83
00:03:48,840 --> 00:03:50,610
它不一定有意义
但不一定符合
it does not necessarily make sense

84
00:03:50,610 --> 00:03:52,413
在序列的上下文中
序列的上下文
in the context of the sequence.

85
Expand All @@ -440,22 +440,22 @@ The model's aim is to identify the sentiment of a sequence.

89
00:04:09,540 --> 00:04:11,280
它的范围可以从给出一个序列
它可以从给出的一个序列
It can range from giving a sequence,

90
00:04:11,280 --> 00:04:12,960
从一颗星到五颗星的评级
做出一颗星到五颗星的评级
a rating from one to five stars

91
00:04:12,960 --> 00:04:15,900
如果进行评论分析以给予肯定,
如果进行评论分析
if doing review analysis to giving a positive,

92
00:04:15,900 --> 00:04:17,820
或对序列的负面评价
来对一个序列进行积极或消极的评级
or negative rating to a sequence

93
Expand Down Expand Up @@ -495,7 +495,7 @@ containing the same words,

100
00:04:35,220 --> 00:04:37,170
意义完全不同
意义却完全不同
the meaning is entirely different,

101
Expand All @@ -505,6 +505,6 @@ and the encoder model is able to grasp that difference.

102
00:04:41,404 --> 00:04:44,154
结尾引人注目
引人注目的结尾
(outro striking)

0 comments on commit e1593ee

Please sign in to comment.