diff --git a/subtitles/zh-CN/60_what-is-the-bleu-metric.srt b/subtitles/zh-CN/60_what-is-the-bleu-metric.srt
index e3fce657e..7de1b2163 100644
--- a/subtitles/zh-CN/60_what-is-the-bleu-metric.srt
+++ b/subtitles/zh-CN/60_what-is-the-bleu-metric.srt
@@ -20,52 +20,52 @@
 
 5
 00:00:07,650 --> 00:00:10,170
-对于许多 NLP 任务，我们可以使用通用指标
+对于许多 NLP 任务，我们可以使用常见指标
 For many NLP tasks we can use common metrics
 
 6
 00:00:10,170 --> 00:00:12,810
-比如准确性或 F1 分数，但你会做什么
+比如准确性或 F1 分数，
 like accuracy or F1 score, but what do you do
 
 7
 00:00:12,810 --> 00:00:14,340
-当你想衡量文本的质量时
+但是当你想衡量模型所翻译的文本的质量时
 when you wanna measure the quality of text
 
 8
 00:00:14,340 --> 00:00:16,560
-那是从模型翻译过来的？
+该如何评估呢？
 that's been translated from a model?
 
 9
 00:00:16,560 --> 00:00:18,750
-在本视频中，我们将了解一个广泛使用的指标
+在本视频中，我们将为大家介绍一个
 In this video, we'll take a look at a widely used metric
 
 10
 00:00:18,750 --> 00:00:20,613
-用于称为 BLEU 的机器翻译。
+广泛使用于机器翻译的指标，叫做 BLEU 。
 for machine translation called BLEU.
 
 11
 00:00:22,290 --> 00:00:23,940
-BLEU 背后的基本思想是分配
+BLEU 背后的基本逻辑是
 The basic idea behind BLEU is to assign
 
 12
 00:00:23,940 --> 00:00:26,250
-翻译的单一数字分数
+为每个翻译分配一个单独的数字评分
 a single numerical score to a translation
 
 13
 00:00:26,250 --> 00:00:27,450
-这告诉我们它有多好
+用于评估
 that tells us how good it is
 
 14
 00:00:27,450 --> 00:00:30,199
-与一个或多个参考翻译相比。
+它与一个或者多个翻译相比，其质量的优劣。
 compared to one or more reference translations.
 
 15
@@ -80,12 +80,12 @@ that has been translated into English by some model.
 
 17
 00:00:35,340 --> 00:00:37,170
-如果我们比较生成的翻译
+如果我们将生成的翻译
 If we compare the generated translation
 
 18
 00:00:37,170 --> 00:00:39,150
-一些参考人工翻译，
+与一些用于参照的人工翻译进行比较，
 to some reference human translations,
 
 19
@@ -100,47 +100,47 @@ but has made a common error.
 
 21
 00:00:43,260 --> 00:00:46,050
-西班牙语单词 tengo 在英语中的意思是，
+西班牙语单词 tengo 在英语中的意思是 have，
 The Spanish word tengo means have in English,
 
 22
 00:00:46,050 --> 00:00:48,700
-这种一对一的翻译不太自然。
+这种一一对应的直译不太自然。
 and this one-to-one translation is not quite natural.
 
 23
 00:00:49,890 --> 00:00:51,270
-那么我们如何衡量质量
+那么对于使用某种自动的方法生成的翻译
 So how can we measure the quality
 
 24
 00:00:51,270 --> 00:00:54,270
-以某种自动方式生成的翻译？
+我们如何来评估它的质量呢？
 of a generated translation in some automatic way?
 
 25
 00:00:54,270 --> 00:00:56,730
-BLEU 采用的方法是比较 n-gram
+BLEU 采用的方法是
 The approach that BLEU takes is to compare the n-grams
 
 26
 00:00:56,730 --> 00:00:58,550
-生成的 n-gram 翻译
+将所生成翻译的 n-gram 和参照的翻译的 n-gram 
 of the generated translation to the n-grams
 
 27
 00:00:58,550 --> 00:01:00,390
-在参考资料中。
+进行比较。
 in the references.
 
 28
 00:01:00,390 --> 00:01:02,400
-现在，n-gram 只是一种奇特的说法
+现在，n-gram 只是用于描述 n 个单词的
 Now, an n-gram is just a fancy way of saying
 
 29
 00:01:02,400 --> 00:01:03,960
-一大块 n 个单词。
+一种奇特的说法。
 a chunk of n words.
 
 30
@@ -150,32 +150,32 @@ So let's start with unigrams,
 
 31
 00:01:05,220 --> 00:01:08,020
-对应于句子中的单个单词。
+它对应于句子中的单个单词。
 which corresponds to the individual words in a sentence.
 
 32
 00:01:08,880 --> 00:01:11,250
-在此示例中，你可以看到其中四个单词
+在此示例中，你可以看到所生成的翻译中有四个单词
 In this example, you can see that four of the words
 
 33
 00:01:11,250 --> 00:01:13,140
-在生成的翻译中也发现
+在参照的翻译的其中一个
 in the generated translation are also found
 
 34
 00:01:13,140 --> 00:01:14,990
-在其中一个参考翻译中。
+也出现了。
 in one of the reference translations.
 
 35
 00:01:16,350 --> 00:01:18,240
-一旦我们找到了我们的比赛，
+一旦我们找到了匹配项，
 And once we've found our matches,
 
 36
 00:01:18,240 --> 00:01:20,130
-一种给译文打分的方法
+给译文打分的一种方法
 one way to assign a score to the translation
 
 37
@@ -185,22 +185,22 @@ is to compute the precision of the unigrams.
 
 38
 00:01:23,070 --> 00:01:25,200
-这意味着我们只计算匹配词的数量
+这意味着我们在生成的和参考的翻译中
 This means we just count the number of matching words
 
 39
 00:01:25,200 --> 00:01:27,360
-在生成的和参考的翻译中
+只计算匹配词的数量
 in the generated and reference translations
 
 40
 00:01:27,360 --> 00:01:29,660
-并通过除以单词数来归一化计数
+并且通过除以生成结果的单词数
 and normalize the count by dividing by the number of words
 
 41
 00:01:29,660 --> 00:01:30,753
-在这一代。
+来归一化计数值。
 in the generation.
 
 42
@@ -210,7 +210,7 @@ In this example, we found four matching words
 
 43
 00:01:34,080 --> 00:01:36,033
-而我们这一代人有五个字。
+而我们的生成结果中有五个单词。
 and our generation has five words.
 
 44
@@ -225,7 +225,7 @@ and higher precision scores mean a better translation.
 
 46
 00:01:44,160 --> 00:01:45,570
-但这并不是故事的全部
+但是到这里还没有结束
 But this isn't really the whole story
 
 47
@@ -235,17 +235,17 @@ because one problem with unigram precision
 
 48
 00:01:47,310 --> 00:01:49,140
-翻译模型有时会卡住吗
+翻译模型有时会在重复的句式上卡住
 is that translation models sometimes get stuck
 
 49
 00:01:49,140 --> 00:01:51,330
-以重复的方式重复同一个词
+这样的句式会很多次重复
 in repetitive patterns and just repeat the same word
 
 50
 00:01:51,330 --> 00:01:52,293
-几次。
+某一个单词。
 several times.
 
 51
@@ -260,12 +260,12 @@ we can get really high precision scores
 
 53
 00:01:56,370 --> 00:01:57,840
-虽然翻译很烂
+即使从人类的角度来看
 even though the translation is terrible
 
 54
 00:01:57,840 --> 00:01:59,090
-从人的角度来看！
+这个翻译很糟糕！
 from a human perspective!
 
 55
@@ -280,32 +280,32 @@ we get a perfect unigram precision score.
 
 57
 00:02:06,960 --> 00:02:09,930
-所以为了处理这个问题，BLEU 使用了修改后的精度
+所以为了解决这个问题，BLEU 使用了修改后的精度
 So to handle this, BLEU uses a modified precision
 
 58
 00:02:09,930 --> 00:02:12,210
-剪掉计算一个单词的次数，
+基于它出现在参考翻译中
 that clips the number of times to count a word,
 
 59
 00:02:12,210 --> 00:02:13,680
-基于最大次数
+出现的最大次数
 based on the maximum number of times
 
 60
 00:02:13,680 --> 00:02:16,399
-它出现在参考翻译中。
+再减掉计算一个单词的次数。
 it appears in the reference translation.
 
 61
 00:02:16,399 --> 00:02:18,630
-在这个例子中，单词 six 只出现了一次
+在这个例子中，单词 six 只在参考翻译中出现了一次
 In this example, the word six only appears once
 
 62
 00:02:18,630 --> 00:02:21,360
-在参考中，所以我们把分子剪成一
+所以我们把分子改为 1
 in the reference, so we clip the numerator to one
 
 63
@@ -335,27 +335,27 @@ the order in which the words appear in the translations.
 
 68
 00:02:33,900 --> 00:02:35,700
-例如，假设我们有 Yoda
+例如，假设我们有 Yoda 为我们
 For example, suppose we had Yoda
 
 69
 00:02:35,700 --> 00:02:37,410
-翻译我们的西班牙语句子，
+翻译西班牙语句子，
 translate our Spanish sentence,
 
 70
 00:02:37,410 --> 00:02:39,457
-那么我们可能会得到一些倒退的东西，比如，
+那么我们可能会得到一些退步的结果，
 then we might get something backwards like,
 
 71
 00:02:39,457 --> 00:02:42,450
-“我已经六十岁了。”
+比如，“Years sixty thirty have I.”
 "Years sixty thirty have I."
 
 72
 00:02:42,450 --> 00:02:44,670
-在这种情况下，修改后的 unigram 精度
+在这种情况下，修改后的 unigram 精度值
 In this case, the modified unigram precision
 
 73
@@ -370,12 +370,12 @@ So to deal with word ordering problems,
 
 75
 00:02:50,460 --> 00:02:52,020
-BLEU 实际计算精度
+BLEU 实际上计算几个不同的 n-gram 精度值，
 BLEU actually computes the precision
 
 76
 00:02:52,020 --> 00:02:55,410
-对于几个不同的 n-gram，然后对结果进行平均。
+然后对结果计算平均值。
 for several different n-grams and then averages the result.
 
 77
@@ -385,22 +385,22 @@ For example, if we compare 4-grams,
 
 78
 00:02:57,300 --> 00:02:58,830
-我们可以看到没有匹配的块
+我们可以看到翻译中
 we can see that there are no matching chunks
 
 79
 00:02:58,830 --> 00:03:01,020
-翻译中的四个词，
+没有匹配四个词的语块，
 of four words in the translations,
 
 80
 00:03:01,020 --> 00:03:02,913
-所以 4 克精度为 0。
+所以 4-gram 精度为 0。
 and so the 4-gram precision is 0.
 
 81
 00:03:05,460 --> 00:03:07,560
-现在，计算数据集库中的 BLEU 分数
+现在，使用 Datasets 库计算 BLEU 分数
 Now, to compute BLEU scores in Datasets library
 
 82
@@ -420,12 +420,12 @@ provide your model's predictions with their references
 
 85
 00:03:13,290 --> 00:03:14,390
-你很高兴去！
+然后就一切就绪！
 and you're good to go!
 
 86
 00:03:16,470 --> 00:03:19,200
-输出将包含几个感兴趣的字段。
+输出将包含几个重点的字段。
 The output will contain several fields of interest.
 
 87
@@ -435,27 +435,27 @@ The precisions field contains
 
 88
 00:03:20,490 --> 00:03:23,133
-每个 n-gram 的所有单独精度分数。
+每个 n-gram 的全部单体精度分数。
 all the individual precision scores for each n-gram.
 
 89
 00:03:25,050 --> 00:03:26,940
-然后计算 BLEU 分数本身
+然后 BLEU 分数本身
 The BLEU score itself is then calculated
 
 90
 00:03:26,940 --> 00:03:30,090
-通过取精度分数的几何平均值。
+通过取精度分数的几何平均值进行计算。
 by taking the geometric mean of the precision scores.
 
 91
 00:03:30,090 --> 00:03:32,790
-默认情况下，所有四个 n-gram 精度的平均值
+默认情况下，所有四个 n-gram 精度的平均值都会输出，
 And by default, the mean of all four n-gram precisions
 
 92
 00:03:32,790 --> 00:03:35,793
-据报道，该指标有时也称为 BLEU-4。
+该指标有时也称为 BLEU-4。
 is reported, a metric that is sometimes also called BLEU-4.
 
 93
@@ -465,7 +465,7 @@ In this example, we can see the BLEU score is zero
 
 94
 00:03:38,880 --> 00:03:40,780
-因为 4 克精度为零。
+因为 4-gram 精度为零。
 because the 4-gram precision was zero.
 
 95
@@ -475,7 +475,7 @@ Now, the BLEU metric has some nice properties,
 
 96
 00:03:45,390 --> 00:03:47,520
-但这远非一个完美的指标。
+但这离作为完美评估指标的距离还很远。
 but it is far from a perfect metric.
 
 97
@@ -490,12 +490,12 @@ and it's widely used in research
 
 99
 00:03:50,970 --> 00:03:52,620
-这样你就可以将你的模型与其他模型进行比较
+这样你就可以基于普遍的基准将你的模型
 so you can compare your model against others
 
 100
 00:03:52,620 --> 00:03:54,630
-在共同的基准上。
+与其他模型进行比较。
 on common benchmarks.
 
 101
@@ -505,12 +505,12 @@ On the other hand, there are several big problems with BLEU,
 
 102
 00:03:56,670 --> 00:03:58,830
-包括它不包含语义的事实
+包括实际上它不包含语义
 including the fact it doesn't incorporate semantics
 
 103
 00:03:58,830 --> 00:04:01,920
-它在非英语语言上很挣扎。
+它不适用于非英语语言。
 and it struggles a lot on non-English languages.
 
 104
@@ -525,17 +525,17 @@ is that it assumes the human translations
 
 106
 00:04:04,620 --> 00:04:05,820
-已经被代币化
+已经被词元化
 have already been tokenized
 
 107
 00:04:05,820 --> 00:04:07,320
-这使得比较模型变得困难
+这使得在基于不同的分词器的情况下
 and this makes it hard to compare models
 
 108
 00:04:07,320 --> 00:04:08,820
-使用不同的分词器。
+比较模型变得困难。
 that use different tokenizers.
 
 109
@@ -560,7 +560,7 @@ is to use the SacreBLEU metric,
 
 113
 00:04:19,440 --> 00:04:22,830
-它解决了 BLEU 的标记化限制。
+它解决了 BLEU 的词元化限制。
 which addresses the tokenization limitations of BLEU.
 
 114
@@ -570,12 +570,12 @@ As you can see in this example,
 
 115
 00:04:24,360 --> 00:04:26,580
-计算 SacreBLEU 分数几乎相同
+计算 SacreBLEU 分数几乎与 BLEU 
 computing the SacreBLEU score is almost identical
 
 116
 00:04:26,580 --> 00:04:28,020
-到 BLEU 一个。
+完全一致。
 to the BLEU one.
 
 117
@@ -590,7 +590,7 @@ instead of a list of words to the translations,
 
 119
 00:04:32,640 --> 00:04:35,640
-SacreBLEU 负责底层的代币化。
+SacreBLEU 负责底层的词元化。
 and SacreBLEU takes care of the tokenization under the hood.
 
 120