huggingface · lewtun · May 16, 2022 · May 7, 2022
diff --git a/chapters/en/chapter6/5.mdx b/chapters/en/chapter6/5.mdx
@@ -113,7 +113,7 @@ First we need a corpus, so let's create a simple one with a few sentences:
 
 ```python
 corpus = [
-    "This is the Hugging Face course.",
+    "This is the Hugging Face Course.",
     "This chapter is about tokenization.",
     "This section shows several tokenizer algorithms.",
     "Hopefully, you will be able to understand how they are trained and generate tokens.",

diff --git a/chapters/en/chapter6/6.mdx b/chapters/en/chapter6/6.mdx
@@ -106,7 +106,7 @@ We will use the same corpus as in the BPE example:
 
 ```python
 corpus = [
-    "This is the Hugging Face course.",
+    "This is the Hugging Face Course.",
     "This chapter is about tokenization.",
     "This section shows several tokenizer algorithms.",
     "Hopefully, you will be able to understand how they are trained and generate tokens.",
@@ -307,7 +307,7 @@ print(vocab)
 ```python out
 ['[PAD]', '[UNK]', '[CLS]', '[SEP]', '[MASK]', '##a', '##b', '##c', '##d', '##e', '##f', '##g', '##h', '##i', '##k',
  '##l', '##m', '##n', '##o', '##p', '##r', '##s', '##t', '##u', '##v', '##w', '##y', '##z', ',', '.', 'C', 'F', 'H',
- 'T', 'a', 'b', 'c', 'g', 'h', 'i', 's', 't', 'u', 'w', 'y', '##fu', 'Fa', 'Fac', '##ct', '##ful', '##full', '##fully',
+ 'T', 'a', 'b', 'c', 'g', 'h', 'i', 's', 't', 'u', 'w', 'y', 'ab', '##fu', 'Fa', 'Fac', '##ct', '##ful', '##full', '##fully',
  'Th', 'ch', '##hm', 'cha', 'chap', 'chapt', '##thm', 'Hu', 'Hug', 'Hugg', 'sh', 'th', 'is', '##thms', '##za', '##zat',
  '##ut']
 ```

diff --git a/chapters/en/chapter6/7.mdx b/chapters/en/chapter6/7.mdx
@@ -157,7 +157,7 @@ We will use the same corpus as before as an example:
 
 ```python
 corpus = [
-    "This is the Hugging Face course.",
+    "This is the Hugging Face Course.",
     "This chapter is about tokenization.",
     "This section shows several tokenizer algorithms.",
     "Hopefully, you will be able to understand how they are trained and generate tokens.",