Merge branch 'huggingface:main' into main

huggingface · Dec 2, 2022 · bd00a2d · bd00a2d
2 parents e2cadb3 + d0a93b9
commit bd00a2d
Show file tree

Hide file tree

Showing 407 changed files with 65,894 additions and 3,387 deletions.
diff --git a/.github/workflows/build_pr_documentation.yml b/.github/workflows/build_pr_documentation.yml
@@ -9,7 +9,7 @@ concurrency:
 
 jobs:
   build:
-    uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@use_hf_hub
+    uses: huggingface/doc-builder/.github/workflows/build_pr_documentation.yml@main
     with:
       commit_sha: ${{ github.event.pull_request.head.sha }}
       pr_number: ${{ github.event.number }}
@@ -18,6 +18,3 @@ jobs:
       additional_args: --not_python_module
       languages: ar bn de en es fa fr gj he hi id it ja ko pt ru th tr vi zh-CN zh-TW
       hub_base_path: https://moon-ci-docs.huggingface.co
-    secrets:
-      token: ${{ secrets.HF_DOC_PUSH }}
-      comment_bot_token: ${{ secrets.HUGGINGFACE_PUSH }}
diff --git a/.github/workflows/delete_doc_comment.yml b/.github/workflows/delete_doc_comment.yml
@@ -7,10 +7,7 @@ on:
 
 jobs:
   delete:
-    uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@use_hf_hub
+    uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main
     with:
       pr_number: ${{ github.event.number }}
-      package: course
-    secrets:
-      token: ${{ secrets.HF_DOC_PUSH }}
-      comment_bot_token: ${{ secrets.HUGGINGFACE_PUSH }}
+      package: course
diff --git a/README.md b/README.md
@@ -18,7 +18,7 @@ This repo contains the content that's used to create the **[Hugging Face course]
 | [Bahasa Indonesia](https://huggingface.co/course/id/chapter1/1) (WIP)                   | [`chapters/id`](https://github.com/huggingface/course/tree/main/chapters/id)       | [@gstdl](https://github.com/gstdl)                                                                                                                                                                                                                                                                                                           |
 | [Italian](https://huggingface.co/course/it/chapter1/1) (WIP)                  | [`chapters/it`](https://github.com/huggingface/course/tree/main/chapters/it)       | [@CaterinaBi](https://github.com/CaterinaBi), [@ClonedOne](https://github.com/ClonedOne),    [@Nolanogenn](https://github.com/Nolanogenn), [@EdAbati](https://github.com/EdAbati), [@gdacciaro](https://github.com/gdacciaro)                                                                                                                                                                  |
 | [Japanese](https://huggingface.co/course/ja/chapter1/1) (WIP)                 | [`chapters/ja`](https://github.com/huggingface/course/tree/main/chapters/ja)       | [@hiromu166](https://github.com/@hiromu166), [@younesbelkada](https://github.com/@younesbelkada), [@HiromuHota](https://github.com/@HiromuHota)                                                                                                                                                                                                       |
-| [Korean](https://huggingface.co/course/ko/chapter1/1) (WIP)                   | [`chapters/ko`](https://github.com/huggingface/course/tree/main/chapters/ko)       | [@Doohae](https://github.com/Doohae)                                                                                                                                                                                                                                                                                                                     |
+| [Korean](https://huggingface.co/course/ko/chapter1/1) (WIP)                   | [`chapters/ko`](https://github.com/huggingface/course/tree/main/chapters/ko)       | [@Doohae](https://github.com/Doohae), [@wonhyeongseo](https://github.com/wonhyeongseo)                                                                                                                                                                                                                                                                                                                     |
 | [Portuguese](https://huggingface.co/course/pt/chapter1/1) (WIP)               | [`chapters/pt`](https://github.com/huggingface/course/tree/main/chapters/pt)       | [@johnnv1](https://github.com/johnnv1), [@victorescosta](https://github.com/victorescosta), [@LincolnVS](https://github.com/LincolnVS)                                                                                                                                                                                                                   |
 | [Russian](https://huggingface.co/course/ru/chapter1/1) (WIP)                  | [`chapters/ru`](https://github.com/huggingface/course/tree/main/chapters/ru)       | [@pdumin](https://github.com/pdumin), [@svv73](https://github.com/svv73)                                                                                                                                                                                                                                                                                 |
 | [Thai](https://huggingface.co/course/th/chapter1/1) (WIP)                     | [`chapters/th`](https://github.com/huggingface/course/tree/main/chapters/th)       | [@peeraponw](https://github.com/peeraponw), [@a-krirk](https://github.com/a-krirk), [@jomariya23156](https://github.com/jomariya23156), [@ckingkan](https://github.com/ckingkan)                                                                                                                                                                         |

diff --git a/chapters/en/chapter0/1.mdx b/chapters/en/chapter0/1.mdx
@@ -1,4 +1,4 @@
-# Introduction
+# Introduction[[introduction]]
 
 Welcome to the Hugging Face course! This introduction will guide you through setting up a working environment. If you're just starting the course, we recommend you first take a look at [Chapter 1](/course/chapter1), then come back and set up your environment so you can try the code yourself.
 
@@ -10,7 +10,7 @@ Note that we will not be covering the Windows system. If you're running on Windo
 
 Most of the course relies on you having a Hugging Face account. We recommend creating one now: [create an account](https://huggingface.co/join).
 
-## Using a Google Colab notebook
+## Using a Google Colab notebook[[using-a-google-colab-notebook]]
 
 Using a Colab notebook is the simplest possible setup; boot up a notebook in your browser and get straight to coding! 
 
@@ -46,7 +46,7 @@ This installs a very light version of 🤗 Transformers. In particular, no speci
 
 This will take a bit of time, but then you'll be ready to go for the rest of the course!
 
-## Using a Python virtual environment
+## Using a Python virtual environment[[using-a-python-virtual-environment]]
 
 If you prefer to use a Python virtual environment, the first step is to install Python on your system. We recommend following [this guide](https://realpython.com/installing-python/) to get started.
 
@@ -99,7 +99,7 @@ which python
 /home/<user>/transformers-course/.env/bin/python
 ```
 
-### Installing dependencies
+### Installing dependencies[[installing-dependencies]]
 
 As in the previous section on using Google Colab instances, you'll now need to install the packages required to continue. Again, you can install the development version of 🤗 Transformers using the `pip` package manager:
 

diff --git a/chapters/en/chapter1/1.mdx b/chapters/en/chapter1/1.mdx
@@ -1,18 +1,18 @@
-# Introduction
+# Introduction[[introduction]]
 
 <CourseFloatingBanner
     chapter={1}
     classNames="absolute z-10 right-0 top-0"
 />
 
-## Welcome to the 🤗 Course!
+## Welcome to the 🤗 Course![[welcome-to-the-course]]
 
 <Youtube id="00GKzGyWFEs" />
 
 This course will teach you about natural language processing (NLP) using libraries from the [Hugging Face](https://huggingface.co/) ecosystem — [🤗 Transformers](https://github.com/huggingface/transformers), [🤗 Datasets](https://github.com/huggingface/datasets), [🤗 Tokenizers](https://github.com/huggingface/tokenizers), and [🤗 Accelerate](https://github.com/huggingface/accelerate) — as well as the [Hugging Face Hub](https://huggingface.co/models). It's completely free and without ads.
 
 
-## What to expect?
+## What to expect?[[what-to-expect]]
 
 Here is a brief overview of the course:
 
@@ -33,7 +33,7 @@ This course:
 
 After you've completed this course, we recommend checking out DeepLearning.AI's [Natural Language Processing Specialization](https://www.coursera.org/specializations/natural-language-processing?utm_source=deeplearning-ai&utm_medium=institutions&utm_campaign=20211011-nlp-2-hugging_face-page-nlp-refresh), which covers a wide range of traditional NLP models like naive Bayes and LSTMs that are well worth knowing about!
 
-## Who are we?
+## Who are we?[[who-are-we]]
 
 About the authors:
 
@@ -55,7 +55,7 @@ About the authors:
 
 **Leandro von Werra**  is a machine learning engineer in the open-source team at Hugging Face and also a co-author of the O’Reilly book [Natural Language Processing with Transformers](https://www.oreilly.com/library/view/natural-language-processing/9781098136789/). He has several years of industry experience bringing NLP projects to production by working across the whole machine learning stack..
 
-## FAQ
+## FAQ[[faq]]
 
 Here are some answers to frequently asked questions:
 

diff --git a/chapters/en/chapter1/10.mdx b/chapters/en/chapter1/10.mdx
@@ -1,6 +1,6 @@
 <!-- DISABLE-FRONTMATTER-SECTIONS -->
 
-# End-of-chapter quiz
+# End-of-chapter quiz[[end-of-chapter-quiz]]
 
 <CourseFloatingBanner
     chapter={1}
@@ -140,7 +140,6 @@ result = classifier("This is a course about the Transformers library")
 
 ### 6. True or false? A language model usually does not need labels for its pretraining.
 
-
 <Question
 	choices={[
 		{
@@ -155,7 +154,7 @@ result = classifier("This is a course about the Transformers library")
 	]}
 />
 
-### 7. Select the sentence that best describes the terms "model," "architecture," and "weights."
+### 7. Select the sentence that best describes the terms "model", "architecture", and "weights".
 
 <Question
 	choices={[

diff --git a/chapters/en/chapter1/2.mdx b/chapters/en/chapter1/2.mdx
@@ -1,4 +1,4 @@
-# Natural Language Processing
+# Natural Language Processing[[natural-language-processing]]
 
 <CourseFloatingBanner
     chapter={1}
@@ -7,7 +7,7 @@
 
 Before jumping into Transformer models, let's do a quick overview of what natural language processing is and why we care about it.
 
-## What is NLP?
+## What is NLP?[[what-is-nlp]]
 
 NLP is a field of linguistics and machine learning focused on understanding everything related to human language. The aim of NLP tasks is not only to understand single words individually, but to be able to understand the context of those words.
 
@@ -21,6 +21,6 @@ The following is a list of common NLP tasks, with some examples of each:
 
 NLP isn't limited to written text though. It also tackles complex challenges in speech recognition and computer vision, such as generating a transcript of an audio sample or a description of an image.
 
-## Why is it challenging?
+## Why is it challenging?[[why-is-it-challenging]]
 
 Computers don't process information in the same way as humans. For example, when we read the sentence "I am hungry," we can easily understand its meaning. Similarly, given two sentences such as "I am hungry" and "I am sad," we're able to easily determine how similar they are. For machine learning (ML) models, such tasks are more difficult. The text needs to be processed in a way that enables the model to learn from it. And because language is complex, we need to think carefully about how this processing must be done. There has been a lot of research done on how to represent text, and we will look at some methods in the next chapter.
diff --git a/chapters/en/chapter1/3.mdx b/chapters/en/chapter1/3.mdx
@@ -1,4 +1,4 @@
-# Transformers, what can they do?
+# Transformers, what can they do?[[transformers-what-can-they-do]]
 
 <CourseFloatingBanner chapter={1}
   classNames="absolute z-10 right-0 top-0"
@@ -15,7 +15,7 @@ In this section, we will look at what Transformer models can do and use our firs
 If you want to run the examples locally, we recommend taking a look at the <a href="/course/chapter0">setup</a>.
 </Tip>
 
-## Transformers are everywhere!
+## Transformers are everywhere![[transformers-are-everywhere]]
 
 Transformer models are used to solve all kinds of NLP tasks, like the ones mentioned in the previous section. Here are some of the companies and organizations using Hugging Face and Transformer models, who also contribute back to the community by sharing their models:
 
@@ -29,7 +29,7 @@ The [🤗 Transformers library](https://github.com/huggingface/transformers) pro
 
 Before diving into how Transformer models work under the hood, let's look at a few examples of how they can be used to solve some interesting NLP problems.
 
-## Working with pipelines
+## Working with pipelines[[working-with-pipelines]]
 
 <Youtube id="tiZFewofSLM" />
 
@@ -82,7 +82,7 @@ Some of the currently [available pipelines](https://huggingface.co/transformers/
 
 Let's have a look at a few of these!
 
-## Zero-shot classification
+## Zero-shot classification[[zero-shot-classification]]
 
 We'll start by tackling a more challenging task where we need to classify texts that haven't been labelled. This is a common scenario in real-world projects because annotating text is usually time-consuming and requires domain expertise. For this use case, the `zero-shot-classification` pipeline is very powerful: it allows you to specify which labels to use for the classification, so you don't have to rely on the labels of the pretrained model. You've already seen how the model can classify a sentence as positive or negative using those two labels — but it can also classify the text using any other set of labels you like.
 
@@ -111,7 +111,7 @@ This pipeline is called _zero-shot_ because you don't need to fine-tune the mode
 </Tip>
 
 
-## Text generation
+## Text generation[[text-generation]]
 
 Now let's see how to use a pipeline to generate some text. The main idea here is that you provide a prompt and the model will auto-complete it by generating the remaining text. This is similar to the predictive text feature that is found on many phones. Text generation involves randomness, so it's normal if you don't get the same results as shown below.
 
@@ -139,7 +139,7 @@ You can control how many different sequences are generated with the argument `nu
 </Tip>
 
 
-## Using any model from the Hub in a pipeline
+## Using any model from the Hub in a pipeline[[using-any-model-from-the-hub-in-a-pipeline]]
 
 The previous examples used the default model for the task at hand, but you can also choose a particular model from the Hub to use in a pipeline for a specific task — say, text generation. Go to the [Model Hub](https://huggingface.co/models) and click on the corresponding tag on the left to display only the supported models for that task. You should get to a page like [this one](https://huggingface.co/models?pipeline_tag=text-generation).
 
@@ -174,13 +174,13 @@ Once you select a model by clicking on it, you'll see that there is a widget ena
 
 </Tip>
 
-### The Inference API
+### The Inference API[[the-inference-api]]
 
 All the models can be tested directly through your browser using the Inference API, which is available on the Hugging Face [website](https://huggingface.co/). You can play with the model directly on this page by inputting custom text and watching the model process the input data.
 
 The Inference API that powers the widget is also available as a paid product, which comes in handy if you need it for your workflows. See the [pricing page](https://huggingface.co/pricing) for more details.
 
-## Mask filling
+## Mask filling[[mask-filling]]
 
 The next pipeline you'll try is `fill-mask`. The idea of this task is to fill in the blanks in a given text:
 
@@ -210,7 +210,7 @@ The `top_k` argument controls how many possibilities you want to be displayed. N
 
 </Tip>
 
-## Named entity recognition
+## Named entity recognition[[named-entity-recognition]]
 
 Named entity recognition (NER) is a task where the model has to find which parts of the input text correspond to entities such as persons, locations, or organizations. Let's look at an example:
 
@@ -238,7 +238,7 @@ We pass the option `grouped_entities=True` in the pipeline creation function to
 
 </Tip>
 
-## Question answering
+## Question answering[[question-answering]]
 
 The `question-answering` pipeline answers questions using information from a given context:
 
@@ -258,7 +258,7 @@ question_answerer(
 
 Note that this pipeline works by extracting information from the provided context; it does not generate the answer.
 
-## Summarization
+## Summarization[[summarization]]
 
 Summarization is the task of reducing a text into a shorter text while keeping all (or most) of the important aspects referenced in the text. Here's an example:
 
@@ -303,7 +303,7 @@ summarizer(
 Like with text generation, you can specify a `max_length` or a `min_length` for the result.
 
 
-## Translation
+## Translation[[translation]]
 
 For translation, you can use a default model if you provide a language pair in the task name (such as `"translation_en_to_fr"`), but the easiest way is to pick the model you want to use on the [Model Hub](https://huggingface.co/models). Here we'll try translating from French to English: