Add captions for tasks videos (#464)

* Add captions for tasks videos * Fix script
huggingface · Jan 5, 2023 · f71cf6c · f71cf6c
1 parent d29a927
commit f71cf6c
Show file tree

Hide file tree

Showing 17 changed files with 1,151 additions and 11 deletions.
diff --git a/subtitles/README.md b/subtitles/README.md
@@ -37,16 +37,16 @@ For example, in the `zh-CN` subtitles, each block has the following format:
 ```
 1
 00:00:05,850 --> 00:00:07,713
-- 欢迎来到 Hugging Face 课程。
-- Welcome to the Hugging Face Course.
+欢迎来到 Hugging Face 课程。
+Welcome to the Hugging Face Course.
 ```
 
 To upload the SRT file to YouTube, we need the subtitle in monolingual format, i.e. the above block should read:
 
 ```
 1
 00:00:05,850 --> 00:00:07,713
-- 欢迎来到 Hugging Face 课程。
+欢迎来到 Hugging Face 课程。
 ```
 
 To handle this, we provide a script that converts the bilingual SRT files to monolingual ones. To perform the conversion, run:

diff --git a/subtitles/en/metadata_tasks.csv b/subtitles/en/metadata_tasks.csv
@@ -0,0 +1,7 @@
+id,title,link,srt_filename
+wVHdVlPScxA,🤗 Tasks: Token Classification,https://www.youtube.com/watch?v=wVHdVlPScxA&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=1,subtitles/en/tasks_00_🤗-tasks-token-classification.srt
+ajPx5LwJD-I,🤗 Tasks: Question Answering,https://www.youtube.com/watch?v=ajPx5LwJD-I&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=2,subtitles/en/tasks_01_🤗-tasks-question-answering.srt
+Vpjb1lu0MDk,🤗 Tasks: Causal Language Modeling,https://www.youtube.com/watch?v=Vpjb1lu0MDk&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=3,subtitles/en/tasks_02_🤗-tasks-causal-language-modeling.srt
+mqElG5QJWUg,🤗 Tasks: Masked Language Modeling,https://www.youtube.com/watch?v=mqElG5QJWUg&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=4,subtitles/en/tasks_03_🤗-tasks-masked-language-modeling.srt
+yHnr5Dk2zCI,🤗 Tasks: Summarization,https://www.youtube.com/watch?v=yHnr5Dk2zCI&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=5,subtitles/en/tasks_04_🤗-tasks-summarization.srt
+1JvfrvZgi6c,🤗 Tasks: Translation,https://www.youtube.com/watch?v=1JvfrvZgi6c&list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf&index=6,subtitles/en/tasks_05_🤗-tasks-translation.srt
diff --git a/subtitles/en/raw/tasks.md b/subtitles/en/raw/tasks.md
@@ -0,0 +1,77 @@
+Note: the following transcripts are associated with Merve Noyan's videos in the Hugging Face Tasks playlist: https://www.youtube.com/playlist?list=PLo2EIpI_JMQtyEr-sLJSy5_SnLCb4vtQf
+
+Token Classification video
+
+Welcome to the Hugging Face tasks series! In this video we’ll take a look at the token classification task.
+Token classification is the task of assigning a label to each token in a sentence. There are various token classification tasks and the most common are Named Entity Recognition and Part-of-Speech Tagging.
+Let’s take a quick look at the Named Entity Recognition task. The goal of this task is to find the entities in a piece of text, such as person, location, or organization. This task is formulated as labelling each token with one class for each entity, and another class for tokens that have no entity.
+Another token classification task is part-of-speech tagging. The goal of this task is to label the words for a particular part of a speech, such as noun, pronoun, adjective, verb and so on. This task is formulated as labelling each token with parts of speech.
+Token classification models are evaluated on Accuracy, Recall, Precision and F1-Score. The metrics are calculated for each of the classes. We calculate true positive, true negative and false positives to calculate precision and recall, and take their harmonic mean to get F1-Score. Then we calculate it for every class and take the overall average to evaluate our model.
+An example dataset used for this task is ConLL2003. Here, each token belongs to a certain named entity class, denoted as the indices of the list containing the labels.
+You can extract important information from invoices using named entity recognition models, such as date, organization name or address.
+For more information about the Token classification task, check out the Hugging Face course.
+
+
+Question Answering video
+
+Welcome to the Hugging Face tasks series. In this video, we will take a look at the Question Answering task.
+Question answering is the task of extracting an answer in a given document.
+Question answering models take a context, which is the document you want to search in, and a question and return an answer. Note that the answer is not generated, but extracted from the context. This type of question answering is called extractive.
+The task is evaluated on two metrics, exact match and F1-Score.
+As the name implies, exact match looks for an exact match between the predicted answer and the correct answer.
+A common metric used is the F1-Score, which is calculated over tokens that are predicted correctly and incorrectly. It is calculated over the average of two metrics called precision and recall which are metrics that are used widely in classification problems.
+An example dataset used for this task is called SQuAD. This dataset contains contexts, questions and the answers that are obtained from English Wikipedia articles.
+You can use question answering models to automatically answer the questions asked by your customers. You simply need a document containing information about your business and query through that document with the questions asked by your customers.
+For more information about the Question Answering task, check out the Hugging Face course.
+
+
+Causal Language Modeling video
+
+Welcome to the Hugging Face tasks series! In this video we’ll take a look at Causal Language Modeling.
+Causal language modeling is the task of predicting the next 
+word in a sentence, given all the previous words. This task is very similar to the autocorrect function that you might have on your phone. 
+These models take a sequence to be completed and outputs the complete sequence.
+Classification metrics can’t be used as there’s no single correct answer for completion. Instead, we evaluate the distribution of the text completed by the model.
+A common metric to do so is the cross-entropy loss. Perplexity is also a widely used metric and it is calculated as the exponential of the cross-entropy loss.
+You can use any dataset with plain text and tokenize the text to prepare the data. 
+Causal language models can be used to generate code.
+For more information about the Causal Language Modeling task, check out the Hugging Face course.
+
+
+Masked Language Modeling video
+
+Welcome to the Hugging Face tasks series! In this video we’ll take a look at Masked Language Modeling.
+Masked language modeling is the task of predicting which words should fill in the blanks of a sentence.
+These models take a masked text as the input and output the possible values for that mask.
+Masked language modeling is handy before fine-tuning your model for your task. For example, if you need to use a model in a specific domain, say, biomedical documents, models like BERT will treat your domain-specific words as rare tokens. If you train a masked language model using your biomedical corpus and then fine tune your model on a downstream task, you will have a better performance.
+Classification metrics can’t be used as there’s no single correct answer to mask values. Instead, we evaluate the distribution of the mask values.
+A common metric to do so is the cross-entropy loss. Perplexity is also a widely used metric and it is calculated as the exponential of the cross-entropy loss.
+You can use any dataset with plain text and tokenize the text to mask the data.
+For more information about the Masked Language Modeling, check out the Hugging Face course.
+
+
+Summarization video
+
+Welcome to the Hugging Face tasks series. In this video, we will take a look at the Text Summarization task.
+Summarization is a task of producing a shorter version of a document while preserving the relevant and important information in the document.
+Summarization models take a document to be summarized and output the summarized text.
+This task is evaluated on the ROUGE score. It’s based on the overlap between the produced sequence and the correct sequence.
+You might see this as ROUGE-1, which is the overlap of single tokens and ROUGE-2, the overlap of subsequent token pairs. ROUGE-N refers to the overlap of n subsequent tokens. Here we see an example of how overlaps take place.
+An example dataset used for this task is called Extreme Summarization, XSUM. This dataset contains texts and their summarized versions.
+You can use summarization models to summarize research papers which would enable researchers to easily pick papers for their reading list.
+For more information about the Summarization task, check out the Hugging Face course.
+
+
+Translation video
+
+Welcome to the Hugging Face tasks series. In this video, we will take a look at the Translation task.
+Translation is the task of translating text from one language to another.
+These models take a text in the source language and output the translation of that text in the target language.
+The task is evaluated on the BLEU score.
+The score ranges from 0 to 1, in which 1 means the translation perfectly matched and 0 did not match at all.
+BLEU is calculated over subsequent tokens called n-grams. Unigram refers to a single token while bi-gram refers to token pairs and n-grams refer to n subsequent tokens. 
+Machine translation datasets contain pairs of text in a language and translation of the text in another language.
+These models can help you build conversational agents across different languages.
+One option is to translate the training data used for the chatbot and train a separate chatbot.
+You can put one translation model from your user’s language to the language your chatbot is trained on, translate the user inputs and do intent classification, take the output of the chatbot and translate it from the language your chatbot was trained on to the user’s language.
+For more information about the Translation task, check out the Hugging Face course.
diff --git a/subtitles/en/tasks_00_🤗-tasks-token-classification.srt b/subtitles/en/tasks_00_🤗-tasks-token-classification.srt
@@ -0,0 +1,116 @@
+1
+00:00:04,520 --> 00:00:07,400
+Welcome to the Hugging Face tasks series!
+
+2
+00:00:07,400 --> 00:00:11,870
+In this video we’ll take a look at the token
+classification task.
+
+3
+00:00:11,870 --> 00:00:17,900
+Token classification is the task of assigning
+a label to each token in a sentence.
+
+4
+00:00:17,900 --> 00:00:23,310
+There are various token classification tasks
+and the most common are Named Entity Recognition
+
+5
+00:00:23,310 --> 00:00:26,430
+and Part-of-Speech Tagging.
+
+6
+00:00:26,430 --> 00:00:31,640
+Let’s take a quick look at the Named Entity
+Recognition task.
+
+7
+00:00:31,640 --> 00:00:38,400
+The goal of this task is to find the entities
+in a piece of text, such as person, location,
+
+8
+00:00:38,400 --> 00:00:40,210
+or organization.
+
+9
+00:00:40,210 --> 00:00:45,250
+This task is formulated as labelling each
+token with one class for each entity, and
+
+10
+00:00:45,250 --> 00:00:51,719
+another class for tokens that have no entity.
+
+11
+00:00:51,719 --> 00:00:55,670
+Another token classification task is part-of-speech
+tagging.
+
+12
+00:00:55,670 --> 00:01:01,399
+The goal of this task is to label the words
+for a particular part of a speech, such as
+
+13
+00:01:01,399 --> 00:01:05,900
+noun, pronoun, adjective, verb and so on.
+
+14
+00:01:05,900 --> 00:01:11,270
+This task is formulated as labelling each
+token with parts of speech.
+
+15
+00:01:11,270 --> 00:01:19,659
+Token classification models are evaluated
+on Accuracy, Recall, Precision and F1-Score.
+
+16
+00:01:19,659 --> 00:01:22,950
+The metrics are calculated for each of the
+classes.
+
+17
+00:01:22,950 --> 00:01:28,040
+We calculate true positive, true negative
+and false positives to calculate precision
+
+18
+00:01:28,040 --> 00:01:31,829
+and recall, and take their harmonic mean to
+get F1-Score.
+
+19
+00:01:31,829 --> 00:01:42,329
+Then we calculate it for every class and take
+the overall average to evaluate our model.
+
+20
+00:01:42,329 --> 00:01:45,680
+An example dataset used for this task is ConLL2003.
+
+21
+00:01:45,680 --> 00:01:51,750
+Here, each token belongs to a certain named
+entity class, denoted as the indices of the
+
+22
+00:01:51,750 --> 00:01:55,380
+list containing the labels.
+
+23
+00:01:55,380 --> 00:02:00,720
+You can extract important information from
+invoices using named entity recognition models,
+
+24
+00:02:00,720 --> 00:02:07,070
+such as date, organization name or address.
+
+25
+00:02:07,070 --> 00:02:16,840
+For more information about the Token classification
+task, check out the Hugging Face course.
diff --git a/subtitles/en/tasks_01_🤗-tasks-question-answering.srt b/subtitles/en/tasks_01_🤗-tasks-question-answering.srt
@@ -0,0 +1,87 @@
+1
+00:00:04,400 --> 00:00:06,480
+Welcome to the Hugging Face tasks series.  
+
+2
+00:00:07,200 --> 00:00:10,080
+In this video, we will take a look 
+at the Question Answering task. 
+
+3
+00:00:13,120 --> 00:00:17,200
+Question answering is the task of 
+extracting an answer in a given document. 
+
+4
+00:00:21,120 --> 00:00:25,600
+Question answering models take a context, 
+which is the document you want to search in,  
+
+5
+00:00:26,240 --> 00:00:31,440
+and a question and return an answer. 
+Note that the answer is not generated,  
+
+6
+00:00:31,440 --> 00:00:37,600
+but extracted from the context. This type 
+of question answering is called extractive. 
+
+7
+00:00:42,320 --> 00:00:46,960
+The task is evaluated on two 
+metrics, exact match and F1-Score. 
+
+8
+00:00:49,680 --> 00:00:52,320
+As the name implies, exact match looks for an  
+
+9
+00:00:52,320 --> 00:00:57,840
+exact match between the predicted 
+answer and the correct answer. 
+
+10
+00:01:00,080 --> 00:01:05,520
+A common metric used is the F1-Score, which 
+is calculated over tokens that are predicted  
+
+11
+00:01:05,520 --> 00:01:10,960
+correctly and incorrectly. It is calculated 
+over the average of two metrics called  
+
+12
+00:01:10,960 --> 00:01:16,560
+precision and recall which are metrics that 
+are used widely in classification problems. 
+
+13
+00:01:20,880 --> 00:01:28,240
+An example dataset used for this task is called 
+SQuAD. This dataset contains contexts, questions  
+
+14
+00:01:28,240 --> 00:01:32,080
+and the answers that are obtained 
+from English Wikipedia articles. 
+
+15
+00:01:35,440 --> 00:01:39,520
+You can use question answering models to 
+automatically answer the questions asked  
+
+16
+00:01:39,520 --> 00:01:46,480
+by your customers. You simply need a document 
+containing information about your business  
+
+17
+00:01:47,200 --> 00:01:53,840
+and query through that document with 
+the questions asked by your customers. 
+
+18
+00:01:55,680 --> 00:02:06,160
+For more information about the Question Answering 
+task, check out the Hugging Face course.
diff --git a/subtitles/en/tasks_02_🤗-tasks-causal-language-modeling.srt b/subtitles/en/tasks_02_🤗-tasks-causal-language-modeling.srt
@@ -0,0 +1,63 @@
+1
+00:00:04,560 --> 00:00:06,640
+Welcome to the Hugging Face tasks series!  
+
+2
+00:00:07,200 --> 00:00:10,400
+In this video we’ll take a look 
+at Causal Language Modeling. 
+
+3
+00:00:13,600 --> 00:00:16,880
+Causal language modeling is 
+the task of predicting the next 
+
+4
+00:00:16,880 --> 00:00:21,920
+word in a sentence, given all the 
+previous words. This task is very  
+
+5
+00:00:21,920 --> 00:00:29,920
+similar to the autocorrect function 
+that you might have on your phone. 
+
+6
+00:00:29,920 --> 00:00:34,720
+These models take a sequence to be 
+completed and outputs the complete sequence. 
+
+7
+00:00:38,640 --> 00:00:44,160
+Classification metrics can’t be used as there’s 
+no single correct answer for completion.  
+
+8
+00:00:44,960 --> 00:00:49,280
+Instead, we evaluate the distribution 
+of the text completed by the model. 
+
+9
+00:00:50,800 --> 00:00:55,440
+A common metric to do so is the 
+cross-entropy loss. Perplexity is  
+
+10
+00:00:55,440 --> 00:01:01,280
+also a widely used metric and it is calculated 
+as the exponential of the cross-entropy loss. 
+
+11
+00:01:05,200 --> 00:01:11,840
+You can use any dataset with plain text 
+and tokenize the text to prepare the data. 
+
+12
+00:01:15,040 --> 00:01:18,240
+Causal language models can 
+be used to generate code. 
+
+13
+00:01:22,480 --> 00:01:33,200
+For more information about the Causal Language 
+Modeling task, check out the Hugging Face course.