Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Trainer to quicktour #18723

Merged
merged 6 commits into from
Sep 2, 2022
Merged

Conversation

stevhliu
Copy link
Member

This PR makes some edits to the pipeline section to focus less on all the tasks (and their definitions) it is capable of. Users probably only need a general representative idea of what it can do, and then they're more interested in diving into how to use the pipeline.

I also added a brief section on the Trainer here about the basic parameters it accepts and a small explanation of how to customize the training loop behavior to keep the quick tour short. I think the Trainer is pretty important to include since a lot of users use it for training, and we also use it in our finetune guides.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 22, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a cool refactor! I left some comments. Thanks for working on this important topic.

Comment on lines 28 to 53
[`pipeline`] is the easiest way to use a pretrained model for a given task.

<Youtube id="tiZFewofSLM"/>

The [`pipeline`] supports many common tasks out-of-the-box:

**Text**:
* Sentiment analysis: classify the polarity of a given text.
* Text generation (in English): generate text from a given input.
* Name entity recognition (NER): label each word with the entity it represents (person, date, location, etc.).
* Question answering: extract the answer from the context, given some context and a question.
* Fill-mask: fill in the blank given a text with masked words.
* Summarization: generate a summary of a long sequence of text or document.
* Translation: translate text into another language.
* Feature extraction: create a tensor representation of the text.

**Image**:
* Image classification: classify an image.
* Image segmentation: classify every pixel in an image.
* Object detection: detect objects within an image.

**Audio**:
* Audio classification: assign a label to a given segment of audio.
* Automatic speech recognition (ASR): transcribe audio data into text.

<Tip>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this introduction in different tasks was quite useful! As a user I feel like I wouldn't necessarily come to the quicktour to learn how to use the library deeply, but rather with a need and a task I'd want to solve. In this case, then showcasing what is supported straight away would be helpful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think Sylvain also advocated for showcasing what all is supported so I'll keep it. Going to experiment a bit with presenting the tasks in a table :)

Comment on lines 427 to 431
<Tip>

For tasks that use a sequence-to-sequence model like translation or summarization, use the [`Seq2SeqTrainer`] and [`Seq2SeqTrainingArguments`] classes instead.

</Tip>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'd also show code samples

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the code sample is so similar to the Seq2Seq classes, what do you think about just clarifying the tip to say that you can copy the above code and remove Seq2Seq? This way, we can avoid being too repetitive.

Comment on lines +433 to +435
You can customize the training loop behavior by subclassing the methods inside [`Trainer`]. This allows you to customize features such as the loss function, optimizer, and scheduler. Take a look at the [`Trainer`] reference for which methods can be subclassed.

The other way to customize the training loop is by using [Callbacks](./main_classes/callbacks). You can use callbacks to integrate with other libraries and inspect the training loop to report on progress or stop the training early. Callbacks do not modify anything in the training loop itself. To customize something like the loss function, you need to subclass the [`Trainer`] instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's more for the Trainer page, but I'd showcase an example of subclassing each method. I think there's a single one shown for compute_loss, but subclassing requires understanding which inputs/outputs will work and it's not necessarily straightforward for a beginner

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! You should complete the training section with the same training using Keras.fit for TensorFlow models.

@stevhliu stevhliu changed the title [WIP] Add Trainer to quicktour Add Trainer to quicktour Sep 2, 2022
@stevhliu stevhliu marked this pull request as ready for review September 2, 2022 20:05
@stevhliu stevhliu merged commit 65fb71b into huggingface:main Sep 2, 2022
@stevhliu stevhliu deleted the trainer-quicktour branch September 2, 2022 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants