Added video processing section (Unit 7 - Transformers based models) #351

mreraser · 2024-10-03T08:33:57Z

Co-authored-by: seoulsky-field
seoulsky.field02@gmail.com

What does this PR do?

Added Transformers based models at video processing section. This document provides an overview of how Transformer models are applied in video processing, focusing on the Vision Transformer (ViT) and its video-specific variant, the Video Vision Transformer (ViViT), and TimeSFormer model.

Thank you in advance for your review.

Part of Proposed Outline Revision for Unit 7. Video & Video Processing / dicussions #348

Who can review?

@jungnerd @cjfghk5697 @1kmmk1 and anyone who wants to review!

Who can review (Final)

jungnerd

Fixed minor typos and suggested the name of the anchor.
Other than that, everything looks good to me 👍🏻

chapters/en/unit7/video-processing/transformers-based-models.mdx

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

chapters/en/unit7/video-processing/transformers-based-models.mdx

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

johko

Thanks for the contribution @mreraser !
I think the main base for most my comments is that we need to keep in mind that the course should also be good to read for beginners. I think sometimes you assume a bit too much prior knowledge, adding some more background info here and there would already be really great.

But apart from that it is a great piece of education, I already learned quite a few things just from reading through it once. Thank you so much for the effort 🤗

chapters/en/unit7/video-processing/transformers-based-models.mdx

johko · 2024-10-23T18:45:17Z

chapters/en/unit7/video-processing/transformers-based-models.mdx

+<div class="flex justify-center">
+    <img src="https://huggingface.co/datasets/hf-vision/course-assets/resolve/main/transformer_based_video_model/unit7_1_vit_architecture.png" alt="Vision transformer architecture"></img>
+</div>
+<small>ViT architecture. Taken from the <a href= "https://arxiv.org/abs/2010.11929"> original paper</a>.</small>


I'm not exactly sure how these blocks will look in HF markdown (once again), as currently the preview is missing because of the persisting token error. So I will just assume it is alright for now, but once we can see how it really looks in the docs, you might need to change it.

Yes I understand, and I’ll make adjustments if any issues arise in the future.

chapters/en/unit7/video-processing/transformers-based-models.mdx

johko · 2024-10-23T18:51:16Z

chapters/en/unit7/video-processing/transformers-based-models.mdx

+<small>ViViT architecture. Taken from the <a href = "https://arxiv.org/abs/2103.15691">original paper</a>.</small>
+
+### Embedding video clips[[embedding-video-clips]]
+


Could you add some words here about what embeddings are and why it is important (just a short info for beginners). And also say why you will explain Uniform Frame Sampling and Tubelet Embeddings. Right now I feel like this part is missing some context.

chapters/en/unit7/video-processing/transformers-based-models.mdx

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

mreraser · 2024-10-24T06:40:08Z

@johko Thank you very much for your review! 😄 I will carefully read through the details you provided and make the necessary revisions accordingly.

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

… learn about uniform frame sampling, tubelet embedding

…explain the meaning of n_w, n_h, n_t earlier.

mreraser · 2024-11-13T16:55:56Z

Hello @johko! 😃

I have carefully reviewed your feedback and addressed the points you mentioned as follows:

Removed all anchor points
Fixed minor typos
Added an explanation of embeddings and their importance
Clarified the reason for discussing Uniform Frame Sampling and Tubelet Embeddings
Provided a definition of spatio-temporal tokens
Explained the term "contextualize"
Defined n_w, n_h, and n_t earlier in the text

Thank you for your guidance, and please let me know if there’s anything else you’d like me to improve!

johko

Thank you for the changes 🙂
LGTM 👍

mreraser · 2024-11-13T23:10:38Z

Thank you for the changes 🙂 LGTM 👍

Thank you @johko 👍 I also resolved some toctree conflits. Have a good one!

chapters/en/unit7/video-processing/transformers-based-models.mdx

ATaylorAerospace

Great additions..LGTM!

mreraser added 3 commits October 3, 2024 17:19

docs: unit7/video-processing/transformers-based-models.mdx

791cbfd

_toctree.yml modification

e8b8c65

name added to welcome.mdx

1670cf4

mreraser requested review from merveenoyan and johko as code owners October 3, 2024 08:33

mreraser added 2 commits October 5, 2024 17:07

Co-authored-by: seoulsky-field <seoulsky.field02@gmail.com>

5373b2a

Merge branch 'stage' into docs-unit7/transformer_based_models

5cc06bd

jungnerd reviewed Oct 8, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser and others added 2 commits October 8, 2024 15:16

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

7f7e811

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

c313920

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

jungnerd reviewed Oct 8, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

jungnerd reviewed Oct 8, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

jungnerd reviewed Oct 8, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser and others added 3 commits October 8, 2024 15:54

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

8faa705

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

48f7543

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

60ca8ed

Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

johko reviewed Oct 23, 2024

View reviewed changes

mreraser and others added 2 commits October 24, 2024 15:31

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

6f6e127

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

86de76c

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

mreraser and others added 10 commits October 24, 2024 15:40

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

1e4229f

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

eb9beb6

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

06b67c0

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

782be93

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

6a34b5d

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

ace8cfe

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

0e53c5f

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

42764ab

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

6dfc0a5

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

0e147ec

Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>

mreraser added 3 commits October 28, 2024 20:23

Merge branch 'johko:stage' into docs-unit7/transformer_based_models

d3433c7

Add explanations about embedding, why that matters, and why we should…

9c9e219

… learn about uniform frame sampling, tubelet embedding

Add explanations about 'spatio-temporal token', 'contextualize', and …

6bac166

…explain the meaning of n_w, n_h, n_t earlier.

johko approved these changes Nov 13, 2024

View reviewed changes

Merge branch 'stage' into docs-unit7/transformer_based_models

844097e

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser commented Nov 14, 2024

View reviewed changes

chapters/en/unit7/video-processing/transformers-based-models.mdx Outdated Show resolved Hide resolved

mreraser added 10 commits November 14, 2024 17:35

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

c0c6617

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

fff1703

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

59ed64d

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

bcb5e8f

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

38e2da2

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

8bc4722

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

2fbe7f9

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

c6d8a3e

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

dab2328

Update chapters/en/unit7/video-processing/transformers-based-models.mdx

487a442

ATaylorAerospace self-requested a review November 14, 2024 10:47

ATaylorAerospace approved these changes Nov 14, 2024

View reviewed changes

ATaylorAerospace merged commit 206b0be into johko:stage Nov 14, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added video processing section (Unit 7 - Transformers based models) #351

Added video processing section (Unit 7 - Transformers based models) #351

mreraser commented Oct 3, 2024 •

edited

Loading

jungnerd left a comment •

edited

Loading

johko left a comment •

edited

Loading

johko Oct 23, 2024

mreraser Oct 28, 2024

johko Oct 23, 2024

mreraser commented Oct 24, 2024

mreraser commented Nov 13, 2024

johko left a comment

mreraser commented Nov 13, 2024 •

edited

Loading

ATaylorAerospace left a comment

		<small>ViViT architecture. Taken from the <a href = "https://arxiv.org/abs/2103.15691">original paper</a>.</small>

		### Embedding video clips[[embedding-video-clips]]

Added video processing section (Unit 7 - Transformers based models) #351

Added video processing section (Unit 7 - Transformers based models) #351

Conversation

mreraser commented Oct 3, 2024 • edited Loading

What does this PR do?

Who can review?

Who can review (Final)

jungnerd left a comment • edited Loading

Choose a reason for hiding this comment

johko left a comment • edited Loading

Choose a reason for hiding this comment

johko Oct 23, 2024

Choose a reason for hiding this comment

mreraser Oct 28, 2024

Choose a reason for hiding this comment

johko Oct 23, 2024

Choose a reason for hiding this comment

mreraser commented Oct 24, 2024

mreraser commented Nov 13, 2024

johko left a comment

Choose a reason for hiding this comment

mreraser commented Nov 13, 2024 • edited Loading

ATaylorAerospace left a comment

Choose a reason for hiding this comment

mreraser commented Oct 3, 2024 •

edited

Loading

jungnerd left a comment •

edited

Loading

johko left a comment •

edited

Loading

mreraser commented Nov 13, 2024 •

edited

Loading