-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added video processing section (Unit 7 - Transformers based models) #351
Added video processing section (Unit 7 - Transformers based models) #351
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed minor typos and suggested the name of the anchor.
Other than that, everything looks good to me 👍🏻
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @mreraser !
I think the main base for most my comments is that we need to keep in mind that the course should also be good to read for beginners. I think sometimes you assume a bit too much prior knowledge, adding some more background info here and there would already be really great.
But apart from that it is a great piece of education, I already learned quite a few things just from reading through it once. Thank you so much for the effort 🤗
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
<div class="flex justify-center"> | ||
<img src="https://huggingface.co/datasets/hf-vision/course-assets/resolve/main/transformer_based_video_model/unit7_1_vit_architecture.png" alt="Vision transformer architecture"></img> | ||
</div> | ||
<small>ViT architecture. Taken from the <a href= "https://arxiv.org/abs/2010.11929"> original paper</a>.</small> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exactly sure how these blocks will look in HF markdown (once again), as currently the preview is missing because of the persisting token error. So I will just assume it is alright for now, but once we can see how it really looks in the docs, you might need to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I understand, and I’ll make adjustments if any issues arise in the future.
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
<small>ViViT architecture. Taken from the <a href = "https://arxiv.org/abs/2103.15691">original paper</a>.</small> | ||
|
||
### Embedding video clips[[embedding-video-clips]] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some words here about what embeddings are and why it is important (just a short info for beginners). And also say why you will explain Uniform Frame Sampling and Tubelet Embeddings. Right now I feel like this part is missing some context.
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
@johko Thank you very much for your review! 😄 I will carefully read through the details you provided and make the necessary revisions accordingly. |
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
Co-authored-by: Johannes Kolbe <2843485+johko@users.noreply.github.com>
… learn about uniform frame sampling, tubelet embedding
…explain the meaning of n_w, n_h, n_t earlier.
Hello @johko! 😃 I have carefully reviewed your feedback and addressed the points you mentioned as follows:
Thank you for your guidance, and please let me know if there’s anything else you’d like me to improve! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes 🙂
LGTM 👍
Thank you @johko 👍 I also resolved some toctree conflits. Have a good one! |
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
chapters/en/unit7/video-processing/transformers-based-models.mdx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great additions..LGTM!
Co-authored-by: seoulsky-field
seoulsky.field02@gmail.com
What does this PR do?
Added
Transformers based models
at video processing section. This document provides an overview of how Transformer models are applied in video processing, focusing on the Vision Transformer (ViT) and its video-specific variant, the Video Vision Transformer (ViViT), and TimeSFormer model.Thank you in advance for your review.
Part of Proposed Outline Revision for Unit 7. Video & Video Processing / dicussions #348
Who can review?
@jungnerd @cjfghk5697 @1kmmk1 and anyone who wants to review!
Who can review (Final)