-
Notifications
You must be signed in to change notification settings - Fork 27.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split docs on modality #18205
Split docs on modality #18205
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Not really convinced by this as it's now very unclear that all the task tutorials are task-specific tutorials. I liked it better when they were grouped altogether under task. It also backfires from the intent as this shows we don't really support vision (one entry) or speech (two entries) compared to NLP (9 entries). This revamp also puts advanced guides on the top when they should be way lower in the table of contents (such as Benchmarks or Migrating from other packages). |
Thanks for the feedback! 🤗
Hmm, do you mean it’s more unclear now because each task tutorial is separated by modality? This seems clearer to me since the section headers are more scannable.
Good perspective, and I totally see what you mean! I think another way to look at it is by creating these new sections, we’re signaling that we plan to create more content for audio/CV. This gives these sections more prominence, which shows we want to focus on audio/CV. So even though it looks pretty bare right now, I think that’s ok since these sections will grow.
I don’t think we should put guides lower because they are more advanced. Instead, it may be better to prioritize guides that users are more likely to find useful. For example, I think the Migration/Train with a script guides are pretty useful. This may be a symptom of how I grouped all these guides under General Usage, in which case, we can try breaking up the section and reordering these guides by their utility. |
They were all under a "Task" section, which is not the case anymore in your proposal. In NLP, you go from fast tokenizers and multilingual to a task-specific tutorial with no warning to the user.
Benchmarks or migrating are both advanced and not useful (benchmarks have 0 issues and we are even questioning whether they should stay in the library and pytorch-pretrained-bert ceased to exist a while ago). You should check the analytics to be certain, but I'm pretty sure they are very far from the most-visited pages and they are definitely very low on the list of pages we want to nudge the users on. |
Ok I see now! You're worried users won't know the task-specific guides are guides about fine-tuning a model for a task if it is just thrown into the NLP section. I think there are some things we can do to help make this clearer to users (in order of preference):
For sure! Benchmark, Migration, and Troubleshoot are bottom-3 in page views in the General Usage section. I can bump these out and move them closer to the bottom. |
Option 2 or 3 are good compromise (my preference goes to 3 if nested-ness is not an issue). I'd leave Troubleshoot in the General Usage section (hopefully we can make it better so it gets more views), but yeah, the other two are out of place there IMO. Let's see what other people think as well, @LysandreJik @patrickvonplaten to name a few :-) |
I also think that option 3) sounds like the best approach. I don't have a problem with adding a nesting level. |
I nested the NLP section but it looks a little off since the content inside isn't aligned on the same level (I pinged @mishig25 on this). I didn't add a nested level for the audio and image sections since there's no content in those sections yet, and it might look a little strange. |
Hi team, just wanted to circle back on this and see if there are any more comments or feedback about how the docs are split. Otherwise, I think we're ready to merge! 🙂 |
Option 3.) Also looks like the right one to me :-) However I'm not a big fan of "Image" as a title. Could we maybe try to align those sections a bit with how we call the modalities on the Hub: https://huggingface.co/tasks -> so maybe replace "Image" with "Computer Vision"? |
83205b1
to
a6f45f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM barring the quotes to remove everywhere. Thanks for iterating on this PR!
* update * 🖍 add missing files * 📝 add nested sections * 🖍 align titles with tasks * oops * remove quotes from titles
Currently, audio and computer vision lack content or the existing content is mixed with NLP. This PR splits the
toctree
on modality to make it easier to discover content for audio/computer vision while also allowing us to also scale to any additional modalities we want to support. As we create additional content, these new sections make it easier to collect specific content in one place. For example, the upcominggenerate
docs can be placed in the NLP section.This structure can also help us identify gaps in the docs between each modality to ensure documentation is complete. For example, NLP has a page about tokenizers, and we can create a similar page for the other modalities using feature extractors and processors.
Other sections include:
After we split the docs, the next step would be to start planning and creating additional content to make the audio/computer vision sections more complete.
Looking forward to hearing what you think :)