Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate to Italian #45

Open
41 of 69 tasks
lewtun opened this issue Mar 29, 2022 · 64 comments
Open
41 of 69 tasks

Translate to Italian #45

lewtun opened this issue Mar 29, 2022 · 64 comments

Comments

@lewtun
Copy link
Member

lewtun commented Mar 29, 2022

Hi there 👋

Let's translate the course to Italian so that the whole community can benefit from this resource 🌎!

Below are the chapters and files that need translating - let us know here if you'd like to translate any and we'll add your name to the list. Once you're finished, open a pull request and tag this issue by including #issue-number in the description, where issue-number is the number of this issue.

🙋 If you'd like others to help you with the translation, you can also post in our forums or tag @_lewtun on Twitter to gain some visibility.

Chapters

0 - Setup

1 - Transformer models

2 - Using 🤗 Transformers

3 - Fine-tuning a pretrained model

4 - Sharing models and tokenizers

5 - The 🤗 Datasets library

6 - The 🤗 Tokenizers library

7 - Main NLP tasks

8 - How to ask for help

Events

@CaterinaBi
Copy link
Contributor

Hello Lewis, I'd love to contribute! I'm a postdoctoral researcher in theoretical linguistics at the University of Cambridge, UK. Italian is my native language. I'd love to traslate modules 1 and 2 for a start. The only thing that scares me a bit is that I'm new to GitHub, so I might end up needing some help...

@lewtun
Copy link
Member Author

lewtun commented Mar 29, 2022

Hey @CaterinaBi thank you - we'd love to have your help with the translation! Feel free to create a post and tag me with @lewtun on our forums (https://discuss.huggingface.co/c/course/20) if you need some help on the GitHub side 🤗

@CaterinaBi
Copy link
Contributor

Hey @lewtun, amazing! I'll start straight away. I'll translate the 'Transformer models' and 'Using HF Transformers' then. Do you want me to take care of the Setup instructions, too?

@lewtun
Copy link
Member Author

lewtun commented Mar 29, 2022

Hey @lewtun, amazing! I'll start straight away. I'll translate the 'Transformer models' and 'Using HF Transformers' then. Do you want me to take care of the Setup instructions, too?

Awesome! Sorry I tagged you for the Setup section by accident 😬 . On the other hand, that might be an easy way to get familiar with GitHub and pull requests, so maybe you'd like to start there?

@CaterinaBi
Copy link
Contributor

Yes, I'll start from there and tag you in the forum if I get lost (get ready to hear from me soon!).

@sharkovsky
Copy link
Contributor

Hi @lewtun, I'd also like to contribute. I have a PhD in computational neuroscience from École Polytechnique Fédérale de Lausanne. Italian is my native language as well.
How about I start with module 3, and see how it goes? Is that acceptable?

@lewtun, @CaterinaBi, should we think of a way to "standardize" our translations (a shared glossary/vocabulary or something similar)? To make sure we all translate common things such as "train a neural network" in the same way.

Thank you!

@lewtun
Copy link
Member Author

lewtun commented Mar 29, 2022

Hey @sharkovsky, thanks for helping out and good idea about a shared glossary! Feel free to create a comment here which lists the core terms. I'll also add your name to module 3 :)

@ClonedOne
Copy link
Contributor

Hi! I would also be happy to help. I'm a phd student at Northeastern University and Italian is my native language.
I can take chapter 4 if no one is working on that.

@CaterinaBi
Copy link
Contributor

Hi @sharkovsky, having a shared glossary is a terrific idea. What about we take a few days to go through the materials, then have a quick chat and publish the standardised translations here?

@lewtun
Copy link
Member Author

lewtun commented Mar 30, 2022

Hi @ClonedOne thank you for offering to help! I've added your name to Chapter 4 🚀 !

@Nolanogenn
Copy link
Contributor

Hi! I would like to help. I am a PhD student at University of Napoli "L'Orientale", and Italian is my native language. I could work on Chapter 5 if nobody's working on it!

@lewtun
Copy link
Member Author

lewtun commented Mar 30, 2022

Thank you @Nolanogenn for offering to help! I've added your name to the list 🙏

@sharkovsky
Copy link
Contributor

sharkovsky commented Mar 30, 2022

@lewtun, @CaterinaBi, @ClonedOne, @Nolanogenn maybe we can come up with a strategy for how to translate common words, for example "machine learning". Here are some options:

  1. always leave it in english
  2. always translate it in italian (in this case, the official translation is apprendimento automatico)
  3. always translate it in italian, but in the first instance have the english term also associated to it.

The third option looks something like "apprendimento automatic (machine learning in inglese)"

Wikipedia seems to favour option 3, and I would also vote in favour of that. I know that the italian term always sounds a bit "weird", but I feel that since we're making the effort to do a translation anyway, it's nice to try to use as many italian words as possible.

But I'm open to discussion, what is your opinion?

@lewtun
Copy link
Member Author

lewtun commented Mar 30, 2022

Thanks for the insight and suggestions into how we can handle the machine learning jargon @sharkovsky ! I really like the analogy with Wikipedia, so would also favour option (3) too. I'm putting together a general TRANSLATING.md guide, so will add this suggestion if the other Italian speakers agree it makes sense :)

@sharkovsky
Copy link
Contributor

Ah, another issue that comes up in italian and may appear in other languages is how you want to address the reader.
In english you say: "But what if you want to ...?"
In italian you should choose between:

  1. (informal singular you) "Ma cosa fare se vuoi ....?"
  2. (informal plural you) "Ma cosa fare se volete ....?"
  3. (formal singular) "Ma cosa fare se vuole ...?"
  4. (impersonal) "Ma cosa fare se si vuole ...?"

Option 4 is equivalent to the english "But what if one wants to ...?"

I would vote for option 4, except those rare cases where it sounds really clunky and weird, where I would fall back on option 2. But as before, I am open to discuss other ideas!

You'll probably have the same issue in other languages (french and spanish at least, I assume), so you want to enforce a "centralized" approach through your TRANSLATING.md I'll be happy to follow that as well.

@sharkovsky
Copy link
Contributor

@CaterinaBi, @ClonedOne, @Nolanogenn you can find my first attempt at a translation of one file in my fork. I'm happy to receive some feedback if you think some things can be improved/better expressed... I'd rather discuss as much as possible now that we're still in a preliminary phase 😄

In a provisional manner, I also created a first glossary of terms that I think could be useful. But again, I'm happy to discuss both the translations and the format of the glossary! For example, now that I think of it, putting a file in my fork is probably not the best way to share a glossary.... @lewtun do you have any suggestions for something that we could all see and edit?

@davidemastricci
Copy link

Hi @lewtun, I'm A Data Scientist and a Chatbot Developer and I'd like to help with chapter 6.
I'm attending 🤗HF course and was about to start that chapter, it would be great to translate it while learning.

Italian is my main language!

@ClonedOne
Copy link
Contributor

@sharkovsky totally agree with both your points. I really like the glossary idea! I ended up with mostly the same translations :) except for a couple of things I'd like to suggest. Maybe we should move the discussion about the glossary on a forum post, so that it's easier to access it and suggest edits?

@davidemastricci
Copy link

@lewtun, @CaterinaBi, @ClonedOne, @Nolanogenn maybe we can come up with a strategy for how to translate common words, for example "machine learning". Here are some options:

  1. always leave it in english
  2. always translate it in italian (in this case, the official translation is apprendimento automatico)
  3. always translate it in italian, but in the first instance have the english term also associated to it.

The third option looks something like "apprendimento automatic (machine learning in inglese)"

Wikipedia seems to favour option 3, and I would also vote in favour of that. I know that the italian term always sounds a bit "weird", but I feel that since we're making the effort to do a translation anyway, it's nice to try to use as many italian words as possible.

But I'm open to discussion, what is your opinion?

@sharkovsky Since there is a little barrier approaching Hugging Face library, meaning that you should be familiar with terms like Machine Learning and Deep Learning, adding translation that sounds weird in Italian (ex. "apprendimento automatico" or "apprendimento profondo") could make reading less fluent.

@lewtun
Copy link
Member Author

lewtun commented Mar 31, 2022

Thanks for this great discussion @sharkovsky - it definitely exposes some subtleties with translation projects :)

For the glossary, I suppose the simplest thing right now would be to share a Google / Notion doc that others can make suggestions to. Notion is probably easier since it supports Markdown and will make it simple to copy back to this repo.

As for how we distribute the glossary, I see two possibilities:

  1. Include it as a standalone file to help guide translators
  2. Add it as a new chapter (e.g. at the very end of the course) in an MDX file and render that on the website.

If you think a glossary would be helpful for course readers, then I would favour option 2.

@CaterinaBi
Copy link
Contributor

Ah, another issue that comes up in italian and may appear in other languages is how you want to address the reader. In english you say: "But what if you want to ...?" In italian you should choose between:

  1. (informal singular you) "Ma cosa fare se vuoi ....?"
  2. (informal plural you) "Ma cosa fare se volete ....?"
  3. (formal singular) "Ma cosa fare se vuole ...?"
  4. (impersonal) "Ma cosa fare se si vuole ...?"

Option 4 is equivalent to the english "But what if one wants to ...?"

I would vote for option 4, except those rare cases where it sounds really clunky and weird, where I would fall back on option 2. But as before, I am open to discuss other ideas!

You'll probably have the same issue in other languages (french and spanish at least, I assume), so you want to enforce a "centralized" approach through your TRANSLATING.md I'll be happy to follow that as well.

Hi guys,

sorry for the late reply but I took a day off yesterday.

I agree with the need to standardise our translations, although I am quite torn when it comes to the question of whether or not we want to translate the technical terms. I believe that if we want a clean Italian version we should use the proposed form 'apprendimento automatico (machine learning)' but at the same time it's true that it's almost a pity to do so while literally anyone in Italy says 'machine learning' (I had to google the translation myself, I wasn't even aware that 'apprendimento automatico' was a thing). So what do we do? @davidemastricci, you had a good point there.

As for the way we address the reader that @sharkovsky mentioned ('But what if you want to ...?') I believe the best translation in Italian would be with an infinitive: 'Ma cosa/come fare per...'. None of the ones that were suggested sound natural to me.

What about the glossary, are we going to add a .mdx file here?

@CaterinaBi
Copy link
Contributor

@CaterinaBi, @ClonedOne, @Nolanogenn you can find my first attempt at a translation of one file in my fork. I'm happy to receive some feedback if you think some things can be improved/better expressed... I'd rather discuss as much as possible now that we're still in a preliminary phase 😄

In a provisional manner, I also created a first glossary of terms that I think could be useful. But again, I'm happy to discuss both the translations and the format of the glossary! For example, now that I think of it, putting a file in my fork is probably not the best way to share a glossary.... @lewtun do you have any suggestions for something that we could all see and edit?

Hi @sharkovsky , I've checked out your fork and your first translation seems fine to me ;)

@sharkovsky
Copy link
Contributor

sharkovsky commented Apr 1, 2022

Hi everyone,
following @ClonedOne sensible suggestion, I converted this discussion into a forum post.

@CaterinaBi, @lewtun, @davidemastricci I tried to interpret your votes, but please feel free to correct any mistakes I made.

Everyone, please let's use the forum post to discuss from now on since it will be much clearer. I will try to monitor it closely and add any words that you suggest to the vocabulary as quickly as possible.

@davidemastricci
Copy link

@sharkovsky forum link do not work anymore.

@sharkovsky

This comment was marked as outdated.

@michimichiamo
Copy link

Hello everyone, I am Michele and recently graduated in Artificial Intelligence at the University of Bologna. Italian is my native language and I would be glad to join the translation! Since Chapter 7 is still to be assigned, I propose to help with that.

@lewtun
Copy link
Member Author

lewtun commented Apr 1, 2022

Hey @sharkovsky thanks for creating the forum post I've asked one of the admins to unblock it and hopefully that happens soon Edit: it's fixed!

@michimichiamo I've added your name to the list - welcome!

@sharkovsky

This comment was marked as outdated.

@lewtun
Copy link
Member Author

lewtun commented Apr 1, 2022

By the way, I realised from the other translations that we need the first section from Chapter 1 to be translated in order for course to render on the website. My suggestions would be:

  • @CaterinaBi would you like to open a pull request with the first section translated?
  • Iterate on the glossary in the forum, and then add it as a new "chapter" to the course.

For the second point, we can then have a section in the _toctree.yml file with something like:

- title: Glossario
  sections:
  - local: glossary/1
    title: Glossario 

I think this way Italian readers can benefit from the great work you're doing to handle the various bits of ML jargon!

@EdAbati
Copy link
Contributor

EdAbati commented Jul 6, 2022

Hi everyone, I hope you are all good!
Not sure if you have seen it but I opened a PR #272 with Chapter 8. It is still WIP I hope to finish soon. In the meantime if you want to proofread it and leave suggestions, feel free :)

@CaterinaBi
Copy link
Contributor

CaterinaBi commented Jul 6, 2022 via email

@sharkovsky
Copy link
Contributor

Hi all, I've also opened the PR #283 for chapter 3. Everything has been translated but I would like to wait maybe one week to give time to everyone to review it.

@gdacciaro
Copy link
Contributor

Hi, I am an Italian student of AI. Can I help you with some chapters?

@lewtun
Copy link
Member Author

lewtun commented Aug 22, 2022

Hi, I am an Italian student of AI. Can I help you with some chapters?

Welcome to the group @gdacciaro ! Yes, you are more than welcome to translate some chapters 🚀

One option would be to see if @CaterinaBi is still working on Chapter 2 - if not, perhaps that would be a good place to start?

@CaterinaBi
Copy link
Contributor

CaterinaBi commented Aug 22, 2022 via email

@gdacciaro
Copy link
Contributor

Fine, I will translate chapter 2 :)

@EdAbati
Copy link
Contributor

EdAbati commented Sep 13, 2022

Hi @lewtun, I think I can still help a bit. Are there any other sections that need to be translated?

@lewtun
Copy link
Member Author

lewtun commented Sep 20, 2022

Hi @lewtun, I think I can still help a bit. Are there any other sections that need to be translated?

Nice, thanks for offering to help @EdAbati ! If @gdacciaro or @sharkovsky aren't working on Chapters 2 or 3, I think those would be great to have translated :)

@gdacciaro
Copy link
Contributor

I'm actually working on it

@EdAbati
Copy link
Contributor

EdAbati commented Sep 20, 2022

Hi @lewtun :) I think that Chapter 3 has been translated already in #283

@CaterinaBi
Copy link
Contributor

CaterinaBi commented Sep 20, 2022 via email

@lewtun
Copy link
Member Author

lewtun commented Sep 20, 2022

Hi @lewtun :) I think that Chapter 3 has been translated already in #283

Oh you're right! I've just updated the issue :)

In that case, Chapter 9 is a safe bet since it was added after this issue was created :)

@CaterinaBi
Copy link
Contributor

CaterinaBi commented Oct 11, 2022 via email

@CaterinaBi
Copy link
Contributor

CaterinaBi commented Oct 11, 2022 via email

@sharkovsky
Copy link
Contributor

Hi all, I see that there's only a few chapters left to go... Can I help anyone?

@EdAbati
Copy link
Contributor

EdAbati commented Jan 9, 2023

Hey @sharkovsky , do you want to help me with Chapter 9? I only did the first 3 sections, we can split the remaining 6 if you want.
Do you want to do the last 3 sections: "intro to Blocks", "Gradio check" and "end of chapter quiz"? :)

@nickprock
Copy link

Hi @gdacciaro , in "Using 🤗 Transformers", "Behind the pipeline" I find a little error.

"Now if we look at the shape of our outputs, the dimensionality will be much lower" is translated as "Ora, se osserviamo la forma dei nostri input, la dimensionalità sarà molto più bassa".

Thanks for your work guys 👍

@sharkovsky
Copy link
Contributor

Hi @EdAbati , yes I can do those last three sections!
Do you mind sharing the link to your fork/branch, so I can take a quick look at what you've already done and try to start from there maybe (if you don't mind! It's not a requirement..)
Thank you!

@gdacciaro
Copy link
Contributor

Hi @gdacciaro , in "Using 🤗 Transformers", "Behind the pipeline" I find a little error.

"Now if we look at the shape of our outputs, the dimensionality will be much lower" is translated as "Ora, se osserviamo la forma dei nostri input, la dimensionalità sarà molto più bassa".

Thanks for your work guys 👍

Thanks!

@EdAbati
Copy link
Contributor

EdAbati commented Jan 30, 2023

@sharkovsky , I have already merged the branch with the first 3 sections #373. I will start sections 4-6 in the next days :)

@nickprock
Copy link

Hi @CaterinaBi @michimichiamo can I help you on chapter 7?
I've already worked on the italian translation of the transformers library documentation.

@AlessandroMiola
Copy link

Hi everyone:)
May I help with chapter 6?

@gdacciaro
Copy link
Contributor

Can I propose a Telegram group to organize the work?

Write me an email (acciarogennaro@gmail.com) and I will send you the link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests