-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[add] translatepy v2.0 #15
Conversation
Just cloned your fork, I'll check how it works! |
@ZhymabekRoman What do you think of my commit? |
Also, I thought about adding a better shell version (with something like |
I really like the idea that we can very easily add more translators, and even open this to some sorts of plugins, because the user can add whatever Also, I think that giving out the detected language rather than "auto" is needed because without it it would mean that the user needs to make another request when for example they are trying to translate something but show the translation only when the language is different from the original language |
Thank you, great work.
I think we can use |
What's cool about |
Do you think it would be a good idea to use CSV to store ISO 693 values? Wouldn't it be better to use JSON instead of csv? New CSV table solves a few problems:
I also added to the list values that are not in the official ISO 639 standard. For example Yandex Translator can translate text in emoji, and only Yandex supports translation of Latin Kazakh and Cyrillic Uzbek |
I guess we could make the data in CSV and convert it to a Python dict so it is a native python object and there is no I/O time when launching I never worked concretely with CSV but if you are familiar with it why not. Also, we should add the translation to the other languages |
@ZhymabekRoman I just added the interactive interface to |
Nice! I fixed some bug |
So, I've reworked the ISO 639 data storage mechanism a little bit. All the information is stored in a CSV table, which is very easy to edit and the data looks structured. All of the ISO 639 data and the languages supported by the services were compiled from scratch from public sources. And by the way, GitHub can display CSV in a browser: https://github.com/ZhymabekRoman/translate/blob/main/playground/iso639.csv I also wrote a special script that converts CSV to Python code - named typle. This script is in translate/playground/export_csv_iso639_table.py, and it generates file iso639_table.py, which should be put into translate/translatepy/utils/ folder. And yes by the way, I accidentally deleted the playground folder and all the scripts inside, if there were any needed scripts restore from GIT history. Here are a couple of examples of changes. Previously, the Language class did not provide information about the languages which are not on the ISO 639 list but are supported by services. Now you can get full information about the language if you know the language code used by the translation service. For example, let's take the language supported by the Bing service - Chinese Simplified Language. The code of the language used by Bing for translation is zh-Hans. Let's get information about the language:
Great, we have full information about the language. Let's also try to get information about emoji, which only Yandex supports and is not on the official ISO 639 list:
Also by this you can get about the language of the text. If before the language code was returned by the API
Now returns the Language object
|
Lmao, I just remembered that it was possible to use named typle instead of creating separate results model classes For example:
|
I mean, using classes isn't that bad too lol Also, I looked at the script creating the python version of the CSV: Did everything work while doing so much translation with Yandex? (also if you changed all of the translations with the Yandex's ones we could merge it with the previous data, generated by translating using Google translate to get more data while checking the similarity to improve the accuracy) |
@ZhymabekRoman Do you think that we should keep the Like we could just leave them as normal functions, raise an exception by default so that we don't need to add it and raise an exception on each translator class. I think though that we should keep the |
Yes, I tried to make more than 100 000 requests - everything works fine And I think the PR is ready. Idk why tests won't works, but in python interactive shell works fine |
Hmmm, yeah, I think that's a great idea |
Yea, I think that I'll merge it and we'll continue the small changes on the main branch |
New features:
- Exception raising
- Proxy support (partly needs to be refined - WIP)
- A better class management, with base classes
- Full code refactoring
- New Bing Translate implementation
And more .....
WIP*:
- Fully implement text to spech function
- Convert ISO 639 to CSV
- Implement supported_languages method
*WIP - working in process