-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import takes ~30 seconds #280
Comments
I have no idea at the moment. If you have the time, could you check two things: Does it also happen with just the single import line? import emoji And maybe test some older versions of the module, to see if it is a new change that introduced this problem. For example these versions:
Do If you are interested in trying it, I could create a test version for you that only has English language (or whatever languages you need) as suggested in the other issue. That would presumably reduce memory usage overall and reduce start-up time. |
yes regarding emoji versions: 2.9.0 => 34 sec But I just found out something really strange... Like I already mentioned int the beginning, this only happens on python3.12. With python3.11 this does not happen. 2.9.0 => ~0.06 sec I still think that it has something to do with the Just to complete the list, with python 3.11 via vs code: 2.9.0 => 0.03 sec From the commandline: |
Wow, thanks for the details! I will look into it. |
I can reproduce it with Python 3.12 on Windows 10 when running with VS Code runs a file like this, when you press I have create a issue at debugpy, maybe they know why this happens: microsoft/debugpy#1496 |
The problem seems to be the large number of lines in the file The dictionary in the file can be compressed into a single line:
resulting in a file with just 46 lines. With the compressed file, the debugging runs as fast as in Python 3.11 |
@cvzi may be it's time to think about separation of languages into different files, like it was before, what do you think? |
oh wow! Thank you very much for looking into it! :) |
I agree. Not sure it will help enough regarding this problem though, because the dictionary would still be huge. It takes 4 minutes on my computer at the moment. Even if it cuts the time to 10% it would still take 25 seconds, far too long. Putting the dictionary into a single line is obviously really ugly. But it would be a quick fix. I guess using a different file format, not Python code, could solve this problem with the debugging. For example storing the dictionary in a JSON file and the loading the JSON file when the module is imported. |
I guess right now there is no work-around when using Python 3.12? |
... to prevent debugging overhead in Python 3.12 carpedm20#280
I guess so. I am not really happy with putting the dict into a single line, but there seems to be no other quick work around. And VS Code is one of the most used editors at the moment and already there seem to be about 2000/day downloads from Python 3.12 of this library (according to PyPi stats). I have deployed that commit on my own apps, and it seems to work, i.e. in release environment, not debugging. @lsmith77 any change you could test if it actually solves the problem with VS Code for you? It does solve it on my computer. You can install from my branch |
BTW for reference: EMOJI_DATA = {
'\U0001F947': {'en': ':1st_place_medal:','status': fully_qualified,'E': 3,'de': ':goldmedaille:','es': ':medalla_de_oro:','fr': ':médaille_d’or:','ja': ':金メダル:','ko': ':금메달:','pt': ':medalha_de_ouro:','it': ':medaglia_d’oro:','fa': ':مدال_طلا:','id': ':medali_emas:','zh': ':金牌:','ru': ':золотая_медаль:','tr': ':birincilik_madalyası:','ar': ':ميدالية_مركز_أول:'},
'\U0001F948': {'en': ':2nd_place_medal:','status': fully_qualified,'E': 3,'de': ':silbermedaille:','es': ':medalla_de_plata:','fr': ':médaille_d’argent:','ja': ':銀メダル:','ko': ':은메달:','pt': ':medalha_de_prata:','it': ':medaglia_d’argento:','fa': ':مدال_نقره:','id': ':medali_perak:','zh': ':银牌:','ru': ':серебряная_медаль:','tr': ':ikincilik_madalyası:','ar': ':ميدالية_مركز_ثان:'},
'\U0001F949': {'en': ':3rd_place_medal:','status': fully_qualified,'E': 3,'de': ':bronzemedaille:','es': ':medalla_de_bronce:','fr': ':médaille_de_bronze:','ja': ':銅メダル:','ko': ':동메달:','pt': ':medalha_de_bronze:','it': ':medaglia_di_bronzo:','fa': ':مدال_برنز:','id': ':medali_perunggu:','zh': ':铜牌:','ru': ':бронзовая_медаль:','tr': ':üçüncülük_madalyası:','ar': ':ميدالية_مركز_ثالث:'},
... That reduces the import time (as expected) but it still takes too long, about 15 seconds on my computer. |
sorry didn't get to it today will try to do it tomorrow morning |
@lsmith77 Thanks for checking! |
thank you for this package and caring about reports such as this one! |
I did some performance tests to check the feasibility of JSON compared to the Python-dictionary-literal.
Command to test this:
where 10 is the repeats (should be much higher for good average results) |
I am going to continue in this thread with this JSON idea, please unsubscribe if you're not interested. Any feedback or suggestions are appreciated though :) I am thinking about making a main JSON file that has the metadata and English/aliases and a file for each language. Main file: {
"🗺️": {
"E": 0.7,
"en": ":world_map:",
"status": 2,
"variant": true
},
"🗻": {
"E": 0.6,
"en": ":mount_fuji:",
"status": 2
},
"🗼": {
"E": 0.6,
"alias": [
":tokyo_tower:"
],
"en": ":Tokyo_tower:",
"status": 2
},
...
} A language file would look like this, e.g. Spanish:
The main file would be loaded when importing the module. The language file would only be loaded when the language is used with It does mean that the This roughly reduces memory usage by about half, if only one language is used. Advantages:
Disadvantages:
So this would be a breaking change, but I don't think this would affect many people, I searched in github and I couldn't find a public repository that directly uses something like |
I think this makes sense, and i see no other option, there are simply a lot of languages and most applications need exactly one. Loading them on demand seems the right decision. EDIT: Question about your performance test methodology, does your command not also include the starting of the whole python interpreter? This would only be relevant for someone who uses this lib standalone. For most projects this will just one of many dependencies. i did a quick test with 2.11.0 def load():
import emoji
emoji.emojize(':lion:')
if __name__ == '__main__':
import timeit
res = timeit.timeit("load()",
setup="from __main__ import load",
number=1)
print(res) and this gives me a load time of just the lib around As the import statement is only once executed, even on repeats, raising the number of repetitions does not yield interesting data. |
Yes my times include the loading of the Python interpreter. It doesn't really matter, because I am only interested in the relative changes. It is possible to load the module multiple times in Python, but it is a bit hacky: import sys
def load():
import emoji
emoji.emojize(':lion:')
# remove the emoji modules from the loaded modules
for name in [name for name in sys.modules if "emoji" in name]:
del sys.modules[name]
if __name__ == '__main__':
import timeit
res = timeit.timeit("load()",
setup="from __main__ import load",
number=100)
print(res) |
FYI compressing the dict into single line has caused |
Hi!
I just upgraded my application to python 3.12.
Doing
import emoji
there takes arround 30 seconds there. Doing the same on python 3.11 takes less than a second.Any clue?
is reproducable with a simple 2 line script like
Using Windows atm with emoji 2.9.0
Something similar was already reported here: #274
The text was updated successfully, but these errors were encountered: