You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
as beeing pretty new to programming and python I am currently wrestling with encodings in oder to get the German Umlauts correct within a WordCloud.
Description
In case I feed wordcloud with an example text like: text = "Wir mögen Möglichkeiten." The "ö" are shown correctly within the wordcloud.
I have a sqlite database (utf8) with the text of 20,000 articles. When I read all the articles and save them in one text file with encoding utf-8 I can print the correct text within the prompt and open the correct text with notepad++ oder word with utf8 encoding.
When I use the same text file for the wordcloud all german Umlauts are lost and all the words with umlauts have a blank instead of the umlaut.
Expected Results
Actual Results
Versions
I am using Python 3.8.10 and tested the behaviour on linux mint, MaxOS and Windows.
I guess there will be an easy explanation, but unfortunately I am totally lost.
Thank you very much for any hint into the right direction!
Marc
The text was updated successfully, but these errors were encountered:
I feel that you are using a font file or you have not specified a font file, for example: a file with the extension ".ttc" or ".ttf". If you haven't specified a font file, it's possible that the default font file is being used. However, the default font file does not have the German diacritical marks you need, which causes the program to encounter unrecognized symbols and output "None". My suggestion is: you need to find a font file that has the diacritical marks you need or can display your text content correctly. This type of file is usually a ".ttc" or ".ttf" file, which you can obtain by searching on a search engine. If you are using a Windows system, you can find the font file you need in the "C:\Windows\Fonts" directory.
So, how can you use the font we specified in Python? First, you need to place the obtained font file in your project folder. When you create or use a wordcloud object, you can pass the path of the font file as an argument, for example: wordcloud.WordCloud(font_path=font_path), where font_path is the path to the font file you want to use.
That's my suggestion. If you have already solved this problem, congratulations!
Hi there,
as beeing pretty new to programming and python I am currently wrestling with encodings in oder to get the German Umlauts correct within a WordCloud.
Description
In case I feed wordcloud with an example text like: text = "Wir mögen Möglichkeiten." The "ö" are shown correctly within the wordcloud.
I have a sqlite database (utf8) with the text of 20,000 articles. When I read all the articles and save them in one text file with encoding utf-8 I can print the correct text within the prompt and open the correct text with notepad++ oder word with utf8 encoding.
When I use the same text file for the wordcloud all german Umlauts are lost and all the words with umlauts have a blank instead of the umlaut.
Expected Results
Actual Results
Versions
I am using Python 3.8.10 and tested the behaviour on linux mint, MaxOS and Windows.
I guess there will be an easy explanation, but unfortunately I am totally lost.
Thank you very much for any hint into the right direction!
Marc
The text was updated successfully, but these errors were encountered: