Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate_from_frequencies may use negative font size in some cases #686

Open
BLKSerene opened this issue Aug 4, 2022 · 2 comments
Open

Comments

@BLKSerene
Copy link

BLKSerene commented Aug 4, 2022

Description

I use generate_from_frequencies to generate word clouds with non-frequency data (test statistics, bayes factors, etc.) which could be negative, and in most cases it works. But in some cases, it raises an error.

It seems that the problem is that generate_from_frequencies would try to use a negative font size in some cases.

Steps/Code to Reproduce

Example:

import wordcloud
word_cloud = wordcloud.WordCloud()

word_cloud.generate_from_frequencies({'a': 3, 'b': -5, 'c': -1}) # OK!
word_cloud.generate_from_frequencies({'a': 3, 'b': -5}) # Error!

Expected Results

No error.

Actual Results

Traceback (most recent call last):
File "<pyshell#25>", line 1, in
word_cloud.generate_from_frequencies({'a': 3, 'b': -5})
File "D:\Python\lib\site-packages\wordcloud\wordcloud.py", line 453, in generate_from_frequencies
self.generate_from_frequencies(dict(frequencies[:2]),
File "D:\Python\lib\site-packages\wordcloud\wordcloud.py", line 509, in generate_from_frequencies
# find possible places using integral image:
File "D:\Python\lib\site-packages\PIL\ImageDraw.py", line 607, in textsize
return font.getsize(
File "D:\Python\lib\site-packages\PIL\ImageFont.py", line 859, in getsize
w, h = self.font.getsize(text)
File "D:\Python\lib\site-packages\PIL\ImageFont.py", line 483, in getsize
size, offset = self.font.getsize(text, "L", direction, features, language)
OSError: invalid argument

Versions

Windows-10-10.0.19044-SP0
Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]
NumPy 1.22.3
matplotlib 3.5.1
wordcoud 1.8.2.2

@BLKSerene BLKSerene closed this as not planned Won't fix, can't repro, duplicate, stale Jul 21, 2023
@amueller amueller reopened this Aug 1, 2023
@amueller
Copy link
Owner

amueller commented Aug 1, 2023

Indeed, not planned, but I'm happy to keep it open in case someone wants to plan it - it's unlikely I'll look at it, but I'd be happy to accept a fix.

@BLKSerene
Copy link
Author

BLKSerene commented Aug 4, 2023

@amueller I would be happy to open a PR for this. But should I modify the calculation logic of font size or just modify users' raw data? An easy fix would be adding the absolute value of the smallest negative values in data (plus a tiny value 1e-15 to avoid too many zeros) to all data points in case of negative numbers, e.g.
{'a': -2, 'b': -2, 'c': -1, 'd': 5}
would be changed to
{'a': 1e-15, 'b': 1e-15, 'c': 1 + 1e-15, 'd': 7 + 1e-15}
then everything works just as normal frequency data would do.

Another issue is that frequency data should generally contain only non-negative integers, so should I just patch the original generate_from_frequencies function to allow negative data as well or add another new function such as generate_from_data with the patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants