-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
input data with many points saturate output image #22
Comments
The natural solution here is a normalization pass of the input data first, then plot the relative density of that dataset -- with a dial to increase or decrease the overall thresholds as desired. This is a well-studied problem - it's effectively a 2D histogram. Some links, including a numpy implementation:
I think the simplest is the 2d histogram calculated via the binning method. Preference is to avoid the numpy dependency and implement the algorithm in heatmap.c. The bones:
The only tricky thing here is how to select the number of bins. This is another well-studied problem. Some links:
...but the general approaches described include considerations for the heatmap output format. When deciding number of bins, we'll have to consider dotsize and output resolution -- perhaps to the exclusion of the "traditional" selection criteria. My starting point would be to have enough bins such that in the output image each dot overlaps 80% from it's neighbors. For example, given a dotsize of 150px and a resolution of 1024px, then use (1024/150)*5 = 34 bins. (shrug) it'll take some tuning and study. |
A naive approach (my original thought for a hack) would be to have the original density array as an array of floats and use a simple additive combination function. After that normalise and put in the range from 0-255 before final colorisation as before. It would almost double memory consumption of this stage though pixels_1B + pixels_4B to pixels_4B + pixels_4B. I'd be interested to see if it took much longer for floating point arithmetic as it is only basic addition. There is an extra normalisation step at the end though that normalisation step currently occurs for every pixel around a point: pixels[ndx] = (pixels[ndx] * pixVal) / 255. There may even be a performance increase for this solution compared to the current one when pixels < (dotsize^2*numpoints). Look forward to seeing what you come up with :) |
@jjguy I've implemented my naive approach, as it skips doing the normalisation at each step it doesn't seem to impact performance as far as time goes. Added customisable weights for the intensity decay and with the new way they are less black magic to make look right. Looking to return the max and min values from the normalisation so they can be used to create a legend. Let me know what you think. It will break backwards compatibility to some extent as images will look different though the same API is still used. |
In the current implementation, each input point is directly translated into an image of DOTSIZE pixels and blended in. If there are a large number of points relative to the chosen dotsize and image resolution, the output image is completely saturated.
The text was updated successfully, but these errors were encountered: