Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kde.evaluate for density plot #159

Open
chum1ngo opened this issue Nov 14, 2023 · 1 comment
Open

kde.evaluate for density plot #159

chum1ngo opened this issue Nov 14, 2023 · 1 comment
Labels
question Further information is requested

Comments

@chum1ngo
Copy link

Hello, I'm trying to display several distributions in the same plot. For that I need to estimate the kde of those distributions, and then evaluate each one of them in the same space. I ilustrated what I intended to do for 1 of those distributions in the first code section below with scipy.

import scipy.stats as st
kde = st.gaussian_kde(np.linspace(-10, 10, num=10000))
kde.pdf([1,2,3,4,5,6,7,8,9,10])

Then im trying to do the same with KDEpy, but then I'm getting an error: Every data point must be inside of the grid.

from KDEpy import FFTKDE
kde = FFTKDE(bw='silverman', kernel='gaussian').fit(np.linspace(-10, 10, num=10000))
kde([1,2,3,4,5,6,7,8,9,10])

I'm not sure if this is some kind of bug because the error doesn't make a lot of sense to me or I just missunderstood how to use the methods. Is it possible evaluate points in the kde like that?

Regards

@chum1ngo chum1ngo changed the title kde.evaluate for desity plot kde.evaluate for density plot Nov 14, 2023
@tommyod
Copy link
Owner

tommyod commented Nov 15, 2023

Your grid needs to be wider than your data points. If your data is in the range [-10, 10], and you use a kernel with some width (e.g. a standard normal), then you need to define a grid on e.g. [-15, 15]. You can always chop the grid after evaluating the KDE. But think about what you really want - would you use a histogram on [1, 10] to evaluate data on [-10, 10]?

There are two reasons why "Every data point must be inside of the grid."

  • It's how the convolution/FFT based computation is set up
  • It's a good sanity check

@tommyod tommyod added the question Further information is requested label Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants