You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
an example use-case can be come up with that I can demonstrate/justify for the book
Actual Behavior
The predicted number of clusters/clusterings are not accurate nor close to accurate. Seems to always favor extremely high number of clusters. I have played around with different settings for about 2 hours and cannot find one where GMM does appreciably better than K-means, and the number of clusters predicted is even close to the true number of clusters (seems to usually be 8 or 9)
Example Code
3 near-perfectly gaussians separated with extraordinarily high probability (density of overlap between the different gaussians is ~0), and I cannot seem to get AutoGMM to give me a good clustering where KMeans does appreciably worse, and AutoGMM gives me something within the ballpark of the true number of clusters. Maybe that's fine?
Step 1 generates the latent positions...
from graspologic.simulations import rdpg
pi = np.array([0.33, 0.33, 0.34])
zs = np.random.choice([0, 1, 2], replace=True, p=pi, size=200)
# the means
mus = np.array([[-.7, .7, 0], [.3, .3, .8]])
# the covariances
covars = np.stack(([[.005, .05], [.05, .8]], [[.005, -.05], [-.05, .8]], [[0.002, 0], [0, 0.002]]), axis=2)
np.random.seed(1234)
Xtrue = np.array([np.random.multivariate_normal(mus[:,z], covars[:,:,z]) for z in zs])
P_rdpg = Xtrue @ Xtrue.T
A = rdpg(Xtrue)
Expected Behavior
an example use-case can be come up with that I can demonstrate/justify for the book
Actual Behavior
The predicted number of clusters/clusterings are not accurate nor close to accurate. Seems to always favor extremely high number of clusters. I have played around with different settings for about 2 hours and cannot find one where GMM does appreciably better than K-means, and the number of clusters predicted is even close to the true number of clusters (seems to usually be 8 or 9)
Example Code
3 near-perfectly gaussians separated with extraordinarily high probability (density of overlap between the different gaussians is ~0), and I cannot seem to get AutoGMM to give me a good clustering where KMeans does appreciably worse, and AutoGMM gives me something within the ballpark of the true number of clusters. Maybe that's fine?
Step 1 generates the latent positions...
and plot it...
Step 2 spectrally embeds...
Step 3 performs the clustering...
Your Environment
Additional Details
Any other contextual information you might feel is important.
The text was updated successfully, but these errors were encountered: