Alternative approaches to clustering and dimension reduction (UMAP in particular) #1166
Replies: 1 comment 5 replies
-
At the end of my day here, but a few points real quick: UMAP does not directly apply to the polis vote matrix interpreted as a graph, since that graph in it's purest form is a (as you point out, signed) multigraph. As far as I know, UMAP only operates on unsigned/unweighted _uni_graphs (if we can call them that). So the trick is figuring out how to make this transformation. We have to be really careful here too, because doing a naive thing and just counting up the number of matching votes (even normalizing by number of comments both responded to) could very easily go awry if between participants A, B and C, A and B agreed on several comments which were more or less consensus statements (overwhelming support across participants), while A and C agreed on several comments for which there's much less consensus. One nice thing about UMAP though that touches on something you point out is that where you don't have information about two participants (didn't vote on any of the same comments) you have the option of letting the network topology pull those two participants together or apart depending on who they might have had overlapping votes with (and those participants in turn, and so on; friend of a friend, etc). Regarding your "clique" based clustering idea, this sounds somewhat similar to the MCL clustering method; Are you familiar with this? It doesn't directly apply to our situation, but again, could potentially be adapted. Thanks |
Beta Was this translation helpful? Give feedback.
-
Over in #883, @JacobCWBillings mentioned:
This didn't strictly fall under the topic there (fair K-means), so I'm moving over here.
Beta Was this translation helpful? Give feedback.
All reactions