-
Notifications
You must be signed in to change notification settings - Fork 79
Recipes
Eiso Kant edited this page Jul 5, 2017
·
4 revisions
This is the list of useful code snippets which use wmd-relax.
Adding the recipe created by @wbecker in https://github.com/src-d/wmd-relax/issues/14
I've been playing with this code and have it working with fetch_20newsgroups from sklearn, just to test it out, but have been coming up with a bunch of useful code snippets that might be useful for other people who are using this.
For example, I have enabled the logs to see what is going on inside by adding this:
import logging
import sys
logger = logging.getLogger("WMD")
logger.addHandler(logging.StreamHandler(sys.stdout))
(I found this useful, since I'm using max_time=1
as a parameter to nearest_neighbours
to see when it jumps out early)
I've also hacked this to work with your code, using those embeddings (this replaces step [7]
):
from wmd import WMD
from sklearn.preprocessing import normalize
X_train = normalize(vect.fit_transform(docs_train), norm='l1', copy=False)
X_test = vect.transform(docs_test)
embeddings = np.array(W_common, dtype=np.float32)
nbow = {}
i = 0
for el in X_train[0:trainSize]:
name = "#" + str(i)
nbow[name] = (name, el.indices, np.array(X_train[i, el.indices].A.ravel(), dtype=np.float32))
i += 1
calc = WMD(embeddings, nbow, vocabulary_min=2,main_loop_log_interval=1)