2 issues about the range of documents when computing cross-document attention and the size of sentenceTransformer's embedding u_k/v_k and sentential encoding e #6

xxr5566833 · 2021-05-23T08:46:44Z

preprocess.py

when use SentenceTransformer's pretrained model to encode document(title + abstract), the document's collection is determined by the "files_path" variable in preprocess.py.

Why you annotate "data/keyphrase/json/kp20k/kp20k_train.json"(add # at the begin of this line) ?

I think kp20k_train.json's documents should be included when computing the cross-document attention just as your paper shows.

the size of e and u_k/v_k

I change the sentenceTransformer model so I have a different size of u_k/v_k, should the size of word_vec_size is determined by the sentenceTransformer's model's embedding size?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2 issues about the range of documents when computing cross-document attention and the size of sentenceTransformer's embedding u_k/v_k and sentential encoding e #6

2 issues about the range of documents when computing cross-document attention and the size of sentenceTransformer's embedding u_k/v_k and sentential encoding e #6

xxr5566833 commented May 23, 2021 •

edited

Loading

2 issues about the range of documents when computing cross-document attention and the size of sentenceTransformer's embedding u_k/v_k and sentential encoding e #6

2 issues about the range of documents when computing cross-document attention and the size of sentenceTransformer's embedding u_k/v_k and sentential encoding e #6

Comments

xxr5566833 commented May 23, 2021 • edited Loading

xxr5566833 commented May 23, 2021 •

edited

Loading