What is a input vector file? And workflow diagram confusion. #20

Rolands-Laucis · 2021-04-05T10:52:24Z

Hello, i am using your paper for my own thesis research. Looking at the workflow.jpg diagram, the arrows are a bit confusing. I am trying to use the Skip-Gram ngram-ngram method. From what i understand, it seems that i have to go through the steps corpus2vocab -> corpus2pairs -> paris2sgns. But paris2sgns requires an "--input_vector_file" argument. I dont know what that is and the steps didnt generate one. I assume its the resulting word embeddings vectors in a file, but if i have that, then i wouldnt be using the tool. Do i have to run the original word2vec SG method and save a .vec model and use it here? I read the research paper and didnt find an answer to this either. I also tried pairs2vocab, but it also doesnt generate the input vector file.

A separate issue is with the corpus2pairs; it generates 4 different .txt files (pairs.txt_0, pairs.txt_1, pairs.txt_2, pairs.txt_3), when i give the argument "--pairs_file ./pairs.txt". Then later do i have to run paris2sgns for all pairs files? Do i generate different output vector files for each? Do the vector files get overwritten or appended to?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is a input vector file? And workflow diagram confusion. #20

What is a input vector file? And workflow diagram confusion. #20

Rolands-Laucis commented Apr 5, 2021

What is a input vector file? And workflow diagram confusion. #20

What is a input vector file? And workflow diagram confusion. #20

Comments

Rolands-Laucis commented Apr 5, 2021