You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, i am using your paper for my own thesis research. Looking at the workflow.jpg diagram, the arrows are a bit confusing. I am trying to use the Skip-Gram ngram-ngram method. From what i understand, it seems that i have to go through the steps corpus2vocab -> corpus2pairs -> paris2sgns. But paris2sgns requires an "--input_vector_file" argument. I dont know what that is and the steps didnt generate one. I assume its the resulting word embeddings vectors in a file, but if i have that, then i wouldnt be using the tool. Do i have to run the original word2vec SG method and save a .vec model and use it here? I read the research paper and didnt find an answer to this either. I also tried pairs2vocab, but it also doesnt generate the input vector file.
A separate issue is with the corpus2pairs; it generates 4 different .txt files (pairs.txt_0, pairs.txt_1, pairs.txt_2, pairs.txt_3), when i give the argument "--pairs_file ./pairs.txt". Then later do i have to run paris2sgns for all pairs files? Do i generate different output vector files for each? Do the vector files get overwritten or appended to?
The text was updated successfully, but these errors were encountered:
Hello, i am using your paper for my own thesis research. Looking at the workflow.jpg diagram, the arrows are a bit confusing. I am trying to use the Skip-Gram ngram-ngram method. From what i understand, it seems that i have to go through the steps corpus2vocab -> corpus2pairs -> paris2sgns. But paris2sgns requires an "--input_vector_file" argument. I dont know what that is and the steps didnt generate one. I assume its the resulting word embeddings vectors in a file, but if i have that, then i wouldnt be using the tool. Do i have to run the original word2vec SG method and save a .vec model and use it here? I read the research paper and didnt find an answer to this either. I also tried pairs2vocab, but it also doesnt generate the input vector file.
A separate issue is with the corpus2pairs; it generates 4 different .txt files (pairs.txt_0, pairs.txt_1, pairs.txt_2, pairs.txt_3), when i give the argument "--pairs_file ./pairs.txt". Then later do i have to run paris2sgns for all pairs files? Do i generate different output vector files for each? Do the vector files get overwritten or appended to?
The text was updated successfully, but these errors were encountered: