Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract path-contexts iteratively #97

Closed
celsofranssa opened this issue Nov 17, 2020 · 6 comments
Closed

Extract path-contexts iteratively #97

celsofranssa opened this issue Nov 17, 2020 · 6 comments

Comments

@celsofranssa
Copy link

I am working with a Java dataset composed of pairs (code, comment), as shown below:

id | code | comment
-- | -- | --
321 | \tpublic int getPushesLowerbound() {\n\t\tretu... | returns the pushes lowerbound of this board po...
323 | \tpublic void setPushesLowerbound(int pushesLo... | sets the pushes lowerbound of this board position
324 | \t\tpublic void play() {\n\t\t\t\n\t\t\t// If ... | play a sound
343 | \tpublic int getInfluenceValue(int boxNo1, int... | returns the influence value between the positi...
351 | \tpublic void setPositions(int[] positions){\n... | sets the box positions and the player position

then, is there an approach to extract the path context of each Java method creating new pairs (path_context, comment)?

@urialon
Copy link
Collaborator

urialon commented Nov 19, 2020

Hi @ceceu ,
Thanks again for your interest in code2vec!
I think that code2seq would be more appropriate for this task than code2vec.

Please see these issues:
tech-srl/code2seq#41
tech-srl/code2seq#45

Best,
Uri

@celsofranssa
Copy link
Author

Couldn't the following script

python3 code2vec.py \
    --load models/java14_model/saved_model_iter8.release \
    --test codes.txt \
    --export_code_vectors

be used to extract the vector from the codes?

  • codes.txt (one code snippet for line):
...
public int getPushesLowerbound() {\n\t\tretu... 
public void setPushesLowerbound(int pushesLo... 
public void play() {\n\t\t\t\n\t\t\t// If ... 
public int getInfluenceValue(int boxNo1, int... 
public void setPositions(int[] positions){\n...
...

@urialon
Copy link
Collaborator

urialon commented Nov 26, 2020

Hmmm, not exactly, the codes.txt file needs to be a file that was preprocessed by JavaExtractor.
The --test flag expects a preprocessed file (where every row is a list of paths), rather than a raw Java text.

@faysalhossain2007
Copy link

If we want to build C/C++ vector using code2vec, then what should we use? -

  1. JavaExtractor,
  2. CSharp Extractor,
  3. I need to build my own extractor?

@urialon
Copy link
Collaborator

urialon commented Nov 29, 2020

Hi @faysalhossain2007 ,
Thank you for your interest in code2vec!

You'll need to build your own extractor. Fortunately, there are some existing extractors for C/C++, see:
https://github.com/tech-srl/code2vec#extending-to-other-languages
and:
https://github.com/tech-srl/code2seq/#extending-to-other-languages

If you have any further questions, feel free to open a new issue, as these issues are unrelated.

Best,
Uri

@celsofranssa
Copy link
Author

Hmmm, not exactly, the codes.txt file needs to be a file that was preprocessed by JavaExtractor.
The --test flag expects a preprocessed file (where every row is a list of paths), rather than a raw Java text.

@urialon,
thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants