Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it need question id convert after computing cv_link in BERT mode? #4

Closed
AoZhang opened this issue Jul 14, 2020 · 1 comment
Closed

Comments

@AoZhang
Copy link

AoZhang commented Jul 14, 2020

It's a nice idea to encode question&schema using RAT, and thanks for your sharing of the well designed codes.

I have some doubt about cell-value linking in BERT mode when reading your code.

In ratsql/models/spider/spider_enc.py SpiderEncoderBertPreproc.preprocess_item (from line 670 to 683), I realized that you compute sc link, and then cv link.

In the procedure of sc linking you call the function Bertokenize.bert_schema_linking, which uses normalized_piece to compute schema linking, and converts question (token) id back to pieces id using idx_map.

But in the procedure of cv linking you only linking question_bert_tokens.normalized_pieces with item.schema without converting id back.

Is that a bug? I'm hoping for you response.

@berlino
Copy link
Collaborator

berlino commented Jul 14, 2020

Thanks for reporting this!

I think it's indeed a bug. For cv_linking, the token ids should also be convert it back to pieces ids. I'll submit a PR to fix this.

@alexpolozov alexpolozov linked a pull request Jul 22, 2020 that will close this issue
@alexpolozov alexpolozov removed a link to a pull request Jul 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants