-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incomplete SQL prediction with PICARD #67
Comments
Hi @DreamerDeo, Edit: it looks like your first example is from Edit2: how many incomplete sql predictions are we talking about? how many is "many"? |
I know what the problem is. |
Thanks for your quick reply! A1: The db_id of the above three examples are A2: For T5-large, 61 incomplete sql in dev (1034 in total). I am using the CLICK ME
|
And when I am using the same input/output format like yours, there are still some incomplete sql. CLICK ME
Some cases could be explained by |
Thanks, can you provide a continuations that the model predicted but that where rejected for these examples? |
Thank you, this is very useful. I can turn these into test cases. |
Yes, I would love to do this job that but what do you mean by |
More like (1), but just for the last step. I want to see which token proposals were rejected when production ended. Thanks! |
Since the log file is a little long, I paste the log in Google doc. You can edit it as you want. Thanks! :) |
I took care of the issues with the parser. I believe that what remains are problems with the model(s) or the data. |
@tscholak Thanks! But I find that the latest docker images
If roll back to |
@DreamerDeo I pushed a fix. Please check :) |
@tscholak It works now. But prediction becomes much slower, is that normal?
|
I did not observe a slowdown, but I also didn't complete a full evaluation. |
@tscholak FYI, in the first 10% eval data, it's okay (shows that the eval time will be ~4 mins as before). But after that, the prediction will be stuck and displays that remains one hours to eval. The eval is not finished right now so I can't tell you the accuracy, but the incomplete SQL problem is solved right now :) CLICK MEFor comparsion, the former version of Here is the
|
Before I try to reproduce, can you confirm please that you are seeing the stalling for the original spider eval set and not your altered one with added table names? |
@tscholak It works now by clean the docker. Thank you very much for your help in this issue :) You really help me a lot on this! |
I reopened the issue. We need to find out which examples take a lot of time to generate. |
Glad to hear it, can you confirm though that it is consistently good now? |
@tscholak I observe that (1) after reboot the server and cleaning the docker cache, it will predict in the normal prediction speed under your setting (2) as for my output format (TABLE.COLUMN), the accuracy is largely improved but there are still some incomplete cases. I think that's because the model is trained with Deepspeed (the lr scheduler and optimizer is different), if the model is trained use your script, it works well). And the prediction speed is still very slow for this cases ((TABLE.COLUMN)). I attempt to predict the dev set in 10 pieces to find PS: T5-large |
And if I want to change the haskell code to build my own picard server. (I want to do debug job to push my project more fast.)
Is that correct? |
Thank you, this will help a lot.
Interesting, and a bit surprising. Will try to reproduce.
You can do that, or you can use VS Code and start a dev container. You can then make changes to both the Haskell and the Python code, recompile Picard, run the tests, and even run evaluation. The Haskell code is built with |
Appreciate for this interesting work!
I trained a new T5 model from scratch using your script and predicted with PICARD but encounter a problem.
Modification: replacing the
COLUMN
withTABLE.COLUMN
in SQL as follows.Problem: generating many incompleted SQL with PICARD (T5-large), which could be correctly generated without PICARD.
Here are some examples.
I try to improve the
picard_max_tokens_to_check
from2
to3
then the above SQL have been generated correctly.However, many incompleted SQL still exists, even if the
num_beams=8
andpicard_max_tokens_to_check=6
.Like this example:
Here are the debug log by printing the
input_ids
. https://github.com/ElementAI/picard/blob/5ff827fa65c719ff975a37bd1d6940214731f3f5/seq2seq/utils/picard_model_wrapper.py#L369CLICK ME
Could you give me some suggestion for this?
Thanks in advance!
The text was updated successfully, but these errors were encountered: