-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tables aren't redefined for re-runs of UDF apply #536
Comments
Hi @robbieculkin, Thanks for your question. I think this is a problem related to the Snorkel package since it assumes all labeling functions can at least apply to one sample in the dataset which means the labeling function cannot always return ABSTAIN. In Fonduer, we save the labeling function outputs in a sparse format which means we will store the labeling function name as a key based on your definition while if it always returns ABSTAIN Fonduer won't save any results. And we send the labeling function names and outputs to snorkel to calculate the weak labels which cause your error if you have some labeling function always return ABSTAIN. FYI: Fonduer gradually updates labeling function outputs which means if you update the only results or add new results (it won't clear existing results by default). If you want to clear all existing results you can call the Thanks, |
Hi Sen, thanks for the information. Maybe it's an edge case, but I can imagine scenarios (like mine) with small training sets or very specific labeling functions that might result in only ABSTAIN answers. Thanks, |
Hi @robbieculkin, I am not sure whether Snorkel can handle that or not. Let us think a way to solve this issue on our side. Sen |
Thanks @senwu, I really appreciate your team's support. |
@robbieculkin Sorry for the late response. We will fix this asap. |
Description of the bug
As part of iterative development in a Jupyter environment,
apply
may be re-run several times. The developer might need to update candidates or create a new labeling function, for example.When this happens, the corresponding Postgres table is cleared but not dropped. This means that the definition of the table cannot change to accommodate the updated parameters for
apply
.To Reproduce
Steps to reproduce the behavior:
stg_temp_lfs
list.Upon calling
LFAnalysis
, the following exception is thrown:Expected behavior
Underlying tables for a re-run of a UDF
apply
method should not only be cleared, but dropped.Error Logs/Screenshots
Full stack trace:
Environment (please complete the following information)
Additional context
#263 (comment) advises restarting Python, but this does not appear to solve the problem.
The text was updated successfully, but these errors were encountered: