-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: retry connection on failed schema cache load #1685
fix: retry connection on failed schema cache load #1685
Conversation
When the "Failed to load the schema cache" error happens, the connection is not retried.
Ideally we'd have "recovery tests" that could detect these errors. But they're a bit complicated - need to stop/start pg, recover with SIGUSR1 or NOTIFY, check pgrest is alive, etc. Maybe we can look at how to do them after #1682 is merged. So, I think this one should be good to merge as it is. |
In #1632 (comment) I assumed that there is no error message when one of the db schema scripts fails. I just tested with this branch by introducing a syntax error in the pfkSourceColumns query - the error is reported. It doesn't look like this changed here, so I am not sure what happened back when I did not get any errors. But this also leads to flooding my console:
and so on. I'm not sure we actually want to retry like this? This seems like an error that it's not recoverable from. I think all SQL errors in dbStructure are errors that should not be there at all and are not caused by the user or the environment. They are bugs in PostgREST, so unlikely to change by restarting. Are there any errors that can be recovered from? I think connection errors are caught before even trying the query, right? I feel like we should exit on an error like this. Or at least debounce the retry (we do this for the database connection, right?). |
@wolfgangwalther I think a check for a syntax error can be added here: postgrest/src/PostgREST/Error.hs Lines 214 to 220 in 11d62a8
I agree it'd be useful, reminds me of an error that happened some time ago with FDWs: #1352 (not a syntax error though). I'll check it out. |
We should be careful in quitting on retrying the schema cache query.
So for now I've only added the strict case of dying on a syntax error. @wolfgangwalther WDYT? |
failureMessage = fromMaybe "" e | ||
failureMessage = fromMaybe mempty e | ||
-- Chek for a syntax error(42601 is the pg code). This would mean the error is on our part somehow, so we treat it as fatal. | ||
checkIsFatal (PgError _ (P.SessionError (H.QueryError _ _ (H.ResultError (H.ServerError "42601" e _ _))))) = Just $ toS e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I'm not sure if the json cast error mentioned here has the same number, but this will allow me to easily extend it to cover that error as well.
main/Main.hs
Outdated
putErr = hPutStrLn stderr . toS . errorPayload $ err | ||
case checkIsFatal err of | ||
Just _ -> do | ||
hPutStrLn stderr ("A fatal error ocurred when loading the schema cache" :: Text) >> putErr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding wording along the lines of "this is probably a bug in PostgREST. Please report it at "?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that sounds nice for the user.
"This is probably a bug in PostgREST. Please report it at https://github.com/PostgREST/postgrest/issues"?
Is that good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can be sure that I will then report my syntax errors during development to the issue tracker :D
Of course - if there is any chance to recover from the error without user interaction, then we should retry. If the user needs to create a function to make this work, they could also trigger a restart of PostgREST? This seems to be a case that's somewhere in the middle. Not sure about that. However both your examples seem to be about fdws - it seems reasonable to restart for fdw stuff, because we can't tell why the fdw failed. It could very well be just a connection loss or whatever, so easily retryable. Dying has the advantage that the user needs to take action and it's more likely that they will report a bug to us - so for those cases where it's highly likely that we have a bug, this should be the best thing to do. Starting with the syntax error seems reasonable, even if it's just for convenience while developing (with still hypothetical If in doubt, retrying should be the safe choice. |
True. Now that you mention it, I think it would be fit even on fdws. As it is, the inifinite loop can discourage the user in trying to make postgrest work.
Yeah. When trying to get a syntax error I got a "42703"(undefined column), we can refine it progressively. |
The code coverage job is taking some time, but I believe this should be good to merge. |
Retry the connection when the "Failed to load the schema cache" error happens. Also die if the schema cache query has a syntax error.
When the "Failed to load the schema cache" error happens, the connection is not retried. This can be reproduced on master...steve-chavez:failed-to-load-schema.
This error is actually a regression that happened on #1542(check this deleted comment).