Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Retry-After hint when in recovery mode #1878

Closed
steve-chavez opened this issue Jun 19, 2021 · 13 comments
Closed

Add Retry-After hint when in recovery mode #1878

steve-chavez opened this issue Jun 19, 2021 · 13 comments
Labels
difficulty: beginner Pure Haskell task http http compliance

Comments

@steve-chavez
Copy link
Member

If PostgREST loses the postgresql connection, it retries it with exponential backoff. If it's retrying the connection after 32 seconds, it wouldn't make sense for the client to retry a request in 1 or 2 seconds.

Retry-After would be a nice hint for this case, clients can use its value for how much should they wait before retrying.

@steve-chavez steve-chavez added difficulty: beginner Pure Haskell task http http compliance labels Jun 19, 2021
@gautam1168
Copy link
Contributor

gautam1168 commented Aug 15, 2021

Hello, I'm completely new to Haskell. So, I want to make sure I'm moving in the right direction to make a PR for this issue.

Am I right in the following:

  1. The exponential retry is implemented in Workers.hs in the connectionStatus function.
  2. This is called through the postgrest function in the file App.hs
  3. The postgrest function responds to the http requests. It checks if the statuscode is 503 and if it is, then it starts the connection worker in 1.

In this PR we should look at the line in postgrest function that starts/restarts the connection worker. Modify that line to obtain the worker's current state (as in how long until it will retry next). Once we have this information inside the postgrest function, we have to add a header to the response before responding to the client that gives the Retry-After hint.

So, I have to:

  1. Figure out how long till next retry (it looks like I have to figure out how to get this from RetryStatus). And I can probably modify the shouldRetry function to get this information.
  2. Add hint to response

@gautam1168
Copy link
Contributor

Im thinking I can add something in the AppState to keep track of how long till the next retry and update it from the line in postgrest function that starts/restarts the connWorker. Then I can get use that value to show the hint. Please tell me if I'm thinking non-sense here. It feels like I'm talking about creating a side effect and haskell is supposed to be free.

Basically I'm saying, Ill make a state in AppState which will be updated by retrying. Frankly I don't know if this is even possible.

@monacoremo
Copy link
Member

Your intuition is exactly right! 💯

The issue here is that the AppState is not modeled very well, making this much more difficult than it should be.

Currently we track separate values for statePgVersion, stateDbStructure, stateIsWorkerOn etc., when it would be much clearer to model it somewhat like this:

data AppStateConnection
  = NotConnected
  | Connecting { retryDelay :: Int }
  | ConnectionSucceeded PgVersion
  | ...

With this in place, it would be pretty straight forward to set the header. It's a pretty fundamental refactor, however, and it will be tricky to not change the behavior.

If you'd like to dive into this together, very happy to! Fully understand if you don't, though - In that case, I'd take a stab at remodeling the state and you could take care of the header afterwards!

@gautam1168
Copy link
Contributor

gautam1168 commented Aug 15, 2021

Yes I'd like to make a PR for this. However, I was trying to modify the AppState by adding just a single Int like this (the last line):

data AppState = AppState
  { statePool         :: P.Pool -- | Connection pool, either a 'Connection' or a 'ConnectionError'
  , statePgVersion    :: IORef PgVersion
  -- | No schema cache at the start. Will be filled in by the connectionWorker
  , stateDbStructure  :: IORef (Maybe DbStructure)
  -- | Cached DbStructure in json
  , stateJsonDbS      :: IORef ByteString
  -- | Helper ref to make sure just one connectionWorker can run at a time
  , stateIsWorkerOn   :: IORef Bool
  -- | Binary semaphore used to sync the listener(NOTIFY reload) with the connectionWorker.
  , stateListener     :: MVar ()
  -- | Config that can change at runtime
  , stateConf         :: IORef AppConfig
  -- | Time used for verifying JWT expiration
  , stateGetTime      :: IO UTCTime
  -- | Time with time zone used for worker logs
  , stateGetZTime     :: IO ZonedTime
  -- | Used for killing the main thread in case a subthread fails
  , stateMainThreadId :: ThreadId
  -- | Keeps track of when the next retry for connecting to database is scheduled
  , stateRetryNextIn  :: IORef Int
  }

And I was going to just set this in Workers.hs inside shouldRetry just before it returned itShould. But, your AppStateConnection is much nicer. Would you say I should try to add my version above first? As it seems to be a fairly small change and then another PR can be made for this refactor with AppStateConnection data type? Alternatively I can make an attempt to make the refactor you have suggested first.

@monacoremo
Copy link
Member

Would you say I should try to add my version above first?

Go for it! We’ll sort the rest out later :-)

@wolfgangwalther
Copy link
Member

I'd take a stab at remodeling the state and you could take care of the header afterwards!

I'm sure I will find time to review... :D

@monacoremo
Copy link
Member

I'm sure I will find time to review... :D

Yay, welcome back! 🥳

@monacoremo
Copy link
Member

monacoremo commented Aug 15, 2021

@gautam1168 if you want to discuss anything or need help, let us know!

Highly recommend the Nix development environment by the way, see https://github.com/PostgREST/postgrest/blob/main/nix/README.md

Otherwise Stack is also fine, but the environment in nix-shell is next level 👌

@gautam1168
Copy link
Contributor

I have setup nix on my system and got the unit tests running. Ill try to make a PR today, but Im new to haskell so it might take me a few days to actually get my changes to compile 😛. Ill keep you guys posted.

@gautam1168
Copy link
Contributor

I have gotten the code to compile 😄. Although I have not actually run them yet.

But I see that pull requests have to come from forks and I just made my changes in the clone. Ill transfer my changes to a fork, run some tests and open a PR, maybe tomorrow.

@gautam1168
Copy link
Contributor

gautam1168 commented Aug 17, 2021

@steve-chavez Hey so I have opened the PR and I have tested that

  • when I shut down the postgres database server running in docker then the postgrest server returns the Retry-After header with a numerical value along with the 503 status.

  • If the response doesn't have a 503 status then this header is not sent.

@monacoremo postgrest-style-check has failed for some reason. But the error message is not telling me what the problem is. Any pointers on how I can understand the problem there?

Also let me know if you want me to squash my commits.

@gautam1168
Copy link
Contributor

@steve-chavez how do I go about getting this PR reviewed/merged?

@steve-chavez
Copy link
Member Author

@gautam1168 Check the comments I've left you on #1916.

postgrest-style-check has failed for some reason

Btw, that job didn't fail on our CI. Looks ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty: beginner Pure Haskell task http http compliance
Development

No branches or pull requests

4 participants