Implement retry mechanism when using requests #466

jonathan-s · 2018-08-20T21:25:00Z

This PR implements a retry mechanism if any request would fail when using requests. It closes #464.

More detailed documentation of Retry can be found in the urllib3 documentation

Total: The amount of total retries
backoff_factor: it will sleep for: {backoff factor} * (2 ^ ({number of total retries} - 1)) between retries
status: How many times to retry on bad status codes. The ones on status_forcelist
status_forcelist: Bad status codes which it will retry on.

jonathan-s · 2018-08-21T07:46:19Z

@brandur-stripe Making a stab at retry issue that I created. Let me know if there is anything else you'd like to see here.

brandur-stripe · 2018-08-21T17:46:50Z

@jonathan-s Nice — thank you for the contribution! I'm going to have someone else on the team take an initial review pass first, then we'll see what we can do to get this merged in.

jonathan-s · 2018-08-21T18:40:38Z

Thanks a lot!

On Tue 21. Aug 2018 at 19:46, Brandur ***@***.***> wrote: @jonathan-s <https://github.com/jonathan-s> Nice — thank you for the contribution! I'm going to have someone else on the team take an initial review pass first, then we'll see what we can do to get this merged in. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#466 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACBsEtQTX47K0ZlA9Xu1wFDDdS84UP5qks5uTEeQgaJpZM4WEvAo> .

-- Sent from Gmail Mobile

mickjermsurawong-stripe · 2018-08-21T21:09:14Z

Hi Jonathan! Thank you for your time and this PR :)

To protect against creating unintended duplicate requests given automatic retries, the client should create idempotency key, if not given by the caller, following our docs[0].

The implementation on retry seems great with the standard Retry. It has flexibility to handle the connection/read failures in connect=Integer, read=Integer and special error status code status_forcelist=[] similar to behavior of our implementations in other languages. However, it lacks one detail on adding randomness to the delay to avoid simultaneously backing-off across multiple clients (step-wise jamming), following our implementations in Ruby[1] or Go[2].

My concern of going ahead with using Retry specific to requests client is that it can't support our other implementations in pycurl, urlfetch, urllib.request. Because the retry is tied in request execution method, we might have to implement an equivalent for each of other client at the request level. Ideally, we can move this retry logic up at the parent class. Concretely, we can possibly have request_with_retry at HTTPClient parent class, surrounding request method implemented by each library with our custom retry logic. (We still have to implement retry condition for exceptions thrown from each kind of client)

-cc @brandur-stripe for more thoughts if we should go ahead with Retry given that request's popularity, but we incur more implementation work in other libs (if we want to support retry there).

[0] Idempotency
https://stripe.com/docs/api/ruby#idempotent_requests

[1] Ruby sleep implementation
https://github.com/stripe/stripe-ruby/blob/ec66c3f0f44274f885de8d13de5dce2657932121/lib/stripe/stripe_client.rb#L80

[2] Go sleep implementation
https://github.com/stripe/stripe-go/blob/442690c3856640e4913163f506f0070561f3368d/stripe.go#L511

jonathan-s · 2018-08-22T18:11:15Z

@mickjermsurawong-stripe Thanks for the reply. I hear you regarding the idempotency key. Right now it looks like it's an optional requirement which you can supply for each api method. I wonder, is there a case when you would want to NOT send an idempotency key?

If there is no such case, perhaps it would be worth centralizing that at one place and let the client take care of generating an idempotency key?

With regards to avoiding step-wise locking. I think this could be worth to raise an issue upstream in urllib3.

jonathan-s · 2018-08-22T18:15:04Z

Also out of curiosity. What % usage is there of the other clients other than requests. Since you have the user-agent it would be possible to find out :)

mickjermsurawong-stripe · 2018-08-23T18:44:04Z

Hi @johnathan-s,

Idempotency key is not needed for GET and DELETE; by definition the http methods are idempotent. So we shouldn't add it unnecessarily, but doing so will make no difference to results. The recommended way for generation is V4 UUIDs.

If I understand your suggestion about centralization of idempotency setting correctly, yup! we can supply the idempotency key in the header request_headers in file api_requestor where other Stripe-specific headers are dealt with; each underlying http-client doesn't have to take care of it. If the caller passes the key in supplied_header it will override our automatically-generated key. At this layer, we also has access to http method as well.

Hmm I don't think the specific underlying client is exposed outside the HTTPClient class, and therefore it isn't recorded in user-agent. Maybe there are other ways that I'm not aware of. I'll follow up with the team on this, if we should go ahead with requests specific implementation. And I do agree that if we use this client, the added randomness in delay is less of a concern.

Meanwhile, it would be great to have tests on the added features, idempotency key and the retries. One needs to install https://github.com/stripe/stripe-mock to run our pytests.

mickjermsurawong-stripe · 2018-08-25T00:21:13Z

Hi @jonathan-s
I checked our request logs and the underlying http-client information indeed isn't in the user-agent header in our python implementation. Though I do see http_user_agent fields, but that seems to be from user who don't use our bindings. With that, I can't seem to argue using Retry for requests based on number of users here.

I had a discussion with others, and it seems like the randomness is something we would like to have still. And making PR to urllib3 as you suggest would still run into the same problem of having the change only on people who use requests.

With that, I have a more concrete follow-up on my earlier proposal on how to implement this at the abstract class in the http_client.py. Worth noting here is the should_retry_on_error. The error that client throws up now is wrapped up into Stripe-custom error. What we have to do is to go down to each client level, and assess their error and categorize them into something if we should_retry. This is so that the abstract method above gets that signal, and know how to execute the example logic. requests has a good starting point for us to conceptualize this client-level error from how it tolerates number of connect and read errors.

    # example code, not tested..
    def request_with_retry(self, method, url, headers, post_data=None):

        num_retries = 0
        response = None

        while num_retries <= max_retries:
            try:
                num_retries += 1
                response = self.request(method, url, headers, post_data) # method implemented by each client
                _, status_code, _ = response
                if not self.should_retry_on_codes(status_code):
                    break
            except error:
                if not self.should_retry_on_error(error): # signal from each client's own assessment of the error 
                    raise error
            self.sleep() # custom exponential back-off with randomness

        if response is not None:
            return response
        else:
            msg = "# descriptive errors when retry exceeded "
            raise error.APIConnectionError(msg)

I'm happy to take on this PR as well, if what I communicate isn't very clear or you are not convinced that the effort here is worthwhile. Or please let me know what arrangement works best for you to move this forward!

mickjermsurawong-stripe · 2018-08-28T03:58:50Z

Hi @jonathan-s ,
I found out that we indeed have the stats, and 99% of our users are using requests -- my apologies. Given so, we do want to roll-out our own retry logic for consistency with our libs in other languages. One simplification is that we can treat error from non-request libs as non-retryable; we maintain the behavior of non-request client. But we still want to put retry logic outside the underlying http-client, and trigger retry logic for requests error. Please let me know if we want to pursue this PR, otherwise I believe it will also be addressed as a part of the refactoring work on this issue #432

mickjermsurawong-stripe · 2018-08-28T19:42:43Z

[Referring to above to retroactively register response to issue]

jonathan-s · 2018-08-28T20:27:14Z

Hi, Currently I'm on holiday for the coming week or so. So I won't be able to pursue this PR right now, so if somebody wants to finish this as soon as possible feel free to start! If nobody has started by the time I'm back again I'll take a look at it. Cheers *Jonathan Sundqvist* Twitter <http://www.twitter.com/argparse> | Linkedin <http://se.linkedin.com/in/jonathansundqvist>

…

On Tue, Aug 28, 2018 at 9:42 PM, mickjermsurawong-stripe < ***@***.***> wrote: [Referring to above to retroactively register response to issue] — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#466 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACBsEvCyl4WbkKnvQ3mLJctbQTaY01mgks5uVZ00gaJpZM4WEvAo> .

mickjermsurawong-stripe · 2018-08-29T15:33:07Z

Thanks for the note! Will find some time to get to this :)

mickjermsurawong-stripe · 2018-09-07T21:08:57Z

@jonathan-s, I'll close this now due to the #470
Thank you for initiating this improvement feature :)

jonathan-s · 2018-09-08T06:25:59Z

Thanks!

On Fri 7. Sep 2018 at 23:09, mickjermsurawong-stripe < ***@***.***> wrote: Closed #466 <#466>. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#466 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACBsErHRTDzyyNq4DT8HpWfS6alSnlamks5uYuBqgaJpZM4WEvAo> .

-- Sent from Gmail Mobile

Implement retry mechanism when using requests

be32735

mickjermsurawong-stripe mentioned this pull request Sep 7, 2018

Python retry mechanism #470

Merged

mickjermsurawong-stripe closed this Sep 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement retry mechanism when using requests #466

Implement retry mechanism when using requests #466

jonathan-s commented Aug 20, 2018 •

edited

Loading

jonathan-s commented Aug 21, 2018

brandur-stripe commented Aug 21, 2018

jonathan-s commented Aug 21, 2018 via email

mickjermsurawong-stripe commented Aug 21, 2018

jonathan-s commented Aug 22, 2018 •

edited

Loading

jonathan-s commented Aug 22, 2018

mickjermsurawong-stripe commented Aug 23, 2018

mickjermsurawong-stripe commented Aug 25, 2018

mickjermsurawong-stripe commented Aug 28, 2018

mickjermsurawong-stripe commented Aug 28, 2018

jonathan-s commented Aug 28, 2018 via email

mickjermsurawong-stripe commented Aug 29, 2018

mickjermsurawong-stripe commented Sep 7, 2018

jonathan-s commented Sep 8, 2018 via email

Implement retry mechanism when using requests #466

Implement retry mechanism when using requests #466

Conversation

jonathan-s commented Aug 20, 2018 • edited Loading

jonathan-s commented Aug 21, 2018

brandur-stripe commented Aug 21, 2018

jonathan-s commented Aug 21, 2018 via email

mickjermsurawong-stripe commented Aug 21, 2018

jonathan-s commented Aug 22, 2018 • edited Loading

jonathan-s commented Aug 22, 2018

mickjermsurawong-stripe commented Aug 23, 2018

mickjermsurawong-stripe commented Aug 25, 2018

mickjermsurawong-stripe commented Aug 28, 2018

mickjermsurawong-stripe commented Aug 28, 2018

jonathan-s commented Aug 28, 2018 via email

mickjermsurawong-stripe commented Aug 29, 2018

mickjermsurawong-stripe commented Sep 7, 2018

jonathan-s commented Sep 8, 2018 via email

jonathan-s commented Aug 20, 2018 •

edited

Loading

jonathan-s commented Aug 22, 2018 •

edited

Loading