We are getting a HTTP2::Error::StreamLimitExceeded exception on connection.push_async(push) #29

amitsaxena · 2016-09-30T21:45:38Z

We were using the async method to send notifications and while sending it to around 40K devices on a single connection we got the HTTP2::Error::StreamLimitExceeded exception.

Can anyone explain what this error means? Is there a limit to the amount of data I can send on a HTTP/2 connection? A part of the notifications did go out, and the remaining failed.

We were on version 0.10.0.

ostinelli · 2016-09-30T22:42:48Z

As the error states, you are sending too many streams and you have exceeded the maximum amount of streams specified by the peer. Think of it like the HTTP/2 equivalent of a HTTP/1.1 429 Too Many Requests response.

Also, I highly recommend to upgrade to the stable 1.0.1 version.

amitsaxena · 2016-09-30T23:03:31Z

@ostinelli we have a single connection, and were sending push via it using push_async (as described in the docs). Should we add some kind of timeouts in between (say every 1000 pushes)? What's a good approach to send notifications to a high number of devices?

Thanks for the help, and the gem!

ostinelli · 2016-09-30T23:09:09Z

Try to understand when you hit a limit. This will give you better clarity on your strategy. I've seen a limit hit around 40k concurrent streams on a single connection, but I haven't looked for official documentation from Apple regarding it.

You can either send pushes in smaller batches and pause between them (as you suggest), or increase the connection pool that you use so that all of your pushes are diluted into more connections, hence every connection has less chance to hit the limit.

amitsaxena · 2016-11-17T20:16:43Z

We saw this error again today on production while sending it to just 1000 devices. We chunk our devices into sets on 1K each, and sleep for 2 seconds after each chunk, and we still got this error for the first 1K devices.

May be having some kind of fail-safe (reconnect on this exception) in apnotic for this might be a good idea, else async push is almost unusable if it fails for such a small number of device tokens. As a temporary workaround we now send to 100 devices at a time (and 2 s sleep), but this has resulted in very long running sidekiq jobs.

Does anyone else use async push on production? Is doing sync pushes the recommended way?

ostinelli · 2016-11-18T14:41:00Z

May be having some kind of fail-safe (reconnect on this exception) in apnotic for this might be a good idea

This raises other questions though.

Would retries be at:

Apnotic level, i.e. only for certain responses (like the one you're suggesting here).
NetHttp2 level, i.e. also covering the case where the connection fails.

If the latter, what should we do with requests that might (or not) have sent data but have not received a response back? Should we consider them as incomplete and retry? Wouldn't this though perform, in certain conditions, twice the same request, thus even for non-idempotent requests?

benubois · 2016-11-18T15:14:55Z

This is somewhat related to #28. I don't understand HTTP/2 well enough to know if this makes sense.

As I understand it a stream is supposed to be "complete" at a certain point. Maybe this is when a request and response are sent along with an END_STREAM. at this point the stream should be closed and reduce the count of open streams. Not sure if this happens automatically or if the client needs to send RST_STREAM.

If net-http2 kept track of MAX_CONCURRENT_STREAMS and then once this number was hit would block until a stream closed or the server sent a new MAX_CONCURRENT_STREAMS it would never exceed the number of streams.

It sounds like this is what golang is considering or what they currently do which is to open a new connection when MAX_CONCURRENT_STREAMS is exceeded.

idyll · 2016-11-21T07:07:51Z

I am not sure that HTTP2::Error::StreamLimitExceeded is a bad thing. The http-2 gem keeps track of the MAX_CONCURRENT_STREAMS and it is throwing that exception to ensure Apnotic always stays within the max.

We deliver 30-40K pushes every hour (most on the hour) and we just allow the exception to trigger the Sidekiq retry logic. I don't think I have seen more than a dozen retries per batch. Using Sidekiq in that way keeps the notification "queue" or "buffer" outside of Apnotic. I think this is a good thing because you'd need complicated logic to track notification state in the any internal buffer so that you'd know if a notification was actually sent in the event of a crash or error.

Opening a new connection when you hit "MAX_CONCURRENT_STREAMS" will result in an issue for the JWT style push because MAX_CONCURRENT_STREAMS starts out a 1 and then increases after the first push. I know this isn't an immediate problem but it needs to be in the back of our minds.

amitsaxena · 2016-11-21T10:36:28Z

We deliver 30-40K pushes every hour (most on the hour) and we just allow the exception to trigger the Sidekiq retry logic

@idyll the problem with that is multiple notifications being sent to same device. We use async push, open a connection, send push to all registered devices and then close the connection. So at the point when exception is encountered, push has already been sent to quite a few devices.

@ostinelli yes the delta thing (the pushes which have been sent and response isn't received yet, before the exception) is tricky. I believe it will be the same at NetHttp2 level, unless there a way using which we can wait for all the responses before raising the HTTP2::Error::StreamLimitExceeded exception, and no new request is sent during this duration.

idyll · 2016-11-21T17:17:47Z

@amitsaxena your setup sounds different then mine. We track things in a way that we'd know the messages were already delivered with the async push. And the resumption would start where the error occurred.

ostinelli · 2016-11-22T15:19:32Z

That's how we do it too.

amitsaxena · 2016-11-22T18:41:59Z

@idyll would you mind sharing your setup? We cannot use a pool of persistent connections (as described in docs) as we need to send notifications to different apps (100s of them), that's why we open a connection, send notifications, and then close it.

If nothing else works, we might use sync pushes. They might be slow, but I believe will be reliable given that it sends one notification at a time, and therefore we will not hit this error.

idyll · 2016-11-22T19:36:36Z

We cannot use a pool of persistent connections (as described in docs) as we need to send notifications to different apps (100s of them)

I do this with pools. I create a hash of pools where the key is the "topic". (where by topic I mean topic of your push certificate.) When a notification is created I store the topic, the push token and the content of the notification in Redis. Sidekiq workers grab the notifications and deliver them.

It's super important to avoid your DB with these workers if you want to achieve any sort of volume. So try to push all the data into redis that you need to send the notification without hitting the DB again.

If a notification isn't deliverable there's a separate queue to handle archiving the device.

amitsaxena · 2016-11-28T07:48:55Z

Interesting approach! Just that it's not practical for us to have persistent connections given that the total number of apps is high, and majority of them rarely use push notifications service.

ostinelli · 2017-02-02T18:28:07Z

Closing due to no activity. If additional input is needed, can re-open.

ostinelli added the question label Sep 30, 2016

ostinelli closed this as completed Sep 30, 2016

ostinelli reopened this Nov 18, 2016

ostinelli mentioned this issue Jan 9, 2017

socket loop thread abort_on_exception (was: Thread Safety) ostinelli/net-http2#4

Closed

ostinelli closed this as completed Feb 2, 2017

benubois mentioned this issue Sep 7, 2017

Add token-based authentication. #28

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We are getting a HTTP2::Error::StreamLimitExceeded exception on connection.push_async(push) #29

We are getting a HTTP2::Error::StreamLimitExceeded exception on connection.push_async(push) #29

amitsaxena commented Sep 30, 2016

ostinelli commented Sep 30, 2016

amitsaxena commented Sep 30, 2016

ostinelli commented Sep 30, 2016 •

edited

Loading

amitsaxena commented Nov 17, 2016

ostinelli commented Nov 18, 2016 •

edited

Loading

benubois commented Nov 18, 2016

idyll commented Nov 21, 2016

amitsaxena commented Nov 21, 2016 •

edited

Loading

idyll commented Nov 21, 2016

ostinelli commented Nov 22, 2016

amitsaxena commented Nov 22, 2016 •

edited

Loading

idyll commented Nov 22, 2016

amitsaxena commented Nov 28, 2016 •

edited

Loading

ostinelli commented Feb 2, 2017

We are getting a HTTP2::Error::StreamLimitExceeded exception on connection.push_async(push) #29

We are getting a HTTP2::Error::StreamLimitExceeded exception on connection.push_async(push) #29

Comments

amitsaxena commented Sep 30, 2016

ostinelli commented Sep 30, 2016

amitsaxena commented Sep 30, 2016

ostinelli commented Sep 30, 2016 • edited Loading

amitsaxena commented Nov 17, 2016

ostinelli commented Nov 18, 2016 • edited Loading

benubois commented Nov 18, 2016

idyll commented Nov 21, 2016

amitsaxena commented Nov 21, 2016 • edited Loading

idyll commented Nov 21, 2016

ostinelli commented Nov 22, 2016

amitsaxena commented Nov 22, 2016 • edited Loading

idyll commented Nov 22, 2016

amitsaxena commented Nov 28, 2016 • edited Loading

ostinelli commented Feb 2, 2017

ostinelli commented Sep 30, 2016 •

edited

Loading

ostinelli commented Nov 18, 2016 •

edited

Loading

amitsaxena commented Nov 21, 2016 •

edited

Loading

amitsaxena commented Nov 22, 2016 •

edited

Loading

amitsaxena commented Nov 28, 2016 •

edited

Loading