Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apns push performance #68

Closed
silver0511 opened this issue Dec 28, 2016 · 11 comments
Closed

apns push performance #68

silver0511 opened this issue Dec 28, 2016 · 11 comments
Labels

Comments

@silver0511
Copy link

Hi,when i execute this code(res, err := client.Push(notification)), it cost about 1~3 seconds. but i want push more(probably push 1000 per second). how can i improve the performance. Thanks

@sideshow
Copy link
Owner

sideshow commented Jan 3, 2017

@silver0511

  • For best performance, you should only have once instance of the push client and pass this round. The TLS connection itself can take a second or so to connect, and if you are by chance setting it up or tearing it down every push then this could be affecting performance. (Apple suggest keeping the connection open all the time).

  • Speed is greatly affected by the location of your server. Ie, if you're just testing locally, behind a proxy or if your server is outside the states then you're not going to get great performance. With a decent server located in the west coast US (close to apple) you should be able to get up to 10k pushes per second.

  • To get high throughput you should be pushing in an async manner. I've just added a simple example to the repo of one way you could do this using channels. See https://github.com/sideshow/apns2/blob/master/_example/channel/main.go
    Ideally you should use channels rather than just spawning go routines because they give you better control over back pressure and will help to avoid you having timeout issues such as this Client.Timeout exceeded while awaiting headers #24

Let me know how you get on, and i will look to close this in a few days.

@serg472
Copy link

serg472 commented Jan 14, 2017

Would anyone be able to share what sort of performance they are getting in terms of req/sec/core?

I have existing PHP solution that uses legacy socket connection api and is able to send 1000/sec/core notifications. I'm currently getting 10x less with Go on the same hardware. I feel the biggest issue is that the client has to wait for the server response with the new api, where before it would just open a socket and stream the data nonstop while listening to feedback endpoint for errors in parallel.

  • Is there a way to check that the http2 connection remains open? Can I add a listener somewhere which would show when a connection is closed? Seeing that a token is now a part of the api url, does the transport protocol really reuses the connection between different urls?
  • Is the certificate being transferred with every request? Apple doc mentions using temp tokens instead of certificates, is it worth it? Can I add a listener to see what is being sent over the wire for each push?
  • I'm sure the answer is no, but is there any way to send pushes without waiting for the server's response from the previous push? Any sort of batch processing? Google lets you send multiple tokens with the same message at least.

Thanks.

@nathany
Copy link

nathany commented Jan 16, 2017

@serg472 If you create a worker pool to send the notifications using the existing API, you can send many more notifications. On a small test I used 20 workers with the Buford APNS library to send 400 notifications/sec. You should be able to use a similar pattern with @sideshow's apns2.

From discussions with Apple, it should be possible to have 4000 open streams with the new API (therefore 4000 workers), but the number is dynamic. Too many steams and Apple starts giving errors, and I don't yet know a good way to adjust it on the fly.

If you pick a fairly safe number, such as 50-100 workers, you should have little trouble besting the legacy PHP solution, plus the new API has the benefit of returning error responses immediately instead of needing the feedback service.

As to the mention of cores, it's up to the Go scheduler to distribute your workers (goroutines) across cores, scaling to how ever many cores you have. Thus the more relevant number is the how many workers = goroutines = number of http2 streams.

@serg472
Copy link

serg472 commented Jan 18, 2017

Thanks. I increased the amount of goroutines to be well in hundreds and it seemed to help (still 2x slower than socked api currently but it's manageable, don't think individual http calls could ever match one-way socket streaming speed). The sweet spot for me seems to be at around 400 threads (32 cores).

@nathany
Copy link

nathany commented Jan 19, 2017

Thanks for the update. @serg472 I agree, the socket streaming has less overhead so it should be possible to go faster. The socket streaming comes with it's own set of downsides though.

As far as 400 threads, it so happens that Apple servers inform the client on how many streams can be used. In the newer JWT API they start with 1 stream and then ramp it up. Currently the HTTP/2 package for Go doesn't expose this information, and I'm not exactly sure how to proceed there. If we had that information, we may be able to ramp up the threads dynamically based on what the Apple servers tell us. Here is the issue I opened for Go: golang/go#17265

@serg472
Copy link

serg472 commented Jan 19, 2017

Hm interesting. I spent a lot of time testing in past few days and performance seems almost random, at some point 400 looks like a clear winner, an hour later 100 performs the best and 400 is 3x slower. If Apple dictates the amount of connections that makes sense now I guess.

@nathany Can you please elaborate a bit more about this. So when http2 client is created and apple lets say allows 100 streams at the moment, what will happen when apple changes this number, will the client follow or it will stay at 100 until closed? If so, would it make sense to recreate clients from time to time to reset this counter?

@nathany
Copy link

nathany commented Jan 19, 2017

Apple's server send settings whenever they change how many streams they are allowing. See SettingMaxConcurrentStreams at https://github.com/golang/net/blob/master/http2/transport.go#L1812. You can add a log statement there if you want to see what's coming from Apple.

How to adjust for it, I'm not entirely sure yet. Maybe switching on/off some of the workers? For now I'm just curious if there is a good way for x/net/http2 to expose those SettingMaxConcurrentStreams events. There may also be some difference between the # of streams Apple allows and the optimal numbers you were finding... so perhaps that number isn't as useful as I think.

@runner66
Copy link

about the demo using channel,is it safe to use one sokcet in multi-workers?

@sideshow
Copy link
Owner

@runner66 Yes the channel is safe for concurrent access.

@sideshow
Copy link
Owner

sideshow commented May 22, 2017

Hi @silver0511, I added a page to the wiki around push speed and what you could expect to get in terms of throughput. Interested to hear how you got on with this. I was able to get more like 4,000 p/sec per core. See https://github.com/sideshow/apns2/wiki/APNS-HTTP-2-Push-Speed
I will look to close this issue off in a few days. Thanks

@bulletblp4
Copy link

Is the performance different for dev endpoint vs prod ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants