Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BufferedProtocol support #2033

Closed
wants to merge 9 commits into from
Closed

Add BufferedProtocol support #2033

wants to merge 9 commits into from

Conversation

ahopkins
Copy link
Member

Resolves #1873

This makes use of the asyncio.BufferedProtocol.

@codecov
Copy link

codecov bot commented Feb 17, 2021

Codecov Report

Merging #2033 (12d15dc) into master (b461cc3) will decrease coverage by 0.363%.
The diff coverage is 78.261%.

Impacted file tree graph

@@              Coverage Diff              @@
##            master     #2033       +/-   ##
=============================================
- Coverage   92.070%   91.707%   -0.363%     
=============================================
  Files           33        33               
  Lines         3102      3123       +21     
  Branches       542       544        +2     
=============================================
+ Hits          2856      2864        +8     
- Misses         166       180       +14     
+ Partials        80        79        -1     
Impacted Files Coverage Δ
sanic/app.py 91.593% <60.000%> (-0.371%) ⬇️
sanic/server.py 86.096% <83.333%> (-1.859%) ⬇️
sanic/websocket.py 78.022% <0.000%> (-4.396%) ⬇️
sanic/handlers.py 97.938% <0.000%> (+2.062%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b461cc3...12d15dc. Read the comment docs.

@ahopkins
Copy link
Member Author

for reference: https://bugs.python.org/issue32251

sanic/server.py Outdated Show resolved Hide resolved
sanic/server.py Outdated
self._time = current_time()
self.recv_buffer += data

if len(self.recv_buffer) > self.app.config.REQUEST_BUFFER_SIZE:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After playing with this new code, I wonder if this line should be

if len(self.recv_buffer) >= self.app.config.REQUEST_BUFFER_SIZE

Heres one thing I observed:

  • Set REQUEST_BUFFER_SIZE to 128 bytes.
  • BufferedProtocol chooses self._buffer with size 128 (self.recv_buffer is unsized/unlimited).
  • Send a request to the app with 384 byte request size
  • buffer_updated is called with nbytes 128
  • 128 bytes is writren to recv_buffer
  • if nbytes > REQUEST_BUFFER_SIZE is False, because 128 is not greater than 128
  • data_received is set, and http headers are read
  • http protocol calls _receive_more(), tries to unpause the transport, but its not paused
  • buffer_updated() is called again with nsize 128
  • a new 128 bytes of data is written to self.recv_buffer
  • len(self.recv_buffer) is now 256, so self.transport.pause_reading() is now executed.
  • data_received is set, and further http headers are read
  • http protocol calls _receive_more() again which unpauses the transport.
  • repeats until all of the request is read and parsed by the HTTP protocol

So I'm wondering, should the transport be paused every time the recv_buffer size matches REQUEST_BUFFER_SIZE, or only when its greater than that?
What effect does pausing the transport even have?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of pausing would be to allow for it to drain. It is more of an issue with streaming then sending complete responses. It is a protection from exhausting memory limits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that said, you are probably correct that it should be >=.

sanic/app.py Outdated
@@ -3,7 +3,14 @@
import os
import re

from asyncio import CancelledError, Protocol, ensure_future, get_event_loop

try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can have this in compat.py module, and don't need try/except blob in multiple places

# Python 3.7+
# -------------------------------------------- #

def get_buffer(self, sizehint=-1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is sizehint used for here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sizehint is the recommended minimum size for the returned buffer. It is acceptable to return smaller or larger buffers than what sizehint suggests. When set to -1, the buffer size can be arbitrary. It is an error to return a buffer with a zero size.

Source

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are ignoring it here.

@ashleysommer
Copy link
Member

I'm going to do some further testing on this, to determine a recommendation to include it in 21.3 or not.

@ahopkins
Copy link
Member Author

ahopkins commented Mar 4, 2021

Awesome 🤘

@ashleysommer
Copy link
Member

Ok.. like last time I tested it, the results are again inconclusive.

Speeds are basically the same whether using BufferedProtocol or Protocol. So similar I thought there was something wrong with my testing methodology, but all seems to be good.

I'm testing a basic hello-world Sanic application, with one server worker (no multiprocessing).
I tried different scenarios too, specifically with uvloop and without uvloop, and with an artificial receive-buffer restriction (to trigger stream pause and resume) and without a restriction.
Using wrk with 8 connections across 8 threads to hammer the server for 30 seconds per test.

My matrix was (BufferedProtocol,Protocol)(with_uvloop, without_uvloop)(with_restriction,without_restriction) giving a total of separate 8 tests.

Every test gave 4400-4500 requests per second. Even with_uvloop and without_uvloop made no difference in my testing on my laptop with Python 3.8.

Repeating all 8 tests above when restricting wrk to 1 connection at a time on 1 thread. Again the results were all within margin of error. Results were all 2200-2300 requests per second.

So I don't know. I like the BufferedProtocol implementation, its clean and correct and works well. It should be faster, but in the Sanic server it doesn't provide any speed advantage.

@ahopkins
Copy link
Member Author

ahopkins commented Mar 5, 2021

Did you try with a large payload? What happens if the response is a large blob? Maybe throw in a couple tests at different sizes?

However, these results pretty much confirm exactly my own results. And, I agree with your conclusion. It should be better, it seems like it. But, I think @Tronic streaming solution pretty much doubles up the effort, so we are already seeing that performance increase. So the Bufferedprotocol is doing the same thing except with two methods.

@ahopkins ahopkins marked this pull request as draft March 15, 2021 10:05
@ahopkins ahopkins added this to the v21.6 milestone Mar 15, 2021
Base automatically changed from master to main March 23, 2021 00:47
@ahopkins ahopkins modified the milestones: v21.6, v21.9 Jun 20, 2021
@codeclimate
Copy link

codeclimate bot commented Jul 8, 2021

Code Climate has analyzed commit 9e8520a and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 93.7% (86% is the threshold).

This pull request will bring the total coverage in the repository to 94.3% (0.0% change).

View more on Code Climate.

@ahopkins
Copy link
Member Author

ahopkins commented Jul 8, 2021

Continuing the conversation from #2183.

BufferedProtocol seems to work for no body, single body, and chunked body requests when sent individually or pipelined.

One thing that probably needs to be reexamined if we are going this direction is whether we want to continue writing to recv_buffer the same way, or use a different sync primitive for control.

@Tronic Thoughts?

self.recv_buffer = bytearray()
self.recv_buffer = self.protocol.recv_buffer = bytearray()
Copy link
Member

@Tronic Tronic Jul 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still incorrect, and in general you should not recreate the buffer. Use del self.recv_buffer[:2] or so to keep the existing buffer but to consume the data that you want removed (note: as discussed earlier, removing all or the first two bytes here might be is invalid anyway).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still am not sure I see how. If we do not, and even if we account for the http loop, you end up with \r\n as the first two bytes on the next request.

Comment on lines +331 to +335
def buffer_updated(self, nbytes: int) -> None:
data = self._buffer[:nbytes]

self._time = current_time()
self.recv_buffer += data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the get_buffer function be returning a memoryview to recv_buffer directly, rather than separate _buffer, to avoid this copying of data here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, since we are concurrently receiving more data (so cannot alter the buffer) and handling a request (which needs to consume bytes from buffer), this cannot easily work. Which begs the question: why use BufferedProtocol in the first place?

Copy link
Member Author

@ahopkins ahopkins Jul 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was just my proof of concept to see if the BufferedProtocol would play more nicely with streaming requests. We probably need to consume the data somehow from the object returned by get_buffer.

I am thinking perhaps what we might want to do is somehow sync on the buffer_updated by grabbing the nbytes value, and then using that as recv_buffer instead of having that object. It is sort of duplicative at this point. This is a larger refactor though.

@Tronic
Copy link
Member

Tronic commented Jul 8, 2021

This could offer performance benefits if a single buffer was used. In this case get_buffer needs to return a memoryview to a location just past the end of the space already used, and recv_buffer should be another memoryview from the currently unconsumed position (zero if no bytes currently in buffer have been consumed yet) to the end of data received (i.e. the same position where the memoryview of get_buffer begins).

The tricky part is that at some point the consumed bytes need to be actually removed from buffer and both memoryviews recreated accordingly, so that everything appears intact. I suppose that this update could be done within the get_buffer function.

Comment on lines +306 to +308
self._buffer = memoryview(
bytearray(self.app.config.REQUEST_BUFFER_SIZE)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a waste of RAM. The buffer should be kept as small as possible, not always the maximum permitted size. This makes a difference when handling a lot of parallel requests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure where this value is coming from, but it is the size limit that get_buffer seems to want 🤔

>> get_buffer sizehint=65536

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sizehint is hardcoded in libuv (and probably in a similar manner in plain asyncio). We can safely ignore the hint because libuv will then simply call recv_into with a smaller buffer. For the sake of simplicity, it might be best to keep our buffer hardcoded to 64 KiB as well, at least until it can be evaluated whether BufferedProtocol offers those performance benefits that we want.

@ahopkins ahopkins modified the milestones: v21.9, Future Release Aug 8, 2021
@ahopkins ahopkins closed this Sep 29, 2021
@ahopkins ahopkins deleted the bufferedprotocol branch February 23, 2022 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The AsyncIO Server should use Python's 3.7 BufferedProtocol when possible ?
4 participants