Improve upload_fileobj performance

Changing the file reader buffer from `bytes` to `bytearray` significantly reduces CPU usage. Using `bytes` is inefficient because it's immutable: you get the classic string building problem, where repeatedly appending to an immutable sequence requires O(n^2) operations. From [the python docs](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations): > if concatenating bytes objects, you can similarly use bytes.join() or io.BytesIO, or you can do in-place concatenation with a bytearray object. bytearray objects are mutable and have an efficient overallocation mechanism I tried `io.BytesIO` too, but `bytearray` has slightly better performance in my testing.
terricain · May 10, 2023 · 1c9bd60 · 1c9bd60
1 parent 6932bef
commit 1c9bd60
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/aioboto3/s3/inject.py b/aioboto3/s3/inject.py
@@ -265,7 +265,7 @@ async def file_reader() -> None:
  eof = False
  while not eof:
  part += 1
- multipart_payload = b''
+ multipart_payload = bytearray()
  loop_counter = 0
  while len(multipart_payload) < multipart_chunksize:
  try:
@@ -284,7 +284,7 @@ async def file_reader() -> None:
 
  # shortcircuit upload logic
  eof = True
- multipart_payload = b''
+ multipart_payload = bytearray()
  break
 
  if data == b'' and loop_counter > 0: # End of file, handles uploading empty files