Skip to content

Commit

Permalink
bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for …
Browse files Browse the repository at this point in the history
…large files (pythonGH-25058)

`shutil.unpack_archive()` tries to read the whole file into memory, making no use of any kind of smaller buffer. Process crashes for really large files: I.e. archive: ~1.7G, unpacked: ~10G. Before the crash it can easily take away all available RAM on smaller systems. Had to pull the code form `zipfile.Zipfile.extractall()` to fix this

Automerge-Triggered-By: GH:gpshead
(cherry picked from commit f32c795)

Co-authored-by: Igor Bolshakov <ibolsch@gmail.com>
  • Loading branch information
2 people authored and mgorny committed Mar 19, 2024
1 parent 1bd6e4c commit 5cbacf7
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 10 deletions.
16 changes: 6 additions & 10 deletions Lib/shutil.py
Original file line number Diff line number Diff line change
Expand Up @@ -1144,20 +1144,16 @@ def _unpack_zipfile(filename, extract_dir):
if name.startswith('/') or '..' in name:
continue

target = os.path.join(extract_dir, *name.split('/'))
if not target:
targetpath = os.path.join(extract_dir, *name.split('/'))
if not targetpath:
continue

_ensure_directory(target)
_ensure_directory(targetpath)
if not name.endswith('/'):
# file
data = zip.read(info.filename)
f = open(target, 'wb')
try:
f.write(data)
finally:
f.close()
del data
with zip.open(name, 'r') as source, \
open(targetpath, 'wb') as target:
copyfileobj(source, target)
finally:
zip.close()

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Fix :exc:`MemoryError` in :func:`shutil.unpack_archive` which fails inside
:func:`shutil._unpack_zipfile` on large files. Patch by Igor Bolshakov.

0 comments on commit 5cbacf7

Please sign in to comment.