Protect against malicious nested archives (aka zip bombs) 

We should check the extracted file size during chunk parsing to avoid filling up the disk when extracting malicious nested archives.

Samples can be found here: https://www.bamsoftware.com/hacks/zipbomb/

For zip bombs, a check can be implemented in `is_valid`:

```python
def is_valid(self, file: io.BufferedIOBase) -> bool:
        has_encrypted_files = False
        try:
            with zipfile.ZipFile(file) as zip:  # type: ignore
                for zipinfo in zip.infolist():
                    if zipinfo.flag_bits & ENCRYPTED_FLAG:
                        has_encrypted_files = True
            if has_encrypted_files:
                logger.warning("There are encrypted files in the ZIP")
            return True
        except (zipfile.BadZipFile, UnicodeDecodeError, ValueError):
            return False
```

Something similar to this could work:

```python
with zipfile.ZipFile(file) as zip:  # type: ignore
    extracted_size = sum(e.file_size for e in zip.infolist())
    if extracted_size > SOME_CONSTANT:
        # bail out
```

I'll check if similar behavior (ie. "let's fill the whole disk") can be triggered with other formats.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Protect against malicious nested archives (aka zip bombs) #210

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Protect against malicious nested archives (aka zip bombs) #210

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions