Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion service/handler/api/v1/files.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import hashlib
import logging
from io import BytesIO
from urllib.parse import quote
from uuid import UUID

from fastapi import APIRouter, Body, Depends, File, Form, HTTPException, UploadFile, status
Expand Down Expand Up @@ -271,11 +272,16 @@ async def download_file(
await storage.download_file(file_record.storage_key, file_stream)
file_stream.seek(0)

# Encode filename for Content-Disposition header (RFC 5987)
# Support both ASCII and UTF-8 filenames for better browser compatibility
ascii_filename = file_record.original_filename.encode("ascii", "ignore").decode("ascii")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: 丢弃所有非 ASCII 字符可能会导致 filename 为空或产生误导性的值。

由于 .encode("ascii", "ignore") 会丢弃非 ASCII 字符,如果 original_filename 完全由非 ASCII 字符组成,那么生成的 ascii_filename 将会是空字符串,而某些老旧的用户代理可能会显示这个值,而不是 filename*。更安全的做法是把非 ASCII 字符映射为占位符(例如 _?),并且在 ASCII 结果为空时回退到一个通用的文件名。

建议的实现方式:

from io import BytesIO
import os
from urllib.parse import quote
from uuid import UUID
        # Encode filename for Content-Disposition header (RFC 5987)
        # Support both ASCII and UTF-8 filenames for better browser compatibility
        original_filename = file_record.original_filename or "download"
        # Replace non-ASCII (or otherwise unsafe) characters with a placeholder
        ascii_filename = "".join(
            ch if ch.isascii() and ch not in ['"', "\\"] else "_"
            for ch in original_filename
        ).strip()
        # Fall back to a generic ASCII filename if everything was replaced
        if not ascii_filename:
            base, ext = os.path.splitext(original_filename)
            ascii_filename = f"download{ext}" if ext else "download"
        utf8_filename = quote(file_record.original_filename.encode("utf-8"))
Original comment in English

suggestion: Dropping all non-ASCII characters may yield an empty or misleading filename value.

Because .encode("ascii", "ignore") drops non-ASCII characters, a fully non-ASCII original_filename will produce an empty ascii_filename, which some older user agents may display instead of filename*. It would be safer to map non-ASCII characters to a placeholder (e.g. _ or ?) and/or fall back to a generic filename if the ASCII result ends up empty.

Suggested implementation:

from io import BytesIO
import os
from urllib.parse import quote
from uuid import UUID
        # Encode filename for Content-Disposition header (RFC 5987)
        # Support both ASCII and UTF-8 filenames for better browser compatibility
        original_filename = file_record.original_filename or "download"
        # Replace non-ASCII (or otherwise unsafe) characters with a placeholder
        ascii_filename = "".join(
            ch if ch.isascii() and ch not in ['"', "\\"] else "_"
            for ch in original_filename
        ).strip()
        # Fall back to a generic ASCII filename if everything was replaced
        if not ascii_filename:
            base, ext = os.path.splitext(original_filename)
            ascii_filename = f"download{ext}" if ext else "download"
        utf8_filename = quote(file_record.original_filename.encode("utf-8"))

utf8_filename = quote(file_record.original_filename.encode("utf-8"))

return StreamingResponse(
file_stream,
media_type=file_record.content_type,
headers={
"Content-Disposition": f'attachment; filename="{file_record.original_filename}"',
"Content-Disposition": f"attachment; filename=\"{ascii_filename}\"; filename*=UTF-8''{utf8_filename}",
"Content-Length": str(file_record.file_size),
},
)
Expand Down
Loading