Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict file types #370

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open

Restrict file types #370

wants to merge 6 commits into from

Conversation

harminius
Copy link
Contributor

@harminius harminius commented Feb 6, 2025

Resolves https://github.com/MerginMaps/server-private/issues/2732

Problem:
We allow users to upload any file, even the (potentially) malicious ones 🏴‍☠️

See tickets for more details and the Pentest report

Solution:
We are more confident with using blacklists. 🛡️

Validate extension during project push - aka push_start. Abort the upload if it includes files with blacklisted extensions.
Do not rely solely on the extension check. Determine the file type from its header and block blacklisted types - once we have the file on our filesystem (in push_finish).

Zipping the unsupported makes it possible to upload it.

E.g. block .exe

Screenshot from 2025-02-06 08-08-00

Extension renaming of the unsupported file doesn't help ⛔

Screenshot from 2025-02-06 08-40-48

Potential whitelists are in the comment

Do not allow to sync files with extensions unless whitelisted

Whitelist approach is suggested in this PR as it is considered to be safer. However, using blacklists is also a way, see this comment for suggested blacklists.

@harminius harminius requested a review from MarcelGeo February 6, 2025 08:00
server/mergin/sync/storages/disk.py Outdated Show resolved Hide resolved
server/mergin/sync/utils.py Show resolved Hide resolved
server/mergin/sync/public_api_controller.py Outdated Show resolved Hide resolved
validate_filename(filename)


def validate_file(head, filename):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why we do not have one function for validating mime types and extensions?

server/mergin/sync/utils.py Outdated Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Feb 6, 2025

Pull Request Test Coverage Report for Build 13240989483

Details

  • 50 of 50 (100.0%) changed or added relevant lines in 3 files are covered.
  • 49 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.9%) to 93.275%

Files with Coverage Reduction New Missed Lines %
server/mergin/sync/utils.py 1 84.13%
server/mergin/commands.py 12 31.71%
server/mergin/app.py 36 74.52%
Totals Coverage Status
Change from base Build 13240347753: 0.9%
Covered Lines: 6755
Relevant Lines: 7242

💛 - Coveralls

@harminius
Copy link
Contributor Author

harminius commented Feb 7, 2025

Current approach is (in web security recommended) whitelist.
If we are afraid to be too restrictive we could use blackilists.
Blocking logic would stay the same.

Here's the suggestion:

FORBIDDEN_EXTENSIONS = {
    ".ade", ".adp", ".app", ".appcontent-ms", ".application", ".appref-ms",  
    ".asp", ".aspx", ".asx", ".bas", ".bat", ".bgi", ".cab", ".cdxml", ".cer",  
    ".chm", ".cmd", ".cnt", ".com", ".cpl", ".crt", ".csh", ".der", ".diagcab",  
    ".dll", ".drv", ".exe", ".fxp", ".gadget", ".grp", ".hlp", ".hpj", ".hta",  
    ".htc", ".htaccess", ".htpasswd", ".inf", ".ins", ".iso", ".isp",  
    ".its", ".jar", ".jnlp", ".js", ".jse", ".jsp", ".ksh", ".lnk", ".mad",  
    ".maf", ".mag", ".mam", ".maq", ".mar", ".mas", ".mat", ".mau", ".mav",  
    ".maw", ".mcf", ".mda", ".mdb", ".mde", ".mdt", ".mdw", ".mdz", ".msc",  
    ".mht", ".mhtml", ".msh", ".msh1", ".msh2", ".mshxml", ".msh1xml", ".msh2xml",  
    ".msi", ".msp", ".mst", ".msu", ".ops", ".osd", ".pcd", ".pif", ".pl",  
    ".plg", ".prf", ".prg", ".printerexport", ".ps1", ".ps1xml", ".ps2",  
    ".ps2xml", ".psc1", ".psc2", ".psd1", ".psdm1", ".pssc", ".pst", ".py",  
    ".pyc", ".pyo", ".pyw", ".pyz", ".pyzw", ".reg", ".scf", ".scr", ".sct",  
    ".settingcontent-ms", ".sh", ".shb", ".shs", ".sys", ".theme", ".tmp",  
    ".torrent", ".url", ".vb", ".vbe", ".vbp", ".vbs", ".vhd", ".vhdx",  
    ".vsmacros", ".vsw", ".webpnp", ".website", ".ws", ".wsb", ".wsc",  
    ".wsf", ".wsh", ".xbap", ".xll", ".xnk"
}
FORBIDDEN_MIME_TYPES = {
    "application/x-msdownload",
    "application/x-sh",
    "application/x-bat",
    "application/x-msdos-program",
    "application/x-dosexec",
    "application/x-csh",
    "application/x-perl",
    "application/javascript",
    "application/x-python-code",
    "application/x-ruby",
    "application/java-archive",
    "application/vnd.ms-cab-compressed",
    "application/x-ms-shortcut",
    "application/vnd.microsoft.portable-executable",
    "application/x-ms-installer",
    "application/x-ms-application",
    "application/x-ms-wim",
    "text/x-shellscript",
}

@MarcelGeo
Copy link
Contributor

Current approach is (in web security recommended) whitelist. If we are afraid to be too restrictive we could use blackilists. Blocking logic would stay the same.

Here's the suggestion:

FORBIDDEN_EXTENSIONS = {
    ".ade", ".adp", ".app", ".appcontent-ms", ".application", ".appref-ms",  
    ".asp", ".aspx", ".asx", ".bas", ".bat", ".bgi", ".cab", ".cdxml", ".cer",  
    ".chm", ".cmd", ".cnt", ".com", ".cpl", ".crt", ".csh", ".der", ".diagcab",  
    ".dll", ".drv", ".exe", ".fxp", ".gadget", ".grp", ".hlp", ".hpj", ".hta",  
    ".htc", ".htaccess", ".htpasswd", ".img", ".inf", ".ins", ".iso", ".isp",  
    ".its", ".jar", ".jnlp", ".js", ".jse", ".jsp", ".ksh", ".lnk", ".mad",  
    ".maf", ".mag", ".mam", ".maq", ".mar", ".mas", ".mat", ".mau", ".mav",  
    ".maw", ".mcf", ".mda", ".mdb", ".mde", ".mdt", ".mdw", ".mdz", ".msc",  
    ".mht", ".mhtml", ".msh", ".msh1", ".msh2", ".mshxml", ".msh1xml", ".msh2xml",  
    ".msi", ".msp", ".mst", ".msu", ".ops", ".osd", ".pcd", ".pif", ".pl",  
    ".plg", ".prf", ".prg", ".printerexport", ".ps1", ".ps1xml", ".ps2",  
    ".ps2xml", ".psc1", ".psc2", ".psd1", ".psdm1", ".pssc", ".pst", ".py",  
    ".pyc", ".pyo", ".pyw", ".pyz", ".pyzw", ".reg", ".scf", ".scr", ".sct",  
    ".settingcontent-ms", ".sh", ".shb", ".shs", ".sys", ".theme", ".tmp",  
    ".torrent", ".url", ".vb", ".vbe", ".vbp", ".vbs", ".vhd", ".vhdx",  
    ".vsmacros", ".vsw", ".webpnp", ".website", ".ws", ".wsb", ".wsc",  
    ".wsf", ".wsh", ".xbap", ".xll", ".xnk"
}
FORBIDDEN_MIME_TYPES = {
    "application/x-msdownload",
    "application/x-sh",
    "application/x-bat",
    "application/x-msdos-program",
    "application/x-dosexec",
    "application/x-csh",
    "application/x-perl",
    "application/javascript",
    "application/x-python-code",
    "application/x-ruby",
    "application/java-archive",
    "application/vnd.ms-cab-compressed",
    "application/octet-stream",
    "application/x-ms-shortcut",
    "application/vnd.microsoft.portable-executable",
    "application/x-ms-installer",
    "application/x-ms-application",
    "application/x-ms-wim",
    "text/x-shellscript",
}

This looks more sensible for me now.

@harminius harminius changed the base branch from develop to path_validation_in_upload February 7, 2025 14:47
@harminius harminius changed the title Restrict file types and filepaths Restrict file types Feb 10, 2025
@harminius
Copy link
Contributor Author

whitelist:
ALLOWED_EXTENSIONS = {
# Geospatial
# Shapefile components
".shp",
".shx",
".dbf",
".prj",
".cpg",
".qix",
".sbn",
".sbx",
# Vector data formats
".geojson",
".kml",
".kmz",
".gpx",
".dxf",
".svg",
".gpkg",
".json",
# Raster data formats
".tif",
".tiff",
".geotiff",
".asc",
".vrt",
".grd",
".img",
".adf",
# Point cloud data formats
".las",
".laz",
".ply",
".xyz",
".e57",
".pcd",
# Database and container formats
".mbtiles",
".sqlite",
".gpkg",
# Geospatial metadata and styles
".sld",
".qml",
".lyr",
".qgz",
".qgs",
# Other specialized formats
".hdf",
".hdf5",
".netcdf",
".nc",
".dem",
".dt2",
".dt0",
".map",
".tab",
".mif",
".mid",
# Images
".jpg",
".jpeg",
".png",
".bmp",
".gif",
".heic",
".webp",
".tif",
".tiff",
# Text documents
".pdf",
".doc",
".docx",
".odt",
".rtf",
".txt",
# Others
".zip",
}

ALLOWED_MIME_TYPES = {
"application/x-shapefile",
"application/x-dbf",
"text/plain",
"application/octet-stream",
"application/geo+json",
"application/vnd.google-earth.kml+xml",
"application/vnd.google-earth.kmz",
"application/gpx+xml",
"application/geopackage+sqlite3",
"application/vnd.sqlite3",
"application/json",
"text/xml",
"text/html",
"application/vnd.mapbox-vector-tile",
"application/x-sqlite3",
"application/vnd.ogc.sld+xml",
"application/xml",
"application/x-qgis",
"application/x-hdf",
"application/x-hdf5",
"application/x-netcdf",
"application/pdf",
"application/msword",
"application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"application/vnd.oasis.opendocument.text",
"application/rtf",
"text/plain",
"application/zip",
}

server/mergin/sync/utils.py Outdated Show resolved Hide resolved
Base automatically changed from path_validation_in_upload to develop February 10, 2025 11:56
@harminius harminius force-pushed the file_upload_restriction branch from edbca50 to 67ac1ee Compare February 10, 2025 12:19
def is_supported_extension(filepath) -> bool:
"""Check whether file's extension is supported."""
ext = os.path.splitext(filepath)[1].lower()
return ext and ext not in FORBIDDEN_EXTENSIONS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not we have some input specific file that would not give extension? Just to make sure we do not break something with forbidden files with no ext.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there could be empty extension.

>>> import os
>>> os.path.splitext("Makefile")
('Makefile', '')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants