Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be more responsible about network requests #84

Open
ssokolow opened this issue Jul 31, 2022 · 2 comments
Open

Be more responsible about network requests #84

ssokolow opened this issue Jul 31, 2022 · 2 comments

Comments

@ssokolow
Copy link

When I tried entering an invalid language code to confirm that there's a Python exception I need to handle if the language code selected in my existing Enchant-based infrastructure isn't supported by nlprule, I got this very surprising error message:

ValueError: HTTP status client error (404 Not Found) for url (https://github.com/bminixhofer/nlprule/releases/download/0.6.4/ef_tokenizer.bin.gz)

Personally, I consider it very irresponsible to not warn people that a dependency is going to perform network requests under some circumstances, nor to provide an obvious way to handle things offline.

I highly recommend you change this and, for my own use, since I tend to incorporate PyO3-based stuff into my PyQt apps anyway, I think I'll probably switch to writing my own nlprule wrapper so I can trust that, if no network libraries show up in the Cargo.lock, and the author isn't being actively malicious, then what I build will work on an airgapped machine or in a networkless sandbox.

(Seriously. Sandboxes like Flatpak are becoming more and more common. Just assuming applications will have network access is not cool.)

@bminixhofer
Copy link
Owner

Good point. I don't quite agree on the magnitude of the issue though. I was taking inspiration from Huggingface's transformers .from_pretrained API which does basically the same thing. I do not see a big issue with network requests to a trusted URL.

You are very welcome to open a PR to check for availability offline. Otherwise I will leave this open and might get around to it at some point, but I am currently not actively developing this library so it will take time.

@ssokolow
Copy link
Author

ssokolow commented Aug 1, 2022

I do not see a big issue with network requests to a trusted URL.

I don't see it as a security thing for the program so much as a point of frustration when it comes time to build your distributables and find that either they're incomplete (eg. you think you're saving a complete .msi or .exe installer, only to discover that you're missing files when you're on deployment and military policy prevents you from just going online to grab them) or something like flatpak-builder or the Debian/Fedora/Gentoo/etc. packaging environment is erroring out because, for security and reproducibility reasons, they operate as follows:

  1. Download all dependencies without executing third-party code, based on a manifest written in something like YAML or JSON or TOML or whatever precursor APT and RPM use. (i.e. Like cargo fetch. Here's an example Flatpak manifest that I wrote for a feature request to package lgogdownloader on Flathub.)
  2. Run the build process inside a sandbox with no network access. (i.e. cargo build --offline in a sandbox)
  3. Install the package in a location that the application won't have write access to when run as an ordinary user... in the case of Flatpak or Snaps, also sandboxed such that the application won't have network access unless the manifest asked for it. (The --share=network under finish-args in the manifest I linked... which then shows up in an imposing "this program will be granted these permissions. Continue installing?"-style prompt similar to how browser extensions work, providing passive pressure to ask for fewer permissions.)

They have a pretty "no exceptions" attitude toward this separation of concerns, since they want to do the build themselves on their own server farm, and, for Flathub, you are the maintainer, so you can't just let someone else figure out how to work around such a road bump.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants