Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support installing smart_open without AWS dependencies #534

Merged
merged 18 commits into from
Sep 3, 2020

Conversation

justindujardin
Copy link
Contributor

@justindujardin justindujardin commented Aug 24, 2020

Title

Support installing smart_open without AWS dependencies

Motivation

Using smart_open with GCS should not require AWS packages to be installed.

Changes

  • show friendly errors when using smart_open.open with an unsupported scheme because of missing dependencies
  • show deprecation warning once when using s3_iter_bucket from the top-level module
  • update tox.ini to run dependency tests without any extras packages installed
  • don't install AWS deps as part of the base package

Checklist

Before you create the PR, please make sure you have:

  • Picked a concise, informative and complete title
  • Clearly explained the motivation behind the PR
  • Linked to any existing issues that your PR will be solving
  • Included tests for any new functionality
  • Checked that all unit tests pass

 - don't export s3_iter_bucket at the top-level. We could probably still make this work, but it would be ugly. This is the only breaking change I can see
 - add suggestive install commands when importing a module that you don't have the deps for
- the tests fail locally without it installed
@justindujardin
Copy link
Contributor Author

@mpenkov @piskvorky could you provide review/feedback on this?

 - wrap it in a lazy-loaded warning about being deprecated in favor of importing iter_bucket from s3
 - add some more tests
 - this isn't the prettiest, but it's still better than a generic error that something might be wrong with a link to the readme?
 - duplicate the schemes list inside of transport.py where the modules are registered. This way if you try to open a URI with that scheme we can show the nice "pip install smart_open[type]" error message.
 - add tests for transport registration and error propagation
@justindujardin
Copy link
Contributor Author

I added code to throw the helpful import errors when you call smart_open.open("scheme://bar") with a scheme that is unsupported because it's missing imports. The code isn't quite as nice as I'd like it to be (I had to duplicate the schemes outside of the transport file) but it yields a nice result when you use the open function:

(.env) smart_open > python
>>> from smart_open import open
>>> open("gs://foo")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "smart_open/smart_open/smart_open_lib.py", line 223, in open
    binary = _open_binary_stream(uri, binary_mode, transport_params)
  File "smart_open/smart_open/smart_open_lib.py", line 401, in _open_binary_stream
    submodule = transport.get_transport(scheme)
  File "smart_open/smart_open/transport.py", line 86, in get_transport
    raise ImportError(_ERRORS[scheme])
ImportError: You are trying to use the GCS functionality of smart_open
but you do not have the correct GCS dependencies installed. Try:

    pip install smart_open[gcs]


>>> open("s3://foo")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "smart_open/smart_open/smart_open_lib.py", line 223, in open
    binary = _open_binary_stream(uri, binary_mode, transport_params)
  File "smart_open/smart_open/smart_open_lib.py", line 401, in _open_binary_stream
    submodule = transport.get_transport(scheme)
  File "smart_open/smart_open/transport.py", line 86, in get_transport
    raise ImportError(_ERRORS[scheme])
ImportError: You are trying to use the S3 functionality of smart_open
but you do not have the correct AWS dependencies installed. Try:

    pip install smart_open[aws]


>>> open("azure://foo")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "smart_open/smart_open/smart_open_lib.py", line 223, in open
    binary = _open_binary_stream(uri, binary_mode, transport_params)
  File "smart_open/smart_open/smart_open_lib.py", line 401, in _open_binary_stream
    submodule = transport.get_transport(scheme)
  File "smart_open/smart_open/transport.py", line 86, in get_transport
    raise ImportError(_ERRORS[scheme])
ImportError: You are trying to use the Azure functionality of smart_open
but you do not have the correct Azure dependencies installed. Try:

    pip install smart_open[azure]

Copy link
Collaborator

@mpenkov mpenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience. Left you a review, please have a look.

setup.py Outdated Show resolved Hide resolved
smart_open/__init__.py Outdated Show resolved Hide resolved
smart_open/__init__.py Outdated Show resolved Hide resolved
smart_open/__init__.py Outdated Show resolved Hide resolved
smart_open/azure.py Outdated Show resolved Hide resolved
smart_open/transport.py Outdated Show resolved Hide resolved
justindujardin and others added 4 commits August 31, 2020 08:45
Co-authored-by: Michael Penkov <m@penkov.dev>
 - DRY up the missing import errors by putting the logic in transport.py
 - add MISSING_DEPS to transport modules that can be set to true when handling import errors. This allows the submodule to load properly (so we can inspect the scheme/s) even if it's missing required pacakges.
 - remove double-specified transport schemes (because of MISSING_DEPS)
@justindujardin justindujardin requested a review from mpenkov August 31, 2020 17:53
smart_open/transport.py Outdated Show resolved Hide resolved
@justindujardin
Copy link
Contributor Author

Thanks for the thorough review! I think that I've addressed all your comments now. Let me know if you want any more changes.

Copy link
Collaborator

@mpenkov mpenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you for your contribution @justindujardin, and congrats on your first PR to smart_open 🥇

@mpenkov mpenkov merged commit efe1e9d into piskvorky:develop Sep 3, 2020
@justindujardin justindujardin deleted the fix/install-without-aws branch September 3, 2020 01:03
@justindujardin
Copy link
Contributor Author

@mpenkov any ballpark for when this might ship in a new pypi version? 🙏

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 22, 2020

I need to fix #538 first, after that nothing is blocking the release.

@mpenkov
Copy link
Collaborator

mpenkov commented Sep 26, 2020

https://pypi.org/project/smart-open/2.2.0/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Installing without AWS dependencies? Make the submodule importing system robust to missing extras
3 participants