-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for pathlib.Path
. Fix #170
#175
Conversation
smart_open/smart_open_lib.py
Outdated
@@ -163,6 +164,14 @@ def smart_open(uri, mode="rb", **kw): | |||
if not isinstance(mode, six.string_types): | |||
raise TypeError('mode should be a string') | |||
|
|||
# Support opening ``pathlib.Path`` objects by casting them to strings. | |||
try: | |||
from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, pathlib
available only for python>=3.4
, that's sad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of checking for python>=3.4
and assuming pathlib
support, I opted for the try-except pattern since there is a backport of pathlib
that users may wish to use for python<3.4
:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add it to setup.py
, wdyt @mpenkov? https://github.com/RaRe-Technologies/smart_open/blob/ecb24c23f4ad58904d371d88eeb1f83389cae16e/setup.py#L19-L23
smart_open/tests/test_smart_open.py
Outdated
fpath = Path(os.path.join(CURR_DIR, 'test_data/cp852.tsv.txt')) | ||
with smart_open.smart_open(fpath, encoding='cp852') as fin: | ||
fin.read() | ||
except ImportError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a bad practice to silently pass the test, better to add a condition about python version for test https://docs.python.org/2/library/unittest.html#skipping-tests-and-expected-failures
b295f04
to
dacbe67
Compare
Hi @menshikh-iv ready for another review. Interestingly |
smart_open/tests/test_smart_open.py
Outdated
from pathlib import Path | ||
fpath = Path(os.path.join(CURR_DIR, 'test_data/cp852.tsv.txt')) | ||
with smart_open.smart_open(fpath, encoding='cp852') as fin: | ||
fin.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you want to test for equality with the expected file content here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mpenkov this commit is outdated. There is a more recent commit addressing review from @menshikh-iv.
Reviewed outdated commit. Newer commit addresses problem.
smart_open/tests/test_smart_open.py
Outdated
@@ -191,6 +191,24 @@ def test_open_with_keywords_explicit_r(self): | |||
actual = fin.read() | |||
self.assertEqual(expected, actual) | |||
|
|||
@unittest.skipIf( | |||
sys.version_info < (3, 4), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to check that pathlib
can be imported or not (not python version, because I can have installed pathlib
from backport.
@mpenkov suspicious dependency conflict for @clintval please merge the latest master into PR. |
d0df9b1
to
9c57e82
Compare
Ok @mpenkov and @menshikh-iv, I now skip the unit test if the module I rebased onto Did you want me to add |
smart_open/tests/test_smart_open.py
Outdated
@@ -191,6 +192,24 @@ def test_open_with_keywords_explicit_r(self): | |||
actual = fin.read() | |||
self.assertEqual(expected, actual) | |||
|
|||
@unittest.skipIf( | |||
pkgutil.find_loader('pathlib') is None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, I didn't know about pkgutil
, @clintval 👍
@clintval yes, add it (and change skipping condition, because of |
btw, really suspicious fail on Travis for |
5cbc331
to
d9f60b5
Compare
@menshikh-iv, opening The tests reflect this support. See these test logs for how this PR acts without And these more recent logs for how this PR acts with The only unsupported condition is if you have both >>> isinstance(pathlib.Path(), pathlib2.Path)
False If you want concurrent |
@clintval I'm +1 for using built-in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at the commits. They look good, but I suggest a few minor changes to remove duplication of effort and simplify expression.
smart_open/tests/test_smart_open.py
Outdated
# Unit test will skip if either module is unavailable so it's safe | ||
# to assume we can import _at least_ one working ``pathlib``. | ||
except ImportError: | ||
pass | ||
|
||
# builtin open() supports pathlib.Path in python>=3.6 only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to bother with this is in the tests. From the point of view of the test, it's an unnecessary detail. In this particular case, we just want to read the expected data in the quickest and simplest possible way. So in this case, I think it's better to just do:
with open(fpath) as fin:
expected = fin.read().decode('cp852')
smart_open/tests/test_smart_open.py
Outdated
@@ -193,16 +194,28 @@ def test_open_with_keywords_explicit_r(self): | |||
self.assertEqual(expected, actual) | |||
|
|||
@unittest.skipIf( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a simpler way of writing this is:
@unittest.skipIf(smart_open_lib.PATHLIB_SUPPORT is False, "your reason here")
smart_open/tests/test_smart_open.py
Outdated
"""If ``pathlib.Path`` is available we should be able to open and read.""" | ||
fpath = os.path.join(CURR_DIR, 'test_data/cp852.tsv.txt') | ||
|
||
# Import ``pathlib`` if the builtin ``pathlib`` or the backport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're duplicating unnecessary work here. We've already gone through the pain of importing the necessary pathlib module in smart_open_lib. So, we can just get rid of all this import stuff, and use:
from smart_open.smart_open_lib import pathlib
path = pathlib.Path('/foo/bar')
whenever you need access to the actual pathlib module.
7483fb9
to
452b194
Compare
Thanks @mpenkov! Great review. I completely agree and pushed another commit slimming the test down. Here are the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like my previous comment wasn't clear enough, so let me clarify here.
smart_open/tests/test_smart_open.py
Outdated
actual = fin.read() | ||
self.assertEqual(expected, actual) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still want to check that the contents are equal. We just don't want to read the expected file via open/pathlib, because that won't work for all Python versions, as you've correctly pointed out.
So the correct thing to do here is (untested pseudo-code):
expected = open(fpath).read().decode('cp852')
self.assertEqual(expected, actual)
You can use a with statement if you like, I guess that's more Pythonic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add that back in, I misunderstood your previous comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mpenkov thanks for your patience. I amended the commit.
@menshikh-iv LGTM, please merge when you're ready. |
pathlib.Path
. Fix #170
Thanks @clintval 👍, good work! |
This closes #170. First time contributing so please let me know if this needs fixing! Thanks!