-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add http/https
schema correctly
#242
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @gilv, thanks for PR
Please also add a test for it.
smart_open/smart_open_lib.py
Outdated
@@ -615,3 +615,8 @@ def _encoding_wrapper(fileobj, mode, encoding=None, errors=DEFAULT_ERRORS): | |||
else: | |||
decoder = codecs.getwriter(encoding) | |||
return decoder(fileobj, errors=errors) | |||
|
|||
def _add_sheme_to_host(host): | |||
if (host.startswith('http')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary outer brackets, please remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if the host is httpsomething.com
, what will happen? Probably you need to check "http://" instead.
@menshikh-iv added tests and addressed the comments. thanks |
integration-tests/test_general.py
Outdated
class Test(unittest.TestCase): | ||
|
||
|
||
def test_host_name(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a suitable place for the test, this isn't an integration test, please add your test to existing files in smart_open/tests/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, needs some minor changes.
smart_open/smart_open_lib.py
Outdated
@@ -615,3 +615,8 @@ def _encoding_wrapper(fileobj, mode, encoding=None, errors=DEFAULT_ERRORS): | |||
else: | |||
decoder = codecs.getwriter(encoding) | |||
return decoder(fileobj, errors=errors) | |||
|
|||
def _add_sheme_to_host(host): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/sheme/scheme
host = 'http://a.com/b' | ||
expected = 'http://a.com/b' | ||
self.assertTrue(expected == smart_open_lib._add_sheme_to_host(host)) | ||
host = 'a.com/b' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of these should be a separate test case. Putting multiple checks into a single test case is bad practice, because one failure can mask subsequent ones.
smart_open/tests/test_smart_open.py
Outdated
def test_host_name(self): | ||
host = 'http://a.com/b' | ||
expected = 'http://a.com/b' | ||
self.assertTrue(expected == smart_open_lib._add_sheme_to_host(host)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to store the result of the function in a separate variable use .assertEquals(expected, actual).
That way, if the assert fails, the test runner will show you both the expected and actual values, and how they differ. That will save you time debugging.
If you just use .assertTrue, the test runner will just show you when that value is False, and you'll have to debug your code to work out how exactly the comparison failed.
@menshikh-iv I guess we can merge after the minor issues are fixed. I think it may be simpler to deal with the kwargs merge first, because the two will (may?) conflict. |
@menshikh-iv Done |
@mpenkov @menshikh-iv why it's not merged yet? |
http/https
schema correctly
Sometimes host provided to the method read_csv() may already starts with http or https.
However existing code always adds "http://", this makes invalid host if http or https was part of host url, making it http://https:// or http://https://