Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hotfixing urllib2 + bwb API #2779

Merged
merged 1 commit into from
Dec 27, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions openlibrary/core/vendors.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import web
import urllib2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed and the exception handling for the referenced urllib2.HTTPError reviewed and updated as necessary.

import simplejson
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suspicious of the simplejson reference. Does the API really return data XML encoded and errors JSON encoded?

If the data is actually returned as XML, why are we "parsing" it with regexs instead of an XML parser?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mekarpeles I've been finding urllib2 references cause problems and can be replaced completely by requests (as it looks like you have done below.

The simplejson import is generally required with urllib2 to parse the response, but that is not needed with requests as it has a built in json() -- I think replacing urllib2 + simplejson with requests everywhere in the code is something we should aim for, it helps with Python3 and just makes network requests so much easier. Not sure it warrants a single push effort, but I think making it known that it's a common goal may help refactoring + moving forward with adding new features.

The two imports on this file and refactor should be removed.

import requests
from amazon.api import SearchException
from infogami import config
from infogami.utils.view import public
Expand All @@ -13,7 +14,7 @@
from openlibrary import accounts


BETTERWORLDBOOKS_API_URL = 'http://products.betterworldbooks.com/service.aspx?ItemId='
BETTERWORLDBOOKS_API_URL = 'https://products.betterworldbooks.com/service.aspx?ItemId='
AMAZON_FULL_DATE_RE = re.compile('\d{4}-\d\d-\d\d')
ISBD_UNIT_PUNCT = ' : ' # ISBD cataloging title-unit separator punctuation

Expand Down Expand Up @@ -291,10 +292,8 @@ def _get_betterworldbooks_metadata(isbn):

url = BETTERWORLDBOOKS_API_URL + isbn
try:
req = urllib2.Request(url)
f = urllib2.urlopen(req)
response = f.read()
f.close()
r = requests.get(url)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Requests doesn't throw an exception like urllib2 does, so it needs explicit status checking.

I've taken a crack at repairing it in #2894, but this was flagged as an issue over 3 weeks ago, so I would have hoped to not have to take it on myself (as part of the Python 3 work).

response = r.content
product_url = re.findall("<DetailURLPage>\$(.+)</DetailURLPage>", response)
new_qty = re.findall("<TotalNew>([0-9]+)</TotalNew>", response)
new_price = re.findall("<LowestNewPrice>\$([0-9.]+)</LowestNewPrice>", response)
Expand Down