Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error trying to run update.yml to update link fields (only happens with one particular CSV) #360

Closed
dara2 opened this issue Nov 24, 2021 · 4 comments
Labels
limitation of dependency A known limitation resulting from another library/application

Comments

@dara2
Copy link

dara2 commented Nov 24, 2021

We are attempting to run the update.yml to update data for some custom Link fields on a Drupal 8 site (v. 8.9.20-dev). I have the most up-to-date version of Workbench and Python libraries, plus the most recent Workbench Integration Module installed. We've run this successfully for another batch, updating the same Link fields for another group of objects. But for some reason this one spits out this error every time (the --check runs fine):

(awesome) Born-Digitals-MacBook-Air:islandora_workbench Dara$ ./workbench --config FAP/digiupdate.yml
OK, connection to Drupal at https://francoamericandigitalarchives.org verified.
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1347, in getresponse
    response.begin()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 331, in begin
    self.headers = self.msg = parse_headers(self.fp)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 221, in parse_headers
    raise HTTPException("got more than %d headers" % _MAXHEADERS)
http.client.HTTPException: got more than 100 headers

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/util/retry.py", line 531, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/packages/six.py", line 734, in reraise
    raise value.with_traceback(tb)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3-1.26.2-py3.8.egg/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1347, in getresponse
    response.begin()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 331, in begin
    self.headers = self.msg = parse_headers(self.fp)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 221, in parse_headers
    raise HTTPException("got more than %d headers" % _MAXHEADERS)
urllib3.exceptions.ProtocolError: ('Connection aborted.', HTTPException('got more than 100 headers'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./workbench", line 1092, in <module>
    update()
  File "./workbench", line 429, in update
    if not ping_node(config, row['node_id']):
  File "/Users/Dara/islandora_workbench/workbench_utils.py", line 520, in ping_node
    response = issue_request(config, 'HEAD', url)
  File "/Users/Dara/islandora_workbench/workbench_utils.py", line 330, in issue_request
    response = requests.head(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/api.py", line 104, in head
    return request('head', url, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/sessions.py", line 677, in send
    history = [resp for resp in gen]
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/sessions.py", line 677, in <listcomp>
    history = [resp for resp in gen]
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/sessions.py", line 237, in resolve_redirects
    resp = self.send(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/requests-2.25.1-py3.8.egg/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', HTTPException('got more than 100 headers'))
(awesome) Born-Digitals-MacBook-Air:islandora_workbench Dara$

Have you seen this before?

The update.yml file looks like:

task: update
validate_title_length: false
host: "https://francoamericandigitalarchives.org"
username: [my username]
password: [my password]
input_dir: FAP
input_csv: FADA_Siena_updateX.csv
allow_adding_terms: true

CSV is attached. (Note: This is a multilingual site, if that matters.)
FADA_Siena_updateX.csv

@dara2 dara2 changed the title Error trying to run update.yml to update linked fields (only happens with one particular CSV) Error trying to run update.yml to update link fields (only happens with one particular CSV) Nov 24, 2021
@mjordan
Copy link
Owner

mjordan commented Nov 24, 2021

Looks like you're experiencing the problem described in Islandora/documentation#1764. The only thing we can do the workbench end is to try to change the maximum number of headers (by default looks like 100) that the Requests HTTP library can consume. I'll investigate further tonight.

mjordan added a commit that referenced this issue Nov 25, 2021
@mjordan
Copy link
Owner

mjordan commented Nov 25, 2021

@dara2 can you check out the issue-360 branch to see if it works around this problem?

@mjordan mjordan added the limitation of dependency A known limitation resulting from another library/application label Nov 25, 2021
@dara2
Copy link
Author

dara2 commented Nov 29, 2021

I just ran the update in the issue-360 branch and it was successful. Yay!

@mjordan
Copy link
Owner

mjordan commented Nov 29, 2021

Excellent, thanks for testing.

@mjordan mjordan mentioned this issue Nov 29, 2021
@mjordan mjordan closed this as completed Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
limitation of dependency A known limitation resulting from another library/application
Projects
None yet
Development

No branches or pull requests

2 participants