You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Distributed Parsing...
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\PYTHON\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "D:\sublime\mofan爬虫教程\4-1-distributed-scraping.py", line 18, in parse
page_urls = set([urljoin(base_url, url['href']) for url in urls]) # remove duplication
File "D:\sublime\mofan爬虫教程\4-1-distributed-scraping.py", line 18, in
page_urls = set([urljoin(base_url, url['href']) for url in urls]) # remove duplication
NameError: name 'base_url' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "4-1-distributed-scraping.py", line 49, in
results = [j.get() for j in parse_jobs] # parse html
File "4-1-distributed-scraping.py", line 49, in
results = [j.get() for j in parse_jobs] # parse html
File "C:\PYTHON\lib\multiprocessing\pool.py", line 771, in get
raise self._value
NameError: name 'base_url' is not defined
Repl Closed
The text was updated successfully, but these errors were encountered:
Distributed Crawling...
Distributed Parsing...
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\PYTHON\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "D:\sublime\mofan爬虫教程\4-1-distributed-scraping.py", line 18, in parse
page_urls = set([urljoin(base_url, url['href']) for url in urls]) # remove duplication
File "D:\sublime\mofan爬虫教程\4-1-distributed-scraping.py", line 18, in
page_urls = set([urljoin(base_url, url['href']) for url in urls]) # remove duplication
NameError: name 'base_url' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "4-1-distributed-scraping.py", line 49, in
results = [j.get() for j in parse_jobs] # parse html
File "4-1-distributed-scraping.py", line 49, in
results = [j.get() for j in parse_jobs] # parse html
File "C:\PYTHON\lib\multiprocessing\pool.py", line 771, in get
raise self._value
NameError: name 'base_url' is not defined
Repl Closed
The text was updated successfully, but these errors were encountered: