-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug/Crash #22
Comments
Subsequent run(s) (that actually worked) got these following error... command: -----> Worker: 1 - Delay: 1 seconds -----> Attempt: [1/1] Snapshot [3880/670904] - Worker: 1
|
actually upon inspection, it appears that both runs ended at the same spot, the files are identical |
hm ssl.SSLZeroReturnError seems not like a problem within the code... can you get the exact snapshot url which causes this error? so i could dive a bit into investigations. and also maybe try pip update for 1.5.0 |
Ok, looked to see if I can find the URL but it's not in the csv nor the window. Also why does it download the files twice? In the CSV it shows that each file is downloaded twice with a status of 200 for OK.... And will try the newest version that you just put out. |
I also just noticed that the delay function isn't applying to the failed 404/301 urls but appears to only work with the 200 status ones. Not Working Delay: -----> Attempt: [1/1] Snapshot [801/670904] - Worker: 1 -----> Attempt: [1/1] Snapshot [802/670904] - Worker: 1 Working Delay: -----> Attempt: [1/1] Snapshot [1137/670904] - Worker: 1 -----> Worker: 1 - Delay: 1 seconds -----> Attempt: [1/1] Snapshot [1138/670904] - Worker: 1 -----> Worker: 1 - Delay: 1 seconds |
to the delay. currently the logic is, that there is a 15 seconds timeout anyway for a retry. thats why i left the delay only for successful downloads. you think it would be better to include it into any status? for the dubplicate downloads: check the cdx response manually: so for timestamps 19980123002752 there are 2 digest (archive thinks both are not the same for timestamp 19970101083806 however there are the same digest. so this seems to be a problem with the CDX response. funnily the param |
so i added a filter: if a snapshot has same TIMESTAMP & URL, duplicates are removed. however i dont know why the cdx server does respond with duplicates... |
When I was watching it to get the URL that was failing for you earlier, it wasn't pausing the 15 seconds as there wasn't a timeout on those 301/404 codes, they were immediate responses and not timeouts. I didn't really mean to inform you about the duplicates, that just happened. Honestly didn't even know I had pasted that because I was showing the delay function. :) Thanks, will try the latest commit now. |
Also, added --debug to the command but it still didn't give the URL of the original problem noted initially about the SSL/TLS error EOF... Going to see if the latest commit happens to have corrected it. |
The |
Thanks, I also just updated the install however it isn't putting the waybackup.exe in the scripts folder like it used to. Is that something in your installer or something else? |
sorry im not on windows but when i was debugging on win i just created a virtual env and installed it inside that via pip |
that's what I did/do but for some reason it's not creating the waybackup.exe this time oh... I just found out it moved it from the appdata folder to the program files scripts folders... strange |
Okay, running 1.5.1 get this now PS C:\users\shawn\appdata\roaming\python\Python312\Scripts> ./waybackup.exe --csv -u http://wuarchive.wustl.edu/pub/ -o No CSV-file or content found to load skipable URLs Querying snapshots...
|
Updated python from 3.12.4 to 3.12.5 and it started working correctly so far. I think my internet may have been going slow as well for that error above. Will keep you updated when this last test finishes. |
Okay, here's the results, same TLS/SSL issue but attached are the data files. -----> Attempt: [1/1] Snapshot [1936/335537] - Worker: 1 -----> Worker: 1 - Delay: 1 seconds -----> Attempt: [1/1] Snapshot [1937/335537] - Worker: 1
|
and the reconnect does not work? exception http.client.IncompleteRead should be a subclass of already catched http.client.HTTPException i tried in a win vm and had no issues so far. downloading without any problems |
I just restarted it on the test16 folder with the csv and it started at and tried at file 1937 to download and it's doing the same thing, I disabled all AV just in case that was causing a connection issue. Not really sure what's going on. |
strange. but okay give me some time. i decided to redesign the whole however retry is not working as intended since i implemented the queue |
i patched dev. you could build it from dev and have a try if your exception gets catched and retried properly. still BETA of course :) |
This issue is marked as stale because there was no activity for 30 days. |
This issue has been closed because there has been no activity for 14 days while it was marked as stale. |
Windows 11 OS
Just tried this and received the following error. Empty output directory.
./waybackup -d --csv -u http://wuarchive.wustl.edu/pub/ -o .\test12 -f --workers 1 --skip --delay 1
No CSV-file or content found to load skipable URLs
Querying snapshots...
---> wuarchive.wustl.edu/pub/*
!-- Exception: UNCAUGHT EXCEPTION
!-- File: ..............\Program Files\Python312\Lib\json\decoder.py
!-- Function: raw_decode
!-- Line: 355
!-- Segment: raise JSONDecodeError("Expecting value", s, err.value) from None
!-- Description: Expecting value: line 1 column 1 (char 0)
Exception log: .\test12\waybackup_error.log
Full traceback:
Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "waybackup.exe_main.py", line 7, in
sys.exit(main())
^^^^^^
File "..\site-packages\pywaybackup\main.py", line 22, in main
archive.query_list(config.range, config.start, config.end, config.explicit, config.mode, config.cdxbackup, config.cdxinject)
File "..\site-packages\pywaybackup\archive.py", line 158, in query_list
cdxResult = json.loads(cdxResult)
^^^^^^^^^^^^^^^^^^^^^
File "..............\Program Files\Python312\Lib\json_init.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..............\Program Files\Python312\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..............\Program Files\Python312\Lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
waybackup_error.log
The text was updated successfully, but these errors were encountered: