-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to scrape anything earlier than 2024 #1135
Comments
Thank you for this issue @maxkostas. Please try to use the I have running this with that compose file services:
docudigger:
container_name: Docudigger
networks:
- homelab
environment:
- TZ=Europe/Berlin
- AMAZON_USERNAME=
- AMAZON_PASSWORD=
- AMAZON_TLD=de
- AMAZON_YEAR_FILTER=2024
- AMAZON_PAGE_FILTER=1
- ONLY_NEW=true
labels:
- net.unraid.docker.managed=dockerman
volumes:
- /mnt/user/appdata/paperless-ngx/consume/:/home/node/docudigger/data:rw
- /mnt/user/appdata/docudigger/logs:/home/node/docudigger/logs:rw
image: "ghcr.io/disane87/docudigger:dev"
networks:
homelab:
external: true
name: br3.11
If that isn't working correctly, please use one of the |
Hello! Thank you very much for the quick response, I have just tried it with the following commands:
However this is the output:
|
Do you have |
Hello! I just tried it and deleted the process.json. It seems that it disables OnlyNew?
and in the end this is the output:
Thank you again for the quick help! I will donate to you! |
I have also tried on a different machine - I think that a change on the year selector on amazon is causing the issues?
|
Yeah it seems. I have that error too. Will investigate it |
Great to hear, thank you very much! If there is any way I can assist please let me know! |
I believe I’ve identified the issue. It appears to be a race condition combined with some rate limiting on Amazon's end. I now wait for the Because of the debounced gathering the invoices as PDF from amazon can be a bit slow. But that should only affect the full runs. I will create a new dev version which you should check if it fixes your problems. |
Brilliant! Once the new DEV version is out I will test it right away and give further feedback! Thank you again for your quick reactions to this issue! |
Please check out the latest dev version. Since you run this on windows it should work. My tests were working properly. On linux servers the tooling pops up some random exceptions I need to investigate. I guess I have to update puppeteer to the latest version but that will break some other stuff. So hopefully it's working on your end by now so I can focus on getting Linux servers ready |
Hello! Great work! I managed to run it and was able to download everything I needed! I am running all of this in Windows I have just tried the DEV8 - but I get this issue when running docker compose up:
I am not an expert - but would it be possible to include the chrome files in the image itself? This is the docker file that I am using:
|
Glad the NPM works for you. It seems something broke the docker image. It's pretty strange because the debug docker image works flawlessly. But I'm working on it. I gonna close this, as the main issue here seems to be resolved. |
Hello!
First of all great application!
I managed to get it running and was able to successfully download my invoices for orders from 2024.
However I am not able to download anything older than that.
I am getting this error:
I also tried with other years and its the same outcome except for the current year.
What I would like to do is to download ALL invoices no matter the year, I tried disable the year value however that did not work. Is there maybe another option to have docudigger just download everything?
This is what I am using:
Thank you in advance and all the best!
The text was updated successfully, but these errors were encountered: