-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script to fetch receipt images #35
Conversation
New dataset with companies information
'errors': list(), | ||
'skipped': list() | ||
} | ||
for receipt in self.receipts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the operation of downloading the receipts doesn't fit on the constructor of the class, maybe it's a good idea to separate it into a run
method which is going to fetch the receipts and also a print_report
which will print the error messages if there's something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good idea, thanks!
Do you wanna step in and send a PR to this branch? These days my workload is focused on Issue #34. Otherwise I'l catch up here later.
Update README.md
Update README-en.md
Added crowdfunding at money support session
Update README.md
Done, @mtrovo! Way better now. Thanks for the advice. |
Signed-off-by: Patrick José Pereira <patrickelectric@gmail.com>
Signed-off-by: Patrick José Pereira <patrickelectric@gmail.com>
Add missing apostrophe
Update CONTRIBUTING.md
Solve typos
Translation table
Since many files are generated based on previous timestamps, leaving these lines here generate filenames with multiple timestamps, which is something not wanted.
Following convention proposed in CONTRIBUTING.
Prevent unnecessary downloads/uploads of datasets and documentation
fix typos
Change setup script to Python 3 shebang
…-de-amor into ec-fetch-receipts
This is a program to download receipts to a local directory (fix #33).
It will download the files to a different target than
data/
because it tends to use a huge amount of disk space (maybe more than 1TB).That said people might want to point it to a external volume (using
target
positional argument) and/or to limit the amount of receipts to be downloaded (using--limit
optional argument).It uses a new external library (added to
conda_requirements.txt
) called humanize. You can manually install it withpip install humanize
if you want.Finally I recommend everyone that has a spare TB at home to save them until we can raise money to afford a virtual drive for the project with that amount of free space.
Using it to download everything to
/Volumes/Narnia/Serenata
:$ python src/fetch_receipts.py /Volumes/Narnia/Serenata
Using it to download 10,000 receipts to
/Volumes/Narnia/Serenata
:$ python src/fetch_receipts.py --limit 10000 /Volumes/Narnia/Serenata