-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display all metadata in debug log level #155
Comments
Happy to. |
Yes
Python-scraperlib is a library, so there is no such things as command-line flag. But we could add an argument e.g. to Creator But the idea of logging only the first 100 bytes makes little sense to me, it has little value. It might only be used to check mime type, but then I would rather prefer that we log only the illustration mime type (python-scraperlib already has everything needed to detect it). Logging nothing is then not a big win and I don't expect scrapers to be willing to use this alternative (you do not mind about one extra small log line usually). And logging everything, if optional, is then better done directly in the scraper rather than in python-scraperlib. So to sum-up: I propose to log all raw metadata except for the illustration where we log only its mime type. And no new argument to any function. WDYT? |
As discussed in openzim/warc2zim#123, we would benefit from logging the metadata which are used, at least all text values.
Regarding illustration, do we want to log the base64 value? It might be useful for debug as well, but not always negligible in log size.
I recommend to do it right at the beginning of the
start
method, before check of presence of mandatory metadatas and before potential validation, so that it is always logged.@rgaudin @kelson42 WDYT?
@richterdavid do you confirm you wanna implement this issue? Please wait a little bit for arguments to settle here before rushing into any implementation, we need to confirm everyone is aligned on the same page
The text was updated successfully, but these errors were encountered: