Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI does not limit the buttons displayed to only requested formats #159

Closed
benoit74 opened this issue Jan 18, 2023 · 5 comments · Fixed by #163
Closed

UI does not limit the buttons displayed to only requested formats #159

benoit74 opened this issue Jan 18, 2023 · 5 comments · Fixed by #163

Comments

@benoit74
Copy link
Collaborator

benoit74 commented Jan 18, 2023

When we request at the CLI a list of formats (e.g epub + html) which is less than the list of formats of a given book (e.g. epub + html + pdf), only appropriate formats are exported in the ZIM (epub + html) but the UI shows a link to all existing formats (e.g. epub + html + pdf) the book supports (in the DB). There is hence broken links since we do not requested this additional format. Code seems to ignore this possibility.

Here's a sample command to use:

python gutenberg2zim -z gut_fr_test.zim -b 18812 -l fr -f epub,html

Looks like the issue is present in many places: cover article (gutenbergtozim/templates/cover_article.html), HTML version of a book (gutenbergtozim/templates/book_infobox.html) and bookshelves (2 times in gutenbergtozim/templates/js/tools.js).

@benoit74
Copy link
Collaborator Author

@rgaudin @kelson42
It looks like this is indeed a regression induced by my last PR. I will look into it today.

@benoit74 benoit74 self-assigned this Jan 19, 2023
@benoit74
Copy link
Collaborator Author

I confirmed (again) that the issue was already present before my PR in v1.1.9
As far as I can tell for now, the bug is not present anymore if I comment few strange lines forcing the PDF format for all books.

@rgaudin
Copy link
Member

rgaudin commented Jan 19, 2023

Can you add a condition there to only do it if pdf was requested?

@benoit74
Copy link
Collaborator Author

benoit74 commented Jan 19, 2023

I just confirmed this bug has been made more visible by the lines / commit mentioned in my former comment. This change is forcing the presence of a PDF format for all books. It is mentioned in the commit / code this was done because some RDFs where not mentioning the PDF as an available format even if the file was available.

Adding a condition to only do it if pdf was requested is not really an option because it means that when pdf is requested, all books which do not have a pdf will be have broken links in the ZIM.

Rather than adding such condition, I would prefer to remove completely these lines. I can confirm this is feasible while working on #97.

As suggested in this issue 97, I prefer to ask @eshellman / PG to fix the issue upstream (if any is still there) rather than maintaining oddities like this which in the end degrade the overall scraper quality / makes it more complex to understand / maintain. I can definitely understand this is easier to say and to do today than years ago.

There is nevertheless still a "UI" bug because all formats present in DB are displayed on the various pages, no matter the restriction of format specified on the CLI. If you request only HTML for instance, you still get a EPUB button on all books with a potential EPUB format.

Finally, one last question: is there really a reason to force the presence of HTML format (see here)? Especially once #95 has been solved

@benoit74
Copy link
Collaborator Author

I created two separate issues for clarity, let's focus only the UI bug on this one (my bad for having caused some confusion maybe):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants