Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we apply head and tail from page content? #10

Open
benoit74 opened this issue Oct 1, 2024 · 2 comments
Open

Should we apply head and tail from page content? #10

benoit74 opened this issue Oct 1, 2024 · 2 comments
Assignees
Labels
question Further information is requested
Milestone

Comments

@benoit74
Copy link
Contributor

benoit74 commented Oct 1, 2024

In addition to MathJax #9, we have other "things" in head and tail attributes of page content.

Do we need / want to include these?

@benoit74 benoit74 added the question Further information is requested label Oct 1, 2024
@benoit74 benoit74 added this to the 0.1 milestone Oct 1, 2024
@benoit74 benoit74 self-assigned this Oct 1, 2024
@rgaudin
Copy link
Member

rgaudin commented Oct 2, 2024

Haven't looked in detail but I think that if those are attributes to some node and that you ask the question then there is no clear usage and thus it should not be included.
I understand this scraper is a standard scraper: ie. we build up our ZIM with data we scraped ; in opposition to zimit where we pass data from website to ZIM with some transformation

@benoit74
Copy link
Contributor Author

benoit74 commented Oct 3, 2024

Intention so far is indeed to better understand what is in these attributes and if there is a usage / use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants