-
-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fragment / Document Question #144
Comments
Hi!
We discussing such thing in #132. It's still debatable, but you can give your upvote for this feature and drop your scenario as an argument there.
As far as I remember this is not documented explicitly. Regarding fragment parsing, if you're use parsing without context element, Following tags will always be implicitly added by full document parsing mode:
I hope i didn't forget anything. |
@inikulin perfect, thank you for the quick and thorough response! Just pitched in at the linked issue. Would be happy to help out as well if someone is willing to hand-hold me a bit at the beginning, just bc this is a large and unfamiliar code base. Also about what you were saying with the |
Yeah, you can pass
Need some time to figure out how it will be actually done (more likely it will be a separate package on top of parse5, but we need to expose some API first). Unfortunately, I'm extremely busy right now and stepped away from parse5 development for some time. I hope I'll be back in late August, but meanwhile maybe @RReverser could help? |
Hi, just following up quickly, @RReverser would you be able to help a little? This issue is time sensitive for me, but I am willing to put time into helping 😁 |
@jescalan I'll try to release new parse5 version on Thursday which includes some great updates to our SAXParser made by @RReverser and we will try to build some basic solution on top of it. |
@inikulin would be amazing. even if it's just a patch for now that's ok 😁 I'm looking through the code now and there's really quite a lot to navigate. I'll keep trying though! |
Wait I'm messing with the |
@jescalan Yeah, new release will not bring any API changes, but it contains some important fixes for the SAXParser. Anyway, you can already start prototyping. The idea is quite simple: maintain own open element stack, on |
@inikulin Great, I have this mostly built out and it's working pretty well 🎉 Will post here when it's entirely finished. I'm running into an issue with self-closing tags though. It seems like it will only detect them if using the closing slash like EDIT: It's only doing this when I don't have a doctype set in the same fragment. This is still an issue for me though, as it's possible that I'll need to parse a fragment which doesn't explicitly contain a doctype. Is there a way to set the doctype manually? I don't see one in the docs... |
@jescalan It shouldn't be related to doctype. Regarding self-closing tags: https://github.com/inikulin/parse5/wiki/Documentation#q-im-parsing-img-srcfoo-with-the-saxparser-and-i-expect-the-selfclosing-flag-to-be-true-for-the-img-tag-but-its-not-is-there-something-wrong-with-the-parser - you need to check against the list of void elements as I mentioned in comment above. |
@inikulin ah, i didn't really get what you meant with the void elements at first, now it makes a lot more sense. thanks for clearing that up! |
@inikulin Ok working well with the void tags 😄 One more question -- I have a test in here to see if it will parse plain text that's not inside any tags, and I'm not getting anything back from the parser on this one. Does the SAXParser parse plain text, or does it need to be contained inside a tag? |
Hmm, it should parse plain text as is: https://tonicdev.com/57a10ab6594ef21300a7a1ad/57a11834d2ab3913009ee831 |
Working now with your method of pushing a string into the stream. I was using a different way that was not, for some reason 👍 |
@inikulin will SAXParser be on par with the regular parser in terms of "back-checking" DOM corrections? <html>
<body>
<div><body class="addition"></body></div>
</body>
</html> |
@stevenvachon It will not perform any tree structure correction, that's the point of this whole thread. |
Just as a wrap-up, did end up getting this working in the end, thanks to the brilliant @inikulin's help. Result can be seen here: https://github.com/reshape/parser 🎉 |
@jescalan have you considere link rel=import instead of a custom include element? http://webcomponents.org/articles/introduction-to-html-imports/ |
@thisconnect absolutely, but you need to be using http/2 (preferably with server push) in order for that to make sense, and not everyone has fully made that transition yet. As soon as http/2 with push becomes more standard, the include element will probably be used much less often, if ever. |
@thisconnect @jescalan Guys, I have a feeling that this discussion doesn't belong to parse5. Can you choose another medium to proceed with your conversation to not spam those who watching this repo, please? |
Sure sorry |
Hi there! Thanks so much for making this fantastic library first of all 💖
So I have a use case where I am wrapping it around a library that essentially allows partials/includes, so you could do something like this:
And let's say for example that
head.html
was:I'm trying to figure out how I can get parse5 to be able to handle this situation. Using the html fragment parse appears to remove
head
,body
, andhtml
tags, but using a normal document parse adds in a bunch of extra tags (doctype, head, body) that are not really necessary for this situation (although I do understand why they are added).Is it possible to use parse5 for a task like this? Is there some type of parse mode that won't alter the tags, or a way for me to get the fragment parse not to strip tags? Also is there documentation anywhere on which tags are stripped by fragment parse mode, and/or added by full document mode?
The text was updated successfully, but these errors were encountered: