Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tail nodes not covered #34

Closed
ariejdl opened this issue Nov 20, 2018 · 2 comments
Closed

tail nodes not covered #34

ariejdl opened this issue Nov 20, 2018 · 2 comments

Comments

@ariejdl
Copy link

ariejdl commented Nov 20, 2018

when parsing a node, the 'tail' text is lost e.g. 'text2' in <a><span>text</span>text 2</a>, it would be nice if the library used the .tail attribute of lxml nodes. By the way, as a result of the lack of this feature I moved to the xmltodict library.

@sanand0
Copy link
Owner

sanand0 commented Nov 21, 2018

@ariejdl -- you're right. Tail nodes are ignored.

This library focus on converting XML to JSON (and vice versa) in line with standard conventions. I definitely see how preserving the tail (or head) would help. But there may be many ways of doing it, and I'd rather latch on to some standard people have defined.

Since you're exploring this space, have you seen others convert XML (or HTML) data of this kind into a JSON-like structure preserving the head or tail? How do they do it, please?

@sanand0
Copy link
Owner

sanand0 commented Nov 22, 2018

I just realized that this is a duplicate of #14 -- closing this thread. We can continue the conversation on #14 @ariejdl

@sanand0 sanand0 closed this as completed Nov 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants