tail nodes not covered #34

ariejdl · 2018-11-20T19:55:38Z

when parsing a node, the 'tail' text is lost e.g. 'text2' in <a><span>text</span>text 2</a>, it would be nice if the library used the .tail attribute of lxml nodes. By the way, as a result of the lack of this feature I moved to the xmltodict library.

The text was updated successfully, but these errors were encountered:

sanand0 · 2018-11-21T17:21:59Z

@ariejdl -- you're right. Tail nodes are ignored.

This library focus on converting XML to JSON (and vice versa) in line with standard conventions. I definitely see how preserving the tail (or head) would help. But there may be many ways of doing it, and I'd rather latch on to some standard people have defined.

Since you're exploring this space, have you seen others convert XML (or HTML) data of this kind into a JSON-like structure preserving the head or tail? How do they do it, please?

sanand0 · 2018-11-22T15:13:05Z

I just realized that this is a duplicate of #14 -- closing this thread. We can continue the conversation on #14 @ariejdl

sanand0 closed this as completed Nov 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tail nodes not covered #34

tail nodes not covered #34

ariejdl commented Nov 20, 2018

sanand0 commented Nov 21, 2018

sanand0 commented Nov 22, 2018

tail nodes not covered #34

tail nodes not covered #34

Comments

ariejdl commented Nov 20, 2018

sanand0 commented Nov 21, 2018

sanand0 commented Nov 22, 2018