You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it be possible to add position information, i.e. line+column to text nodes? Or, at least make this information available to the tree builder? I implemented a very minimal proof of concept to add the information to each token and pass that along to the dom tree builder and obtain the following result:
import html5lib
html = '<div>&<p>b<span>c</span></p> cab</div>'
parser = html5lib.HTMLParser(tree=html5lib.getTreeBuilder("dom"))
doc = parser.parse(html)
def parse(n):
for c in n.childNodes:
if hasattr(c, 'sourcepos'):
print(c.sourcepos, c)
parse(c)
parse(doc)
None <DOM Element: head at 0x10bbed0d0>
None <DOM Element: body at 0x10bbed1f0>
(1, 5) <DOM Element: div at 0x10bbfb790>
(1, 10) <DOM Text node "'&'">
(1, 13) <DOM Element: p at 0x10bbfb820>
(1, 14) <DOM Text node "'b'">
(1, 20) <DOM Element: span at 0x10bbfb8b0>
(1, 21) <DOM Text node "'c'">
(1, 33) <DOM Text node "' '">
(1, 36) <DOM Text node "'cab'">
I would be willing to implement it.
The text was updated successfully, but these errors were encountered:
Would it be possible to add position information, i.e. line+column to text nodes? Or, at least make this information available to the tree builder? I implemented a very minimal proof of concept to add the information to each token and pass that along to the dom tree builder and obtain the following result:
I would be willing to implement it.
The text was updated successfully, but these errors were encountered: