-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem with html2creole AttributeError: 'NoneType' object has no attribute 'parent' #6
Comments
Sorry for the late response. You have to cut out the body content and put this to html2creole() made something like this: body_re = re.compile(r'<body[^>]*>(.*?)</body>', re.S | re.I)
f = open("/tmp/test.html","r")
html = f.read()
f.close()
content = body_re.findall(html)
creole = html2creole(content) |
I found a bug related to "AttributeError: 'NoneType' object has no attribute 'parent'" and fix it with: 9e5b5dd I create a new relase v1.0.2 |
Dude. It works like beautiful now. U are the man! Here is a full sample that works with the html from my first post, no need for the regex.
|
jedie
added a commit
to jedie/python-creole-old
that referenced
this issue
Apr 5, 2015
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm getting an error when running the following code
In [54]: html2creole(unicode(fr,errors='ignore'))
In [53]: html2creole(unicode(fr,errors='ignore'))
AttributeError Traceback (most recent call last)
/tmp/ in ()
/usr/local/lib/python2.7/dist-packages/creole/init.pyc in html2creole(html_string, debug, parser_kwargs, emitter_kwargs, unknown_emit)
110 warnings.warn("parser_kwargs argument in html2creole would be removed in the future!", PendingDeprecationWarning)
111
--> 112 document_tree = parse_html(html_string, debug=debug)
113
114 emitter_kwargs2 = {
/usr/local/lib/python2.7/dist-packages/creole/init.pyc in parse_html(html_string, debug)
91
92 h2c = HtmlParser(debug=debug)
---> 93 document_tree = h2c.feed(html_string)
94 if debug:
95 h2c.debug()
/usr/local/lib/python2.7/dist-packages/creole/html_parser/parser.pyc in feed(self, raw_data)
157 # print("-"*79)
--> 159 HTMLParser2.feed(self, data)
160
161 return self.root
/usr/lib/python2.7/HTMLParser.pyc in feed(self, data)
107 """
108 self.rawdata = self.rawdata + data
--> 109 self.goahead(0)
110
111 def close(self):
/usr/lib/python2.7/HTMLParser.pyc in goahead(self, end)
151 k = self.parse_starttag(i)
152 elif startswith("</", i):
--> 153 k = self.parse_endtag(i)
154 elif startswith("<!--", i):
155 k = self.parse_comment(i)
/usr/local/lib/python2.7/dist-packages/creole/shared/html_parser.pyc in parse_endtag(self, i)
98 return j
99 # --- changed end -----------------------------------------------------
--> 100 self.handle_endtag(tag.lower())
101 self.clear_cdata_mode()
102 return j
/usr/local/lib/python2.7/dist-packages/creole/html_parser/parser.pyc in handle_endtag(self, tag)
255 self._go_up()
256 else:
--> 257 self.cur = self.cur.parent
258
259 #-------------------------------------------------------------------------
Here's the actual html code (I don't know if I can attach files)
The text was updated successfully, but these errors were encountered: