Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchy request error #328

Closed
TrevTwells opened this issue Oct 7, 2011 · 7 comments
Closed

Hierarchy request error #328

TrevTwells opened this issue Oct 7, 2011 · 7 comments

Comments

@TrevTwells
Copy link

Hello, everytime I try and access http://www.cracked.com using:

//Insert fetching the html body using the request module here. request(options, function(error, response, body){ if(!error && response.statusCode == 200){ try{ var window = jsdom.jsdom(body).createWindow(); window.document.getElementsByTagName('Body')[0].innerHTML+="PENIS"; var temp= window.document.innerHTML; console.log(temp); } catch(err) { console.log("ERROR "+err) } } });

I get the error:

Error: Hierarchy request error
.

What is the cause of this?

@demian85
Copy link

demian85 commented Oct 9, 2011

I have the same issue with other sites. I don't understand why there isn't any way to register an error event to prevent the exception from being thrown.

@TrevTwells
Copy link
Author

I managed to fix it using the HTML5 node parser module in jsdom instead of the default.

var HTML5 = require('html5');
var window = jsdom.jsdom(body, null,{parser: HTML5}).createWindow();

@demian85
Copy link

I'm sorry but html5 parser still does not work for me.. I need something bulletproof, I need to parse a very nasty html!!! At least I should be able to catch the error so my script does not die!

/home/demian/www/projects/test/node_modules/html5/lib/html5/tokenizer.js:62
                throw(e);
    ^
Error: Invalid character: Invalid character in tag name: =
    at Object.createElement (/usr/local/lib/node_modules/jsdom/lib/jsdom/level1/core.js:1188:13)
    at TreeBuilder.createElement (/home/demian/www/projects/test/node_modules/html5/lib/html5/treebuilder.js:29:25)
    at TreeBuilder.insert_element_normal (/home/demian/www/projects/test/node_modules/html5/lib/html5/treebuilder.js:61:21)
    at TreeBuilder.insert_element (/home/demian/www/projects/test/node_modules/html5/lib/html5/treebuilder.js:52:15)
    at Object.startTagOther (/home/demian/www/projects/test/node_modules/html5/lib/html5/parser/in_body_phase.js:483:12)
    at Object.processStartTag (/home/demian/www/projects/test/node_modules/html5/lib/html5/parser/phase.js:43:44)
    at EventEmitter.do_token (/home/demian/www/projects/test/node_modules/html5/lib/html5/parser.js:104:20)
    at EventEmitter.<anonymous> (/home/demian/www/projects/test/node_modules/html5/lib/html5/parser.js:125:30)
    at EventEmitter.emit (events.js:64:17)
    at EventEmitter.emitToken (/home/demian/www/projects/test/node_modules/html5/lib/html5/tokenizer.js:87:8)

@TrevTwells
Copy link
Author

Oh, I forgot to mention stopped jsdom from executing scripts via options. If that helps.

@demian85
Copy link

Using this... should work!

var doc = jsdom.jsdom(body, null,{
    parser: HTML5,
    FetchExternalResources   : [],
    ProcessExternalResources : false,
    MutationEvents           : false,
    QuerySelector            : true
});

@ghost
Copy link

ghost commented Oct 12, 2011

I reported on this issue, see #290. Inspired by this post, I also tried the html5 parser. Code snippet:

                request(url, function (error, response, body) {
                    if (error) {
                        onError(message, error, socket);
                    } else {
                        var window = jsdom.jsdom(null, null, { parser : html5 }).createWindow();
                        var parser = new html5.Parser({ document: window.document });
                        parser.parse(body);
                        jsdom.jQueryify(window, jQueryLib, function(window, jquery) {
                            var $ = window.$;
                            // Do work...
                        });
                    }
                });

This pattern can be found on the htlm5 github site. While I was able to pass in 'features' options (as detailed above), the jsdom.jQueryify function appears to overwrite the feature's settings. See jsdom.js (lines 122-4):

122 window.document.implementation.addFeature('FetchExternalResources', ['script']);
123 window.document.implementation.addFeature('ProcessExternalResources', ['script']);
124 window.document.implementation.addFeature('MutationEvents', ["1.0"]);

Naturally, I wanted to set these to false to avoid script processing. My only option was to edited the code, setting all 3 features to false. It would be nice if the global defaultDocumentFeatures function worked as expected or the jQueryify function signature provided for options/features.

@jprichardson
Copy link

I have this problem:

Error: Invalid character: Invalid character in tag name: =.

With either using HTML5 parser or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants