-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
jsdom vs. PhantomJS
Some people wonder what the differences are between jsdom and PhantomJS, and when you would use one over the other. Here we attempt to explain some of the differences, and why we find jsdom to be a pleasure to use for testing and scraping use cases.
PhantomJS is a complete browser (although it uses a very old and rare rendering engine). It even performs layout and rendering, allowing you to query element positions or take a screenshot. jsdom is not a full browser: it does not perform layout or rendering, and it does not support navigation between pages. It does support the DOM, HTML, canvas, many other web platform APIs, and running scripts.
So you could use jsdom to fetch the HTML of your web application (while also executing the JavaScript code within that HTML). And then you could examine and modify the resulting DOM tree. Or you could trigger event listeners to test how the web application reacts. You could also use jsdom to build up your own DOM tree from scratch, and then serialize it to a HTML string.
You need an executable to run PhantomJS. It is written in native code, and has to be compiled for each platform. jsdom is pure JavaScript, and runs wherever Node.js runs. It even has experimental support for running within browsers, giving you the ability to create a whole DOM Document inside a web worker.
One of the reasons jsdom is used a lot for testing is that creating a new document instance has very little overhead in jsdom. Opening a new page in PhantomJS takes a lot of time, so running a lot of small tests in fresh documents could take minutes in PhantomJS, but only seconds in jsdom.
In general, the same reasons that make jsdom pleasant for testing also make it pleasant for web scraping. In both cases, the extra power of a full browser is not as important as getting things done easily and quickly.
Another important benefit jsdom has for testing is a bit more complicated: it is easy to suffer race conditions using an external process like PhantomJS (or Selenium). For example if you create a script to test something using PhantomJS, that script will live in a different process than the web application. If you perform multiple steps in your test that are dependent on each other (for example, step 1: find the element; step 2: click on the element), the application might change the DOM during those steps (step 1.5: the page's JavaScript removes the element).
This is not an issue in jsdom, since your tests live in exactly the same thread and event loop as the web application, so if your test is executing JavaScript code, the web application cannot run its code until your test releases control of the event loop.