Skip to content
This repository was archived by the owner on May 30, 2023. It is now read-only.

PhantomJS does not follow some redirects #10389

Closed
ariya opened this issue Feb 19, 2012 · 44 comments
Closed

PhantomJS does not follow some redirects #10389

ariya opened this issue Feb 19, 2012 · 44 comments
Labels

Comments

@ariya
Copy link
Owner

ariya commented Feb 19, 2012

ctvo...@gmail.com commented:

Version of PhantomJS: Version 1.3.0 "Water Lily".

What steps will reproduce the problem?

  1. I run from console:

DISPLAY=:0 timeout 25 /root/phantomjs/bin/phantomjs --ignore-ssl-errors=yes /root/phantomjs/rasterize202x152.js www.chow-chow.hu /var/thumbnails/thumbfile.jpg
or:
DISPLAY=:0 timeout 25 /root/phantomjs/bin/phantomjs --ignore-ssl-errors=yes /root/phantomjs/rasterize202x152.js http://hfv-speyer.de /var/thumbnails/thumbfile.jpg

  1. /root/phantomjs/rasterize202x152.js listing:
    var page = new WebPage(),
    address, output, size, rc=0;
    page.settings.userAgent = 'Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/7.0';

if (phantom.args.length < 2 || phantom.args.length > 3) {
console.log('Usage: rasterize.js URL filename');
phantom.exit();
} else {
address = phantom.args[0];
output = phantom.args[1];
page.viewportSize = { width: 1024, height: 768 }
page.open(address, function (status) {
if (status == 'fail') {
console.log('Unable to load the address! ' + address);
phantom.exit();
console.log('Unable to load the address! ' + address);
} else {
window.setTimeout(function () {
page.clipRect = { top: 0, left: 0, width: 990, height: 745 };
page.render(output);
console.log(status);
phantom.exit();
}, 200);
}

});

}

What is the expected output? What do you see instead?
I expected to get JPEG thumbnail in /var/thumbnails/thumbfile.jpg but no file creates. For chow-chow.hu I get "success" message, and no error is shown. For hfv-speyer.de I get "Unable to load the address!" error message.

Which operating system are you using?
Debian GNU/Linux 2.6.32-5-686

Did you use binary PhantomJS or did you compile it from source?
Compiled from the source.

Please provide any additional information below:

It seems that PhantomJS can`t follow redirects like that on this websites: http://www.chow-chow.hu ( redirects to http://www.chow-chow.hu/start.php ) and http://hfv-speyer.de (redirects to http://www.hfv-speyer.de/js-start.htm)
Many other websites are rendering well the same way at my server.

Great thanks for developers for such awesome project! I really appreciate your work.

Disclaimer:
This issue was migrated on 2013-03-15 from the project's former issue tracker on Google Code, Issue #389.
🌟   13 people had starred this issue at the time of migration.

@ariya
Copy link
Owner Author

ariya commented Feb 20, 2012

ctvo...@gmail.com commented:

One more example of this issue is: http://snowplay.com which redirects to http://snowplay.com/cms/

@ethicalhack3r
Copy link

ryandewh...@gmail.com commented:

Hi,

I'm having the same issue with version 1.6.1.

Thanks,
Ryan

@jason-fonseca
Copy link

jfons...@ontech.com.au commented:

I am having the same issue with v 1.6.1.

I'm trying http://osc3.ezimobile.biz which redirects to http://osc3.ezimobile.biz/catalog

Here is the index.html file from that server:

Does anyone have any workarounds for this?

@ariya
Copy link
Owner Author

ariya commented Jan 11, 2013

pe...@spotfront.com commented:

All of the examples provided here contain redirects in HTML (something like ) or in Javascript (window.self.location.replace('index.php' );). (I've yet to see PhantomJS fail to follow an HTTP redirect via the Location response header.) As much as I find this frustrating in my own projects, I think that PhantomJS is working as it ought. That said, having some sort of optional timeout that waits a configurable amount of time for any location changes before firing the page.open callback would save a lot of repetitive userland code.

@ariya
Copy link
Owner Author

ariya commented Feb 26, 2013

ankitjai...@gmail.com commented:

I am facing a similar issue where the redirect is done using location.replace. Though it is kinda strange that it generates the image with phantomjs1.5.1 whereas it does not with v1.8.1

@standpat
Copy link

theobe...@gmail.com commented:

I am facing a similar issue, lots of sites redirect (301) but phantomjs fails to notice.
#6 - why this is an intended feature? Seems this significantly limits the use of PhantomJS as a gui-less browser.

@standpat
Copy link

theobe...@gmail.com commented:

Just a small addenum:
pjs2 netsniff.js http://www.forbes.com
FAIL to load the address

This is not an expected output, of course.

@standpat
Copy link

theobe...@gmail.com commented:

Perhaps a fix would be to detect whether the first accessed document results with a redirection (302/301) and then just assume this is the main document we wanted to access from the start and proceed?

@amenzhinsky
Copy link

Simple solution for handling redirects

function renderPage(url) {
  var page = require('webpage').create();
  var redirectURL = null;

  page.onResourceReceived = function(resource) {
    if (url == resource.url && resource.redirectURL) {
      redirectURL = resource.redirectURL;
    }
  };

  page.open(url, function(status) {
    if (redirectURL) {
      renderPage(redirectURL);
    } else if (status == 'success') {
      // ...
    } else {
      // ...
    }
  });
}

https://gist.github.com/amenzhinsky/5245915

@JamesMGreene
Copy link
Collaborator

Note for devs:
I found a page on Nokia's wiki discussing how to setup the QNAM to follow redirects:
http://www.developer.nokia.com/Community/Wiki/Handling_an_HTTP_redirect_with_QNetworkAccessManager

This is probably what we should be doing.

@77yards
Copy link

77yards commented May 24, 2013

We are facing the same issues. We would donate if anybody fixes it :)

@ghost
Copy link

ghost commented Jun 19, 2013

Adding a +1 for the fix, we're facing this too. Thanks!

@sniemetz
Copy link

My issue was that the target page has a 3rd party service installed (Optimizely) that runs in the head, and a few millis after issues a location.href.

Using the approach from here
http://newspaint.wordpress.com/2013/04/25/getting-to-the-bottom-of-why-a-phantomjs-page-load-fails/

I noticed that this redirect registers as a navigation request from the page's main frame - at that point processing stops in phantom.

Here's what seems to work for me:

When I notice such a navigation request, I close the current page, and re-run the process on the new URL. I now can't seem NOT to get the screen shot of the final page.

var page;
var myurl="your.targeturl.com"; 

var renderPage = function (url) {
    page = require('webpage').create();

    page.onNavigationRequested = function(url, type, willNavigate, main) {
        if (main && url!=myurl) {
            myurl = url;
            console.log("redirect caught")
            page.close()
            renderPage(url);
        }
    };

    page.open(url, function(status) {
        if (status==="success") {
            console.log("success")
                page.render('yourscreenshot.png');
                phantom.exit(0);
        } else {
            console.log("failed")
                phantom.exit(1);
        }
    });
} 

renderPage(myurl);

@kurojishi
Copy link

same issue here, but using phantomjs trought selenium

@jekel
Copy link

jekel commented Nov 11, 2013

When the webPage handles 301 redirect, in the final response header is 301 statusCode, not 200 as it should be,
and other headers are from the first request who gives redirect

@AndreasRo
Copy link

+1 for fixing this bug, can't use PhantomJS because of this.

@glifchits
Copy link

+1 for me as well, this is problematic for my selenium testing

@icezzzz
Copy link

icezzzz commented Dec 16, 2013

+1 need fix asap

@JamesMGreene
Copy link
Collaborator

@icezzzz: "need fix ASAP" == Fix it yourself or hire someone. The links I gave earlier in this thread should provide enough info (or close to it) to fix this.

@sniemetz
Copy link

Hey James et all

I'd like to help but sadly never programmed anything like this.
Are there other todos? testing etc?

and - is development still active? I know all of you have day jobs and real
lives to support :)

Cheers -
sven

On Mon, Dec 16, 2013 at 5:13 PM, James M. Greene
notifications@github.comwrote:

@icezzzz https://github.com/icezzzz: "need fix ASAP" == Fix it yourself
or hire someone. The links I gave earlier in this thread should provide
enough info (or close to it) to fix this.


Reply to this email directly or view it on GitHubhttps://github.com//issues/10389#issuecomment-30718035
.


Sven Niemetz cell 415 407 2133
http://www.linkedin.com/in/sniemetz efax 435 603-2133

@JamesMGreene
Copy link
Collaborator

Development switched gears from fixing bugs and adding features to upgrading the underlying WebKit engine and Qt framework versions themselves as we believe this will actually solve 50% or more of the existing bugs. That effort is close to Technical Preview stage (working on Windows, anyway) and will likely be pushed back into this primary repo again in the near future.

When that happens, we're definitely going to need some folks to give it some much needed testing... Both to ensure it still meets your personal needs as well as before and to go through many of the open bugs to check if they are fixed by this upgrade.

@sniemetz
Copy link

ah cool thanks for the shout.
Looking forward for the next rev!

On Mon, Dec 16, 2013 at 7:03 PM, James M. Greene
notifications@github.comwrote:

Development switched gears from fixing bugs and adding features to
upgrading the underlying WebKit engine and Qt framework versions themselves
as we believe this will actually solve 50% or more of the existing bugs.
That effort is close to Technical Preview stage (working on Windows,
anyway) and will likely be pushed back into this primary repo again in the
near future.

When that happens, we're definitely going to need some folks to give it
some much needed testing... Both to ensure it still meets your personal
needs as well as before and to go through many of the open bugs to check if
they are fixed by this upgrade.


Reply to this email directly or view it on GitHubhttps://github.com//issues/10389#issuecomment-30722705
.


Sven Niemetz cell 415 407 2133
http://www.linkedin.com/in/sniemetz efax 435 603-2133

@yogeshunavane
Copy link

The script which @sniemetz has made, gives us the redirected url i.e

var page;
var myurl="http://osc3.ezimobile.biz";

var renderPage = function (url) {
page = require('webpage').create();

page.onNavigationRequested = function(url, type, willNavigate, main) {
    if (main && url!=myurl) {
        myurl = url;
        console.log("redirect caught")
        page.close()
        renderPage(url);
    }
};

page.open(url, function(status) {
    if (status==="success") {
    console.log(myurl);
        console.log("success")
            page.render('yourscreenshot.png');
            phantom.exit(0);
    } else {
        console.log("failed")
            phantom.exit(1);
    }
});

}

renderPage(myurl);

OUTPUTS -
http://osc3.ezimobile.biz/
http://osc3.ezimobile.biz/catalog
We now only need to find a way, how to request that redirected URL.
Looking forward.

@driket
Copy link

driket commented Jan 9, 2014

I've just met this issue (no output at all / js redirections in target). I tried setting the timer to 2000ms instead of 200ms and it worked.

(looks like the generated page with multiple redirections didn't have time to render).

@aevernon
Copy link

aevernon commented Jan 9, 2014

@driket, my experience with this bug also makes me think it is related to time to render. Would you please give an example of how to set the timer to 2000 ms?

@yogeshunavane
Copy link

@driket, tried but didn't work'd in the case for http://osc3.ezimobile.biz/

@driket
Copy link

driket commented Jan 10, 2014

@aevernon, you can do this by changing the window.setTimeout parameter (in the js file).

@yogeshunavane, I've tested the version 1.9.2 on OSX Maverick + Ubuntu 12 LTS -> seems to work (the output file is a full html file with relevant data)

Here is the output file I got for http://osc3.ezimobile.biz/ + the script I'm using :
https://gist.github.com/driket/8348520

@aevernon
Copy link

For those seeking a workaround to get the final redirected URL, you might be interested in Watir WebDriver plus Headless, although this solution uses Ruby instead of JavaScript.

To install dependencies on Ubuntu/Debian:

sudo apt-get install rubygems xvfb firefox # Use iceweasel instead of firefox on Debian.
sudo gem install headless watir-webdriver

test-watir.rb

#!/usr/bin/ruby

require 'headless'
require 'watir-webdriver'

Headless.ly do
  browser = Watir::Browser.new

  browser.goto 'http://osc3.ezimobile.biz/'
  puts browser.url
  # Output is http://osc3.ezimobile.biz/catalog/

  browser.goto 'https://silkflowers.affiliatetechnology.com/redirect.php?nt_id=1&URL=http://www.silkflowers.com'
  puts browser.url
  # Output is http://www.silkflowers.com/?utm_source=affiliatetraction&utm_medium=CommissionJunction

  browser.goto 'http://snowplay.com/'
  puts browser.url
  # Output is http://snowplay.com/cms/
end

@wesleylancel
Copy link

I'm running 1.9.7 which still seems to have trouble following redirects. I'm loading certain images that are being 301'd to another locations, but all those images seem to 404 in Phantom. Doesn't seem to be a timing thing as setting a 60s delay before rendering didn't help.

@ofrzeta
Copy link

ofrzeta commented Jul 9, 2014

@wesleylancel same here. I was using CasperJS with PhantomJS 1.9.7 and got a 404 error on a redirect to HTTPS that worked in a regular browser. It did work with a self-compiled PhantomJS from the Github repo, though.

(compiled on Ubuntu 12.04.04 x86_64 with libqt4 4.8.1-0ubuntu4.8)

@mattreines
Copy link

Wondering if anyone has come across a redirect when submitting a form not working. From observation of the events I see something along the lines of (See below):

The gist of it seems to be - the page posted to .../auth/UI/Login returns a 302 which causes the 5/Operation Cancelled error. However, the HTTP status and Location header aren't populated? On FF or Chrome this all works just fine Status/Location is populated and browser redirects appropriately.

My best guess would be some type of timing or race condition in the phantomjs binary (hope I'm wrong)

Has anyone seen anythign like this and/or have any ideas about workarounds? Everything I can try or think of doesn't work - I can't get a hold of the status or location in any of the phantomjs 'javascript' context.

*********** onNavigationRequested ************
Trying to navigate to: http://XXXXXXXX:8080/auth/UI/Login
Caused by: FormSubmitted
Will actually navigate: true
Sent from the page's main frame: true
*********** onResourceRequested ************
Request (#40): {"headers":[{"name":"Origin","value":"http://XXXXXXXX:8080"},{"name":"User-Agent","value":"Mozilla/5.0 (Macintosh; Intel Mac OS X) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.7 Safari/534.34"},{"name":"Content-Type","value":"application/x-www-form-urlencoded"},{"name":"Accept","value":"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},{"name":"Referer","value":"http://XXXXXXXXXX:8080/auth/UI/Login"},{"name":"Content-Length","value":"741"}],"id":40,"method":"POST","time":"2014-07-29T17:23:34.480Z","url":"http://XXXXXXXXXX:8080/auth/UI/Login"}
Request (#40): {"objectName":""}
*********** onResourceError ************
Unable to load resource (#40 URL:http://XXXXXXXXXXX:8080/auth/UI/Login)
Error code: 5. Description: Operation canceled
*********** onResourceReceived ************
Response (#40, stage "end"): {"contentType":null,"headers":[],"id":40,"redirectURL":null,"stage":"end","status":null,"statusText":null,"time":"2014-07-29T17:23:34.625Z","url":"http://XXXXXXXXXXXX:8080/auth/UI/Login"}

@luksch
Copy link

luksch commented Oct 16, 2014

any news regarding this? I also face this problem. it is a https link redirecting with 302 and when I get the getPageSource() it is:

<html>
 <head></head>
 <body></body>
</html>

Strange...

In Chrome or firefox the redirect works without problems. i am using phantomjs 1.9.7 together with Seleniu webdriver (https://github.com/detro/ghostdriver)

@gam-phon
Copy link

Same problem with 1.9.8 and use with it selenium

@obihill
Copy link

obihill commented Feb 7, 2015

I was able to sort out the redirect issue, but now I have a problem with Page.evaluate not working.

I have the following JavaScript code saved in a file name ph_test.js.

var page;
var args = require('system').args;
var url_str = 'http://'+args[1];

var renderPage = function(){

    page = require('webpage').create();
    var myArgs = Array.prototype.slice.call(arguments),
        url_str = myArgs[0];

    // Set the viewport size
    page.viewportSize = {
        width: 320,
        height: 480
    };

    // Sets the User Agent
    page.settings.userAgent = 'Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_0 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8A293 Safari/6531.22.7';

    /**
     * From PhantomJS documentation:
     * This callback is invoked when there is a JavaScript console. The callback may accept up to three arguments:
     * the string for the message, the line number, and the source identifier.
     */
    page.onConsoleMessage = function (msg, line, source) {
        console.log('console> ' + msg);
    };

    /**
     * From PhantomJS documentation:
     * This callback is invoked when there is a JavaScript alert. The only argument passed to the callback is the string for the message.
     */
    page.onAlert = function (msg) {
        console.log('alert!!> ' + msg);
    };

    /**
     * Handle Redirection
     */
    page.onNavigationRequested = function(url_sub_str, type, willNavigate, main) {
        if (main && url_sub_str != url_str)
        {
            url_str = url_sub_str;
            console.log("redirect caught");
            page.close();
            renderPage(url_str);
        }
    };

    /**
     * Open the web page and run RRunner
     */
    page.open(url_str, function(status) {
        if (status === 'success') {
            page.injectJs('https://code.jquery.com/jquery-1.11.2.min.js');

            // Our "event loop"
            if(!phantom.state)
            {
                phFunction(url_str);
            }
            else {
                phantom.state();
            }
        }
        else
        {
            console.log('failed');
        }

        page.close();
        setTimeout(function(){
            phantom.exit();
        }, 1000);

        function phFunction()
        {
            var myArgs = Array.prototype.slice.call(arguments),
                url_str = myArgs[0]
                ;

            page.evaluate(function (url_str) {
                console.log('evaluate');
            }, url_str);

            page.render('screenshots/screenshot_full.png');
        }
    });
};

renderPage(url_str);

When I run the following command [pointing to my website]:

phantomjs ph_test.js www.restive.io

Everything works ok with no issues. However, when I run it for another website with mobile redirection [in this case taobao]

 phantomjs ph_test.js www.taobao.com

Page.evaluate doesn't run as I do not see the message 'evaluate' in my console.

I'd appreciate some help in resolving this.

@jorgelbg
Copy link

I'm posting a solution here that has worked for me, what I've done is basically intersect the request done in onResourceRequested function and extract the correct URL, I was trying to render a site that has a location header redirection into other site, this is done after a previous redirection from HTTP to HTTPS. In this case, in the onResourceRequested event, the last URL (after all the redirections was caught) looks like http://site.com/news.php,%20news.php, the problem here is that when this url is going to be fetched the second portion ,%20news.php causes a 404 (even if the redirection is being correctly followed) so my solution was to use onResourceRequested to remove the additional URL portion causing the problems, the code looks like:

  page.onResourceRequested = function(requestData, networkRequest) {
    var reqUrl = requestData.url;
    var newUrl = requestData.url.split(',%20')[0];

    if (newUrl != reqUrl) {
      networkRequest.changeUrl(newUrl);
    }
  };

  page.open(url, function(status) {
    if (status == 'success') {
      page.render(path);
      response.write('Success: Screenshot saved to ' + path + "\n");
    } else {
      response.write('Error: Url returned status ' + status + "\n");
    }
    page.release();
  });

This feels a little "hacky" but it has solved my problem I haven't tested with other redirection techniques but I hope that this 2 cents helps someone, for more information a curl to the site without HTTPS looked like this (info take using curl --head):

HTTP/1.1 301 Moved Permanently
Server: nginx
Date: Tue, 17 Feb 2015 21:37:15 GMT
Content-Type: text/html
Content-Length: 178
Connection: keep-alive
Location: https://site.com/
Strict-Transport-Security: max-age=15768000

The same to the HTTPS site looked like:

HTTP/1.1 302 Moved Temporarily
Server: nginx
Content-Type: text/html
Connection: keep-alive
X-Powered-By: PHP/5.3.3
Set-Cookie: phpv2Q389C_visited=yes; expires=Tue, 17-Feb-2015 21:58:29 GMT; path=/
Set-Cookie: phpv2Q389C_lastvisit=1424205509; expires=Tue, 17-Feb-2015 22:38:29 GMT; path=/; domain=site.com; httponly
Location: news.php
Location: news.php

For what I've seen the location: news.php was causing the 404 that I was experiencing, the above code seems to work, although more testing is required :).

@vbauer
Copy link
Contributor

vbauer commented Feb 25, 2015

👍

@xvik
Copy link

xvik commented Feb 28, 2015

👍

@ankneo
Copy link

ankneo commented May 19, 2015

Ok so a small change in the script makes it load the websites fine.

var page;
var myurl="your.targeturl.com"; 

var renderPage = function (url) {
    page = require('webpage').create();

    page.onNavigationRequested = function(url, type, willNavigate, main) {
        if (main && url!=myurl) {
            myurl = url;
            console.log("redirect caught")
            page.close()
            setTimeout('renderPage(myurl)',1); //Note the setTimeout here
        }
    };

    page.open(url, function(status) {
        if (status==="success") {
            console.log("success")
                page.render('yourscreenshot.png');
                phantom.exit(0);
        } else {
            console.log("failed")
                phantom.exit(1);
        }
    });
} 

renderPage(myurl);

@stdex
Copy link

stdex commented Nov 18, 2015

Hello.
Ubuntu 14.04 x86
phantomjs 2.0.0

I also have a problem with redirects.

Task:
Need to handle URL's like:

http://click.ticketswap.nl/track/click/30039336/www.ticketswap.nl?p=eyJzIjoiY0x6N3NXYThpZ0VGTGVsNVJzRC16R2hGVGFBIiwidiI6MSwicCI6IntcInVcIjozMDAzOTMzNixcInZcIjoxLFwidXJsXCI6XCJodHRwczpcXFwvXFxcL3d3dy50aWNrZXRzd2FwLm5sXFxcL2Rvd25sb2FkXFxcLzM2MTUyOFxcXC9jMTA5YmJjOWI4OGYzYTEyNTBjZDk3MTQyMmE2YWVkYVxcXC83NjQyNzFcIixcImlkXCI6XCIxNmE4NWI4Yzc5NmE0Y2UwOTk0Njc0M2RmM2MzODZkZlwiLFwidXJsX2lkc1wiOltcImQ4M2U3YmJmOTU3MTFkNDcyM2U4NjJlNTA1MWNjMWVhNTU5MDZlZjlcIl19In0

And:

  1. Login to facebook.

  2. Handle redirects (3 redirects)
    image

  3. Get URL of last page with 200 status code to download file.
    Something like:

https://ticketswap.s3.amazonaws.com/pdf-services/201508/74df1712-6cd3-4ce1-87bd-28a928762087/93ebfa4c-46ba-4b8d-9077-b5ecc34f4af0.page.pdf?response-content-disposition=attachment%3B%20filename%3Dticketswap-breakfast-club-mini-fest-ticket-764271.pdf&response-content-type=application%2Fpdf&AWSAccessKeyId=AKIAJA2AW7EYEF5JWHGQ&Expires=1447875603&Signature=hqbIX3GnTgLQnqqVyue4xGcBlF0%3D#_=_

I can't use phantomjs for this task, becouse current_url return something like:

https://www.facebook.com/login.php?skip_api_login=1&api_key=384197868327751&signed_next=1&next=https%3A%2F%2Fwww.facebook.com%2Fv2.0%2Fdialog%2Foauth%3Fredirect_uri%3Dhttps%253A%252F%252Fwww.ticketswap.nl%252Flogin%252Fcheck-facebook%26display%3Dpopup%26scope%3Demail%26response_type%3Dcode%26client_id%3D384197868327751%26ret%3Dlogin&cancel_url=https%3A%2F%2Fwww.ticketswap.nl%2Flogin%2Fcheck-facebook%3Ferror%3Daccess_denied%26error_code%3D200%26error_description%3DPermissions%2Berror%26error_reason%3Duser_denied%23_%3D_&display=popup

If I use Firefox selenium webdriver, I can control the flow and save files (https://github.com/stdex/web_crawlers/blob/master/ticketswap/ticketswap.py)

How to handle redirects correctly?

@lorenooliveira
Copy link

Just leaving my +1 to keep updated about this issue.

I'm using 1.9.8 version. When running in a Ubuntu64 box, the redirects are handled as expected. But I'm having problems when running in a RedHat64 box.

@zackw
Copy link
Contributor

zackw commented Nov 23, 2015

We know this bug has been hanging around for a very long time, and we apologize. Here are some concrete things that you can do:

  • If you're on 1.9.x, upgrade to 2.0. 2.0 uses much newer versions of Qt and Webkit, and these may have addressed some of the problems with following redirects. (Qt provides the HTTP implementation.) 1.x is no longer being developed -- future bugfixes will only appear in the 2.0 series.
  • There is a long-standing bug in the Qt/Webkit interface which causes page.url not to accurately reflect redirects. If this is your problem, please try my encode-all-the-urls branch; it replaces that bug with a different bug, in which page.url is confused by <base href=>. (We're still looking for a way to fix both bugs properly. Concrete, self-contained, minimal test cases in which either stock 2.0 gets page.url wrong, or my branch gets this wrong, would be very helpful.)
  • The onLoadFinished callback fires at approximately the same time the onload event fires in the page. (They are not guaranteed to occur in any particular order relative to each other.) This is by design. If your problem is that JavaScript adjustments to window.location, <meta refresh> tags, or other such things do not get a chance to happen before the onLoadFinished callback fires, you need to wait a bit after that callback (using setTimeout or equivalent) before declaring the page "done." Unfortunately, we have no good way of determining that a page is done running all of its JavaScript and stuff (indeed, it might never be done).

Because this bug is very old and lumps together a number of related issues with different causes, it is not useful as a bug report, and therefore I am going to close it. If you still have a problem, and my advice above does not fix it, please submit a new bug report, providing a concrete, minimal, and self-contained test case. If you don't know how to write a concrete, minimal, self-contained test case, or you need help following my advice, please ask for help on the phantomjs-users mailing list.

@zackw zackw closed this as completed Nov 23, 2015
Repository owner locked and limited conversation to collaborators Nov 23, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests