-
Notifications
You must be signed in to change notification settings - Fork 5.8k
PhantomJS does not follow some redirects #10389
Comments
ctvo...@gmail.com commented:
|
ryandewh...@gmail.com commented:
|
jfons...@ontech.com.au commented:
|
pe...@spotfront.com commented:
|
ankitjai...@gmail.com commented:
|
theobe...@gmail.com commented:
|
theobe...@gmail.com commented:
|
theobe...@gmail.com commented:
|
Simple solution for handling redirects function renderPage(url) {
var page = require('webpage').create();
var redirectURL = null;
page.onResourceReceived = function(resource) {
if (url == resource.url && resource.redirectURL) {
redirectURL = resource.redirectURL;
}
};
page.open(url, function(status) {
if (redirectURL) {
renderPage(redirectURL);
} else if (status == 'success') {
// ...
} else {
// ...
}
});
} |
Note for devs: This is probably what we should be doing. |
We are facing the same issues. We would donate if anybody fixes it :) |
Adding a +1 for the fix, we're facing this too. Thanks! |
My issue was that the target page has a 3rd party service installed (Optimizely) that runs in the head, and a few millis after issues a location.href. Using the approach from here I noticed that this redirect registers as a navigation request from the page's main frame - at that point processing stops in phantom. Here's what seems to work for me: When I notice such a navigation request, I close the current page, and re-run the process on the new URL. I now can't seem NOT to get the screen shot of the final page. var page;
var myurl="your.targeturl.com";
var renderPage = function (url) {
page = require('webpage').create();
page.onNavigationRequested = function(url, type, willNavigate, main) {
if (main && url!=myurl) {
myurl = url;
console.log("redirect caught")
page.close()
renderPage(url);
}
};
page.open(url, function(status) {
if (status==="success") {
console.log("success")
page.render('yourscreenshot.png');
phantom.exit(0);
} else {
console.log("failed")
phantom.exit(1);
}
});
}
renderPage(myurl); |
same issue here, but using phantomjs trought selenium |
When the webPage handles 301 redirect, in the final response header is 301 statusCode, not 200 as it should be, |
+1 for fixing this bug, can't use PhantomJS because of this. |
+1 for me as well, this is problematic for my selenium testing |
+1 need fix asap |
@icezzzz: "need fix ASAP" == Fix it yourself or hire someone. The links I gave earlier in this thread should provide enough info (or close to it) to fix this. |
Hey James et all I'd like to help but sadly never programmed anything like this. and - is development still active? I know all of you have day jobs and real Cheers - On Mon, Dec 16, 2013 at 5:13 PM, James M. Greene
Sven Niemetz cell 415 407 2133 |
Development switched gears from fixing bugs and adding features to upgrading the underlying WebKit engine and Qt framework versions themselves as we believe this will actually solve 50% or more of the existing bugs. That effort is close to Technical Preview stage (working on Windows, anyway) and will likely be pushed back into this primary repo again in the near future. When that happens, we're definitely going to need some folks to give it some much needed testing... Both to ensure it still meets your personal needs as well as before and to go through many of the open bugs to check if they are fixed by this upgrade. |
ah cool thanks for the shout. On Mon, Dec 16, 2013 at 7:03 PM, James M. Greene
Sven Niemetz cell 415 407 2133 |
The script which @sniemetz has made, gives us the redirected url i.evar page; var renderPage = function (url) {
} renderPage(myurl);OUTPUTS - |
I've just met this issue (no output at all / js redirections in target). I tried setting the timer to 2000ms instead of 200ms and it worked. (looks like the generated page with multiple redirections didn't have time to render). |
@driket, my experience with this bug also makes me think it is related to time to render. Would you please give an example of how to set the timer to 2000 ms? |
@driket, tried but didn't work'd in the case for http://osc3.ezimobile.biz/ |
@aevernon, you can do this by changing the window.setTimeout parameter (in the js file). @yogeshunavane, I've tested the version 1.9.2 on OSX Maverick + Ubuntu 12 LTS -> seems to work (the output file is a full html file with relevant data) Here is the output file I got for http://osc3.ezimobile.biz/ + the script I'm using : |
For those seeking a workaround to get the final redirected URL, you might be interested in Watir WebDriver plus Headless, although this solution uses Ruby instead of JavaScript. To install dependencies on Ubuntu/Debian:sudo apt-get install rubygems xvfb firefox # Use iceweasel instead of firefox on Debian.
sudo gem install headless watir-webdriver test-watir.rb#!/usr/bin/ruby
require 'headless'
require 'watir-webdriver'
Headless.ly do
browser = Watir::Browser.new
browser.goto 'http://osc3.ezimobile.biz/'
puts browser.url
# Output is http://osc3.ezimobile.biz/catalog/
browser.goto 'https://silkflowers.affiliatetechnology.com/redirect.php?nt_id=1&URL=http://www.silkflowers.com'
puts browser.url
# Output is http://www.silkflowers.com/?utm_source=affiliatetraction&utm_medium=CommissionJunction
browser.goto 'http://snowplay.com/'
puts browser.url
# Output is http://snowplay.com/cms/
end |
I'm running 1.9.7 which still seems to have trouble following redirects. I'm loading certain images that are being 301'd to another locations, but all those images seem to 404 in Phantom. Doesn't seem to be a timing thing as setting a 60s delay before rendering didn't help. |
@wesleylancel same here. I was using CasperJS with PhantomJS 1.9.7 and got a 404 error on a redirect to HTTPS that worked in a regular browser. It did work with a self-compiled PhantomJS from the Github repo, though. (compiled on Ubuntu 12.04.04 x86_64 with libqt4 4.8.1-0ubuntu4.8) |
Wondering if anyone has come across a redirect when submitting a form not working. From observation of the events I see something along the lines of (See below): The gist of it seems to be - the page posted to .../auth/UI/Login returns a 302 which causes the 5/Operation Cancelled error. However, the HTTP status and Location header aren't populated? On FF or Chrome this all works just fine Status/Location is populated and browser redirects appropriately. My best guess would be some type of timing or race condition in the phantomjs binary (hope I'm wrong) Has anyone seen anythign like this and/or have any ideas about workarounds? Everything I can try or think of doesn't work - I can't get a hold of the status or location in any of the phantomjs 'javascript' context. *********** onNavigationRequested ************ |
any news regarding this? I also face this problem. it is a https link redirecting with 302 and when I get the getPageSource() it is:
Strange... In Chrome or firefox the redirect works without problems. i am using phantomjs 1.9.7 together with Seleniu webdriver (https://github.com/detro/ghostdriver) |
Same problem with 1.9.8 and use with it selenium |
I was able to sort out the redirect issue, but now I have a problem with I have the following JavaScript code saved in a file name ph_test.js.
When I run the following command [pointing to my website]:
Everything works ok with no issues. However, when I run it for another website with mobile redirection [in this case taobao]
I'd appreciate some help in resolving this. |
I'm posting a solution here that has worked for me, what I've done is basically intersect the request done in page.onResourceRequested = function(requestData, networkRequest) {
var reqUrl = requestData.url;
var newUrl = requestData.url.split(',%20')[0];
if (newUrl != reqUrl) {
networkRequest.changeUrl(newUrl);
}
};
page.open(url, function(status) {
if (status == 'success') {
page.render(path);
response.write('Success: Screenshot saved to ' + path + "\n");
} else {
response.write('Error: Url returned status ' + status + "\n");
}
page.release();
}); This feels a little "hacky" but it has solved my problem I haven't tested with other redirection techniques but I hope that this 2 cents helps someone, for more information a curl to the site without HTTPS looked like this (info take using curl --head):
The same to the HTTPS site looked like:
For what I've seen the location: news.php was causing the 404 that I was experiencing, the above code seems to work, although more testing is required :). |
👍 |
👍 |
Ok so a small change in the script makes it load the websites fine. var page;
var myurl="your.targeturl.com";
var renderPage = function (url) {
page = require('webpage').create();
page.onNavigationRequested = function(url, type, willNavigate, main) {
if (main && url!=myurl) {
myurl = url;
console.log("redirect caught")
page.close()
setTimeout('renderPage(myurl)',1); //Note the setTimeout here
}
};
page.open(url, function(status) {
if (status==="success") {
console.log("success")
page.render('yourscreenshot.png');
phantom.exit(0);
} else {
console.log("failed")
phantom.exit(1);
}
});
}
renderPage(myurl); |
Hello. I also have a problem with redirects. Task:
And:
I can't use phantomjs for this task, becouse current_url return something like:
If I use Firefox selenium webdriver, I can control the flow and save files (https://github.com/stdex/web_crawlers/blob/master/ticketswap/ticketswap.py) How to handle redirects correctly? |
Just leaving my +1 to keep updated about this issue. I'm using 1.9.8 version. When running in a Ubuntu64 box, the redirects are handled as expected. But I'm having problems when running in a RedHat64 box. |
We know this bug has been hanging around for a very long time, and we apologize. Here are some concrete things that you can do:
Because this bug is very old and lumps together a number of related issues with different causes, it is not useful as a bug report, and therefore I am going to close it. If you still have a problem, and my advice above does not fix it, please submit a new bug report, providing a concrete, minimal, and self-contained test case. If you don't know how to write a concrete, minimal, self-contained test case, or you need help following my advice, please ask for help on the phantomjs-users mailing list. |
ctvo...@gmail.com commented:
Disclaimer:
This issue was migrated on 2013-03-15 from the project's former issue tracker on Google Code, Issue #389.
🌟 13 people had starred this issue at the time of migration.
The text was updated successfully, but these errors were encountered: