-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxy for page #678
Comments
You can use request interception to forward requests from each page to the correct proxy. |
@JoelEinbinder could you show an example how can I forward request through SOCKS proxy using request interception? |
@ivibe unfortunately, this is not possible for SOCKS proxy, you'll have to launch a separate browser instance for this case. Out of curiosity, why would you need this? |
It's a pity. My use case is web-scraping. Web-servers can block IPs or the proxy server can become inactive, that's why relatively often I need to change proxy. Of course, I would like to avoid a perfomance hit related to launching many instances of chromium. Is there any chance, that such functionality (i.e. dynamic changing proxy) will be implemented in future chromium releases? |
+1 This feature would be useful for me too, as I'm currently forced to launch multiple chromium instances if I need to access multiple URLs via different proxies. To add to what @ivibe suggested for use-cases, this could also be useful if you need to access resources behind firewalls with no common proxy that can pass through both. Alternatively, this would be useful if you wanted to test or screenshot your web application from multiple sources - e.g. if page content changes based on the visitor's IP's geolocation. If there is a way to workaround this as suggested by @JoelEinbinder, perhaps the SOCKS requirement could be alleviated by setting up a proxy in the middle to allow an HTTP proxy interface to the SOCKS connection. (e.g. https://superuser.com/questions/423563/convert-http-requests-to-socks5) |
What are the supported proxy for this case? |
+1 |
I think you can capture every request to use http(s) proxy! |
Socks proxy affect to the whole browser(all tabs), you only run different browser(different userDataDir) instance to do. |
One more reason to get this feature is the absence of proxy pac file
support in headless mode:
https://bugs.chromium.org/p/chromium/issues/detail?id=765245
|
@fhmd4k even if we consider only regular http(s) proxy, that would be nice to see an example of using it through capturing requests |
Hi, I'm working around on this issue and I'm already able to make this work with HTTP websites. For HTTPS websites I'm still facing some issues. It may sound a bit hacky and complex... hmm... that's because it really is! But hey, it works. The idea is to create a local Downstream Proxy that parses the address of the Upstream Proxy from the headers of the page's requests.
You can use something like this per page: page.setExtraHTTPHeaders({proxy_addr: "200.11.11.11", proxy_port: 999});
// 200.11.11.11:999 is the address of your final proxy you want to use (the Upstream Proxy). You should start chrome using Then, your custom Downstream Proxy should extract those proxy headers and forward the request for the proper Upstream Proxy. For HTTPS requests, the issue I'm facing is to intercept the CONNECTION requests when the secure communication tunnel is being created. In this case the proxy headers are not sent by Chrome and I'm figuring out another way of transmitting the proxy information to the Downstream Proxy without needing to hack chrome(/chromium) itself. The Downstream Proxy should be a very lightweight process running in your operation system. For reference, the proxy I've built consumes about 20MB of system's memory. I won't share the proxy code for now because it currently exposes some security risks for my application. |
I could be wrong, but I believe SOCKS5 is already supported: http://www.chromium.org/developers/design-documents/network-stack/socks-proxy
|
@tzellman that sets a single proxy for chrome and not for each page (tab) of chrome. |
@barbolo I believe this workaround applies to most headless browsers. We have set up similar stack with PhantomJS: Same story, works great for HTTP resources but fails to route HTTPS as there is no access to additional headers, querystring, nothing... We are even considering SSL termination but that's just soooo much hacking to achieve such a simple thing :/ Did you have any luck with working around HTTPS requests? |
@gwaramadze Yes, I've found some ways of making this scheme work with HTTPS and I'll share how I'm currently doing it. Like I've said in the previous comment, the custom headers with the proxy information were ignored by Chrome when communicating with the downstream proxy server. However the The first approach I tried was to encapsulate the proxy information in a JSON string sent as the var userAgent = JSON.stringify({
"user-agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36",
"proxy-addr" : "111.111.111.111",
"proxy-port" : "9999",
});
page.setUserAgent(userAgent); That way I can intercept the The problem is that the So what I ended up doing was to create a list with thousands of user agent strings and for each new tab:
That's how I'm doing it now. The steps 2 and 4 implies in reprogramming the downstream proxy. Another approach that should work is to make changes to the source of chromium network to allow other headers to be transmitted. But that would be more maintenance work in the long term. |
@barbolo Thanks, this is quite interesting hack. I wouldn't want to meddle with user agents too much as they might be checked by anti-scraping algorithms. |
@gwaramadze yes. That's why I'm using the other approach. For instance, you have thousands of real chrome user agents available for recent versions of the chrome browser. |
Is this feature in active development? Got the same issue and I guess the Use-Case is widely spread. |
+1 |
+1 |
+1 Even I have similar use case. Waiting for the Solution with capability to set Proxy per page. |
I don't think Puppeteer has anything to do with this issue. The problem is with Chrome, which doesn't provide any API to configure proxy. You can either use a workaround like I've suggested above or you can build Chromium with a modified Network Stack, which I don't see as a good option. |
I'm using request interception to forwarding request:
Please note that In my case I just forward document and xhr request and ignore baseUrl of request options and I use request-promise-native instead of request. You can replace the proxy settings in function fetch. |
@flyxl I used your code in project to forward all request to proxy, but it introduced some 502 error from server. sure directly add proxy config in launch options works fine. |
What Node.js version? |
v8.12.0
…On Fri, Jun 5, 2020 at 8:48 PM Gajus Kuizinas ***@***.***> wrote:
Anybody having issues with Puppeteer-page-proxy?
I'm getting the following error:
dist/source/create.js:155
yield item;
^^^^^
SyntaxError: Unexpected strict mode reserved word
at createScript (vm.js:80:10)
at Object.runInThisContext (vm.js:139:10)
at Module._compile (module.js:617:28)
What Node.js version?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#678 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACV3XANDGR23CBNOMXKTEJLRVGG7RANCNFSM4DZPUCKQ>
.
|
Node needed to be updated. It's working fine |
厉害了,python啥时候有? |
Is this library still working for you? |
…ontext Issue: puppeteer#678 Example: const browser = await puppeteer.launch(); const context = await browser.createIncognitoBrowserContext('myproxy.com:3128'); const page = await context.newPage() await page.authenticate({username: 'foo', password: 'bar' }); await page.goto('https://google.com'); await browser.close();
…ontext Issue: puppeteer#678 Example: (async () => { const browser = await puppeteer.launch(); const context = await browser.createIncognitoBrowserContext('myproxy.com:3128'); const page = await context.newPage() await page.authenticate({username: 'foo', password: 'bar' }); await page.goto('https://google.com'); await browser.close(); })();
…ontext Issue: puppeteer#678 Example: (async () => { const browser = await puppeteer.launch(); const context = await browser.createIncognitoBrowserContext('myproxy.com:3128'); const page = await context.newPage() await page.authenticate({username: 'foo', password: 'bar' }); await page.goto('https://google.com'); await browser.close(); })();
…ontext (#7516) Example: (async () => { const browser = await puppeteer.launch(); const context = await browser.createIncognitoBrowserContext('myproxy.com:3128'); const page = await context.newPage() await page.authenticate({username: 'foo', password: 'bar' }); await page.goto('https://google.com'); await browser.close(); })(); Issue: #678
@Nisthar |
You can use proxy per context, that in the end it's going to be pretty similar |
It is likely that we will never support proxy per page given that it is possible to set it per browser context which is cheap to create for each page (unless it gets supported in the browsers for other reasons). Therefore, closing this issue as not planned. |
Hi!
Could someone tell me, whether there's a possibility to set proxy not only for a chromium instance, but also for a page?
So the current solution is:
const browser = await puppeteer.launch({ args: [ '--proxy-server=127.0.0.1:9876' ] });
Desired solution in my case is something like this:
const page = await browser.newPage({ args: [ '--proxy-server=127.0.0.1:9876' ] });
With proxy per page there's a possibility to run a single chrome instance, but use different proxies depending on page.
Thanks in advance!
The text was updated successfully, but these errors were encountered: