Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create proxy handlers for China. #87

Closed
pauldotknopf opened this issue Jan 10, 2016 · 37 comments
Closed

Create proxy handlers for China. #87

pauldotknopf opened this issue Jan 10, 2016 · 37 comments
Labels

Comments

@pauldotknopf
Copy link

I don't know where to start this discussion so I will start it here.

China has blocked recaptcha. The service is awesome, so I'd like to attempt to workaround this.

My idea would be this.

  1. All network requests are proxies through a server handler. "http://mydomain.com/recpatcha/http://google.com/recpatcha/image.jpeg" and "http://mydomain.com/recpatcha/http://google.com/recpatcha/script.js" would be proxied and served as if the content came from "mydomain.com".
  2. There may need to by some "live" modification of the scripts and styles returned from the proxy so that all urls and domains have the "http://mydomain.com/recaptcha" preappended to them.

What do you guys think? I haven't tried it yet, but does anyone know if there would be any hangups with this approach?

@paragonie-scott
Copy link

#93 should help here.

@rowan-m rowan-m added the php label Aug 4, 2016
@zypA13510
Copy link

zypA13510 commented Jan 31, 2017

Hello, for anyone who is still interested, I have made an apache configuration that will setup a reverse proxy for Recaptcha using your own server under your domain yourdomain.com.

yourdomain.com/recaptcha -> www.google.com/recaptcha
static.yourdomain.com -> www.gstatic.com

Edit: moved to gist for easier maintenance
https://gist.github.com/zypA13510/fc3669a4c6957f3593c6ebed76d1d433

@tabjy
Copy link

tabjy commented Feb 16, 2017

@zypA13510

What did you do to https://www.google.com/recaptcha/api.js and https://www.gstatic.com/recaptcha/api2/r20170206171236/recaptcha__en_gb.js?

I tried to modify the hostnames in those files, but the browser ends up sending request to https://www.google.com/recaptcha/api2/userverify all the time...

Thanks in advance

@zypA13510
Copy link

zypA13510 commented Feb 16, 2017

@tabjy
Short answer: AddOutputFilterByType and Substitute.
For details, refer to the apache documentation
Make sure you enabled the related modules in your apache configuration file.
If the setup is sucessful, you shouldn't see any request to www.google.com or www.gstatic.com in your network requests.
Feel free to ask if you need more help.

@barene
Copy link

barene commented Feb 17, 2017 via email

@tabjy
Copy link

tabjy commented Feb 18, 2017

@zypA13510
I'm not using Apache here but node.js. I did something similar to substitute all google's domain to mine. It turns out I forgot to proxy /recaptcha/api2/anchor.
However, after fixing this issue, I got following error from the browser

Uncaught DOMException: Failed to construct 'Worker': Script at 'https://www.google.com/recaptcha/api2/webworker.js?hl=en&v=r20170213115309' cannot be accessed from origin 'http://localhost:3000'.

I'm absolute sure that I substituted all domain in /recaptcha/api2/anchor. So I did some research and found out that "Google implemented a whole VM in JavaScript with a specific bytecode language" according to neuroradiology. Maybe the domain is embedded in those byte codes?
Since this is way beyond my knowledge, I'll have to give up on this.
Anyway, thanks for your help.

@zypA13510
Copy link

@tabjy

  1. Sorry, there're some minor issues in the original code I provided. I have updated the code now (tested and working). Please update your code accordingly and test again.
  2. The URL in the bytecode stream is not critical according to my research. The recaptcha still seems to work even if one of the requests (to www.google.com) failed.
  3. Make sure the recaptcha__<language_code>.js served is properly filtered. Take a look at the response in your browser, you should have replaced all www.google.com with yourdomain.com and www.gstatic.com with static.yourdomain.com. Searching those two strings should yield no result.

@javier-reguillo
Copy link

Hello, How do I do this in IIS?
Thanks

@DarwinSilva
Copy link

Hi, I stay trying make it on my Apache server but the recaptcha is not working
Can you put one example where this stay running?
Thanks

@zypA13510
Copy link

@DarwinSilva
I use this on my Wordpress server with a Recaptcha plugin. And it seems to work in China without being blocked. (One request still points to google.com and failed of course, but Recaptcha works nevertheless)

@joinso
Copy link

joinso commented Mar 20, 2017

Hi @zypA13510 !

I followed your steps but doesn't work.
The first call to the page where Google Recaptcha is present, makes the "SUBSTITUTE" without problems.
So

<script src="https://www.google.com/recaptcha/api.js?hl=es" async="async" defer="defer"></script>

changes to

<script src="https://mydomain.com/recaptcha/api.js?hl=es" async="async" defer="defer"></script>

That's ok.

However in the call "https://mydomain.com/recaptcha/api.js?hl=es", the SUBSTITUTE doesn't work.
I still seeing "www.gstatic.com" inside.

So, it seems that the SUBSTITUTE doesn't work on proxypass ....

Any idea?

Regards,
JOINSO

@joinso
Copy link

joinso commented Mar 31, 2017

Hi!
Solved: misconfiguration on Apache.

@Augustin-FL
Copy link

Augustin-FL commented Apr 4, 2017

Hello,

i tried to apply the idea of @zypA13510 , however i don't have full control to the virtualhosts. but, i have mod_rewrite enabled on my apache. So i wrote a small PHP wrapper, and a .htaccess :

.htaccess file

RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /recaptcha/index.php [L] 
#change to RewriteRule . /index.php [L] for static.yourwebsite.com

index.php file

<?php
/*
	simple reverse proxy for google reCaptcha in PHP :
	- if visitor is on "website.com/recaptcha/xxxxx" , then this script get "google.com/recaptcha/xxxx"
	- if visitor is on "static.website.com/xxxx", then this script get "gstatic.com/xxxx"
	
*/

$proxy="";//if the server you are on, need any proxy to go to the internet


// -- STEP 1 : decide if the visitor is on static.website.com or website.com/recaptcha. Also, get the domain name and the uri entered.

$uri=$_SERVER['REQUEST_URI'];
$host=$_SERVER["HTTP_HOST"];

if(strpos($host,'static.')===0) 
{
	$host=substr($host,strlen('static.'));
	$domain='https://www.gstatic.com/';
}
else
{
	$domain='https://www.google.com/';
}




// ---- Step 2 : We make the request (with curl)

$curl=curl_init($domain.$uri);
curl_setopt($curl, CURLOPT_PROXY, $proxy);// IMPORTANT : we need to enter proxy
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($curl, CURLOPT_TIMEOUT, 10); 
if(!empty($_SERVER['HTTP_COOKIE'])) curl_setopt($curl, CURLOPT_HTTPHEADER, array("Cookie: ".$_SERVER['HTTP_COOKIE']));
curl_setopt($curl, CURLOPT_HEADERFUNCTION, "curlResponseHeaderCallback");
	
	

if(!empty($_POST))// if we received any POST data, we send it with the request
{	
	$post="";
	foreach ($_POST AS $key=>$value)  $post .= $key.'='.$value.'&'; 
	$post = rtrim($post, '&'); 

	curl_setopt($curl, CURLOPT_POST, 1);
	curl_setopt($curl, CURLOPT_POSTFIELDS,$post);
}
	


$response = curl_exec($curl);


// Step 3 : we get the request and replace all references to google.com by website.com
// and all references to gstatic.com by static.website.com

if($response!=false)
{	
	$response = str_replace("www.google.com", $host,$response);
	$response = str_replace("www.gstatic.com", "static.".$host,$response);
	$response = str_replace("https://", "http://",$response);
	
	echo $response;
}
else
{
	echo "/*error : could not get recaptcha*/";
}

//step 3 (bis) : we display the content type returned by the request
//and we also replace the cookies
function curlResponseHeaderCallback($curl, $headerLine)
 {	
	if (strpos($headerLine,'Set-Cookie:') === 0)
	{
		$headerLine = str_replace("www.google.com", $host,$headerLine);
		$headerLine = str_replace("www.gstatic.com", "static.".$host,$headerLine);
		header($headerLine);
	}
	else if (strpos($headerLine,'Content-Type:') === 0)
	{
		header($headerLine);
	}
	
	return strlen($headerLine); // Needed by curl
}


?>

Here is how to use it :

(yes, you need to copy each file in two location. You will have 4 files at the end)

Just to be clear : this is NOT the best way at all to proxy recaptcha (create a reverse-proxy at apache level is clearly better), however sometimes there is no other choice than doing this

@stath715
Copy link

@zypA13510

It seems that a lot of google domain are also accessed: www.google.com,www.gstatic.com,support.google.com,developers.google.com,fonts.gstatic.com

I am stuck with https://www.google.com/js/bg/d--b7FVIhvCFHkmSrkgO9rhjbdCimjBfDEqJIwYWYPc.js initiated by recaptche__en.js in which I can't see any "www.google.com" reference.

It's like if that url was built by a js function.

Do you have any update about your reverse proxy solution?

Tkx

@zypA13510
Copy link

zypA13510 commented Dec 26, 2017

@stath715

Yes, it is as you said. Requests built from sources other than plain text (e.g. from a binary stream) cannot be detected by SUBSTITUTE. But I tested my solution on a client computer that never visited Google nor used any VPN. Despite a few requests failed, it worked nevertheless and I was able to click the right images to get pass recaptcha (at least at the time of my previous post). However, I have not tested it recently, so I'm afraid I can't help you further. Sorry.

One thing about the Chinese Great Firewall is, it is ever-changing and evolving. And of course, its behavior is not, and will never be documented. In other words, it will never be easy, trying to grant access to a website that is not meant to be accessible. 😉

If you find a better solution, you can share it to help more (or I don't mind updating my comment). Good luck.

@joinso
Copy link

joinso commented Dec 26, 2017

Hi!

Here is my solution that works.

Create two conf on Apache and replace WWW.YOURDOMAIN.COM and YOURDOMAIN.COM with your domain.

  1. /etc/httpd/conf.d/mysite.conf:

<VirtualHost *:80>
ServerName WWW.YOURDOMAIN.COM:80
DocumentRoot /var/www/html
ProxyRequests Off
SSLProxyEngine On
SSLProxyVerify none
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
SSLProxyCheckPeerExpire off
ProxyVia On
ProxyPreserveHost Off
GeoIPEnable On
GeoIPScanProxyHeaders On
GeoIPDBFile /usr/share/GeoIP/GeoIP.dat
GeoIPDBFile /usr/share/GeoIP/GeoIPCity.dat
GeoIPDBFile /usr/share/GeoIP/GeoIPASNum.dat
<Proxy *>
Order deny,allow
Allow from all

  ProxyPass "/recaptcha" "https://www.google.com/recaptcha"
  ProxyPassReverse "/recaptcha" "https://www.google.com/recaptcha"
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    <If "%{ENV:GEOIP_COUNTRY_CODE} in { 'CN' }">
      RequestHeader unset Accept-Encoding
      FilterChain CUSTOMFILTER
      Substitute "s/www.google.com/WWW.YOURDOMAIN.COM/ni"
      Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.COM/ni"
   	</If>
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "WWW.YOURDOMAIN.COM"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.COM"
  1. /etc/httpd/conf.d/gstatic.conf:

    <VirtualHost *:80>
    ServerName gstatic.YOURDOMAIN.COM:80
    SSLProxyEngine On
    ProxyVia On
    ProxyRequests Off
    ProxyPreserveHost Off
    <Proxy *>
    Order deny,allow
    Allow from all

    ProxyPass "/" "https://www.gstatic.com/"
    ProxyPassReverse "/" "https://www.gstatic.com/"
    FilterDeclare CUSTOMFILTER
    FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
    FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
    FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
    FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"

    RequestHeader unset Accept-Encoding
    FilterChain CUSTOMFILTER
    Substitute "s/www.google.com/www.YOURDOMAIN.COM/ni"
    Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.COM/ni"

    ProxyPassReverseCookieDomain "www.google.com" "WWW.YOURDOMAIN.COM"
    ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.COM"

@ykuz
Copy link

ykuz commented Jan 18, 2018

Hello,

does someone has solution for nginx?

@rehfeldchris
Copy link

@zypA13510

I had trouble with the config you posted on jan 30 until I took a clue from @joinso - it's important to add RequestHeader unset Accept-Encoding otherwise the Substitute ... was not working for me. I was using apache 2.4.29, in case the substitute behavior has changed at some point. My guess is that the substitute didn't work because the response was gzip'd, and stripping the header avoids that scenario.

@rehfeldchris
Copy link

So, now that I implemented this clever hack, I'd like to warn others that it doesn't work very well.

  1. If I click the button to request an audio version of the captcha, it refuses, telling me "Your computer or network may be sending automated queries. To protect our users, we can't process your request right now. For more details visit our help page"
  2. When I solve the captcha the normal way (clicking images with street signs etc...) it makes me click much more than usual. I feels like about 15-25 image clicks before it will be satisfied.
  3. The links for "Privacy" and "Terms" get rewritten to your domain, but the ProxyPass "/recaptcha" config ensures only urls that start with /recaptcha are actually proxied to google.com, and so these links fail. This could be fixed easily, but the previous 2 problems are probably very difficult or just not solvable, making it a moot point.

@Equim-chan
Copy link

Equim-chan commented Feb 24, 2018

You may try recaptcha.net. It's offical and accessible from China. Just change

https://www.google.com/recaptcha/api.js?render=explicit

to

https://recaptcha.net/recaptcha/api.js?render=explicit

in front end, and

https://www.google.com/recaptcha/api/siteverify

to

https://recaptcha.net/recaptcha/api/siteverify

in back end, and it should work as expected.

@rehfeldchris
Copy link

rehfeldchris commented Mar 8, 2018

So, I tried using recaptcha.net, but I see it still loads 1 asset from google.com

eg, the url that starts with: https://www.google.com/recaptcha/api2/anchor...

I tried putting this url into the "test url" tab of https://en.greatfire.org/analyzer but it said it failed to load

Is there any official comment from google on this?

@sdemjanenko
Copy link

sdemjanenko commented Mar 8, 2018

I was able to get this to work.

In the page I added:

 <script>
  window['__recaptcha_api'] = "https://my.recaptcha.proxy.com/recaptcha_proxy/google_com/";
 </script>
 <script type="text/javascript" src="https://my.recaptcha.proxy.com/recaptcha_proxy/google_com/api.js"></script>

In Nginx I added (note: this is ERB which compiles to Nginx config):

location /recaptcha_proxy/#{proxy_path}/api.js {
 proxy_redirect off;
 proxy_set_header X-Real-IP $remote_addr;
 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 proxy_set_header Host "www.#{domain}";
 proxy_set_header User-Agent $http_user_agent;
 proxy_set_header Referer $http_referer;
 proxy_cookie_domain "#{domain}" $host;
 proxy_set_header Accept-Encoding "";
 sub_filter_once off;
 sub_filter_types text/css text/html text/javascript;
 sub_filter "www.gstatic.com/recaptcha" "$http_host/gstatic_proxy/recaptcha";
 sub_filter "fonts.gstatic.com/" "$http_host/gstatic_fonts_proxy/";
 sub_filter "recaptcha.anchor.Main.init(" "window.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nrecaptcha.anchor.Main.init(";
 sub_filter "recaptcha.anchor.ErrorMain.init(" "window.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nrecaptcha.anchor.ErrorMain.init(";
 sub_filter "recaptcha.frame.Main.init(" "window.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nrecaptcha.frame.Main.init(";
 sub_filter "recaptcha.frame.ErrorMain.init(" "window.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nrecaptcha.frame.ErrorMain.init(";
 sub_filter "importScripts(" "this.__recaptcha_api = '$http_host/recaptcha_proxy/#{proxy_path}/';\nimportScripts(";
 proxy_pass https://www.#{domain}/recaptcha/api2/$1$is_args$args;
}

location ~* ^/recaptcha_proxy/#{proxy_path}/api2/(.+)$ {
 ... same as above
 proxy_pass https://www.#{domain}/recaptcha/api2/$1$is_args$args;
}

location ~* ^/gstatic_proxy/recaptcha/(.+)$ {
 proxy_pass https://www.gstatic.com/recaptcha/$1$is_args$args;
 proxy_set_header X-Real-IP $remote_addr;
 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 proxy_set_header User-Agent $http_user_agent;
 proxy_set_header Referer $http_referer;
 proxy_set_header Accept-Encoding "";
 sub_filter_once off;
 sub_filter_types text/css text/html text/javascript;
 sub_filter "www.gstatic.com/recaptcha" "$http_host/gstatic_proxy/recaptcha";
 sub_filter "fonts.gstatic.com/" "$http_host/gstatic_fonts_proxy/";
 sub_filter "www.gstatic.c..?" "$http_host\\/gstatic_proxy";
 sub_filter "/recaptcha/api2/" "";
}

location ~* ^/gstatic_fonts_proxy/(.+)$ {
 proxy_pass https://fonts.gstatic.com/$1$is_args$args;
 proxy_set_header X-Real-IP $remote_addr;
 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 proxy_set_header User-Agent $http_user_agent;
 proxy_set_header Referer $http_referer;
}

@msegzda
Copy link

msegzda commented Apr 19, 2018

Is this workaround officially "supported"?

@Augustin-FL
Copy link

@msegzda "officially"? there is not such concept in China.

You need to understand something : when chinese government decide do ban websites, officials just say to chinese ISP "please ban Google", without any clear explanation of what "Google" is.

Because of this, chinese ISP do ban websites depending on their interpretation / rendition, which is sometimes subjective. The two main ISP of China (China unicom & china telecom) don't use the same ban list and/or ban depending on the area you are in....so basically, a website could be sometimes accessible and sometimes not, in a chaos which is so representative of China.

@msegzda
Copy link

msegzda commented Apr 19, 2018

@Augustin-FL thanks for reply. Let me clarify. I don't care about China, its probably wrong place but what I'm asking is what's Google's official standing on community using these reverse-proxy workarounds? As you can see there is number of complex substitutions on Nginx or Apache filters happening in order to remove and replace any Google servers with own domains and endpoints. If reCaptcha sources changes in those parts all of the implementations blows up! So what I want to know - does Google support or back the community doing this workaround and are they + community careful about changing the code that can potentially break all the zillions of websites in China using reCaptcha?

@SimonVillage
Copy link

@sdemjanenko would you mind to update your nginx snippet? Seems like api2 is not longer available?

@rowan-m
Copy link
Contributor

rowan-m commented Aug 1, 2018

I'm updating the client on the v1.2 branch to allow you to set an arbitrary URL for the siteverify call which may help in testing environments or other situations.

@rowan-m rowan-m added question and removed php labels Aug 1, 2018
@rowan-m rowan-m closed this as completed Aug 1, 2018
@rowan-m
Copy link
Contributor

rowan-m commented Aug 1, 2018

Reliability of results and user experience if you're going through a proxy is really out of scope for this repo. I've updated the code to allow for setting of an explicit URL and I'm happy to take PRs that add RequestMethods for better working within a proxy.

@eyeinsky
Copy link

@zypA13510 do you do the yourdomain.com/static.yourdomain.com separation simply to avoid filename clashes? I.e if one is feeling lucky and there are no name clashes then one could serve a bunch of domains from within yourdomain.com/recaptcha?

@zypA13510
Copy link

@eyeinsky

do you do the yourdomain.com/static.yourdomain.com separation simply to avoid filename clashes?

Another reason is that rewriting path (the part after hostname) in reverse proxy is very troublesome and tends to have undesired result

if one is feeling lucky and there are no name clashes then one could serve a bunch of domains from within yourdomain.com/recaptcha?

good luck with that. But personally, I don't think it's the right way to go (unless you are really limited to one domain only and have no other choice)

@rwat090
Copy link

rwat090 commented Sep 1, 2018

Hi Everyone,

I have written a reverse proxy solution for our China customers but the problem is the solution is unstable, for example, we have the following issue

It takes a user about 8 to 9 verify requests before the users response is accepted within the Recaptcha interface.

I receive the following response for the POST to "/recaptcha/api2/userverify?k=xxxx"

["uvresp”,xxxxx”,,0,null,null,null,null,["rresp","03xxxxx,null,120,["pmeta",["/m/01bjv",null,3,3,3,null,"bus",[] ] ,null,[1,3000] ]"dynamic",null,["bgdata”,xxxx”]]

If I use google the domain I receive a successful ReCaptcha response within 3 attempts

["uvresp","03xxxxx”,1,120]

Basic Curl POST to Google reCAPTCHA Domain (Bypassing Proxy)

["uvresp",null,null,null,1]

Im trying to understand the issue with the response, im wondering if its related to the session, does anyone have any tips or advice how to decode the response to confirm the issue ?

@batou-mtcapthca
Copy link

One can also consider using another captcha service as fallback when reCaptcha fails to load. Here is an example by MTCaptcha: https://www.mtcaptcha.com/faq-recaptcha-fallback-mtcaptcha. MTCaptcha is not free though, it does have a relatively cheaper plan if only need to support traffic in China.

Full transparency, I work for MTCaptcha, and its an awesome service :-)

@joinso
Copy link

joinso commented Oct 22, 2019

Hi!

I posted a new version of my solution.
It does not work, but perhaps someone can help us like @rwat090 .

In /etc/httpd/conf.d/yourdomain.conf:

  <VirtualHost *:80>      	
  ServerName www.YOURDOMAIN.com:80 	
  DocumentRoot /var/www/html
  ProxyRequests Off      
  SSLProxyEngine On
  SSLProxyVerify none 
  SSLProxyCheckPeerCN off
  SSLProxyCheckPeerName off
  SSLProxyCheckPeerExpire off      
  ProxyVia On
  ProxyPreserveHost Off      
  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>
  ProxyPass /s3fs-css/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPassReverse /s3fs-css/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPass /s3fs-js/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPassReverse /s3fs-js/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPass /s3fs-images/ https://static.YOURDOMAIN.com/s3fs-public/
  ProxyPassReverse /s3fs-images/ https://static.YOURDOMAIN.com/s3fs-public/
          
  ProxyPass "/recaptcha" "https://www.google.com/recaptcha"
  ProxyPassReverse "/recaptcha" "https://www.google.com/recaptcha"
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    <If "%{ENV:COUNTRY_CODE} in { 'CN','HK' }">
      RequestHeader unset Accept-Encoding
      FilterChain CUSTOMFILTER
      Substitute "s/www.google.com/www.YOURDOMAIN.com/ni"
      Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.com/ni"
   	</If>
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "www.YOURDOMAIN.com"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.com"
  </VirtualHost>

In ssl.conf:

  <VirtualHost *:443>
  ServerName www.YOURDOMAINcom
  ProxyRequests Off      
  SSLProxyEngine On
  SSLProxyVerify none 
  SSLProxyCheckPeerCN off
  SSLProxyCheckPeerName off
  SSLProxyCheckPeerExpire off      
  ProxyVia On
  ProxyPreserveHost Off
  MaxMindDBEnable On   

  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>
                
  ProxyPass "/recaptcha" "https://www.google.com/recaptcha"
  ProxyPassReverse "/recaptcha" "https://www.google.com/recaptcha"
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    <If "%{ENV:COUNTRY_CODE} in { 'CN','HK' }">
      RequestHeader unset Accept-Encoding
      FilterChain CUSTOMFILTER
      Substitute "s/www.google.com/www.YOURDOMAIN.com/ni"
      Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.com/ni"
   	</If>
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "www.YOURDOMAIN.com"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.com"              
  </VirtualHost>

  <VirtualHost *:443>
  ServerName gstatic.YOURDOMAIN.com
  ErrorLog logs/gstatic.com.error_log
  TransferLog logs/gstatic.com.access_log
  SSLProxyEngine On
  ProxyVia On
  ProxyRequests Off
  ProxyPreserveHost Off
  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>      
  ProxyPass "/" "https://www.gstatic.com/"
  ProxyPassReverse "/" "https://www.gstatic.com/"      
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    RequestHeader unset Accept-Encoding
    FilterChain CUSTOMFILTER
    Substitute "s/www.google.com/www.YOURDOMAIN.com/ni"
    Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.com/ni"
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "www.YOURDOMAIN.com"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.com"
  </VirtualHost>

In /etc/httpd/conf.d/maxmind_geolite2.conf:

  LoadModule maxminddb_module /usr/lib64/httpd/modules/mod_maxminddb.so    
  <IfModule mod_maxminddb.c>
    MaxMindDBEnable On
    MaxMindDBFile ASN_DB /usr/share/GeoIP/GeoLite2-ASN.mmdb
    MaxMindDBEnv MM_ASN ASN_DB/autonomous_system_number
    MaxMindDBEnv MM_ASORG ASN_DB/autonomous_system_organization
    
    MaxMindDBFile CITY_DB /usr/share/GeoIP/GeoLite2-City.mmdb
    MaxMindDBEnv MM_COUNTRY_CODE CITY_DB/country/iso_code
    MaxMindDBEnv MM_COUNTRY_NAME CITY_DB/country/names/en
    MaxMindDBEnv MM_CITY_NAME CITY_DB/city/names/en
    MaxMindDBEnv MM_LONGITUDE CITY_DB/location/longitude
    MaxMindDBEnv MM_LATITUDE CITY_DB/location/latitude    
    MaxMindDBEnv MM_REGION_CODE  CITY_DB/subdivisions/0/iso_code        
    
    MaxMindDBFile COUNTRY_DB /usr/share/GeoIP/GeoLite2-Country.mmdb
    MaxMindDBEnv COUNTRY_CODE COUNTRY_DB/country/iso_code  
  </IfModule>

In /etc/httpd/conf.d/gstatic.conf:

  <VirtualHost *:80>
  ServerName gstatic.YOURDOMAIN.com:80  
  SSLProxyEngine On
  ProxyVia On
  ProxyRequests Off
  ProxyPreserveHost Off
  <Proxy *>
    Order deny,allow
    Allow from all       
  </Proxy>      
  ProxyPass "/" "https://www.gstatic.com/"
  ProxyPassReverse "/" "https://www.gstatic.com/"      
  FilterDeclare CUSTOMFILTER
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/html|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/css|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^text/javascript|"
  FilterProvider CUSTOMFILTER SUBSTITUTE "%{CONTENT_TYPE} =~ m|^application/javascript|"
  <Location />
    RequestHeader unset Accept-Encoding
    FilterChain CUSTOMFILTER
    Substitute "s/www.google.com/www.YOURDOMAIN.com/ni"
    Substitute "s/www.gstatic.com/gstatic.YOURDOMAIN.com/ni"
  </Location>
  ProxyPassReverseCookieDomain "www.google.com" "www.YOURDOMAIN.com"
  ProxyPassReverseCookieDomain "www.gstatic.com" "gstatic.YOURDOMAIN.com"
  </VirtualHost>

If your are behind a Proxy: In /etc/httpd/conf.d/remoteip.conf:

  LoadModule remoteip_module modules/mod_remoteip.so
  RemoteIPHeader X-Forwarded-For

As @rwat090 says, the reCatpcha works, load all images, but it takes about 10 test to pass it.
However when submit, says that the recaptcha is wrong.

Regards,
JOINSO

@blinkybill
Copy link

blinkybill commented Nov 8, 2019

@joinso. Maybe it's working fine, and by the tenth pass its simply failing due to a timeout? (from memory the recaptcha client response code needs to be validated server side within 2 minutes or something)

@rehfeldchris the original post is a couple years old, but I'd be expecting that Google does this auto switching of dependant JS files out of the box depending on where the request comes from?, otherwise the "recaptcha.net" domain they've offered as an alternative for "International" use cases would be pointless. E.g. for supporting someone in China I don't think Google engineers would be silly enough to ask us on their official site to load the first script via "recaptcha.net", and simply have all subsequent dependant files still loading from "google.com".

@somireddysathiDB
Copy link

Hi,

We are using ALB as the client facing and have Apache reverse proxy in between ALB and application server. Implemented the same solution in Apache reverse proxy without any virtual host. getting js file If I try to access the js file from browser as "http:///recaptcha/api.js". I am getting 404 error when I tried from browser as "https:///recaptcha/api.js". I verified the Apache reverse proxy logs, connection is established with google but getting 404 error from google somehow. Can you please share your thoughts what is going wrong.

@somireddysathiDB
Copy link

Hi,

Can someone throw light on the above issue. appreciate your inputs.

@swetalina-orangescrum
Copy link

Hello, for anyone who is still interested, I have made an apache configuration that will setup a reverse proxy for Recaptcha using your own server under your domain yourdomain.com.

yourdomain.com/recaptcha -> www.google.com/recaptcha static.yourdomain.com -> www.gstatic.com

Edit: moved to gist for easier maintenance https://gist.github.com/zypA13510/fc3669a4c6957f3593c6ebed76d1d433

Hello, for anyone who is still interested, I have made an apache configuration that will setup a reverse proxy for Recaptcha using your own server under your domain yourdomain.com.

yourdomain.com/recaptcha -> www.google.com/recaptcha static.yourdomain.com -> www.gstatic.com

Edit: moved to gist for easier maintenance https://gist.github.com/zypA13510/fc3669a4c6957f3593c6ebed76d1d433

https://www.gstatic.com
Could anyone explain why this url is not opening after setting proxy url in server??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests