Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selective ("smart") proxying #887

Open
fortuna opened this issue Nov 11, 2020 · 27 comments
Open

Selective ("smart") proxying #887

fortuna opened this issue Nov 11, 2020 · 27 comments
Labels
feature request New feature, we are tracking it

Comments

@fortuna
Copy link
Collaborator

fortuna commented Nov 11, 2020

This issue is to track selective proxying (also known as "smart routing") on the Outline clients and gather feedback and input from the community.

Why?

There's large demand for not proxying what is not blocked. Some reasons:

  • Performance, especially to access domestic websites. Proxying domestic traffic crosses the national border twice in each direction, drastically reducing performance.
  • Client Cost: Some countries charge more for for international traffic. Proxying also breaks the zero-rating done by some ISPs.
  • Provider Cost: Cost is a bottleneck for service providers, and they often have to limit the number of people they help to limit their expenses. Reducing costs can enable providers to help more people.
  • Blocking resistance: all traffic going to a single server is suspicious. Also, there is a report that the GFW may correlate up and downstream traffic when a user uses a proxy to access domestic servers.
  • Geo restrictions: many payment platforms require your IP to be in the country you claim to be.

Approaches

Bypass Domestic traffic

Intercept all DNS traffic:

  • If the requested domain ends with a ccTLD of the user's country, use the system resolver. Otherwise, use the Outline server resolver.
    • The assumption is that the domestic domains are unblocked, since they are under control of the government.

On TCP or UDP connection:

  • Map the IP address to country. If it matches the user's country, connect directly. Otherwise, connect via the Outline server.
    • The assumption is that domestic IPs are unblocked, since they are under control of the government.

Difficulties:

  • With DNS resolution done by the proxy, you may get a foreign IP when a domestic one is available.
    • This is an issue with global services like WeChat, which will give you a foreign IP if resolved abroad (203.205.235.205 instead of 59.37.97.124).
  • We will need to load an IP -> country table on the client, which is extremely memory-constrained on mobile.
    • Mitigation: keep the data for the user's country only.
  • The table will need to be kept updated.
  • IP mapped to the wrong country.
    • Open question: Is that an issue in practice? Do we need an escape hatch?

Examples

Domain or IP list

The customer could specify domains or IPs to force or bypass proxying.
To handle domains, we need to convert them to IPs. To do so, the client can intercept all DNS requests. Then depending on the approach:

  • If the domain is in a force list:
    • Forward the DNS request to the proxy
    • Add the answer IPs to the force proxying list of IPs, with a TTL
  • If the domain is in a bypass list:
    • Send DNS request via the local network
    • Add the answer IPs to the bypass proxying list of IPs, with a TTL

Note that IPs may affect more than one domain.

Difficulties:

  • In practice services use multiple different domains, so having users enter domains is not practical for large services.
    • We could rely on carefully crafted lists. There are lists like the GFW list. However that's country-specific, and we need an approach that works globally.
    • Perhaps we can show the list of recent DNS queries and let users pick what to bypass/force proxying.
  • Outline works as a VPN, not a proxy. It sees IPs, not domain names.
    • We could monitor the resolution of the listed domains and use the answer IPs. This will affect other domains running on the same IP (e.g. it will proxy google.com if youtube.com is in the proxy list).
    • Intercept DNS and assign local IPs to listed domains ("fake dns"), running port forwarders via the proxy on those IPs (Lantern does that). This is very similar to RFC 6535 Dual-Stack Hosts Using "Bump-in-the-Host" (BIH)
      standard for mapping IPv6 to IPv4, but we would map domains instead.
    • Do DPI. Look at the HTTP GETs and SNIs in order to decide how to connect to the destination. This forces the client to buffer the connection until the decision can be made, causing delay and increase in memory (shadowsocks-libev seems to use that). It's fragile and doesn't work with ECH and other protocols.

Examples

Application list

We can provide a way for the user to force proxying or force bypass for specific applications on a device (#933).

That's doable on Android (we do it for Intra), and it seems we can use network namespaces (see also ip-netns) on Linux.

Difficulties:

  • It's not possible on iOS. It's unclear to me what we can do on macOS and Windows.

Detect what's blocked

Outline could connect directly by default, and only proxy in case of blocking.

Difficulties

  • It can be hard to detect blocking. There are country-specific tricks, but it's hard to come up with something that works globally:
    • In China, blocked domains always return foreign IPs. Domestic IPs indicate it's not blocked.
    • In South Korea, blocked domains return the IP for warning.or.kr, which is a domestic IP.
    • In Iran, you get a local network IP (10.10.x.x)
  • It can make the client vulnerable to fingerprinting.

Examples:

Thanks @alalamav for doing a lot of this research.

@fortuna fortuna added the feature request New feature, we are tracking it label Nov 11, 2020
@fortuna fortuna changed the title Selective proxying Selective ("smart") proxying Nov 11, 2020
@felixding
Copy link

Glad to see this. A few notes:

With DNS resolution done by the proxy, you may get a foreign IP when a domestic one is available. Is this a big issue in practice?

It is. For example, we almost instantly got user complaints when we accidentally got foreign IPs for WeChat domains in our VPN client.

Bad IP geolocation. Is that an issue in practice? Do we need an escape hatch?

Is it a different issue from above?

Domain or IP list

v2ray solved this problem by introducing an internal DNS and a GeoSite database: https://www.v2ray.com/en/configuration/dns.html . Does Outline use tun2socks? If so Outline should be able to see domains, no?

we need n approach that works globally

Do people from other places have the same feature request? I think you can add it country by country.

Application list

Some apps (e.g Shadowsocks for Android) used to have this feature, but AFAIK it's not popular anymore.

@fortuna
Copy link
Collaborator Author

fortuna commented Nov 12, 2020

@felixding Would this heuristic work?

  • If domain ends with the user's country ccTLD, use system resolver. Else, use proxy resolver

Do the wechat domains end with the country code TLD?

@fortuna
Copy link
Collaborator Author

fortuna commented Nov 12, 2020

On bad geolocation, sometimes the IP -> country map may be wrong, so you may end up proxying something you don't need, or not proxy something you need. But perhaps it's good enough in most countries.

SOCKS proxies see domain names, but Outline sees IP packets. Outline uses tun2socks, but that gives us connections with IP:port, not domains. Does v2ray intercept all system traffic or does it work as a system proxy?

@database64128
Copy link

database64128 commented Nov 12, 2020

Does v2ray intercept all system traffic or does it work as a system proxy?

Both are supported and widely used. When intercepting all traffic, a sniffing mechanism can be used to detect the target domain name from HTTP/HTTPS requests. A new FakeDNS implementation is coming soon (v2fly/v2ray-core#406).

@fortuna
Copy link
Collaborator Author

fortuna commented Nov 12, 2020

There are corner cases for the domestic bypass. For example, you may be in a network in Canada that blocks psiphon.ca, even though the government doesn't block it. In practice I don't think that will be a big issue, and we are really mainly focused on state-level blocking anyway. The user still has the option to fallback to proxying all traffic.

@Loyalsoldier
Copy link

Proxy all traffic by default, except domains and IPs in the list.

BTW, V2Ray(now maintained by V2Fly team) gathers popular domains classified by organization/company in repo: https://github.com/v2fly/domain-list-community

@felixding
Copy link

@fortuna

If domain ends with the user's country ccTLD, use system resolver. Else, use proxy resolver

Interesting idea. It may work for some websites but certainly not all, as most popular websites in China use .com.

Do the wechat domains end with the country code TLD?

No. WeChat probably have some .cn domains but they are not using them publicly.

On bad geolocation, sometimes the IP -> country map may be wrong...But perhaps it's good enough in most countries.

Should be fine I think.

Does v2ray intercept all system traffic or does it work as a system proxy?

v2ray is a proxy just like Shadowsocks. It doesn't have tun so it doesn't intercept all system traffic.

Unless I missed something but tun2socks should give you domains. Our implementation is like:

  • badvpn-tun2socks -> v2ray's socks server (e.g. localhost:1081)
  • all the DNS and routing are done by v2ray

@fortuna
Copy link
Collaborator Author

fortuna commented Nov 13, 2020

@felixding

Unless I missed something but tun2socks should give you domains. Our implementation is like:

badvpn-tun2socks -> v2ray's socks server (e.g. localhost:1081)
all the DNS and routing are done by v2ray

V2Fly doesn't get the domains from tun2socks. They use a fake DNS to map fake IPs to the original domains.

@felixding
Copy link

No. V2ray does not have Fake DNS yet. There is a PR but not merged: v2fly/v2ray-core#406

The way v2ray gets domains is sniffing which "extracts domain names from TLS and HTTP traffic" (https://guide.v2fly.org/en_US/app/transparent_proxy.html#notes).

@hadifarnoud
Copy link

regarding DNS question, I think we assume all domestic IPs don't need to be proxied. also, Alexa Top Sites list for each country can be useful. if it's on the top sites, it means it is not blocked and no reason for proxy.

@crucifyer
Copy link

It is necessary to set up a list of ips that require a proxy or a country that does not require a proxy.
I can't set anything right now.

@fanyinghao
Copy link

fanyinghao commented Jun 7, 2021

I don't think client side need to consider more about ip/domain or location in a country if block or not. Just provide an interface to turn proxy on/off and retrieve a rule edit manually or remotely. Every developer in different country would think more to the rules. Make the client more extensible. @fortuna

@crucifyer
Copy link

It would be nice to be able to manually edit the allow/deny ip list.

@JamasChuang94
Copy link

Those in the list will go to proxy these whitelisted domain names or IP addresses, and the rest will not choose to proxy. Select global proxy traffic when users don't need PAC

@hadifarnoud
Copy link

What Clash Proxy did with Enhanced Mode is probably the best. It does use a VPN mode on entire system but internally don't send traffic through proxy if it is whitelisted. PAC/Proxy mode might not be supported on all apps

@fortuna
Copy link
Collaborator Author

fortuna commented Nov 11, 2022

There's a report that, when a user in China uses a proxy to access a target in China, the GFW may be able to correlate the incoming and outgoing traffic in the proxy, and determine it's a proxy: net4people/bbs#129 (comment)

Bypassing domestic traffic would prevent that.

@kayx23
Copy link

kayx23 commented Mar 7, 2023

@fortuna as a user, I would very much appreciate if split tunnel option can be implemented soon. Switching on/off the VPN all day is inconvenient as I need certain traffic through chinese network routing (e.g. video calls) while googling stuff on my browser. Would hate to have to move away from Outline due to the lack of split tunnelling option.

@kayx23
Copy link

kayx23 commented Mar 28, 2023

Please factor in interests in this issue too: #602

My reply there:

Can we at least understand why this proposal hasn't been looked at for such a long time? Any technical difficulties? Time commitment issue? Not enough interests? Maybe some of us can collaborate and contribute? Let us know.

Taking a step back, will the implementation involve changes to BOTH the outline client and server?

@fortuna
Copy link
Collaborator Author

fortuna commented Mar 28, 2023

@kayx23 we have not been able to prioritize this due to technical challenges and lack of headcount.

More recently we've been focused on working around blocking in Iran and China. Since then we introduced dynamic keys and prefix camouflaging. We are now looking into making our network stack independent of the protocol, so we can use different protocols, and compose them in different ways, so we can have more agility in the strategies. We also want to release an SDK to let people build tools more easily, since we don't usually have the capacity to build many tools people ask.

We do have some exploratory code: https://github.com/Jigsaw-Code/outline-client/tree/bemasc-split-tunnel. But we won't be able to get back to it anytime soon.

@kayx23
Copy link

kayx23 commented Mar 29, 2023

I see. Thank you for the information : )

@meecosha
Copy link

meecosha commented Jul 22, 2023

Hi! Is it hard to implement a feature that only certain websites are proxied? So I can make a silly txt file in the root folder with a website on each row so that Outline only reroutes when accessing those websites?

@daniellacosse
Copy link
Contributor

daniellacosse commented Jul 24, 2023

Yeah unfortunately each platform is different so supporting all five at once (ios, android, linux, windows, macos) is challenging. We are working towards it, however!

@pieterclaassen
Copy link

Smart proxying is a really good idea and the comments in this thread outlines creative ways to do this. This feature will increase general uptake of Outline by the man in the street as the current server doesn't handle non-sensitive traffic like video etc. in a cost effective manner. This blocks users who cannot afford to implement privacy for non-sensitive data.

What is the status of activity now (@daniellacosse @fortuna )? I see the branch referenced split tunnelling has been deleted. https://github.com/Jigsaw-Code/outline-client/tree/bemasc-split-tunnel

Maybe start with a bazar approach and just do a text file URL whitelist with very basic DNS translation. This could cover 80% of the Youtube/Tiktok/Reddit use cases and put Outline in reach for people who cannot use it now due to pricing issues.

This whitlist can be created locally on the client and/or pushed from the service for organisational policies. I have some time to help with such a branch if it could help.

@catinapoke
Copy link

@daniellacosse Is there any updates? This topic exists since 2020

@Korb
Copy link

Korb commented Oct 13, 2024

Some reasons:

In my country, websites related to government agencies in most cases block all connections not from the IP of my country. These include tax services, traffic police, utility bill payment services, medical institution websites, etc. For all of them, VPN currently has to be disabled.

@buzzkirill
Copy link

This functionality would be extremely useful if implemented.

@Miv2nir
Copy link

Miv2nir commented Oct 25, 2024

Bump this, I've been having issues with comfortable use of nekoray for accessing the outside world, so a better alternative would be very welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature, we are tracking it
Projects
None yet
Development

No branches or pull requests