Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Client][Enhancement] Better support for servers with backup ports #706

Closed
sstpe opened this issue Dec 1, 2021 · 17 comments
Closed

[Client][Enhancement] Better support for servers with backup ports #706

sstpe opened this issue Dec 1, 2021 · 17 comments

Comments

@sstpe
Copy link

sstpe commented Dec 1, 2021

As per the latest blocking mitigation advice from GFW Report, many server owners, myself included, may have changed their configurations to support multiple ports as a backup.

After configuring sslocal to know about each of these "servers" (in reality the same machine on different ports), by default sslocal will send a connectivity check through each configured server every ~10 seconds.

In this scenario, the connectivity checks are unnecessary (because it's the same server machine, so latency won't differ and load balancing is not needed). It's also potentially fingerprintable behavior.

If we agree this is something to fix, we could (just one potential solution of many) modify PingBalancer's checker_task method:

async fn checker_task(self: Arc<Self>) {
assert!(!self.servers.is_empty(), "check PingBalancer without any servers");
if !self.probing_required() {
self.checker_task_dummy().await
} else {
self.checker_task_real().await
}
}

It could include a third option such as checker_task_fallback_only, or we could just modify check_once.

My suggested fix doesn't actually remove the connectivity checks altogether (which I think would be much better -- and just fall back to another port on connection failure), but this is how I could see to do this without too much heavy lifting.

Perhaps the "best" fix would involve making a new way to configure this (same server, multiple ports) in the config file, then adding fallback functionality for the client outside of the loadbalancing module, but that's over my head in terms of my familiarity with this project.

@sstpe
Copy link
Author

sstpe commented Dec 21, 2021

Still interested in this. Serversocks isn't by default a "chatty" protocol, so it feels like the current PingBalancer makes this implementation feel unnecessarily chatty. I think that this issue makes chances of detection (of both client and server) more likely.

@zonyitoo
Copy link
Collaborator

If I understand it correctly, you want servers in local should be grouped by IPs?

@sstpe
Copy link
Author

sstpe commented Dec 21, 2021

That's my original issue, though perhaps the best solution would be to provide a different mode of choosing which server to use. For example, choosing to use the first configured server unless there is a failure. On failure, switch to second configured server, and so on.

@zonyitoo
Copy link
Collaborator

What you are proposing is to make a passive balancer, which servers' weights are reported by each connections.

@zonyitoo
Copy link
Collaborator

There may be a solution that:

  1. Create a new balancer, called passive_weight maybe
  2. Lower servers' weight when connect fails

But currently we couldn't know when we saw a connect failure whether it cannot connect to the shadowsocks-server or cannot connect to remote target.

Err(err) => {
let reply = match err.kind() {
ErrorKind::ConnectionRefused => Reply::ConnectionRefused,
ErrorKind::ConnectionAborted => Reply::HostUnreachable,
_ => Reply::NetworkUnreachable,
};
let dummy_address = SocketAddr::new(Ipv4Addr::UNSPECIFIED.into(), 0);
let header = TcpResponseHeader::new(reply, Address::SocketAddress(dummy_address));
header.write_to(&mut stream).await?;
return Err(err);
}

@sstpe
Copy link
Author

sstpe commented Dec 22, 2021

Yes, a passive balancer. That's a great way to describe it. I agree with your passive_weight solution, if we can find a way to make it possible.

Hmm, I see the problem you're mentioning. Right now, it seems like there's no way to pass the result of tcprelay's handle_tcp_connect to any load balancer. Currently, it seems like the project structure requires any potential load balancing module to perform its own connections to determine weights.

I guess that is the key to my issue. Could we update the way loadbalancing modules interface with the local server, so that we can create a passive balancer?

I'm not an expert at Rust, but as I look at the code, I think the Err(err) on line 170 in the block you provided is not actually used anywhere. It's not used in handle_tcp_connect (defined in tcprelay.rs), then it's not used in handle_socks5_client (defined in tcprelay.rs), or in handle_tcp_client (defined in server/mod.rs)

I noticed the failure to connect is actually not displayed in any trace or debug message either. For example, with my internet connection turned off:

curl --socks5-hostname localhost:11111 https://google.com
TRACE [25836:140195636676160] [shadowsocks_service::local::socks::server::socks5::tcprelay] socks5 HandshakeRequest { methods: [0, 1] }
TRACE [25836:140195636676160] [shadowsocks_service::local::socks::server::socks5::tcprelay] reply handshake HandshakeResponse { chosen_method: 0 }
TRACE [25836:140195723675200] [shadowsocks_service::local::socks::server::socks5::tcprelay] socks5 TcpRequestHeader { command: TcpConnect, address: google.com:443 } peer: 127.0.0.1:56718
DEBUG [25836:140195723675200] [shadowsocks_service::local::socks::server::socks5::tcprelay] CONNECT google.com:443
curl: (97) Can't complete SOCKS5 connection to google.com. (3)

Perhaps the failure to connect should be logged, too. That's a separate issue, though.

@zonyitoo
Copy link
Collaborator

Since PingBalancer is currently the only balancer in this project, the struct PingBalancer is now used everywhere, we should make a new enum Balancer:

enum Balancer {
    Ping(PingBalancer),
    PassiveWeight(PassiveWeightBalancer),
}

providing unified APIs to other modules.

For servers that enabled TFO, the connect may not fail because actually it did nothing but just store the remote address. The connect failure will happened in the first write (the target address with other headers).

@sstpe
Copy link
Author

sstpe commented Dec 23, 2021

Since PingBalancer is currently the only balancer in this project, the struct PingBalancer is now used everywhere, we should make a new enum Balancer:

enum Balancer {
    Ping(PingBalancer),
    PassiveWeight(PassiveWeightBalancer),
}

providing unified APIs to other modules.

Yes, I think that sounds right.

For servers that enabled TFO, the connect may not fail because actually it did nothing but just store the remote address. The connect failure will happened in the first write (the target address with other headers).

Good point.

Sounds like we need a way to report both connect and write errors back to the balancer.

@zonyitoo
Copy link
Collaborator

On the other hand, how about UDP associations? There is no way to "passively" detect whether the server is still there without actively UDP detection.

@sstpe
Copy link
Author

sstpe commented Dec 27, 2021

Hmm, you are right. While many UDP datagrams may expect responses of some sort eventually, we can't count on that.

I'm not against a "Passive" balancer that actually still needs to perform its own connectivity checks in some situations. "Quiet" balancer could be another name for it.

My main goal in this issue is reducing the chances that shadowsocks-rust clients expose themselves because of identifiable behavior, or expose any/all their configured servers, if the client is discovered.

The way I'm hoping to arrive at this goal is to make a new balancer which decreases the behaviors of PingBalancer which I think could expose clients/servers.

For UDP, I'm thinking the best we could do is send minimal amounts of and/or minimally noticeable PingBalancer-like connection tests.

One potential way this could work:

  1. Send connectivity checks sparingly, and only while the client is actively being asked to send UDP traffic.
  2. The connectivity check would be sent first, then either: we continue to send traffic, assuming it will go through, unless a timeout for the connectivity check expires; or we wait for the connectivity check to succeed before sending out any data (higher latancy for initial traffic, but higher probability of successful delivery). I probably favor the first approach, given that IMO is more aligned with UDP's fault-handling model.
  3. The connectivity check only is sent to the active/primary/first-in-config server, unless that connectivity check fails, at which point the next server will be tried.

Honestly, we could handle TCP similarly, too. It's not perfect, but it's much quieter than PingBalancer's current behavior. This approach would fix the issue you mentioned in your earlier response:

But currently we couldn't know when we saw a connect failure whether it cannot connect to the shadowsocks-server or cannot connect to remote target.

In this case, we'd always check against a reliably-up service, so any connect failures would clearly be from the connection to the shadowsocks-server.

What do you think? Do you think this approach is better than a purely passive one for TCP?

@zonyitoo
Copy link
Collaborator

So what you are proposing is that to make a balancer that only "ping" one of the servers until it has connection errors?

@sstpe
Copy link
Author

sstpe commented Jan 3, 2022

It would "ping" only the active server, and only when the client is trying to send data. I'm open to other ideas as well.

@sstpe
Copy link
Author

sstpe commented Jan 14, 2022

Thank you for those commits. To achieve what I'm asking for, it sounds like what I'll have to do is set a very long check_interval and a shorter check_best_interval. Do you agree?

@zonyitoo
Copy link
Collaborator

Yes.

@sstpe
Copy link
Author

sstpe commented Jan 14, 2022

Excellent. The only difference between that and what I suggested is that this will still run the check_best even when sslocal is not being used.

I was hoping to change the interval during use: While sslocal is in use, check_best would be a relatively low value, and when sslocal is idle, check_best would be a high value.

That said, I'm very grateful for your work on this, and I'm happy with this solution for now. You've done more than I could have expected, and I appreciate it.

@zonyitoo
Copy link
Collaborator

Great. Then I would closing this issue now.

@sstpe
Copy link
Author

sstpe commented Jan 14, 2022

Sounds good. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants