-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the behavior of silent dropping replay packets #556
Conversation
CC @madeye Feel free to test. |
Hmm, but why removing it? |
Because Bloom filter itself is a feature that can be utilized by GFW. See the post I linked above. EDIT: Yes. I am saying that having a Bloom filter is even worse than not doing anything, and I am actively testing this right now. |
The report you linked as a reference contradicts your point. Shadowsocks servers need protection against replay attacks. The current bloomfilter-based implementation might not be good enough (limited capacity, no persistency, etc). But removing the protection is not the solution. We need protocol changes to achieve full protection. Since you have consistently given downvotes/objections to proposals of any change (shadowsocks/shadowsocks-org#177, shadowsocks/shadowsocks-org#178, shadowsocks/shadowsocks-org#183 (comments)), I'm giving your PR a 👎. |
Since there is no better solution for now, doing nothing should be better than removing it completely. net4people/bbs#22 actually indicates that the main detection method is replay. |
Let me clarify by giving an explicit attack to distinguish/detect shadowsocks traffic. Most protocols that looks random allow replay of the first packet. The only counterexample I can think of is TLS 1.3 0-RTT, but even there, the behavior is immediately different. shadowsocks-libev and shadowsocks-rust disallows replay within a certain period (until server restarts or ping-pong bloom filter resets). This by itself is a feature that an active probing adversary can utilize to distinguish shadowsocks-rust from TLS by doing replay. To perform this distinguishing attack, you need:
Given that there are enough traffic to the server, shadowsocks-libev and shadowsocks-rust eventually passes this test with some noticeable probability. Compare this to TLS (not 0-RTT), which will never pass the first test since TLS always accepts replay of the ClientHello message. Even for TLS 0-RTT, the server returns HTTP 425 when it detects a replay, instead of silently dropping the connection, and therefore also fails to pass the first test. Therefore, a firewall can perform this test randomly, and block the server whenever all three tests passes. (The distribution of the delay can be chosen so that the amortized space complexity to carry out the attack is low.) In fact, if the distinguisher/censor keeps track of a counter for all the new connections made to the server, he can in principle know exactly when to replay the packet that it will pass the verification, since funnily enough, the number of maximum entries for the Bloom filter is hardcoded in shadowsocks-libev/shadowsocks-rust: shadowsocks-rust/crates/shadowsocks/src/context.rs Lines 10 to 13 in 6583640
Given the data from net4people/bbs#22, it is likely that this attack is already deployed. There are a few alternative ways to protect against this attack, but this PR is the only way without changing the protocol, and I am testing its effectiveness on a popular VPS. By removing the Bloom filter, shadowsocks-rust now fails the first test and passes the other two, which is consistent with the behavior of TLS. This backs up my claim that "doing nothing is better than having a Bloom filter," as per #556 (comment). |
Well, sounds reasonable. What do you think @madeye ? |
This also fixes shadowsocks/shadowsocks-org#184.
@database64128 You mean I have consistently given downvotes/objections to proposals of any useless or even harmful change*. You are welcome. *Also I have supported the change of adding AAD to the protocol, as per shadowsocks/shadowsocks-org#183 (comment). However, since it is a breaking change, it is not worth the upgrade for now. Also since you mentioned it, you can see that I have already proposed some similar ideas in shadowsocks/shadowsocks-org#183 (comment). I think this Bloom filter thing is a good example of how various proposals of "protecting against replay attacks" or "forward secrecy" or [insert your other unnecessary demands on Shadowsocks and questionable protocol "upgrades"] can actually harm the main goal of Shadowsocks being an access tool. |
This makes no sense at all. Even a simple DPI system can tell that Shadowsocks AEAD traffic is not TLS. Not mimicking TLS doesn't make Shadowsocks unique. But a half-baked "fake" TLS will almost certainly stand out. |
@zonyitoo instead of removing the bloomfilter, I think we can disable its behavior "drop replay connection" by default. Let's still report the replay in the log, which should help to understand the replay probe in the future. |
Well, that should be a good idea. ss-go2 also have a bf for detecting replay connections, what do you think? @riobard I think it can also be a configuration feature, simply: {
// A switch for enabling silent drop of replay connections
// It is "true" by default, for backward compatiblity.
"silent_drop_replay": false
} |
https://github.com/Jigsaw-Code/outline-ss-server/blob/master/service/PROBES.md This is how outline-ss-server handles replay connections. |
需要留意,一旦移除了重放过滤器,shadowsocks/shadowsocks-org#183 会有更多玩法。 |
shadowsocks/shadowsocks-org#178 (comment)
比如 shadowsocks/shadowsocks-org#183 提到的对客户端的重放,以及将客户端的请求塞回给客户端、将服务端的响应塞回给服务端、再加上一些逐字节探测的手段,还有 UDP DNS 杂交到 TCP 对比响应长度什么的...很难排列组合完,这些都变成了加密层无法识别、无法防御的攻击,具体的 Shadowsocks 受攻击响应行为 pattern 需逐个实测。 |
能不能判断运行在plugin模式下自动切换? |
Ah. That’s a good point. |
@zonyitoo What exactly is "silent_drop_replay" in this case? Currently go-ss2 just keeps reading from the connection after detecting replay without actively dropping it. Does ss-rust behave the same? |
Yes, they work the same. Here Mygod proposed a change to remove it completely, and madeye suggested to change the behavior to detect only. What do you think? |
@zonyitoo It's definitely better to keep the replay detection at least to inform server admins that they're being probed. I'm undecided as to whether or not disable silent drop. If the end goal is to make it more troublesome to probe, maybe a chaotic approach would be better? e.g. after detecting replay (or any invalid attempts to connect), we could respond randomly, i.e. sometimes drop the connection after a random timeout, sometimes keep the connection but reply with bogus data, sometimes just keep draining as if nothing is wrong. Hopefully it would confuse probes. The probabilities of each response would ideally be different per server deployment in order to avoid leaking statistical patterns. |
If respond randomly, then the attacker would try with one data stream multiple times, and check if the server respond randomly. If we can persist the IV filter for longer time, then the current issue should be able to avoid. |
Do you want to use a database holding all the past IV, and make your server super slow, and have users complaining that their disk is full? |
Storing the past bloom filters should be enough. And we don't need to store all filters. |
No and the reason is left as exercise. |
Updated the patch to keep the error logging as per #556 (comment). P.S. The answer to the exercise above is that your running time still scales linearly, and even worse your false positive rate goes up also linearly. |
This is easy to counter-react: choose one of many responses based on a hash of the IV then we have the same response for one particular stream. |
This comment has been minimized.
This comment has been minimized.
Great suggestions, @riobard ! We have been taking actions since you suggested. |
I want to apologize for my previous statements against Mygod on replay protection. After reevaluating the situation with @zonyitoo, I now agree that the current implementation is inherently broken, and should probably be completely removed. I have retracted my downvotes and changed some of my previous statements to be hidden. |
What's the latest development? |
It has been completely removed in 90f05a5. |
I saw it. I'm wondering what changed your mind. |
To be concluded, the current implementation of replay attack protection with ping-pong bloom filters is actually broken, which couldn't provide any protection as it was designed for. And it may also reject legitimate clients causing unexpected connection failures. |
We first observed a high false positive rate on TFO-enabled servers. This was probably partially caused by duplicate SYNs with data. It has led us to do a little math on the real world effectiveness of the replay protection implementation. Both shadowsocks-rust and Outline use 10000 as the IV cache capacity by default. Turns out you can easily exhaust it in as little as a few minutes even on your personal server. Just watch a YouTube video via QUIC (It's the default transport on both desktop and mobile). Half way through the video, the cached IVs have probably already been rotated. Even if you give up on filtering UDP traffic (like Outline does), new TCP connections from normal household Internet usage can exhaust your IV cache capacity in less than a few hours. Let's also not forget about the fact that most of us share servers with family, friends or in a community. Clients also can't guarantee the uniqueness of the generated salt due to the limited capacity, leading to higher false positive rate at server side. With 250 servers in ss-rust's ping balancer, it takes exactly 5 minutes to fill up the salt cache. |
UDP is an afterthought anyway, so it's fine to ignore those. 10k capacity is clearly insufficient. IIRC libev port defaults to 1m? You should raise the capacity on busy servers. I'm wondering if there's any behavior leaks. Previous discussion seems resulted in no clear conclusion. |
Correct: shadowsocks-rust has the same capacity as shadowsocks-libev. But even 10k capacity is not enough for the QUIC case. Also, the bloom filter can only allow one client to one server configuration. But in reality, users have mobile phones, PCs, smart routers, relays, ... The current implementation will require all clients to generate salt/iv to be globally unique, which is impossible in most cases. And we also found that TFO will significantly reduce latency in shadowsocks connections, but the "duplicate SYNs" behavior will cause shadowsocks servers to reject clients and causing connection problems, or prints lots of |
Please don't complicate this issue by messing up several related issues. I'll address each of @zonyitoo's previous 5 points below:
Don't take me wrong. I'm not defending replay protection per se, but let's separate configuration issues and fundamental brokenness. |
What are you even thinking? Which part of UUID have you forgotten about? :D |
|
I'm sorry but this is not the constructive way to reason about it. Probabilistic approaches like Bloom filters will never give you 100% reliable results. The idea is to find a very low probability such that it won't matter in practice. If all you're concerned about is false positive rate, just increase the capacity at the cost of cheap RAM will solve the problem. There's no ultimate solutions. Just tradeoffs. UDP traffic has completely different behaviors. Remember we need replay protection for fingerprinting. Now convince me if QUIC can be fingerprinted by replaying? If not, it's fine to leave it alone. Once again, tell me what's the probability of two honest clients repeatedly generating 256-bit randomness and ending up colliding? It's not a capacity issue coz it's a statistics issue. Use your math! 😂 Even if TFO is not broken, how do you deal with the fact that almost no other TCP traffic adopts TFO? What replay protection protects us from? It's completely backwards. Again, I'm not saying replay protection is alright. I'm just not seeing the correct arguments against it. |
Or we should keep it simple:
Just that simple. Ineffective, False Positive, Flawed. We can further engineered a new solution against replay attack. |
Bloom filters obviously is not perfect, but a) there's no perfect solution; and b) SS itself is not a perfect solution; and c) I'm not saying we should not use an imperfect solution. C'est la vie!
You're jumping right into the conclusion without even arguing for it. How does it not distinguish replayed traffic? Sure it cannot protect us from every attack but for the intended protection against short-term replay it's very effective. @Mygod's experiment above is very valuable but it's only a single data point. So far there's no more useful data points to (dis)prove the effectiveness of replay protection at all. @gfw-report agreed to do more experiments but it might take more time to reach conclusions. My objection to your arguments so far is that they're mostly nitpicking about configuration defects (which can be simply corrected by changing some parameters), and some of they are factually wrong (e.g. 256-bit randomness collision) or conflicting objectives (e.g. TFO vs hidden-in-plain-sight). |
I don't think 90f05a5 is necessary. As proposed by @madeye, detecting the replays and printing a warning (without acting on the packet differently in a way observable by the external environment) is still useful. @riobard Certainly 10000 capacity is not enough. However, considering that all it takes for an attacker to distinguish is a single replay success, you would have to remember virtually every single past IV to protect against this attack. On the other hand, the attack only needs to randomly pick a constant number of IVs to remember and eventually succeed at detecting. Removing the bloom filter invalidates this attack completely without the server remembering a single IV. As I have stressed multiple times in this thread, Bloom filters are so shit that it is better not having it to begin with. Good bye. |
I am not trying to convince you because I agree with your opinion. You are saying that this is rather an effective solution, but we were saying that we would better remove it completely than have a solution that have several obvious flaws. You may notice that we are talking about the same thing from different view. Mygod is right, if you are actually want to protect against from replay attack, you will have to remember all IVs instead of a "short-term replay". If it can only protect in a short-term, in my point of view, it is ineffective. |
Now we're on the right path 😄 So at the core of the issue is about whether replay protection is desirable at all. The answer to that question really depends on analysis of other popular anonymous traffic and see what kind of replay protection (if any) they employ. I don't have a clear idea and I'm hoping @gfw-report could continue researching on it. If it's deemed necessary to have perfect replay protection, I agree with both of you that the current solution is very problematic. Specifically, @Mygod said that
which I'm not 100% sure. Yes it's definitely a weakness, but do other less-than-perfectly-replay-protected protocols exhibit similar weakness? In other words, how reliable can a probe fingerprint SS servers knowing this behavior quirk, and at what storage/scalability cost? Also
which I completely agree, but then we're vulnerable to short-term replay. Are we now more vulnerable to fingerprinting or not? I guess it also depends on whether other popular anonymous traffic are susceptible to replay, right? Once again, I'm not pro or against replay protection. I'm just saying that we don't know enough to have a clear-cut conclusion yet. |
By the way, it's relatively easy to gain long-term replay protection while keeping the simplistic Bloom filters and without changing the underlying protocol. Just mix-in a coarse-grained timestamp (e.g. one tick every 15 minutes) into the salt randomization process. It would require both clients and servers to adapt for proper enforcement, but servers could choose to fallback to non-timestamped as before for backwards compatibility. I'm hoping we're not throwing the baby out with bathwater. 😂 |
You are right, I am also thinking about this kind of solution. |
Great! And if there's new solution to address it, we should probably first discuss what compatibility cost we're gonna swallow 🤢 |
可以考虑用totp来产生salt |
I know one of the requirements of reply attack mitigation is to not modify the original protocol. But if we swap the salt used by the client and the server, we could trivially make reply attacks impossible. We could do the follows. The client initiates a connection and sends a random salt to the server. The server responses with another random salt to the client. After the encryption handshake, the server uses the salt designated by the client's first request to encrypt traffic sent to the client, and the client should use the salt provided by the server to encrypt traffic sent to the server. The wire format will stay the same. Reply attack would be impossible, because the server will send different salt to different clients. |
But no forward security. BTW, that will require the server to respond instantly after received the first packet. The current shadowsocks protocol doesn't require that behavior. |
I spent quite some time Googling this term (and forward secrecy), but could not quite understand it in the context of Shadowsocks. Do you have an example scenario?
That is true. Not replying immediately makes the shadow server look more like an HTTP server. |
This reverts commit 32a1406 and partially reverts 19ce248. For discussions see shadowsocks/shadowsocks-rust#556.
This removes a carefully engineered feature (proposed in shadowsocks/shadowsocks-org#44 (comment)) to detect Shadowsocks as per net4people/bbs#22. Testing in progress.