Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SplitHTTP client: Add minUploadInterval #3592

Merged
merged 5 commits into from
Jul 27, 2024
Merged

Conversation

mmmray
Copy link
Collaborator

@mmmray mmmray commented Jul 25, 2024

Add a new config option that allows to throttle the rate of upload requests. The new default behavior is to initiate a new upload request every 10 milliseconds max, before there was no limit.

The old behavior can be restored by setting a lower value, or setting -1. Due to quirks in config parsing, 0 means the same as 10.

On my machine, 10 ms improves stability on splithttp+h2. It does not affect QUIC at all. More testing is needed.

Since some CDN also bill by number of requests (on top of bytes transferred), the combination of maxConcurrentUploads and maxUploadInterval should also provide some more control over the CDN cost. (In theory, but I don't have that usecase)

Add a new config option that allows to throttle the rate of upload
requests. The new default behavior is to initiate a new upload request
every 10 milliseconds max, before there was no limit.

The old behavior can be restored by setting a lower value, or setting
`-1`. Due to quirks in config parsing, `0` means the same as `10`.

On my machine, `10` ms improves stability on splithttp+h2. It does not
affect QUIC at all. More testing is needed.

Since some CDN also bill by number of requests (on top of bytes
transferred), the combination of `maxConcurrentUploads` and
`maxUploadInterval` should also provide some more control over the CDN
cost. (In theory, but I don't have that usecase)
@RPRX
Copy link
Member

RPRX commented Jul 26, 2024

如果我没理解错的话这个应当被称为 minUploadInterval 而非 max,也就是实现了我说的 #3560 (comment)更稳定是肯定的

但还有 maxConcurrentUploads 的限制,我觉得这个选项实际上不再有意义,应当删掉,因为在带宽充足的情况下它只会根据无关服务端承载能力的 RTT 来限制上传,比如每 300ms 只能上传十次,没有意义,而带宽紧张的话底层会自动安排,无需应用层操心

此外,minUploadIntervalmaxUploadSize 都需要支持填一个范围,每次随机,以缓解特征:#3560 (comment)

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 26, 2024

Fixed the naming. Yep, this is what you proposed. 🙇‍♂️ I had it implemented for a while but didn't see it making a change on QUIC, then today I noticed it works well for h2. Maybe it's just my machine.

Regarding maxConcurrentUploads. I think minUploadInterval and maxUploadSize cannot be used to control the number of in-flight requests, and it has some meaning for these cases:

  • browser dialer: browser has low limits on the number of concurrent/in-flight requests.
  • certain "CDN" have overzealous ddos protection
  • xray server's upload_queue should have a memory limit, and it is derived from maxConcurrentUploads

In conclusion, yes it's meaningless for saturating bandwidth, but there are so many brittle components along the path, they need to be protected from high concurrency. Maybe maxConcurrentUploads default should be raised. Let me know what you think.

Range options, yes should be implemented. I will try to make it similar to fragment options.

@Fangliding
Copy link
Member

Fangliding commented Jul 26, 2024

I've noticed significant discrepancies in upload speeds among users. Some experience rates as low as a few hundred Kbps, while others achieve tens of Mbps.
Some have mentioned that lowering maxUploadSize and maxConcurrentUploads could improve performance, Its default value might need to be tuning
It's possible that packet reordering in complex network conditions is contributing to the issue.
Imagine a scenario where seq1 contains 500kb of data and seq2 contains 100kb. It's likely that seq2 will arrive first, but it has to wait for seq1 to finish transmitting before it can be reassembled and processed. This is not like Ethernet packet, in http transport the possibility of these data arriving out of order is greater

@RPRX
Copy link
Member

RPRX commented Jul 26, 2024

@mmmray 我认为你说的这些都可以通过调整 minUploadInterval 来实现,而不是引入 maxConcurrentUploads 增加不确定性

就像我举的 RTT 为 300ms 的例子,根据这个 PR 现在的代码,基本上就是在每 300ms 中,前 100ms 每隔 10ms 上传一次,后 200ms 在阻塞,这远不如每隔 30ms 上传一次更高效(有可能累积了更多的数据)和稳定(延迟、速率等更平均)

所以应当默认把 minUploadInterval 设为 30ms,maxConcurrentUploads 设为 100,使后者在平时无实际意义,仅作为保险

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 26, 2024

oh, i finally understood. you're saying, if i first measure RTT or assume a certain value for it, then i can use minUploadInterval alone to replace maxConcurrentUploads. This is true, but for my usecases it is a bit unstable, as RTT changes all the time (and this does not allow sharing of configs). I think maxConcurrentUploads is simpler. But I also recognize that this is a bit of an edgecase since 99% of users use cloudflare without browser dialer, so I think bumping the option to 100 is ok, for as long as the setting does not get removed entirely.

@RPRX
Copy link
Member

RPRX commented Jul 26, 2024

@mmmray 不是让你去测量 RTT,而是举个 RRT 大于 100ms 时的具体例子让你意识到问题在哪。我并不想设定“一个 RTT 之内上传多少次”,而是仅根据 CDN、服务端的接受能力去设定 minUploadInterval 而不考虑 RTT,我觉得 30ms 比 10ms 更合适。

@RPRX
Copy link
Member

RPRX commented Jul 26, 2024

@mmmray 我认为影响我设定 minUploadInterval 的因素几乎仅为 CDN、服务端对请求速率的接受能力,而 RTT 与此无关。

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 26, 2024

I have adjusted minUploadInterval -- I wasn't trying to argue about the right value for this.

I just realized that increasing maxConcurrentUploads is trickier, as the settings are coupled between server and client: the server enforces maxConcurrentUploads and drops packets if the buffer gets too big. Therefore, simply increasing the default means that a client on 1.8.22 will not be compatible with a server on 1.8.21. I can split up the settings into maxConcurrentUploadsServer and maxConcurrentUploadsClient and increase the server limit for now, and then in the next version the client limit can be increased, what do you think?

I understand that setting maxConcurrentUploads=10 causes issues of uneven packet sizes. I'm saying there needs to be some sort of limit on the number of in-flight requests, or number of open connections, for the sake of things like browser dialer. If maxConcurrentUploads is removed (doesn't seem you proposed that anyway), it can't be done anymore.

Maybe it is better handled using MaxConns/MaxStreams setting somewhere lower in the HTTP stack, because at that point the "http packet" has already been created. However, maxConcurrentUploads=100 and a MaxStreams=10 potentially causes very ugly latency issues...

@RPRX
Copy link
Member

RPRX commented Jul 26, 2024

I can split up the settings into maxConcurrentUploadsServer and maxConcurrentUploadsClient and increase the server limit for now, and then in the next version the client limit can be increased, what do you think?

无需分成两个选项,只需先把服务端的默认值改成 100,下下个版本把客户端的默认值也改成 100 即可

minUploadInterval 可以也在服务端生效,即服务端每隔多少毫秒再处理某 UUID 一个新的上传请求,防止客户端把值调得过小

总之这些值应当均由服务端确定,客户端只能遵守不得逾越,就像 REALITY 服务端的 servernames

@RPRX
Copy link
Member

RPRX commented Jul 26, 2024

the server enforces maxConcurrentUploads and drops packets if the buffer gets too big

只能 drop 而不能延迟处理吗

@RPRX
Copy link
Member

RPRX commented Jul 26, 2024

我认为你可以另开一个 PR 处理 maxConcurrentUploads

  1. 将服务端默认值改为 100
  2. 将服务端返回的 ok 改为 ook
  3. 客户端检测到 ook 就把 10 改为 100

这样即可实现平稳升级,下下个版本再把客户端的默认值改为 100

@mmmray mmmray changed the title SplitHTTP: add maxUploadInterval SplitHTTP: add minUploadInterval Jul 26, 2024
@mmmray
Copy link
Collaborator Author

mmmray commented Jul 26, 2024

I added the rand range options. Two caveats:

  • I think that if it is randomized by default, it will cause unstable packet sizes again, but feel free to change it. Right now it is set to 30-30 and 1MB-1MB.
  • maxUploadSize is not randomized within a single connection, the random value is determined once per connection. It's a bit difficult to resize the buffers randomly (or the implementation is rewritten without MultiBuffer pipe). However, I think this is ok when combined with mux.

What else needs to be done for this PR? I think it is fine as-is, I only wanted to solve the upload performance concern.

minUploadInterval It can also take effect on the server side, that is, how many milliseconds does the server process a new upload request for a certain UUID to prevent the client from adjusting the value too small?

is it necessary? I don't see a way to enforce this on the server without measuring more things, or adding a constant 30ms Sleep on POST.

maxConcurrentUploads changes, I will make another PR for it.

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 26, 2024

只能 drop 而不能延迟处理吗
Can it only be dropped but not delayed?

I think I need to clarify "drop". If packets arrive out of order and the buffer fills up beyond maxConcurrentUploads, the server tears down the connection entirely.

It can be delayed instead but I think it will cause more issues if HTTP requests get "lost", as it makes the connection hang completely. In http/1.1 pipelining, the response is currently not read, so there is no ACK on any of the packets (or a retry mechanism)

Anyway, I don't think it is necessary to reconsider this at the moment, it hasn't caused issues so far, and you have already come up with a solution to raise maxConcurrentUploads in a safe way.

@RPRX RPRX changed the title SplitHTTP: add minUploadInterval SplitHTTP client: Add minUploadInterval Jul 27, 2024
@RPRX
Copy link
Member

RPRX commented Jul 27, 2024

应当从这个 PR 中移除 maxUploadSize 相关内容并为其创建另一个 PR,完成前者后你可以合并这个 PR,然后等我合并后者

@RPRX
Copy link
Member

RPRX commented Jul 27, 2024

客户端检测到 ook 就把 10 改为 100

这个逻辑改为“客户端检测到不是 ok 就把 10 改为 100”

@RPRX
Copy link
Member

RPRX commented Jul 27, 2024

我认为你可以另开一个 PR 处理 maxConcurrentUploads

对了,我准备下个版本给 path 加参数,我简单看了下代码旧版客户端不会去掉参数,所以也是不兼容的,要不直接就不兼容吧

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 27, 2024

I'm going to merge this PR in a second. nevermind

about the ook vs ok feature detection, does it not mean that 1.8.22 server will not be compatible with 1.8.16 client? i think this breakage is happening too soon, 1.8.16 is still in some very important iOS clients. Maybe wait a bit, or ok vs oook also becomes another range config option?

@RPRX RPRX merged commit 8a4217f into XTLS:main Jul 27, 2024
36 checks passed
@mmmray
Copy link
Collaborator Author

mmmray commented Jul 27, 2024

对了,我准备下个版本给 path 加参数,我简单看了下代码旧版客户端不会去掉参数,所以也是不兼容的,要不直接就不兼容吧

By the way, I'm planning to add parameters to the path in the next version. I took a quick look at the code, and the old client won't remove the parameters, so it's also incompatible. Should we just make it incompatible directly?

it's not clear to me how to evolve the protocol right now in general, and there are too many "special-purpose" hacks. I think client and server may start need to send a version number, they can do it over query parameters:

  1. client sends ?xver=1 to indicate "client version 1"
  2. server responds with a version number in response body (instead of ok) if xver is set at all, for example 1ok.
  3. now, client and server know each other's version and can adjust behavior based on it.

of course, the protocol becomes very difficult to port to other cores if this sort of thing is done...

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 27, 2024

actually this will not work that well, because the client may send POST sooner, if it needs to wait for the version to come from GET to determine behavior there is just more RTT

@RPRX
Copy link
Member

RPRX commented Jul 27, 2024

about the ook vs ok feature detection, does it not mean that 1.8.22 server will not be compatible with 1.8.16 client? i think this breakage is happening too soon, 1.8.16 is still in some very important iOS clients. Maybe wait a bit, or ok vs oook also becomes another range config option?

直接把客户端的默认值也改为 100 吧,服务端比较好升级,仍然兼容以前的客户端

@RPRX
Copy link
Member

RPRX commented Jul 27, 2024

关于 maxUploadSize 的 range,服务端应取较大的值,maxConcurrentUploads 也加个 range,服务端也取较大的值

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 27, 2024

about maxUploadSize range, the server should take a larger value

ok

maxConcurrentUploadsAlso add a range

ok

and the server will also take a larger value.

in a future PR, maxConcurrentUploads is already 100 on the client, you want to go even higher on the server?

@RPRX
Copy link
Member

RPRX commented Jul 29, 2024

直到看了文档我才发现加的是 minUploadIntervalMS 而不是 minUploadInterval,commit title 少了 MS,不过问题不大

@RPRX
Copy link
Member

RPRX commented Jul 29, 2024

take a larger value

想把这个 a 改成 the,这翻译看得我难受

@RPRX
Copy link
Member

RPRX commented Jul 29, 2024

about maxUploadSize range, the server should take a larger value

ok

maxConcurrentUploadsAlso add a range

ok

and the server will also take a larger value.

in a future PR, maxConcurrentUploads is already 100 on the client, you want to go even higher on the server?

卧槽,我还纳闷你咋把服务端的默认值设为了 200,翻译的锅,我的意思是 take the largest value within the range 而不是 take a larger value,200 就 200 吧,准备发个新版看看服务端会不会 OOM:#3603 (comment)

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 29, 2024

I want to change this a to the, This translation makes me uncomfortable

I just realized that I may have misinterpreted that sentence then. Now the server defaults to 200 and the client to 100, but I think you said that the server should just take the upper end of the range, not have different defaults. Anyway, it doesn't seem to be a big difference.

Unfortunately Google Translate in the browser is more convenient than Gpt4o

EDIT: ...yes

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 29, 2024

@RPRX please do not release for now, I think I found some issues.

@RPRX
Copy link
Member

RPRX commented Jul 29, 2024

@RPRX please do not release for now, I think I found some issues.

WTF

@mmmray
Copy link
Collaborator Author

mmmray commented Jul 29, 2024

#3610

leninalive pushed a commit to amnezia-vpn/amnezia-xray-core that referenced this pull request Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants