-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zstd (zstandard) decompression stream heap allocates #18937
Comments
This could be a straightforward change to make in exchange for several MiB of stack space as the Zstandard decompressor already uses O(1) memory. The required space for internal buffers will be a couple hundred kilobytes plus the maximum supported window size, which currently defaults to 8 MiB (at least 8 MiB is recommended by the spec for improved interoperability). |
Given std.http was used as the example where zstd causes a problem: I must point out that the decompressors are stored by value in Response, which is stored by value in Request, which is returned by value from If zstd needs 8 MiB of space, putting it on the stack is untenable, a single copy will use all 16 MiB of stack space allocated on linux; however I see no reason why the decompressors (or zstd specifically) can't just ask for a buffer of some defined constant length from the user. If this is untenable, then std.http will need to change to allocate the |
And this is too much stack space to use. The best fix as far as Zstandard is concerned would be to change the |
Yes, this I think this is the right approach to take, but |
Give me an hour to wrap up some local changes and then it might be worth re-evaluating this :-) |
I'm not actually familiar with |
Are there any other valid values for window size? Do I understand correctly that any buffer size can be used for the window, but it will cause a failure to decompress if the particular stream required a bigger window size than was allocated to the decompression stream? |
From the zstd spec:
It seems 8 MiB is the recommended size if we want it to almost always work. |
That is correct. My impression is that If a compressed stream used a larger window than the decompressor supports it is not possible to do the decompression. It looks like the reference implementation uses a limit of 8MiB in the |
Following up: I'm super close to having this work done but I gotta go, so this is coming tomorrow rather than this evening. |
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
I originally removed these in 402f967. I allowed them to be added back in #15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see #18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
Has this been addressed in #18955? |
Yes however zstd support has not been re-enabled in Edit: on second thought, I'll go ahead and close this because I have either one big HTTP follow-up issue to file, or many small ones. |
I originally removed these in 402f967. I allowed them to be added back in ziglang#15299 because they were smuggled in alongside a bug fix, however, I wasn't kidding when I said that I wanted to take the design of std.http in a different direction than using this data structure. Instead, some headers are provided via explicit field names populated while parsing the HTTP request/response, and some are provided via new fields that support passing extra, arbitrary headers. This resulted in simplification of logic in many places, as well as elimination of the possibility of failure in many places. There is less deinitialization code happening now. Furthermore, it made it no longer necessary to clone the headers data structure in order to handle redirects. http_proxy and https_proxy fields are now pointers since it is common for them to be unpopulated. loadDefaultProxies is changed into initDefaultProxies to communicate that it does not actually load anything from disk or from the network. The function now is leaky; the API user must pass an already instantiated arena allocator. Removes the need to deinitialize proxies. Before, proxies stored arbitrary sets of headers. Now they only store the authorization value. Removed the duplicated code between https_proxy and http_proxy. Finally, parsing failures of the environment variables result in errors being emitted rather than silently ignoring the proxy. error.CompressionNotSupported is renamed to error.CompressionUnsupported, matching the naming convention from all the other errors in the same set. Removed documentation comments that were redundant with field and type names. Disabling zstd decompression in the server for now; see ziglang#18937. I found some apparently dead code in src/Package/Fetch/git.zig. I want to check with Ian about this. I discovered that test/standalone/http.zig is dead code, it is only being compiled but not being run. Furthermore it hangs at the end if you run it manually. The previous commits in this branch were written under the assumption that this test was being run with `zig build test-standalone`.
Decompression should use O(1) memory.
#18923 removed the requirement of an allocator from decompressing deflate and gzip streams. This is great, but zstd is left behind, leaving this sad situation:
and
zstd is holding us back.
Here's a diff to get started on this issue:
The text was updated successfully, but these errors were encountered: