-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add async ZipFile APIs #1541
Comments
This needs someone to write up a formal API proposal then we can review and if approved, offer up for implementation. Anyone? Docs on the API review process and see Jon's API request for a great example of a strong proposal. The more concrete the proposal is (e.g. has examples of usage, real-world scenarios, fleshed out API), the more discussion it will start and the better the chances of us being able to push the addition through the review process. |
Forked the repo, and working on an api proposal. That probably means this item is done, right? |
Not quite, @abdulbeard. This particular issue is around adding ZipFile async APIs. For example, currently we have |
Gotcha @ianhays . I did a terrible job reading the title 😛 |
SummaryCurently, System.IO.Compression.Zipfile doesn't have async methods for Proposed APInamespace System.IO.Compression {
public partial class Zipfile {
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, string destinationArchiveFileName, CancellationToken cancellationToken = default(CancellationToken));
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, string destinationArchiveFileName, CompressionLevel compressionLevel, bool includeBaseDirectory, CancellationToken cancellationToken = default(CancellationToken));
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, string destinationArchiveFileName, CompressionLevel compressionLevel, bool includeBaseDirectory, Encoding entryNameEncoding, CancellationToken cancellationToken = default(CancellationToken));
public static Task ExtractToDirectoryAsync(string sourceArchiveFileName, string destinationDirectoryName, CancellationToken cancellationToken = default(CancellationToken));
public static Task ExtractToDirectoryAsync(string sourceArchiveFileName, string destinationDirectoryName, Encoding entryNameEncoding, CancellationToken cancellationToken = default(CancellationToken));
public static Task<ZipArchive> OpenAsync(string archiveFileName, ZipArchiveMode mode, CancellationToken cancellationToken = default(CancellationToken));
public static Task<ZipArchive> OpenAsync(string archiveFileName, ZipArchiveMode mode, Encoding entryNameEncoding, CancellationToken cancellationToken = default(CancellationToken));
public static Task<ZipArchive> OpenReadAsync(string archiveFileName, CancellationToken cancellationToken = default(CancellationToken));
}
} Expected Use and BenefitsThese new functions are asynchronous versions of their synchronous counterparts. This allows the .net runtime to allow the thread to do other work while the asynchronous operation completes (as mandated by the TPL and async/await scheduling).
|
Here's a proposal ^. Looking forward to constructive feedback and insight. |
Should using (var archive = await ZipFile.OpenAsync("archive.zip", ZipArchiveMode.Update))
{
archive.CreateEntry("empty.txt");
}
using (var archive = await ZipFile.OpenReadAsync("archive.zip"))
{
foreach (var entry in archive.Entries)
{
Console.WriteLine(entry.FullName);
}
} As far as I can tell from skimming the source code, even though the above code uses the proposed |
Here's a proposal for ZipArchive class:namespace System.IO.Compression {
public class ZipArchive {
public Task<ReadOnlyCollection<ZipArchiveEntry>> GetEntriesAsync();
public Task<ZipArchiveEntry> GetEntryAsync(string entryName, CancellationToken cancellationToken = default(cancellationToken));
}
} |
@abdulbeard How does that help? As far as I can tell, |
@svick you're right. Async versions of CreateEntry don't do much, since their underlying functionality is not blocking. public Task<ReadOnlyCollection<ZipArchiveEntry>> GetEntriesAsync();
public Task<ZipArchiveEntry> GetEntryAsync(string entryName, CancellationToken cancellationToken = default(cancellationToken)); Amended the proposal above. |
@JeremyKuhne could you give feedback on this proposal? Perhaps we can move it forward |
Yeah, there isn't a great approach until dotnet/roslyn#114 is added. But in the meantime, I think there are ways to make it work. The approaches I considered are:
Looking at the two examples above, I think adding |
@svick |
@danmosemsft @JeremyKuhne any suggestions for changes to ^, or shall we move forward? |
Anything I can do to help move this along? |
crickets.......... |
@abdulbeard Sorry, the actual owner here moved on to another project and this got lost in the shuffle. 😦 Thanks for poking on it. @stephentoub can you give some direction/guidance on the disposal questions? @ericstj do you think this proposal provides what you were looking for? Presumably |
@ahsonkhan, @ViktorHofer can one of you take this and drive it forward? |
We have added |
It depends. Is ZipArchive lazily reading from a stream, e.g. when you go to read the next entry, might that perform a read on the stream? If yes, then having it return an IAsyncEnumerable could make sense. If no, then it should probably return a synchronous enumerable. |
Just curious is there any update on this issue? We have a .NET Core project that creates zip files in Azure blobs, and we are forced to use the sync version of ZipArchive apis because the async ones will cause unit tests to hang. |
I have some code that is generating and streaming a zip archive on the fly, which fails if AllowSynchronousIO is false, which is the default from core 3.0.0-preview3 onwards: System.InvalidOperationException: Synchronous operations are disallowed. Call WriteAsync or set AllowSynchronousIO to true instead. |
Like @kccsf I have simillar issue with the Kestrel server configured to disallow synchronous IO. With the current implementation I can not use ZipArchive directly with the Kestrel's response stream - because of synchronous disposing. |
Just hit this today as well. :\ After writing the entries, Dispose() throws the exception. Worked around by allowing sync I/O at that point for now, but would be nice to have some async API that I could call to remove the need for sync I/O. |
I have found a work around for now ... async Task DoZipFileWorkAsync()
{
// ZipArchive is a Synchronous Block, so yield asynchronously
await Task.Yield();
// Open archive
using ZipArchive zipArchive = new(File.OpenRead(path));
// ... do work
} All work is now done on a |
This is not really a work around, since all work with the ZipArchive is still done synchronously. |
I do agree to some extent, however testing done works as expected in a worker thread, not the |
I wanted to try adding async and ended up with copy-pasting dotnet source code of I even added a progress report. The downside is, I had to use english exception message since the methods to get translated messages from resources are internal.
|
All the ZipFile and ZipArchive async APIs should be implemented together. For that reason, I closed #1560 in favor of this issue. There was a good discussion there we should use for reference. There are some problems with the way ZipArchive is implemented that would make it complex to implement the async methods:
|
This issue is now 8 years old... Are there any plans to target this in the near future? 😕 |
Just wanted to mention that this is blocking me from implementing a feature the most efficient way. I have an ASP.NET Core service that calls an API multiple times and consolidates its results. The API returns zip files, and my service needs to unzip the files, consolidate them, then produce a single zip of the results. I want to do that in streaming fashion, so that the consumer of the service can start receiving data as soon as the first API call returns a zip. But because of #1560, I can't send the zipped data to the consumer in streaming fashion; instead, I have to wait until all the API calls are finished (there can be several hundred of them), unzip all their zips to a temporary directory, create a new zip that consolidates them all, and only once that is finished will the consumer receive the first byte of data. That's lag time I could get rid of if I could just stream a ZipArchive to the HTTP response body, but #1560 blocks me from doing that. |
@rmunn is there anything stopping you from using my "workaround"? |
@synek317 You are missing the implementation of the |
Update on my previous comment: #1541 (comment)
The API proposals shared by folks above are mostly complete. I am adding my amended proposal at the top description. |
- public partial class ZipArchive : IDisposable
+ public partial class ZipArchive : IAsyncDisposable, IDisposable public partial class ZipArchive
{
public ValueTask DisposeAsync();
protected virtual ValueTask DisposeAsyncCore();
public static Task<ZipArchive> CreateAsync(Stream stream, ZipArchiveMode mode, bool leaveOpen = false, Encoding? entryNameEncoding = null, CancellationToken cancellationToken = default);
}
public partial class ZipArchiveEntry
{
public Task<Stream> OpenAsync(CancellationToken cancellationToken = default);
}
public static partial class ZipFile
{
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, Stream destination, CancellationToken cancellationToken = default);
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, Stream destination, CompressionLevel compressionLevel, bool includeBaseDirectory, CancellationToken cancellationToken = default);
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, Stream destination, CompressionLevel compressionLevel, bool includeBaseDirectory, Encoding? entryNameEncoding, CancellationToken cancellationToken = default);
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, string destinationArchiveFileName, CancellationToken cancellationToken = default);
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, string destinationArchiveFileName, CompressionLevel compressionLevel, bool includeBaseDirectory, CancellationToken cancellationToken = default);
public static Task CreateFromDirectoryAsync(string sourceDirectoryName, string destinationArchiveFileName, CompressionLevel compressionLevel, bool includeBaseDirectory, Encoding? entryNameEncoding, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(Stream source, string destinationDirectoryName, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(Stream source, string destinationDirectoryName, bool overwriteFiles, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(Stream source, string destinationDirectoryName, Encoding? entryNameEncoding, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(Stream source, string destinationDirectoryName, Encoding? entryNameEncoding, bool overwriteFiles, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(string sourceArchiveFileName, string destinationDirectoryName, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(string sourceArchiveFileName, string destinationDirectoryName, bool overwriteFiles, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(string sourceArchiveFileName, string destinationDirectoryName, Encoding? entryNameEncoding, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(string sourceArchiveFileName, string destinationDirectoryName, Encoding? entryNameEncoding, bool overwriteFiles, CancellationToken cancellationToken = default);
public static Task<ZipArchive> OpenAsync(string archiveFileName, ZipArchiveMode mode, CancellationToken cancellationToken = default);
public static Task<ZipArchive> OpenAsync(string archiveFileName, ZipArchiveMode mode, Encoding? entryNameEncoding, CancellationToken cancellationToken = default);
public static Task<ZipArchive> OpenReadAsync(string archiveFileName, CancellationToken cancellationToken = default);
}
public static partial class ZipFileExtensions
{
public static Task<ZipArchiveEntry> CreateEntryFromFileAsync(this ZipArchive destination, string sourceFileName, string entryName, CancellationToken cancellationToken = default);
public static Task<ZipArchiveEntry> CreateEntryFromFileAsync(this ZipArchive destination, string sourceFileName, string entryName, CompressionLevel compressionLevel, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(this ZipArchive source, string destinationDirectoryName, CancellationToken cancellationToken = default);
public static Task ExtractToDirectoryAsync(this ZipArchive source, string destinationDirectoryName, bool overwriteFiles, CancellationToken cancellationToken = default);
public static Task ExtractToFileAsync(this ZipArchiveEntry source, string destinationFileName, CancellationToken cancellationToken = default);
public static Task ExtractToFileAsync(this ZipArchiveEntry source, string destinationFileName, bool overwrite, CancellationToken cancellationToken = default);
} |
All ZipFile APIs are currently synchronous. This means manipulations to zip files will always block a thread. We should investigate using an async file and calling async IO APIs (ReadAsync/WriteAsync) to free up the thread while IO is happening. dotnet/corefx#5680
Edit by @carlossanlop:
System.IO.Compression
Note there is no
CreateEntryAsync
above. That is because I didn't find any code insideCreateEntry
(or insideDoCreateEntry
) that can be made async, Aside for thecancellationToken.ThrowIfCancellationRequested()
call that would have to be added at the top.But here's the method in case we want to add it regardless. We can get it approved here, and if in the end it is not needed, we don't have to ship it:
... public ZipArchiveEntry CreateEntry(string entryName); + public Task<ZipArchiveEntry> CreateEntryAsync(string entryName, CancellationToken cancellationToken = default); }
Also:
Usage examples:
System.IO.Compression.ZipFile
Usage examples
Plan
ZipFile.CreateFromDirectory
andZipFile.ExtractToDirectory
performance#4764The text was updated successfully, but these errors were encountered: