Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Security.Cryptography.Pkcs.ContentInfo loads all data into memory #47410

Open
Tracked by #64488
scott-xu opened this issue Jan 25, 2021 · 6 comments
Open
Tracked by #64488
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Security
Milestone

Comments

@scott-xu
Copy link
Contributor

Background and Motivation

Currently, if SignedCMS is detached, it has to load all content into memory before verify signature.

var content = File.ReadAllBytes(@"Path\To\BigSizeFile");
var cms = new SignedCms(new ContentInfo(content),true);
cms.Decode(p7sData);
cms.CheckSignature(true);

Proposed API

name System.Security.Cryptography.Pkcs
{
    public class ContentInfo
   {
   + public ContentInfo (Stream content, bool detached)
   }
}

Usage Examples

Alternative Designs

Risks

@scott-xu scott-xu added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jan 25, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-Meta untriaged New issue has not been triaged by the area owner labels Jan 25, 2021
@ghost
Copy link

ghost commented Jan 25, 2021

Tagging subscribers to this area: @bartonjs, @vcsjones, @krwq
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and Motivation

Currently, if SignedCMS is detached, it has to load all content into memory before verify signature.

var content = File.ReadAllBytes(@"Path\To\BigSizeFile");
var cms = new SignedCms(new ContentInfo(content),true);
cms.Decode(p7sData);
cms.CheckSignature(true);

Proposed API

name System.Security.Cryptography.Pkcs
{
    public class ContentInfo
   {
   + public ContentInfo (Stream content, bool detached)
   }
}

Usage Examples

Alternative Designs

Risks

Author: scott-xu
Assignees: -
Labels:

api-suggestion, area-Meta, area-System.Security, untriaged

Milestone: -

@jkotas jkotas removed the area-Meta label Jan 25, 2021
@bartonjs
Copy link
Member

Intriguing. I was trying to come up with a clever way of allowing the caller to just specify the hash, as a more generalized solution, but it doesn't work well with the data model since the digest algorithm is specified in a different place than the content (it'd be a late exception if the hash-only content was a different length than the DigestAlgorithm property specified).

What are your expectations for the byte[] Content { get; } property when this constructor was used? What should happen when multiple signatures are applied (especially if they're changing the digest algorithm)?

ContentInto streamContent = ...;
SignedCms cms = new SignedCms(streamContent, true);
cms.DigestAlgorithm = s_sha256Oid;
cms.Sign(s_sha256Signer);
cms.DigestAlgorithm = s_sha1Oid;
// This needs to re-read the stream.
// What if it wasn't seekable?
// What if it was, but didn't start at position 0?
// What if it doesn't produce the same data both times?
cms.Sign(s_sha1Signer);

For verification we could solve the multiple-read problem by using the document's "what digest algorithms will I be using on this?" value to compute all needed digests in parallel up front... but that'd require a fair bit of restructuring (though, FWIW, I believe that's what Windows CMS does).

We could also try to change signing to compute digests for all known algorithms during the first signature, but that'd be on average wasteful... or just say something like non-seekable streams can only be signed once (exception on second signer) and that seekable streams will be set to the Position value they were at when signing started. But these sorts of questions are things we need to think about before adding this sort of new paradigm to an existing API.

@scott-xu
Copy link
Contributor Author

I switched to BouncyCastle and it works perfectly.

@bartonjs bartonjs removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2021
@bartonjs bartonjs modified the milestones: Future, 7.0.0 Jul 6, 2021
@bartonjs
Copy link
Member

bartonjs commented Jul 6, 2021

Marked this for 7.0. I figure if we don't do it by then we can just close it.

Latent thoughts:

namespace System.Security.Cryptography.Pkcs
{
    partial class ContentInfo
    {
        public ContentInfo (Stream content, bool detached) {}
        public Stream? ContentStream { get; }
   }
}

It's probably a breaking change to make the existing Content property be declared nullable, so we could make the Stream ctor set it to Array.Empty.

Draining the stream is where things become complicated.

  • If the first call to Sign leaves the stream drained, and there is a second call to Sign, the stream will now produce the empty payload.
    • For an unseekable stream we could:
      • Set a "dead object" bit and throw if we're about to read and it was set.
      • Throw in the ctor.
      • Compute hashes with all known hash algorithms, and save those somewhere.
    • For a seekable stream
      • The ctor could capture the Position value, and we could always seek to the captured Position before reading.
      • "Unseekable" (1) or (3) (dead object / compute all algorithms).
  • For CheckSignature/CheckHash with multiple signers
    • The CMS document is supposed to list all hash algorithms that got used, so we could use that to precompute hashes (a la Windows CMS).
      • Don't know what Windows does if a hash algorithm gets used that wasn't in the all-used-algorithms list.
    • Seekable streams can, of course, do the captured postion thing.

In a sense, just parallel computing the MD5/SHA1/SHA-2-256/SHA-2-384/SHA-2-512 hashes once during "content capture" sounds nice. But there are a lot of corner mutability cases that are present in SignedCms for compat reasons, and they make that a little weird.

EnvelopedCms consumes the same ContentInfo type, but it should only need to read the data once... and always has to fully load the encrypted (or decrypted) contents into memory, so there are fewer issues there.

@vcsjones
Copy link
Member

In a sense, just parallel computing the MD5/SHA1/SHA-2-256/SHA-2-384/SHA-2-512 hashes once during "content capture" sounds nice.

My 2c: Hm. It's unnecessary computational resources, and it locks us in to extending that pattern for SHA-3-256..SHA-3-512 if/when SHA3 is added and the relevant CMS specs are updated, and whatever might be around in 20 years.

Set a "dead object" bit and throw if we're about to read and it was set.

I think this aligns closest with what I was thinking. A single signer / encryptor is, in my experience, the most common. I would propose though that we make it if (streamTouched && !stream.CanSeek) throw;. If the stream supports seeking, then we can support restoring the position during the call to sign.

@bartonjs bartonjs modified the milestones: 7.0.0, Future Aug 2, 2022
@ulrichb
Copy link

ulrichb commented Dec 9, 2024

As this was once planned for .NET 7 in #64488: Any chance that we get this in .NET 10?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Security
Projects
None yet
Development

No branches or pull requests

5 participants