Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another attempt at "Provide default sign method, add external signer and webcrypto example" #230

Draft
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

vbuch
Copy link
Owner

@vbuch vbuch commented Feb 23, 2024

The external dependencies are not mocked in the tests which may cause all kinds of problems on different platforms. E.g. tests pass on my Mac but fail on my Windows and in Github Actions.
To make sure develop is still workable on I've reverted #220 and readded its commits on top of the reverted version.

We need mocks so that these commits can be merged in.

@coveralls
Copy link

coveralls commented Feb 23, 2024

Coverage Status

coverage: 99.638% (+0.002%) from 99.636%
when pulling c11b647 on signer-external
into 13b071e on develop.

@dcbr
Copy link
Collaborator

dcbr commented Feb 23, 2024

I don't really know what's going wrong, so I'll leave it up to you or do you need my help with this?

@vbuch
Copy link
Owner Author

vbuch commented Feb 23, 2024

I'll give it a go when I have the time.

  • It fails in Actions because a couple of lines are not covered by unit tests and there is 100% threshold.
  • It fails on my Windows machine because pkijs produces an error whose wording differs from the one the Mac produces.

packages/signer/src/Signer.js Outdated Show resolved Hide resolved
@dcbr dcbr mentioned this pull request Mar 25, 2024
Will be reintroduced with future work.

Co-authored-by: dcbr <15089458+dcbr@users.noreply.github.com>
@dcbr
Copy link
Collaborator

dcbr commented Mar 29, 2024

I cannot modify this PR branch, so I cloned it to my own repository and rebased onto the latest development branch. You can see the result here.
All tests pass (after adding an extra one to improve coverage), and apparently there are also no issues on the windows build? @vbuch: Could you verify that this works correctly on your Windows environment and then go ahead with this PR?

@TimKieu
Copy link

TimKieu commented Apr 1, 2024

@vbuch Let me bother you on run-time compatible of this signpdf:
can this tool-kit run on edge run-time? Vercel or CloudFlare.
Skimming on some example, this is nodejs run-time.
Thanks much indeed.

@dcbr dcbr mentioned this pull request Apr 1, 2024
Copy link
Collaborator

@dhensby dhensby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments that are (hopefully) self-explanatory.

I also want to reiterate that I think this approach is too oppinionated and too tightly bound to the pki.js lib. I don't think we should be so tightly bound to that library that we bring their typings into ours.

We should have an abstraction layer that sits between our interfaces and whichever crypto engine is being used, that way we aren't going to be opinionated about what crypto libraries are used and it'll make the implementation of external signers much more simple (and will allow us to move the default engine away from pki.js without breaking our APIs or requiring downstream refactors.

Comment on lines +9 to +28
/** Signature algorithm used for PDF signing
* @type {string}
*/
signAlgorithm: string;
/** Hash algorithm used for PDF signing
* @type {string}
*/
hashAlgorithm: string;
/**
* Method to retrieve the signature algorithm used for PDF signing.
* To be implemented by subclasses or set in the `signAlgorithm` attribute.
* @returns {Promise<string>}
*/
getSignAlgorithm(): Promise<string>;
/**
* Method to retrieve the hashing algorithm used for PDF signing.
* To be implemented by subclasses or set in the `hashAlgorithm` attribute.
* @returns {Promise<string>}
*/
getHashAlgorithm(): Promise<string>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the redundant functions here. A "getter" function can be used if a static prop is not desirable (or we should just always use a function). I don't see a benefit to increasing the API surface here.

For example (if you need a dynamic prop or run some logic):

class MySigner extends ISigner {
  get signAlgorithm() {
    // some logic here
    return alg;
  }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the "getter" approach. To be honest, I didn't know javascript had this functionality.
The original idea for having both a static attribute and the method is that subclasses that don't need custom logic (e.g. request to external service) can just use the "shorthand" notation

class MySigner extends Signer {
    signAlgorithm = '...';
}

But it's probably less confusing if there is just a single method/getter.

* Get a "crypto" extension.
* @returns {pkijs.ICryptoEngine}
*/
getCrypto(): pkijs.ICryptoEngine;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our abstract class should not be tightly bound to pkijs types. If you have a signer and you don't want to be using PKIJS, then this typing is overly restrictive. We should have our own interface for what a crypto engine looks like (it should be really simple IMO, just really a "sign" interface, I'd have thought), and that's it.

What is the "crypto engine" needed for?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a "private" method that does not need to be further modified/extended by subclasses (see my comment below).

Comment on lines +48 to +67
/**
* Obtain the certificates used for signing (first one) and verification (whole list).
* @returns {pkijs.Certificate[]}
*/
obtainCertificates(): pkijs.Certificate[];
/**
* Obtain the private key used for signing.
* @returns {CryptoKey}
*/
obtainKey(): CryptoKey;
/**
* Obtain the signed attributes, which are the actual content that is signed in detached mode.
* @returns {pkijs.Attribute[]}
*/
obtainSignedAttributes(signingTime: any, data: any, signCert: any): pkijs.Attribute[];
/**
* Obtain the unsigned attributes.
* @returns {pkijs.Attribute[]}
*/
obtainUnsignedAttributes(signature: any): pkijs.Attribute[];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be mixing of sync/async methods. Why would these not be async whilst others are? If we want to allow the use of external signers, it's feasible that obtaining certificates may well be an async task.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should all be async, but the generated typings do not show this. How can I fix this? I also now notice that the return type has to be changed to Promise as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Obtain the signed attributes, which are the actual content that is signed in detached mode.
* @returns {pkijs.Attribute[]}
*/
async obtainSignedAttributes(signingTime, data, signCert) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is async, but the typings don't match (neither in the interface nor in the JSDoc)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE #230 (comment) - if you update line 124 to be @returns {Promise<pkijs.Attribute[]>} then they typings will follow

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok perfect, thanks!

* Obtain the unsigned attributes.
* @returns {pkijs.Attribute[]}
*/
async obtainUnsignedAttributes(signature) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect typings

this.signAlgorithm = await this.getSignAlgorithm();
this.hashAlgorithm = await this.getHashAlgorithm();
// Get a crypto extension
this.crypto = this.getCrypto();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bad technique here. We're assigning the crypto prop here, and then other methods rely on it later, but if they were called out of order, or if this sign method were to be overridden in a child class, then implementors need to know this prop is meant to be set as part of the sign method.

It would be much better if this was either set in the constructor, or the crypto engine was passed as a parameter to the functions (a bit like how cmsSignedData.sign() requires it as an argument).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, this should be moved to the constructor instead. I have to check why this was added here exactly.

Comment on lines +256 to +272
/**
* Verify whether the signature generated by the sign function is correct.
* @param {Buffer} cmsSignedBuffer
* @param {Buffer} pdfBuffer
* @returns {boolean}
*/
async verify(cmsSignedBuffer, pdfBuffer) {
// Based on cmsSignedComplexExample from PKI.js
const cmsContentSimpl = pkijs.ContentInfo.fromBER(cmsSignedBuffer);
const cmsSignedSimpl = new pkijs.SignedData({schema: cmsContentSimpl.content});

return cmsSignedSimpl.verify({
signer: 0,
trustedCerts: [],
data: pdfBuffer,
}, this.getCrypto());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any real-world use for this function in our library or is it just a testing helper? I'd suggest there's no need to provide a verify method as part of PDF signing, unless we are also offering PDF verification too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's for testing purposes only.

@dhensby
Copy link
Collaborator

dhensby commented Apr 1, 2024

This may be better discussed in a dedicated issue, but I don't really like this approach as it's mixing the responsibilities of signature generation (the actual signing), with the CMS structure generation.

At the moment the signing interface is really simple and allows for any external signer to be used to generate signatures - all you need to do is effectively provide a single function that signs the provided data. see https://github.com/vbuch/node-signpdf/blob/v3.2.4/packages/utils/src/Signer.js

What this change is really doing is taking the CMS structure generation in-house (which is all good) but it's mixed-in with the signing logic. IMO we need to separate these concerns and provide a way to generate the CMS structure, and then a way to do the signing.

We can think of it a bit like a pipeline. First the PDF needs to be created (or loaded), then a CMS structure is created, then the CMS structure is signed, then the CMS is embedded in the PDF. The signing part of this pipeline shouldn't have to be concerned with anything other than a buffer/uint8array that it receives and then returns a buffer/uint8array that represents the signature.

Comment on lines +32 to +47
/**
* Get a "crypto" extension and override the function used by SignedData.sign to support
* external signing.
* @returns {pkijs.ICryptoEngine}
*/
getCrypto() {
const crypto = super.getCrypto();
crypto.sign = async (_algo, _key, data) => {
// Calculate hash
const hash = await crypto.digest({name: this.hashAlgorithm}, data);
// And pass it to the external signature provider
const signature = await this.getSignature(Buffer.from(hash), Buffer.from(data));
return signature;
};
return crypto;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the reason for having this getCrypto method. But it should be considered a "private" method that is not needed for most Signer implementations.

* Get a "crypto" extension.
* @returns {pkijs.ICryptoEngine}
*/
getCrypto(): pkijs.ICryptoEngine;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a "private" method that does not need to be further modified/extended by subclasses (see my comment below).

Copy link
Collaborator

@dcbr dcbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review and feedback. I agree that pki.js typings shouldn't be exposed here. Although for the "internal"/"private" methods of the Signer implementation there is no way around this I believe? If so, please tell me how I can rewrite it :)

@dcbr
Copy link
Collaborator

dcbr commented Apr 2, 2024

Regarding your last point, I'm not sure I fully understand what you mean.

I don't really like this approach as it's mixing the responsibilities of signature generation (the actual signing), with the CMS structure generation.

At the moment the signing interface is really simple and allows for any external signer to be used to generate signatures - all you need to do is effectively provide a single function that signs the provided data. see https://github.com/vbuch/node-signpdf/blob/v3.2.4/packages/utils/src/Signer.js

What this change is really doing is taking the CMS structure generation in-house (which is all good) but it's mixed-in with the signing logic. IMO we need to separate these concerns and provide a way to generate the CMS structure, and then a way to do the signing.

I would argue that the current situation is even worse/more coupled and that the implementation here is a first step towards decoupling it. For example, when subclassing the current Signer, you indeed have to generate the CMS structure and sign it yourself (or let this be done by another service/library). In this implementation, when subclassing Signer or ExternalSigner you should only take care of the cryptographic signing step and the CMS structure is automatically generated for you already.

We can think of it a bit like a pipeline. First the PDF needs to be created (or loaded), then a CMS structure is created, then the CMS structure is signed, then the CMS is embedded in the PDF. The signing part of this pipeline shouldn't have to be concerned with anything other than a buffer/uint8array that it receives and then returns a buffer/uint8array that represents the signature.

IMO that is exactly how the signing part is implemented in the new Signer and ExternalSigner classes (either by passing a uint8array signing key in the getKey method, or by signing a received uint8array in the getSignature method).
Further separating it to obtain this pipeline procedure, would be even better of course. But I think the current PR is already an improvement and first step in that direction?

@dhensby
Copy link
Collaborator

dhensby commented Apr 2, 2024

I agree that this is an improvement on what we've currently got (which, as you rightly point out, is either a very rigid implementation from node-forge or you have to complete BYO CMS). My current implementation is the BYO CMS approach, I build and sign the entire signature and have that embedded as the node-forge implementation is not "fit for purpose", so I'm totally on-side with improving this.

It's really the architectural approach I don't agree with here. If we have "private" methods that we are implementing for the crypto aspect, then I'd argue they should not even be in the class at all.

If we want this "all-in-one" signer class, I feel it should probably work something like:

// Use the JWA spec for algs as a fairly well shared standard
type SigningAlg = 'RS256' | 'RS384' | 'RS512' | 'ES256' | 'ES384' | 'ES512' | 'PS256' | 'PS384' | 'PS512';

// we may want to add `crls` or `ocsp` info here?! or a way to fetch that info??
interface SignerOptions {
  certificate: string|UInt8Array, // the certificate being used for signing
  signer: (alg: SigningAlg; payload: UInt8Array) => Promise<UInt8Array>;
  certificateChain?: (string|UInt8Array)[]; // the certificate chain
  digestAlgorithm?: SigningAlg;
  authenticatedAttributes?: Record<string, string>; // provide a way to override / add authenticated attrs
  unauthenticatedAttributes?: Record<string, string>; // provide a way to override / add unauthenticated attrs
}

class Signer {
  constructor(opts: SignerOptions) {
    // just pseudo code, this would need to be validated, and stored against props properly
    this.opts = opts;
  }

  async sign(pdfBuffer: UInt8Array): Promise<UInt8Array> {
    // construct CMS encoded attrs for signing
    const sig = await this.opts.signer(algsignedAtrrs);
    // finish constructing CMS data
    return cmsData;
  }

This is obviously really simplified, it doesn't provide any way to support a .pfx/P12 certificate. It also makes a really hard assumption that there will only be one signer for the CMS, but there's obviously room in the spec for the CMS sig to have many signers...


After writing this out, I still don't like this approach. I think that we really need our own CMS interface that is in charge of creating the CMS structure, and then separately a Signer interface that can be passed to the CMS generator. The CMS structure isn't trivial, whilst it's easy to create a naive CMS signing interface, once you start drilling down into the actual spec, it really becomes quite complicated and there's a lot to be implemented (which I suspect is why this has been kept outside the remit of the library).

I will try to find some time to put something together...

@dcbr
Copy link
Collaborator

dcbr commented Apr 3, 2024

Ok, thanks for the further clarifications.
This PR was my first attempt and idea of providing some extra functionality that is currently not supported, but I'm definitely open to reconsider the architecture for the better. My intent was to simplify the generation of PAdES compliant signatures, even up to the LTV level (requiring indeed some ocsp or crl data to be embedded in the CMS or pdf DSS). By extending the proposed Signature class of this PR I was able to produce such signatures. It was still in a prototype phase and too much changes to include in this PR (which is already quite large on its own), but it nicely "fit in" this class structure in my opinion. The idea was then that a user could pass a "signature configuration" structure when constructing a Signer instance (with some predefined defaults for e.g. "pades-b-b", "pades-b-t", "pades-b-ltv" type signatures). Unfortunately I haven't had much time since to further complete it (and I won't have in the coming months).

So if in the meantime you can come up with an alternative implementation, I'm all for it and very curious to see the result :)

Edit:
Just found the "signature configuration" interface I was using (that can then be passed to the Signer constructor):

interface SignConfiguration {
    // Signature dictionary
    subFilter: string; // Subfilter of the signature dictionary
    // Signed attributes
    signingTimeAttr: boolean; // Flag indicating whether the `signing-time` signed attribute should be included in the signature
    signingCertAttr: boolean; // Flag indicating whether the `signing-certificate-V2` signed attribute should be included in the signature
    revocationInfoAttr: boolean; // Flag indicating whether the `adbe-revocationInfoArchival` signed attribute should be included in the signature
    // Unsigned attributes
    signatureTimestamp: boolean; // Flag indicating whether the signature should be timestamped (using the `signature-time-stamp` unsigned attribute)
    // Other
    dss: boolean; // Flag indicating whether the signature certificates and revocation information should be added to the DSS
    documentTimestamp: boolean; // Flag indicating whether the document should be timestamped
};

@vbuch
Copy link
Owner Author

vbuch commented Apr 4, 2024

Sorry. I'm really detached from the communication here. Just sharing an opinion I think I've shared before that is pretty shallow and maybe doesn't help: I prefer to have a minimal scope to maintain and maximum extensibility + modularity. So I imagine if you figure out a way to keep all the pkijs stuff in a separate package it would be the most awesome thing. Again, I didn't read the whole discussion you two have here and I don't think I will in the coming couple of months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants