Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate media type of manifest and its descendants #10

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cmd/oci-image-validate/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ func (v *validateCmd) validatePath(name string) error {

switch typ {
case image.TypeManifest:
return schema.MediaTypeManifest.Validate(f)
return image.ValidateManifestMediaType(f)
case image.TypeManifestList:
return schema.MediaTypeManifestList.Validate(f)
case image.TypeConfig:
Expand Down
32 changes: 31 additions & 1 deletion image/manifest.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ func findManifest(w walker, d *descriptor) (*manifest, error) {
return errors.Wrapf(err, "%s: error reading manifest", path)
}

if err := schema.MediaTypeManifest.Validate(bytes.NewReader(buf)); err != nil {
if err := ValidateManifestMediaType(bytes.NewReader(buf)); err != nil {
return errors.Wrapf(err, "%s: manifest validation failed", path)
}

Expand Down Expand Up @@ -240,3 +240,33 @@ loop:
}
return nil
}

// ValidateManifestMediaType validate the manifest schema media-type
// and its descendants fields media type, such as config and layers,
// to make sure they match to spec definition, or returns an error if
// the validation failed.
func ValidateManifestMediaType(r io.Reader) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason to expose media-type validation as a separate thing. There's just “validation”, which includes checking media types against spec requirements as well as checking other parameter values against spec requirements. That probably means we want a comparison with the appropriate JSON Schema (to catch things like schemaVersion is unset) and additional checks for cases not covered by JSON Schema (e.g. schemaVersion != 2). But I don't think a caller will care about picking and choosing particular subsets of validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, validate command will call schema.MediaTypeManifestList.Validate() directly when validating signal manifest file. I think it is an efficient method to wrap descendants and schema checking in one function(ValidateManifestMediaType).

But I don't think a caller will care about picking and choosing particular subsets of validation.

The callers are image layout validation and manifest file validation. According to current calling subsequence, they are checking schema. I present the new function to add descendants checking, make it efficient, and keep current checked subsets un-lost.
Am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the call to schema.MediaTypeManifest.Validate at the end of this function, the name should probably just be ValidateManifest (since it covers descriptor media types locally and the JSON Schema via schema.MediaTypeManifest.Validate.

header := v1.Manifest{}

buf, err := ioutil.ReadAll(r)
if err != nil {
return errors.Wrapf(err, "error reading the io stream")
}

err = json.Unmarshal(buf, &header)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still no reason to read r into a buffer and then unpack it to JSON. You've previously used:

buf, err := ioutil.ReadAll(r)
…
err = json.NewDecoder(bytes.NewReader(buf)).Decode(&header)

And we all agreed that the NewReader call wasn't a good idea. But instead of DecodeUnmarshal, I still think you just want to use:

func ValidateManifestMediaType(reader io.Reader) error {
  manifest := v1.Manifest{}
  err = json.NewDecoder(reader).Decode(&manifest)
  …

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if that we will meet the problem [1] that reader(pointer) is seek to end, and the next using reader is failed of EOF. How to avoid this?
So far I not find a example to reuse Reader in context.

[1] #10 (comment)

if err != nil {
return errors.Wrap(err, "manifest format mismatch")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want an error-checked call to r.Close() here. No need to keep the reader open after you've read it while you walk the descendants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, with 739f3ada574b2f you dropped the early defer r.Close(). I think you want to keep that so the reader is closed on early errors.

And there's no reason to read the file in and then create a bytes.NewReader around it. Just call json.NewDecoder(r)….


if header.Config.MediaType != string(v1.MediaTypeImageConfig) {
return fmt.Errorf("illegal config mediaType: %s", header.Config.MediaType)
}

for _, layer := range header.Layers {
if layer.MediaType != string(v1.MediaTypeImageLayer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably way to strict. I'm not sure these media types are "illegal".

return fmt.Errorf("illegal layer mediaType: %s", layer.MediaType)
}
}

return schema.MediaTypeManifest.Validate(bytes.NewReader(buf))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above function should be covered by this Validate call. What is going on here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seems to mean schema.MediaTypeManifest.Validate() have checked descendants as config and layer media types inside manifest? Actually it doesn't. Currently manifest validation is passed to:

{
    "annotations": null,
    "config": {
        "digest": "sha256:2b8fd9751c4c0f5dd266fcae00707e67a2545ef34f9a29354585f93dac906749",
        "mediaType": "foo",
        "size": 1459
    },  
    "layers": [
        {   
            "digest": "sha256:8ddc19f16526912237dd8af81971d5e4dd0587907234be2b83e249518d5b673f",
            "mediaType": "bar",
            "size": 667590
        }   
    ],  
    "mediaType": "application/vnd.oci.image.manifest.v1+json",
    "schemaVersion": 2
}

schema.MediaTypeManifest.Validate() just check manifest own media type and schema patten, not involving descendants media type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this code is checking descendants. It decodes the manifest, then checks the mediatypes of the layers.

It looks like there is a bug in the validation. Go fix that rather than just adding more code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe
This has been discussed [1]. The validation depends on JSON scheme, involving JSON patten files.
@wking and I think that JSON schema can not present all constraints and bug fix had better not to tie to JSON Schema too tightly.
According to current schema file, which shows common patten as below, it is some hard to add this constraints.

{
    "mediaType": {
      "id": "https://opencontainers.org/schema/image/mediaType",
      "type": "string",
      "pattern": "^[a-z]+/[0-9a-zA-Z.+]+$"
    }  
}

JSON schema is third party to image tools, and we just want to fix bug for implementor tool. So we select to fix it here.

[1] opencontainers/image-spec#273 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If json schema doesn't meet our validation needs, we need to drop it. Adding a new function is not the right way to fix it. It is just adding extra code. The Type.Validate(...) function should do everything that is needed. There shouldn't be two functions to validate the same thing.

}