Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catch common resource schema issues in cfn validate #675

Merged
merged 10 commits into from
Feb 5, 2021

Conversation

PatMyron
Copy link
Contributor

@PatMyron PatMyron commented Jan 28, 2021

continuing #663, #668
fixes #395, #459, #414


how to run new validations on all existing resource provider schemas


curl -s -N --compressed https://d1uauaxba7bl26.cloudfront.net/latest/gzip/CloudFormationResourceSpecification.json | grep -E '"[a-g][i-z]'
        "error": {
        "code": {
CloudformationSchemas $ grep 'pattern.*arn:aws:' *
aws-appflow-flow.json:      "pattern" : "arn:aws:appflow:.*:[0-9]+:.*",
aws-appflow-flow.json:      "pattern" : "arn:aws:kms:.*:[0-9]+:.*",
aws-globalaccelerator-endpointgroup.json:      "pattern" : "^arn:aws:(\\w+)::(\\d{12}):(accelerator)/([0-9a-f-]+)/listener/([0-9a-f-]+)$"
aws-ivs-channel.json:      "pattern" : "^arn:aws:ivs:[a-z0-9-]+:[0-9]+:channel/[a-zA-Z0-9-]+$",
aws-ivs-playbackkeypair.json:      "pattern" : "^arn:aws:ivs:[a-z0-9-]+:[0-9]+:playback-key/[a-zA-Z0-9-]+$",
aws-ivs-streamkey.json:      "pattern" : "^arn:aws:ivs:[a-z0-9-]+:[0-9]+:stream-key/[a-zA-Z0-9-]+$",
aws-ivs-streamkey.json:      "pattern" : "^arn:aws:ivs:[a-z0-9-]+:[0-9]+:channel/[a-zA-Z0-9-]+$"
aws-mwaa-environment.json:      "pattern" : "^(((arn:aws(-[a-z]+)?:kms:[a-z]{2}-[a-z]+-\\d:\\d+:)?key\\/)?[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}|(arn:aws:kms:[a-z]{2}-[a-z]+-\\d:\\d+:)?alias/.+)$"
aws-opsworkscm-server.json:      "pattern" : "arn:aws:iam::[0-9]{12}:role/.*",
aws-opsworkscm-server.json:      "pattern" : "arn:aws:iam::[0-9]{12}:instance-profile/.*",
aws-sso-assignment.json:      "pattern" : "arn:aws:sso:::instance/(sso)?ins-[a-zA-Z0-9-.]{16}",
aws-sso-assignment.json:      "pattern" : "arn:aws:sso:::permissionSet/(sso)?ins-[a-zA-Z0-9-.]{16}/ps-[a-zA-Z0-9-./]{16}",
aws-sso-instanceaccesscontrolattributeconfiguration.json:      "pattern" : "arn:aws:sso:::instance/(sso)?ins-[a-zA-Z0-9-.]{16}",
aws-sso-permissionset.json:      "pattern" : "arn:aws:sso:::permissionSet/(sso)?ins-[a-zA-Z0-9-.]{16}/ps-[a-zA-Z0-9-./]{16}",
aws-sso-permissionset.json:      "pattern" : "arn:aws:sso:::instance/(sso)?ins-[a-zA-Z0-9-.]{16}",

@@ -1,6 +1,7 @@
import json
import logging
import os
import re
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this type of testing. Get rid of some of the java patterns would be helpful. My question is the python re the one we want to standardize on? For instance ^[\d\w-_.+]*$ is technically valid but in python it needs to be ^[\d\w\-_.+]*$.

Copy link
Contributor Author

@PatMyron PatMyron Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, resource schema patterns should be valid and equivalent in both Python and Java (and more)

JSON schema itself recommends sticking to a minimal subset of regular expression syntax. Let's encourage the same

Just emitting warnings for patterns not valid in Python seems like a good start in that direction:

  23 Could not validate regular expression: ^([\p{L}\p{Z}\p{N}_.:/=+\-@]*)$
   6 Could not validate regular expression: ^[a-zA-Z0-9_\-\+\./\(\)\$\p{Zs}]+$
   4 Could not validate regular expression: ^[a-zA-Z0-9-]+{1,255}$
   3 Could not validate regular expression: ^([\p{L}\p{Z}\p{N}_.:=+\/\-@%]*)$
   3 Could not validate regular expression: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF  ]*
   2 Could not validate regular expression: ^[\/]+([^~]*(~[01])*)*{1,512}$
   2 Could not validate regular expression: ^((?!aws:)[\p{L}\p{Z}\p{N}_.:/=+\-@]*)$
   2 Could not validate regular expression: [\p{L}\p{M}\p{Z}\p{S}\p{N}\p{P}]*
   1 Could not validate regular expression: ^[a-zA-Z0-9_\-\+\./\(\)\p{Zs}]*$
   1 Could not validate regular expression: ^[a-zA-Z0-9\s-_()\[\]]+$
   1 Could not validate regular expression: ^[a-zA-Z-0-9-:\/]*{1,1000}$
   1 Could not validate regular expression: ^[a-zA-Z-0-9-:\/.]*{1,1000}$
   1 Could not validate regular expression: ^[\p{L}\p{Z}\p{N}_.:\/=+\-@%]*$
   1 Could not validate regular expression: ^[\p{L}\p{Z}\p{N}_.:/=+\-@]*$|resource-groups:Name
   1 Could not validate regular expression: ^[A-Za-z0-9._\-:\/#\+]+{1,255}$
   1 Could not validate regular expression: ^(?!aws:.*)([\p{L}\p{Z}\p{N}_.:=+\/\-@%]*)$
   1 Could not validate regular expression: ^(?!aws:)[\p{L}\p{Z}\p{N}_.:\/=+\-@%]*$
   1 Could not validate regular expression: [^_\p{Z}][\p{L}\p{M}\p{S}\p{N}\p{P}][^_\p{Z}]+
   1 Could not validate regular expression: [\p{L}\p{Z}\p{N}_.:\/=+\-@]+
   1 Could not validate regular expression: [\p{L}\p{Z}\p{N}_.:\/=+\-@\[\]\{\}\$\\"]*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON schema itself recommends sticking to a minimal subset of regular expression syntax. Let's encourage the same

Agreed completely here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Golang has some interesting regex limitations even within that minimal subset:
hashicorp/terraform-provider-awscc#88
golang/go#7252

@PatMyron PatMyron marked this pull request as ready for review February 3, 2021 06:45
@PatMyron PatMyron linked an issue Feb 3, 2021 that may be closed by this pull request
@PatMyron PatMyron added the schema Related to the provider meta schema label Feb 3, 2021
Copy link
Contributor

@johnttompkins johnttompkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advise trying to use the schema flattener to check any nested object's properties. The flattener will place all nested schemas in the schema_map for use. iterating over these as well as the top level definitions and properties should allow you to definitively check all objects

src/rpdk/core/data_loaders.py Outdated Show resolved Hide resolved
src/rpdk/core/data_loaders.py Outdated Show resolved Hide resolved
Copy link
Contributor

@omkhegde omkhegde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I suggest adding a test case with a large schema file with a lot of nested properties.

@PatMyron PatMyron merged commit e874a82 into master Feb 5, 2021
@PatMyron PatMyron deleted the catch-schema-issues branch February 5, 2021 00:15
@PatMyron
Copy link
Contributor Author

PatMyron commented Feb 5, 2021

LGTM, I suggest adding a test case with a large schema file with a lot of nested properties.

ran cfn validate on all the existing resource schemas, which should have quite a few nested properties

kddejong pushed a commit to kddejong/cloudformation-cli that referenced this pull request Oct 24, 2022
…on#675)

* catching non-ASCII characters

* hardcoded patterns/enums

* lowercase properties

* incorrect min/max constraints

* invalid patterns
kddejong pushed a commit to kddejong/cloudformation-cli that referenced this pull request Oct 24, 2022
…on#675)

* catching non-ASCII characters

* hardcoded patterns/enums

* lowercase properties

* incorrect min/max constraints

* invalid patterns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema processing schema Related to the provider meta schema
Projects
None yet
4 participants