Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

appmesh - added many tests - fixed big appmesh bug - extended CC to allow appmesh model to fit #9290

Merged
merged 43 commits into from
Mar 21, 2024

Conversation

Johnlon
Copy link
Contributor

@Johnlon Johnlon commented Feb 11, 2024

Update: my tests and class doco are sorted now

Moved some of my doco into query.py

Good to go.

#9307

Johnlon and others added 27 commits November 7, 2023 15:04
…on value not just a field name.

... to allow ....
- Changed appmesh virtual gateway "arn" attribute to be a path to the necessary attribute

Also ...
- Removed my override of get_resources because it resulted in the models in pull and in event modes to be different meaning policies would not be portable.
- Resolved some failing tests.
added missing tag data
added verification of all API calls
HOWEVER There are issues !! I CAOT MAKE TESTS FAIL BY SETTING FIELDS TO JUNK VALUES - see appmesh.py where I've included "!!!!" in some values.
… inside cloud cust that causes occasiional reordering)
… dict inside cloud cust that causes occasional reordering)

HACK changed a bunch of extension properties to be junk values with !!!! in the name to see if/where they are used.
…!!! in the name to see if/where they are used.
# Conflicts:
#	tests/test_appmesh.py
@Johnlon Johnlon requested a review from kapilt as a code owner February 11, 2024 05:36
Copy link
Member

@ajkerrigan ajkerrigan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some good questions in the comments here - although any helpful/clarifying bits might make more sense on the parent TypeInfo docstring rather than defined for appmesh specifically.

I dig the idea of trying to make tests more useful, though for some of the points we're using SDK info / shape details to do best effort validation rather than making everything a functional test. As far as I know that's a conscious design decision.

Comment on lines 34 to 38
# TODO: IS THIS EVEN USED AT RUNTIME?
# Junk value doesn't seem to affect the functionality of the extension
# but does fail the test PolicyMetaLint.test_valid_arn_type
# but that isn't a functional test.
arn_type = "mesh"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arn_type helps custodian build an ARN when the list/describe response doesn't provide one directly. Since we get this resource back in the arn property of a list response and you've already defined arn = "arn", you can remove arn_type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack - I've added a comment.
The other fields in QueryResourceManager.generate_arn are probably the same , apart maybe from ID.

Comment on lines 42 to 52
# TODO: IS THIS EVEN USED AT RUNTIME?
# Junk value doesn't seem to affect the functionality of the extension
# but does fail the test PolicyMetaLint.test_cfn_resource_validity
# but that isn't a functional test.
cfn_type = 'AWS::AppMesh::Mesh'

# TODO: IS THIS EVEN USED AT RUNTIME?
# Junk value doesn't seem to affect the functionality of the extension
# but does fail the test PolicyMetaLint.test_cfn_resource_validity
# but that isn't a functional test.
config_type = 'AWS::AppMesh::Mesh'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types are used in a few places: (config compliance filters, config execution modes, the awscc provider...)

If there is Config/CloudFormation support for a resource type, these values are usually the same (which is why you often see patterns like cfn_type = config_type = 'AWS::AppMesh::Mesh'. There are exceptions like ACM certs, so it's still worth having separate types.

Custodian relies on shape data from the SDK and periodically updated value lists to validate these type properties, rather than adding functional tests for every resource type.

Copy link
Contributor Author

@Johnlon Johnlon Feb 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the PolicyMetaLint does check that the config_type value is know too AWS.

Documentation error in the parent class...
The existing doc comments say these are both optional but actually 'config_type' is mandatory

If cfn_type is optional then under what circumstances must it be provided?

Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CloudFormation supports a resource, we should record that in cfn_type. If that resource has explicit Config support, we should record that in config_type.

Custodian's Config filters/modes look for a config_type first, and can fall back to cfn_type.

cfn_type pops up in a couple extra spots though. A non-exhaustive list there would be the awscc provider and iam-analyzer filter.

I wouldn't expect the TypeInfo comments to always contains a complete list of where those properties are used, though examples are always neat. It's pretty common that Config or CloudFormation add support well after Custodian supports a resource type, so adding config_type or cfn_type down the road is fine too.`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.

c7n/resources/appmesh.py Outdated Show resolved Hide resolved
Comment on lines 243 to 244
# TODO: WHERE IS THIS EVEN USED AT RUNTIME?
arn = "metadata.arn!!!!!!!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_arns() pops up in a few places (like building a set of resource ARNs for bulk tagging operations).

That method tries to get an ARN for a resource type somehow, which means:

  • Do we already have an arn from the list/describe response? Cool, use that
  • Do we have an arn_type defined? Cool, we can use that to build an ARN from the parts we have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for AppmeshVirtualGateway this wont work anyway as "arn" cannot be a path (no dots) and the ARN isn't top level in the datastruture.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overriding get_arns means the metadata for arn isn't used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - got that one thanks

c7n/resources/appmesh.py Outdated Show resolved Hide resolved
@Johnlon
Copy link
Contributor Author

Johnlon commented Feb 11, 2024

There are some good questions in the comments here - although any helpful/clarifying bits might make more sense on the parent TypeInfo docstring rather than defined for appmesh specifically.

I dig the idea of trying to make tests more useful, though for some of the points we're using SDK info / shape details to do best effort validation rather than making everything a functional test. As far as I know that's a conscious design decision.

Agree the comments should be moved - I'm just aggregating them here to sort them out.

DONE - most of the comments are not on in the parent class.

added further appmesh tests that user the Formatter class to verify that my extension definitions would pull the right fields out during reporting and found a bug by doing this!!
…d more tests

Added doco to Formatter
Small improvement style of get_arns()
…and model out of shape. The limitation of CC being fixed is that arn/id/data ought to allow a path not just a simple field name. Such path accesses have been optimised so the jmes search is only used if the name is actually a path.
@Johnlon
Copy link
Contributor Author

Johnlon commented Feb 28, 2024

…ble[[{}, str], None] see if this cures a runtime error on Python 3.8

def expect(self, calls: list[str], on_error: Callable[[dict, str], None] = _default_on_err):
  TypeError: 'type' object is not subscriptable
@Johnlon Johnlon changed the title appmesh - added many tests - but not enough - need help finding observable side effects of the config appmesh - added many tests - fixed big appmesh bug - extended CC to allow appmesh model to fit Feb 29, 2024
Copy link
Member

@ajkerrigan ajkerrigan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @Johnlon . Apologies for the delays reviewing, going to have to just chip in with partial comments here because there are a lot of different things going on here.

(Also I know you've asked multiple times to keep things focused on appmesh specifically, but it's hard to do that given the proposed changes.)

Comment on lines +974 to +975
jmespath_search is expensive and it's rarely the case that
there is a path in the id field, therefore this wrapper is an optimisation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a convenient side effect of @thisisshi 's work in #8533 that we call custodian's own jmespath_search() rather than directly using jmespath.search(). So if we're going to optimize away unnecessary calls, we handle that inside jmespath_search() which is already the most common pattern for extracting things that might be JMESPath expressions or simple top-level field keys.

(That could definitely be handled in a separate PR if we want to keep this one more focused, but this one is already operating at a pretty high/broad level in many of its changes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi I only did this because I a previous commit I was told that the Search was expensive. Welcome better alternative that allows a path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah thanks. Re the request for specificity. The earlier commit didnt include the core changes I made in the final commit. So there's a leap there. Apols

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkerrigan should switch to jmespath_search() and optimise it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkerrigan - hi do you want me to do something with jmespath_search as you suggested?
Thanks John

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, thanks

Copy link
Contributor Author

@Johnlon Johnlon Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at trying what you suggested however I am doubtful about the approach an my ability to make it float.

I thought this naive approach would work ....

def jmespath_search(*args, **kwargs):
    # DROP OUT IF ITS A FIELD 
    path = args[0]
    resource = args[1]
    if '.' not in path:
        return resource[path]

   # Otherwise it's a path...
   return jmespath.search(
           *args,
           **kwargs,
           options=jmespath.Options(custom_functions=C7NJmespathFunctions())
    )

But, the 3rd options passed to the jmespath.search mean my shortcut might interfer with existing usage if people are already passing simple fields to it.

Also I cam across cases where there were more than 2 args being passed to the wrapper "jmespath_search" meaning again I am uncertain whether my shortcut interfers with existing usage.

Any ideas? (BTW I am not a python programmer and not even sure how *args and *kwargs actually work or why one would need them)

Comment on lines +981 to +982
if '.' in path:
return jmespath_search(path, resource)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dig the idea of avoiding unnecessary searches, even if I'm not sure how much time we actually spend there (probably worth some profiling before we just assume savings).

That said, it's probably safer to use jmespath_compile() (caching the results where we can), and use a direct resource.get(key) if the resulting expression has type == 'field'. For example:

jmespath_compile("ResourceId")
{'type': 'field', 'children': [], 'value': 'ResourceId'}

jmespath_compile("&ResourceId")
{'type': 'expref', 'children': [{'type': 'field', 'children': [], 'value': 'ResourceId'}]}

jmespath_compile("ResourceIds[0]")
{'type': 'index_expression', 'children': [{'type': 'field', 'children': [], 'value': 'ResourceIds'}, {'type': 'index', 'value': 0, 'children': []}]}

jmespath_compile("split(',', Resources)")
{'type': 'function_expression', 'children': [{'type': 'literal', 'value': ',', 'children': []}, {'type': 'field', 'children': [], 'value': 'Resources'}], 'value': 'split'}

jmespath_compile("Resource.Id")
{'type': 'subexpression', 'children': [{'type': 'field', 'children': [], 'value': 'Resource'}, {'type': 'field', 'children': [], 'value': 'Id'}]}

And again, it seems helpful to generally treat this case as a "maybe-JMESPath-expression" and handle that at a higher level than this ARN key lookup specifically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkerrigan sorry wasn't sure if you want me to make this optimisation or not?
RSVP thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH Thiis is beyond my python skills :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be permissible to have the original code I suggested since it doesn't raise as many questions as I don't think I'm up to the reorg you are suggesting.

Comment on lines +849 to +853
:param arn: Defines a field in the resource definition that contains the ARN value, when
the resource has an ARM..

This value is accessed used by the 'get_arns(..)' fn on the super-class
QueryResourceManager. This value must be a simple field name and cannot be a path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in more depth later, but it seems totally reasonable to have this represent a JMESPath expression, and then optimize away unnecessary jmespath.search() calls as we like within custodian's derived jmespath_search() method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - that helps

Comment on lines +862 to +870
If you aren't going to define the "arn" field and can't rely on the "id" to be an
ARN then you might get lucky that "generate_arn" works for your resource type.
However, failing that then you should override "get_arns" function entirely and
implement your own logic.

Testing: Whatever approach you use (above) you REALLY SHOULD (!!!) include a unit
test that verifies that "get_arns" yields the right shape of ARNs for your resources.
This test should be implemented as an additional assertion within the unit tests
you'll be already planning to write.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a judgement call here, but some of this feels like it goes beyond explaining the parameters and into "recommendations for creating new resources". This extra level of detail feels more appropriate in something like the Adding a New Resource documentation that is focused on helping contributors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Intention is to verify the API calls that my config and code is making cc perform, and. Not testing boto. Adding this allows.me to be sure that my apps behave as Intended in a way that placebo does not. For example placebo alone is incapable of doing verification of the values passed to the APIs or even the number of calls made really. This is simply a better mock of the boto API to confirm behaviour

Copy link
Contributor Author

@Johnlon Johnlon Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkerrigan I understand your point about moving the comment to the "new resource" content . I was just concerned about it getting separated from the related comments in the preceding paragraph.

Thoughts?

from botocore.history import get_global_history_recorder


class ApiCallCaptor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a cool use of botocore's history recording 👍 . I do want to be a bit careful though, since tests have both a value and a cost, and I don't want to get too hung up on testing botocore's underlying activity. It seems very easy to end up with a glut of tests at a finer grained level of detail than we want to maintain for the value we get out of them.

Copy link
Contributor Author

@Johnlon Johnlon Mar 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment re this being a test of my code and cc config rather than the APIs

I found this really useful and found some unexpected issues. I think it's therefore a useful approach that authors can choose to adopt or ignore. I can move the ApiCallCaptor back into my appmesh test but I can some other services n the pipeline and I'll want to use it there too.

Comment on lines +88 to +112
# The "placebo" testing library doesn't allow us to make assertions
# linking specific api's calls to the specific mock response file
# that will serve that request. So we will compensate here by
# making an assertion about all the api calls and the order
# of calls that must be made.
self.assertEqual(
[
{'operation': 'ListMeshes', 'params': {}, 'service': 'appmesh'},
{'operation': 'DescribeMesh', 'params': {'meshName': 'm1'}, 'service': 'appmesh'},
{'operation': 'DescribeMesh', 'params': {'meshName': 'm2'}, 'service': 'appmesh'},
{'operation': 'DescribeMesh', 'params': {'meshName': 'm3'}, 'service': 'appmesh'},
{
'operation': 'GetResources',
'params': {
'ResourceARNList': [
'arn:aws:appmesh:eu-west-2:123456789012:mesh/m1',
'arn:aws:appmesh:eu-west-2:123456789012:mesh/m2',
'arn:aws:appmesh:eu-west-2:123456789012:mesh/m3',
]
},
'service': 'resourcegroupstaggingapi',
},
],
captor.calls,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this level of detail from an observer standpoint, but I'm not sure what it functionally buys us to assert this during every test run. if we've already asserted that the shape/content of the policy results match expectations, do we need to also tick off each botocore call that got us there?

(This is a question not just for you, but for other maintainers. I can honestly see it from both sides, it's just a tradeoff.)

Copy link
Contributor Author

@Johnlon Johnlon Mar 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It found bugs in my code even though I had other tests. Defo a best practice imho.of course it's up to the committer I guess to.decide.the.level.of. Confidencd they want about their implementations behaviour

--

You can tell that I am big on testing. Those who know me will know my mantra that if there isn't a failable test then it's best to assume that the test subject is either broken or doesn't even exist. I can't work any other way and feel responsible :)

@Johnlon
Copy link
Contributor Author

Johnlon commented Mar 18, 2024

Hi @ajkerrigan any news on the review please.
Appreciated John

Copy link
Member

@ajkerrigan ajkerrigan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates 👍

@ajkerrigan ajkerrigan merged commit 1f854c9 into cloud-custodian:main Mar 21, 2024
22 checks passed
@Johnlon
Copy link
Contributor Author

Johnlon commented Mar 22, 2024

@ajkerrigan thanks for the help.

Please also take a look at this ..
https://github.com/Johnlon/cloud-custodian/blob/main/docs/johns-unofficial-dummys-guide.md

This is for my own benefit but also anyone else.
I'm not sure it's appropriate to commit it to CC as I also don't want to change it away from the current format which suits my purposes right now. But let me know your thougths.
I had it in a diffferent project up until now.

John

@ajkerrigan
Copy link
Member

@ajkerrigan thanks for the help.

Please also take a look at this .. https://github.com/Johnlon/cloud-custodian/blob/main/docs/johns-unofficial-dummys-guide.md

This is for my own benefit but also anyone else. I'm not sure it's appropriate to commit it to CC as I also don't want to change it away from the current format which suits my purposes right now. But let me know your thougths. I had it in a diffferent project up until now.

John

Thanks for the write-up. It's helpful to have an opinionated and detailed run through a bunch of contribution steps 👍 . I'm not sure we'd want to embed that whole thing in the docs, but perhaps could link to it as an external reference if you had it in a post somewhere?

I think the ApiCallCaptor will be very handy in certain spots. I'm looking forward to playing with it locally myself next time I'm deep in the botocore weeds, to see how it feels in practice compared to other patterns of inspecting call patterns. But I also want to be cautious about overusing that. As a concrete example, your example voice connector resource needs to list / get / fetch tags on the way to building a full Custodian resource. Rather than verifying all of the botocore calling patterns on the way, I would rather see folks assert that the expected properties exist on the final resource. If they don't, then by all means dig into the calls to find out why. It's neat to have a few tests showcasing that level of tracking, and validating that abstractions like universal tagging generally behave as expected. But doing it by default for every new resource feels like it carries more noise than signal.

@Johnlon
Copy link
Contributor Author

Johnlon commented Mar 23, 2024

@Johnlon
Copy link
Contributor Author

Johnlon commented Mar 23, 2024

Hi thanks. I will find a home for it.
Re the tagging test , the way I look at it is if there isn't a test then either it's broke or doesn't exist. A psychology stemming from early years of being optimistic and regretting it. I find that strong standardized patterns lead to easier to navigate , understand, trust , maintain code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants