Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Data classes for common AWS event triggers #164

Closed
10 tasks done
michaelbrewer opened this issue Sep 10, 2020 · 4 comments
Closed
10 tasks done

RFC: Data classes for common AWS event triggers #164

michaelbrewer opened this issue Sep 10, 2020 · 4 comments
Labels

Comments

@michaelbrewer
Copy link
Contributor

michaelbrewer commented Sep 10, 2020

Key information

Summary

NOTE: This solution should not have any additional dependencies and be as lightweight as possible.

Create a set of data classes for all of the common AWS Event Triggers these data classes will include:

  1. Docstrings for each of the fields
  2. Type hinting and code completion
  3. Later support for validation via the Json schema PR
  4. Helper functions to handle some of the decoding (CloudwatchLogsEvent decoding)
  5. Define enums where necessary

Motivation

When defining a new lambda you start with a handler method with event Dict[str, Any] and LambdaContext context parameter. You will then have to find the corresponding documentation for the event structure and field type etc..

Proposal

Each Lambda trigger event there is a corresponding typed data class which doc strings extracted from the AWS documentation, some helper methods to updating values as well as decoding any embedded data. In the test suite there is alot a series of example trigger events.

CloudWatchLogsEvent example usage:

from aws_lambda_powertools.utilities.trigger import CloudWatchLogsEvent

def handler(event: Dict[str, Any], context: LambdaContext):
    # CloudWatchLogsEvent provides a typed data class with a method to decode gzipped log messages
    decoded_data = CloudWatchLogsEvent(event).decode_cloud_watch_logs_data()
    # CloudWatchLogsDecodedData is another typed data class with docstrings and getters
    for log_event in decoded_data.log_events:
        handle_log_message(log_event.message)

PreTokenGenerationTriggerEvent example usage:

def handler(event: Dict[str, Any], context: LambdaContext) -> Dict[str, Any]:
    # PreTokenGenerationTriggerEvent data class includes docstrings and helper methods
    pre_token_event = PreTokenGenerationTriggerEvent(event)

    claims_override_details = pre_token_event.response.claims_override_details
    if pre_token_event.user_name == "SpecialUser":
        # ClaimsOverrideDetails has convenient setters
        claims_override_details.claims_to_suppress = ["Email"]
        claims_override_details.claims_to_add_or_override = {"attribute": "value"}

    if "SpecialGroup" in pre_token_event.request.group_configuration.groups_to_override:
        # A setter method for only part of groupOverrideDetails
        claims_override_details.set_group_configuration_preferred_role("preferred_value")

    # Updated are made to the original dict
    return event

Example screenshot of the docs:

Screen Shot 2020-09-10 at 12 15 59 PM

List of initial trigger events to support:

  • API Gateway Poxy V1 and V2 events
  • CloudWatch log event with decoding of the gzipped logs
  • Cognito user pool events
  • DynamoDB stream events
  • Event bridge
  • Kinesis stream events
  • S3 event notifications
  • SES events
  • SNS events
  • SQS events

Drawbacks

Why should we not do this?

This will include alot of extra code to maintain and we will have to keep it in sync when new event triggers are created. However all new Lambda users will have to do the same.

Do we need additional dependencies? Impact performance/package size?

No extra dependencies are needed

Rationale and alternatives

  • What other designs have been considered? Why not them?
    There are alternatives that include a large set of dependencies, and will still need the similar amount of work (like tracking down example events and documentation)

Unresolved questions

Mapping for keys like type, id, from etc.. to the corresponding getter method like get_id

Optional, stash area for topics that need further development e.g. TBD
Add integration for validation code from other PRs

@michaelbrewer michaelbrewer added RFC triage Pending triage from maintainers labels Sep 10, 2020
@michaelbrewer
Copy link
Contributor Author

@heitorlessa i put in a RFC for the existing PR on adding data classes for commint lambda triggers.

@heitorlessa
Copy link
Contributor

Adding a reminder to myself to think this through next Friday as I blocked my morning for this.

I really like the idea of having a lightweight solution to this over Pydantic @michaelbrewer.

That said, these are initial questions I'd like us to discuss in this RFC:

  • Should we make this a separate package from the get go? Or start here and evolve towards that?

  • How could we make these classes reusable for the advanced parser PR feat: Advanced parser utility (pydantic) #118 on Pydantic? Or should we?

  • How do we decide when to add a getter vs not to? i.e. popular properties vs properties where payload data is and potentially need transformation

cc @igorlg @nmoutschen @cakepietoast for a reavida and thoughts too

@michaelbrewer
Copy link
Contributor Author

Adding a reminder to myself to think this through next Friday as I blocked my morning for this.

I really like the idea of having a lightweight solution to this over Pydantic @michaelbrewer.

That said, these are initial questions I'd like us to discuss in this RFC:

  • Should we make this a separate package from the get go? Or start here and evolve towards that?
  • We could use some of these helpers internally as part of the powertool utilities. Either way it is up to you 👍 . Ideally
    some of this could be generated from a message descriptor and ported to other languages (kind of like how CDK generates is classes from the Cloudformation spec).
  • I think we could by passing in a validator callable to the constructor. eg:
# Where `event` is the original Dict from the handler and `validator_method` is the Callable that does validation
gateway_event = APIGatewayProxyEventV2(event, validator_method)
  • How do we decide when to add a getter vs not to? i.e. popular properties vs properties where payload data is and potentially need transformation
  • I think we should have @Property mappings for all of the passible fields and doc strings. And then some helper methods that
    can help with transformations or decoding.

cc @igorlg @nmoutschen @cakepietoast for a reavida and thoughts too

@heitorlessa heitorlessa added area/utilities pending-release Fix or implementation already in dev waiting to be released and removed triage Pending triage from maintainers labels Sep 19, 2020
@to-mc to-mc mentioned this issue Sep 21, 2020
6 tasks
@to-mc
Copy link
Contributor

to-mc commented Sep 22, 2020

Thanks for all the work on this @michaelbrewer! Closing now since we released this functionality in 1.6.0.

@to-mc to-mc closed this as completed Sep 22, 2020
@heitorlessa heitorlessa removed the pending-release Fix or implementation already in dev waiting to be released label Oct 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

3 participants