Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(parser): add KafkaMskEventModel and KafkaSelfManagedEventModel #1499

Merged
merged 27 commits into from
Sep 9, 2022

Conversation

ran-isenberg
Copy link
Contributor

@ran-isenberg ran-isenberg commented Sep 4, 2022

Issue number: #1498

Summary

parser support for kafka swelf managed events to lambda.
TBH still debating if the envelope is really useful, i think in most cases people should extend kafkamodel class and use it instead.

Changes

added envelope, models, tests docs.

Please provide a summary of what's being changed

User experience

see #1498

Checklist

If your change doesn't seem to apply, please leave them unchecked.

Is this a breaking change?

RFC issue number:

Checklist:

  • Migration process documented
  • Implement warnings (if it can live side by side)

Acknowledgment

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.


View rendered CHANGELOG.md
View rendered docs/core/event_handler/api_gateway.md
View rendered docs/utilities/parser.md

@ran-isenberg ran-isenberg requested a review from a team as a code owner September 4, 2022 14:10
@ran-isenberg ran-isenberg requested review from heitorlessa and removed request for a team September 4, 2022 14:10
@boring-cyborg boring-cyborg bot added documentation Improvements or additions to documentation tests labels Sep 4, 2022
@pull-request-size pull-request-size bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 4, 2022
@pull-request-size pull-request-size bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 4, 2022
@ran-isenberg
Copy link
Contributor Author

i think there's something with spaces/ \n in 2 files for some reason

@ran-isenberg ran-isenberg changed the title feature: Kafka Parser support feature(parser): Kafka Parser support Sep 4, 2022
@leandrodamascena
Copy link
Contributor

Hi @ran-isenberg. Thank you very much for submitting this PR to improve the parser features. We were working on #1485 to make sure we had consistency in the fields and nothing was missing.

At first glance, I think we need to change things up in this PR.

1 - I can't see the eventSourceArn field. This is mandatory when the event source comes from MSK.
2 - Are you considering events coming only from Self-Managed Kafka? I think we should consider both: MSK and Self-Managed Kafka.

Feel free to ping me on Discord if you want.

Leandro

@ran-isenberg
Copy link
Contributor Author

Hi @ran-isenberg. Thank you very much for submitting this PR to improve the parser features. We were working on #1485 to make sure we had consistency in the fields and nothing was missing.

At first glance, I think we need to change things up in this PR.

1 - I can't see the eventSourceArn field. This is mandatory when the event source comes from MSK. 2 - Are you considering events coming only from Self-Managed Kafka? I think we should consider both: MSK and Self-Managed Kafka.

Feel free to ping me on Discord if you want.

Leandro

@leandrodamascena
I can add the second one, no worries.
However, you didnt answer my question about the envelope - do you think it's required in this case? i dont think it makes sense to get the value param without the extra data. maybe we should remove the envelope and stay with the model only like in other events?

@ran-isenberg
Copy link
Contributor Author

@leandrodamascena pushed the fixes

@ran-isenberg
Copy link
Contributor Author

@heitorlessa @leandrodamascena added some data class kafka docs fixes

@leandrodamascena
Copy link
Contributor

Hi @ran-isenberg! Answering your question: yes, we think it makes sense to keep the Kafka envelope to extract the logs from the record. We believe it makes the code cleaner and we follow the same one we used in Kinesis, for example.

Could you please rebase the repository locally and just send the code related to this PR? For some reason this PR came with previous changes to github actions and this makes it a little difficult to read the PR and understand everything that is being changed.

I will make considerations that we could consider changing before merging this PR. But as I see, this is a nice feature to parser events from Kafka.

Thanks 🚀

Ran Isenberg added 2 commits September 6, 2022 21:07
@ran-isenberg
Copy link
Contributor Author

Hi @ran-isenberg! Answering your question: yes, we think it makes sense to keep the Kafka envelope to extract the logs from the record. We believe it makes the code cleaner and we follow the same one we used in Kinesis, for example.

Could you please rebase the repository locally and just send the code related to this PR? For some reason this PR came with previous changes to github actions and this makes it a little difficult to read the PR and understand everything that is being changed.

I will make considerations that we could consider changing before merging this PR. But as I see, this is a nice feature to parser events from Kafka.

Thanks 🚀

take a look now ;)

@ran-isenberg
Copy link
Contributor Author

you might notice that there's a clear distinction between the two kafka events, unlike data classes, which is what i'd expect from a parser.

@leandrodamascena
Copy link
Contributor

you might notice that there's a clear distinction between the two kafka events, unlike data classes, which is what i'd expect from a parser.

Yes, I've seen that and it makes it clearer events from selfManaged and MSK. I'll open a issue to split the data classes into 2 events, just like the parser. 🏅

@codecov-commenter
Copy link

codecov-commenter commented Sep 7, 2022

Codecov Report

Merging #1499 (f41c82c) into develop (d3a0007) will decrease coverage by 0.06%.
The diff coverage is 94.36%.

@@             Coverage Diff             @@
##           develop    #1499      +/-   ##
===========================================
- Coverage    99.48%   99.42%   -0.07%     
===========================================
  Files          126      128       +2     
  Lines         5672     5743      +71     
  Branches       661      670       +9     
===========================================
+ Hits          5643     5710      +67     
- Misses          14       18       +4     
  Partials        15       15              
Impacted Files Coverage Δ
...a_powertools/utilities/data_classes/kafka_event.py 92.75% <ø> (ø)
aws_lambda_powertools/shared/functions.py 85.18% <73.33%> (-14.82%) ⬇️
..._powertools/utilities/parser/envelopes/__init__.py 100.00% <100.00%> (ø)
...bda_powertools/utilities/parser/envelopes/kafka.py 100.00% <100.00%> (ø)
...bda_powertools/utilities/parser/models/__init__.py 100.00% <100.00%> (ø)
...lambda_powertools/utilities/parser/models/kafka.py 100.00% <100.00%> (ø)
aws_lambda_powertools/tracing/tracer.py 100.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.



class KafkaBaseEventModel(BaseModel):
bootstrapServers: Optional[List[str]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is not optional. So, we must remove the Optional. Even the AWS documentation here doesn't mention the bootstrapServers field, I created an MSK cluster and added it as an event source of a Lambda and I was able to confirm that the bootstrapServers field is also present there.

image

I've notified the documentation team to include it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np, will fix it

tests/functional/parser/test_kafka.py Outdated Show resolved Hide resolved
tests/functional/parser/test_kafka.py Outdated Show resolved Hide resolved
Comment on lines 16 to 28
def _base64_decode(value: str) -> bytes:
try:
logger.debug("Decoding base64 Kafka record item before parsing")
return base64.b64decode(value)
except (BinAsciiError, TypeError):
raise ValueError("base64 decode failed")


def _bytes_to_string(value: bytes) -> str:
try:
return value.decode("utf-8")
except (BinAsciiError, TypeError):
raise ValueError("base64 UTF-8 decode failed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are adding new features and always improving our code, what do you think about moving these 2 functions to the shared functions file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, np

tests/functional/parser/test_kafka.py Outdated Show resolved Hide resolved
ran-isenberg and others added 4 commits September 7, 2022 13:08
Co-authored-by: Leandro Damascena <leandro.damascena@gmail.com>
Co-authored-by: Leandro Damascena <leandro.damascena@gmail.com>
Co-authored-by: Leandro Damascena <leandro.damascena@gmail.com>
@ran-isenberg
Copy link
Contributor Author

@leandrodamascena all fixed

@leandrodamascena
Copy link
Contributor

leandrodamascena commented Sep 7, 2022

@leandrodamascena all fixed

Thanks for addressing all changes very quickly! I made some minor changes to change the models name and make it clear to users.

Hi @heitorlessa, could you please take a look here? I would like to have a double check on this.

@leandrodamascena leandrodamascena self-assigned this Sep 7, 2022
@heitorlessa heitorlessa changed the title feature(parser): Kafka Parser support feat(parser): add KafkaMskEventModel and KafkaSelfManagedEventModel Sep 8, 2022
@github-actions github-actions bot added the feature New feature or functionality label Sep 8, 2022
@leandrodamascena
Copy link
Contributor

Hi @ran-isenberg! If you have no other considerations, it's okay to merge this PR.

Thanks again for helping to improve the project. 🚀

@ran-isenberg
Copy link
Contributor Author

ran-isenberg commented Sep 9, 2022

sure, thx for the changes.
Let's merge :)

@leandrodamascena leandrodamascena merged commit 0e7819b into aws-powertools:develop Sep 9, 2022
rubenfonseca pushed a commit that referenced this pull request Sep 13, 2022
…1499)

Co-authored-by: Leandro Damascena <leandro.damascena@gmail.com>
Co-authored-by: Heitor Lessa <lessa@amazon.com>
Co-authored-by: heitorlessa <lessa@amazon.co.uk>
Co-authored-by: Release bot <aws-devax-open-source@amazon.com>
Co-authored-by: Peter Schutt <peter@schutt.io>
Co-authored-by: Ran Isenberg <ran.isenberg@ranthebuilder.cloud>
Co-authored-by: Ran Isenberg <ran.isenberg@cyberark.com>
@ran-isenberg ran-isenberg deleted the kafka2 branch September 16, 2022 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation feature New feature or functionality size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants