Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentry message improvement #279

Merged
merged 8 commits into from
Mar 29, 2021

Conversation

wRAR
Copy link
Member

@wRAR wRAR commented Dec 30, 2020

As discussed with Renne before, this is a small batch of improvements for SendSentryMessage copied from internal SH projects and already used there for some time. I've split them into several commits:

  1. Add the spider name to tags.
  2. Add the failed monitor names to tags.
  3. Calculate a hash of the failed monitor names and add it to tags.

I omitted another thing: setting the fingerprint to [spider_name, reason_hash] (so that the reports are only grouped into one if the list of failed monitors matches) as that's backward incompatible and probably needs either a setting or an easy way to do this in a subclass.

@codecov
Copy link

codecov bot commented Dec 30, 2020

Codecov Report

Merging #279 (4cb5102) into master (ef6d632) will decrease coverage by 0.64%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #279      +/-   ##
==========================================
- Coverage   73.66%   73.02%   -0.65%     
==========================================
  Files          68       68              
  Lines        2954     2980      +26     
  Branches      319      323       +4     
==========================================
  Hits         2176     2176              
- Misses        717      743      +26     
  Partials       61       61              
Impacted Files Coverage Δ
spidermon/contrib/actions/sentry/__init__.py 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ef6d632...bc73406. Read the comment docs.

Copy link
Collaborator

@rennerocha rennerocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments and questions on the code. Please let me know if you have any questions.

suggestion: Include documentation explaining what tags are added to the notification.

for mon in group:
try:
k = mon[1].lower().replace(" ", "_")
# 32 chars undocumented tag key length
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is this a limit from Sentry (not allowing tags with more than 32 chars)? If so, maybe it is worthy to include that information in the documentation of the action.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's now documented:

Tag keys are limited to 32 characters.
Tag values are limited to 200 characters.

for key, group in groupby(failed_monitors, key=lambda x: x[0]):
for mon in group:
try:
k = mon[1].lower().replace(" ", "_")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: When failed method name has special characters (like á, *, etc), the tag is not applied correctly and Sentry adds an error in the notification.

invalid_chars

suggestion: It would be great if we handle these specials characters to avoid this problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. I wonder how should we escape them (dropping them looks bad).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Maybe we may consider using a library like python-slugify to generate proper slug for this string, removing/replacing unwanted characters. This library is already a spidermon dependency (for validation stats), so you can use it with no need to include an extra requirement to the project.

try:
k = mon[1].lower().replace(" ", "_")
# 32 chars undocumented tag key length
tags.update({k[:32]: 1})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: What if I have two methods with the same name (at least in the first 32 characters)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there will be only one tag then. Note that this uses human-readable names if available so there is probably more variety. I also wonder if should strip leading "test_" to save some space.

Copy link
Collaborator

@rennerocha rennerocha Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: If you decorate your monitor class and/or your monitor methods with @monitors.name, the test_ part will not be added, so it won't avoid duplication. I think it is worthwhile to add a warning about this limitation in Sentry action documentation and the user can take care of not using long names for its monitor.

suggestion: Given that the format of the failed_monitors will always be <ClassName>/<MethodName> (even if the names are defined by the user) I think get_tags could be simplified to:

    def get_tags(self, message):
        tags = {
            "spider_name": message.get("spider_name", ""),
            "project_name": self.project_name,
        }

        for failed_monitor in message.get("failed_monitors", []):
            # 32 chars limited by Sentry
            key = slugify(failed_monitor.split("/")[-1])[:32]
            tags[key] = 1

        return tags

@@ -100,15 +102,47 @@ def get_message(self):

return message

@staticmethod
def get_tags(message):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: We don't have any tests for this action. It would be good to have at least some tests for get_tags method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have tests for anything in this module, what do you suggest?

@rennerocha rennerocha added this to the 1.15.0 milestone Jan 19, 2021
Copy link
Collaborator

@rennerocha rennerocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost good to merge, just minor changes to make it work in a more general way):

  • Add slugify to generate better names for the tags (excluding unicode invalid values)
  • Document get_tags method informing which tags are included to the notification and adding an warning about the 32 chars limits for the tag name.

@rennerocha rennerocha self-requested a review March 29, 2021 19:41
@rennerocha rennerocha merged commit 456344f into scrapinghub:master Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants