Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[crmsh-4.6] Fix: report.utils: Fix the performance issue (bsc#1232821) #1605

Merged

Conversation

liangxin1300
Copy link
Collaborator

@liangxin1300 liangxin1300 commented Nov 5, 2024

The function extract_critical_log can be very slow when processing large log files. The main issue is the inefficiency of the regular expression, which combines multiple wildcards (.*) with alternation (|), leading to excessive backtracking. Additionally, the function reads the entire file into memory, which is not optimal for large files.

To improve performance, it is better to use grep with the -F option to search for fixed strings.

And for a sequence of archived log files, check the modify time. No need to check the rest of the files if the from time is greater than the modify time of the file.

Copy link

codecov bot commented Nov 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.04%. Comparing base (6bff117) to head (11795f2).

Additional details and impacted files
Flag Coverage Δ
integration 52.69% <ø> (+<0.01%) ⬆️
unit 49.63% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

see 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@liangxin1300 liangxin1300 force-pushed the 20241105_bsc_1232821 branch 2 times, most recently from 368d969 to 13b4f9e Compare November 6, 2024 14:47
@liangxin1300 liangxin1300 changed the title Fix: report.collect: Make sure the log is not None before using it (bsc#1232821) Fix: report.utils: Fix the performance issue (bsc#1232821) Nov 6, 2024
@liangxin1300 liangxin1300 changed the title Fix: report.utils: Fix the performance issue (bsc#1232821) [crmsh-4.6] Fix: report.utils: Fix the performance issue (bsc#1232821) Nov 6, 2024
@liangxin1300 liangxin1300 marked this pull request as ready for review November 7, 2024 07:11
@@ -3,7 +3,7 @@
BIN_CRM = "/usr/sbin/crm"
BIN_COLLECTOR = f"{BIN_CRM} report __collector"
COMPRESS_DATA_FLAG = "COMPRESS CRM_REPORT DATA:::"
LOG_PATTERNS = "CRIT: ERROR: error: warning: crit:"
LOG_PATTERNS = "CRIT: ERROR: WARNING: crit: error: warning:"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of defining these patterns as space-separated string, we can define it directly as a list of strings.

The function `extract_critical_log` can be very slow when processing
large log files. The main issue is the inefficiency of the regular
expression, which combines multiple wildcards (.*) with alternation (|),
leading to excessive backtracking. Additionally, the function reads the
entire file into memory, which is not optimal for large files.

To improve performance, it is better to use `grep` with the -F option to
search for fixed strings.
…dify time

No need to check the rest of the files if the from time is greater than
the modify time of the file.
@liangxin1300 liangxin1300 merged commit b49bfd5 into ClusterLabs:crmsh-4.6 Nov 8, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants