Skip to content

Conversation

@simonpasquier
Copy link
Contributor

@simonpasquier simonpasquier commented Nov 7, 2025

What this PR does / why we need it:

Add a new script and skill to parse and analyze Prometheus alerts contained in must-gathers.

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.

Summary by CodeRabbit

  • New Features

    • Added Prometheus alert analysis to must-gather: CLI-invokable, supports global or namespace-scoped runs, lists active alerts in a readable table, and shows totals for pending/firing alerts with input validation and clear error messages.
  • Documentation

    • Updated must-gather analysis docs and skill guidance to include Monitoring/Prometheus analysis, usage examples, integration into the full analysis flow, and expanded interpret-and-report steps.

Add a new script and skill to parse and analyze Prometheus alerts
contained in must-gathers.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 7, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 7, 2025

Note

.coderabbit.yaml has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)
Validation error: Unrecognized key(s) in object: 'tools'
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Walkthrough

Adds Prometheus alerts analysis to the must-gather analyzer: a new analyze_prometheus.py script that parses Prometheus rules/alerts from a must-gather, filters/sorts active alerts (optional namespace scope), and prints an alerts table and summary; documentation and CLI integration updated; README.md shows duplicated usage content.

Changes

Cohort / File(s) Summary
Documentation & CLI integration
plugins/must-gather/README.md, plugins/must-gather/commands/analyze.md, plugins/must-gather/skills/must-gather-analyzer/SKILL.md
Add documentation and CLI mapping for Prometheus/alerts/monitoring analysis: usage examples, keywords mapping to the new analyzer, inclusion of a Monitoring section in full analysis output, and updated interpret/report guidance. Note: README.md contains duplicated identical usage insertion.
Prometheus Analysis Implementation
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py
New Python script that loads rules.json from a must-gather, parses JSON with stderr warnings, validates structure, collects active alerts (type 'alerting' and state != 'inactive'), supports optional namespace filtering, sorts alerts (namespace, name, severity), prints a formatted alerts table and a concise summary, and exposes a CLI main() entrypoint.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant Analyzer
    participant RulesJSON
    participant Output

    User->>CLI: run analyze_prometheus.py <must_gather_path> [--namespace ns]
    CLI->>Analyzer: main() -> analyze_prometheus(path, namespace)
    Analyzer->>RulesJSON: parse_json_file(.../monitoring/prometheus/rules.json)
    RulesJSON-->>Analyzer: JSON data or parse error
    Analyzer->>Analyzer: validate structure
    Analyzer->>Analyzer: collect alerts where type=='alerting' and state!='inactive'
    alt namespace provided
        Analyzer->>Analyzer: filter alerts by namespace
    end
    Analyzer->>Analyzer: sort alerts (namespace, name, severity)
    Analyzer->>Output: print_alerts_table(alerts)
    Analyzer->>Output: print_summary(total, pending, firing)
    Output-->>User: formatted alerts table + summary (or "No alerts found." / warnings)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Focus areas:
    • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py: JSON parsing/error handling, alert filtering (type/state), namespace filtering, sorting, output formatting, and CLI validation.
    • Documentation consistency across commands/analyze.md, SKILL.md, and README.md, and removal of duplicated README content.

Pre-merge checks and finishing touches

✅ Passed checks (7 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately summarizes the main change: adding Prometheus analysis functionality to the must-gather tool.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
No Real People Names In Style References ✅ Passed Review of all modified files found no real people names used in plugin commands, skill documentation, example prompts, instructions, or style references.
No Assumed Git Remote Names ✅ Passed No hardcoded git remote names like 'origin' or 'upstream' found in any modified files.
Git Push Safety Rules ✅ Passed No git push commands, force push operations, or automated push workflows detected in the PR changes.
No Untrusted Mcp Servers ✅ Passed Changes introduce analyze_prometheus.py with JSON parsing and Prometheus data analysis using only standard Python libraries, with no untrusted MCP server dependencies.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 7, 2025
@openshift-ci
Copy link

openshift-ci bot commented Nov 7, 2025

Hi @simonpasquier. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)

11-11: Remove unused import.

The attrgetter import is not used anywhere in the code.

Apply this diff:

-from operator import attrgetter

15-24: Consider more specific exception handling.

The bare Exception catch is broad. Consider catching specific exceptions for better error handling.

Apply this diff:

 def parse_json_file(file_path: Path) -> Optional[Dict[str, Any]]:
     """Parse a JSON file."""
     try:
         with open(file_path, 'r') as f:
             doc = json.load(f)
             return doc
-    except Exception as e:
+    except (FileNotFoundError, json.JSONDecodeError, PermissionError) as e:
         print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
     return None

78-82: Remove unnecessary f-string.

Line 79 uses an f-string without any placeholders.

Apply this diff:

-    print(f"\n{'='*80}")
+    print("\n" + "="*80)
     print(f"SUMMARY")
     print(f"Active alerts: {total_alerts} total ({pending} pending, {firing} firing)")
-    print(f"{'='*80}")
+    print("="*80)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 01eb538 and d76d384.

📒 Files selected for processing (4)
  • plugins/must-gather/README.md (1 hunks)
  • plugins/must-gather/commands/analyze.md (4 hunks)
  • plugins/must-gather/skills/must-gather-analyzer/SKILL.md (1 hunks)
  • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

1-1: Shebang is present but file is not executable

(EXE001)


21-21: Do not catch blind exception: Exception

(BLE001)


79-79: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (6)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

26-44: LGTM with minor note.

The function correctly formats and prints the alerts table. The use of pop() to extract and display specific labels while showing remaining labels is a good approach.


86-111: LGTM!

The main function properly validates input and handles command-line arguments correctly.

plugins/must-gather/skills/must-gather-analyzer/SKILL.md (2)

179-191: LGTM!

The new Monitoring Analysis section is well-documented and consistent with other analysis sections in the file.


192-200: LGTM!

The numbered steps improve the structure and readability of the interpretation guidance.

plugins/must-gather/commands/analyze.md (1)

28-28: LGTM!

All additions properly integrate Prometheus analysis into the command documentation. The keyword mappings, workflow steps, and output structure are consistent with existing patterns.

Also applies to: 111-111, 124-124, 181-182

plugins/must-gather/README.md (1)

229-252: LGTM—no duplicates found.

The verification confirms that analyze_prometheus.py documentation appears only once in the file. The section is complete, follows the established pattern, and requires no changes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1)

57-57: Critical: Fix string interpolation bug.

Line 57 uses ${status} which will print the literal string "${status}" instead of the variable value. This should be {status}.

Apply this diff:

-        print(f"{rules_file}: unexpected status ${status}")
+        print(f"{rules_file}: unexpected status {status}", file=sys.stderr)
🧹 Nitpick comments (2)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

64-72: Add defensive error handling for malformed data structures.

Lines 66-68 use direct key access (group["rules"], rule["type"], rule["state"], rule["alerts"]) which could raise KeyError if the JSON structure is malformed, even after the validation at lines 60-62.

Apply this diff to add defensive handling:

     alerts = []
     for group in rules["data"]["groups"]:
-        for rule in group["rules"]:
-            if rule["type"] == 'alerting' and rule["state"] != 'inactive':
-                for alert in rule["alerts"]:
+        if not isinstance(group, dict) or "rules" not in group:
+            continue
+        for rule in group.get("rules", []):
+            if not isinstance(rule, dict):
+                continue
+            if rule.get("type") == 'alerting' and rule.get("state") != 'inactive':
+                for alert in rule.get("alerts", []):
                     if namespace is None or namespace == '':
                         alerts.append(alert)
                     elif alert.get('labels', {}).get('namespace', '') == namespace:

86-86: Remove unnecessary f-string prefix.

Line 86 uses an f-string without any placeholders. The f prefix can be removed.

Apply this diff:

-    print(f"{'='*80}")
+    print('='*80)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between d76d384 and 40491c2.

📒 Files selected for processing (1)
  • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

21-21: Do not catch blind exception: Exception

(BLE001)


86-86: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (5)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (5)

15-23: LGTM!

The function correctly handles JSON parsing errors and returns None on failure. The broad exception catch is appropriate here for defensive error handling in a utility script.


26-43: LGTM!

The table formatting is clean and the use of .pop() on labels (lines 39-41) intentionally removes namespace, alertname, and severity from the labels dictionary before printing remaining labels in the last column. This is good design.


93-113: LGTM!

The argument parsing and directory validation are well-implemented. The error handling appropriately prints to stderr and returns non-zero exit codes.


116-117: LGTM!

The entry point follows best practices for Python CLI scripts.


1-5: No action required—file already has execute permissions.

The file is already executable, so no permission changes are needed. The shebang is present and properly configured.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)

1-1: Shebang present — ensure file is executable

Make the script executable so tools like Ruff don’t flag EXE001.

chmod +x plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

49-56: Fix f-string bug for status and prefer Path joins; log to stderr

The print uses ${status} which renders literally; also use Path for portability and send errors to stderr. This was flagged previously.

-    rules_file = f"{base_path}/monitoring/prometheus/rules.json"
+    rules_file = base_path / "monitoring" / "prometheus" / "rules.json"
@@
-    if status != "success":
-        print(f"{rules_file}: unexpected status ${status}")
+    if status != "success":
+        print(f"{rules_file}: unexpected status {status}", file=sys.stderr)
         return 1

59-72: Guard nested keys when traversing rules/groups/alerts

Direct indexing risks KeyError on unexpected JSON. Use .get chains and safe defaults. Previously reported.

-    alerts = []
-    for group in rules["data"]["groups"]:
-        for rule in group["rules"]:
-            if rule["type"] == 'alerting' and rule["state"] != 'inactive':
-                for alert in rule["alerts"]:
-                    if namespace is None or namespace == '':
-                        alerts.append(alert)
-                    elif alert.get('labels', {}).get('namespace', '') == namespace:
-                        alerts.append(alert)
+    alerts = []
+    for group in rules.get("data", {}).get("groups", []):
+        for rule in group.get("rules", []):
+            if rule.get("type") == "alerting" and rule.get("state") != "inactive":
+                for alert in rule.get("alerts", []):
+                    if not namespace or (alert.get('labels', {}) or {}).get('namespace', '') == namespace:
+                        alerts.append(alert)
🧹 Nitpick comments (4)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (4)

3-5: Docstring claims status/targets but script only prints alerts

Either implement status/targets output or adjust the docstring to reflect current behavior.

-Analyze Prometheus data from must-gather data.
-Shows Prometheus status, targets, and active alerts.
+Analyze Prometheus alerts from a must-gather.
+Shows active alerts and a brief summary.

25-43: Don’t mutate alert labels; make “since” formatting safe

Using pop mutates the input alerts; also appending 'Z' on empty timestamps prints a stray 'Z'.

 def print_alerts_table(alerts):
@@
-    print(f"{'STATE':<10} {'NAMESPACE':<50} {'NAME':<50} {'SEVERITY':<10} {'SINCE':<20} LABELS")
+    print(f"{'STATE':<10} {'NAMESPACE':<50} {'NAME':<50} {'SEVERITY':<10} {'SINCE':<20} LABELS")
@@
-        state = alert.get('state', '')
-        since = alert.get('activeAt', '')[:19] + 'Z' # timestamps are always UTC.
-        labels = alert.get('labels', {})
-        namespace = labels.pop('namespace', '')[:50]
-        name = labels.pop('alertname', '')[:50]
-        severity = labels.pop('severity', '')[:10]
+        state = alert.get('state', '')
+        since_raw = alert.get('activeAt')
+        # Keep second precision when 'Z' is present; avoid adding 'Z' on empty values.
+        since = since_raw[:19] + 'Z' if since_raw and since_raw.endswith('Z') else (since_raw or '')
+        labels = dict(alert.get('labels') or {})
+        namespace = labels.get('namespace', '')[:50]
+        name = labels.get('alertname', '')[:50]
+        severity = labels.get('severity', '')[:10]
+        other_labels = {k: v for k, v in labels.items() if k not in ('namespace', 'alertname', 'severity')}
@@
-        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {labels}")
+        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {other_labels}")

12-12: Drop unused typing import

List isn’t used; trim the import.

-from typing import List, Dict, Any, Optional
+from typing import Dict, Any, Optional

108-111: Minor: prefer Path.is_dir for consistency

Not required, but since you already use Path elsewhere, Path.is_dir keeps style consistent.

-    if not os.path.isdir(args.must_gather_path):
+    if not Path(args.must_gather_path).is_dir():
         print(f"Error: Directory not found: {args.must_gather_path}", file=sys.stderr)
         return 1
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 40491c2 and 08418fb.

📒 Files selected for processing (1)
  • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

20-20: Do not catch blind exception: Exception

(BLE001)


85-85: f-string without any placeholders

Remove extraneous f prefix

(F541)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)

12-22: Tighten JSON parsing: explicit encoding + specific exceptions + wider path type.

Avoid blind except, set UTF‑8, and accept both str and Path. Keeps failures clear and detectable.

-from typing import List, Dict, Any, Optional
+from typing import List, Dict, Any, Optional, Union
@@
-def parse_json_file(file_path: Path) -> Optional[Dict[str, Any]]:
+def parse_json_file(file_path: Union[str, Path]) -> Optional[Dict[str, Any]]:
     """Parse a JSON file."""
     try:
-        with open(file_path, 'r') as f:
+        with open(file_path, 'r', encoding='utf-8') as f:
             doc = json.load(f)
             return doc
-    except Exception as e:
-        print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
+    except (FileNotFoundError, json.JSONDecodeError, OSError) as e:
+        print(f"Error: Failed to parse {file_path}: {e}", file=sys.stderr)
     return None

As per static analysis hints.


63-71: Guard nested JSON access to avoid KeyError on unexpected structures.

Indexing group["rules"], rule["type"], rule["state"], and rule["alerts"] can throw; use .get() with safe defaults.

-    for group in rules["data"]["groups"]:
-        for rule in group["rules"]:
-            if rule["type"] == 'alerting' and rule["state"] != 'inactive':
-                for alert in rule["alerts"]:
+    for group in rules["data"].get("groups", []):
+        for rule in group.get("rules", []) or []:
+            if rule.get("type") == "alerting" and rule.get("state") != "inactive":
+                for alert in rule.get("alerts", []) or []:
                     if namespace is None or namespace == '':
                         alerts.append(alert)
                     elif alert.get('labels', {}).get('namespace', '') == namespace:
                         alerts.append(alert)

84-87: Remove unnecessary f-strings and keep only the interpolated one.

print(f"SUMMARY") triggers F541; also simplify the separators.

-    print(f"\n{'='*80}")
-    print(f"SUMMARY")
+    print("\n" + "=" * 80)
+    print("SUMMARY")
     print(f"Active alerts: {total_alerts} total ({pending} pending, {firing} firing)")
-    print(f"{'='*80}")
+    print("=" * 80)

As per static analysis hints.

🧹 Nitpick comments (2)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

34-42: Don’t mutate input alerts when printing; handle missing/invalid timestamps.

Using pop() alters alert["labels"]. Derive display labels without mutation and only add 'Z' when a timestamp exists.

-        state = alert.get('state', '')
-        since = alert.get('activeAt', '')[:19] + 'Z' # timestamps are always UTC.
-        labels = alert.get('labels', {})
-        namespace = labels.pop('namespace', '')[:50]
-        name = labels.pop('alertname', '')[:50]
-        severity = labels.pop('severity', '')[:10]
+        state = alert.get('state', '')
+        active_at = alert.get('activeAt')
+        since = (active_at[:19] + 'Z') if isinstance(active_at, str) and active_at else ''
+        labels = alert.get('labels', {}) or {}
+        namespace = labels.get('namespace', '')[:50]
+        name = labels.get('alertname', '')[:50]
+        severity = labels.get('severity', '')[:10]
+        labels_to_print = {k: v for k, v in labels.items() if k not in {'namespace', 'alertname', 'severity'}}
@@
-        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {labels}")
+        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {labels_to_print}")

3-5: Align docstring with current behavior (alerts only).

Docstring mentions status/targets, but the script only analyzes active alerts from rules.json. Update or implement the missing parts.

-Analyze Prometheus data from must-gather data.
-Shows Prometheus status, targets, and active alerts.
+Analyze Prometheus alerts from must-gather data.
+Parses monitoring/prometheus/rules.json and reports active alerts (optionally filtered by namespace).
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 08418fb and dbfbe7e.

📒 Files selected for processing (1)
  • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

20-20: Do not catch blind exception: Exception

(BLE001)


85-85: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (2)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

73-75: Sorting key now uses alertname correctly. LGTM.


79-83: Good defensive .get() usage in summary counters.

simonpasquier and others added 3 commits November 10, 2025 11:00
…e_prometheus.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

87-88: Remove extraneous f-string prefix.

Line 88 has an unnecessary f prefix on a string without placeholders. This was flagged in a previous review but remains unfixed.

Apply this diff:

     print(f"\n{'='*80}")
-    print(f"SUMMARY")
+    print("SUMMARY")
     print(f"Active alerts: {total_alerts} total ({pending} pending, {firing} firing)")

14-22: Critical: Fix exception handling and add encoding parameter.

Despite past review comments marking this as addressed, the code still has the original issues:

  • Line 17 is missing the encoding='utf-8' parameter
  • Line 20 catches broad Exception instead of specific exceptions (triggers BLE001)
  • Line 21 says "Warning" but should say "Error" for consistency with other error messages

Apply this diff:

 def parse_json_file(file_path: Path) -> Optional[Dict[str, Any]]:
     """Parse a JSON file."""
     try:
-        with open(file_path, 'r') as f:
+        with open(file_path, 'r', encoding='utf-8') as f:
             doc = json.load(f)
             return doc
-    except Exception as e:
-        print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
+    except (FileNotFoundError, json.JSONDecodeError, OSError) as e:
+        print(f"Error: Failed to parse {file_path}: {e}", file=sys.stderr)
     return None
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between dbfbe7e and 9aeb04a.

📒 Files selected for processing (1)
  • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

50-50: unindent does not match any outer indentation level

(invalid-syntax)


51-51: Unexpected indentation

(invalid-syntax)


53-53: Unexpected indentation

(invalid-syntax)


56-56: Expected a statement

(invalid-syntax)


56-56: Expected an identifier, but found a keyword 'if' that cannot be used here

(invalid-syntax)


56-56: Simple statements must be separated by newlines or semicolons

(invalid-syntax)


56-57: Expected an expression

(invalid-syntax)


57-57: Unexpected indentation

(invalid-syntax)


59-59: Unexpected indentation

(invalid-syntax)


61-61: Expected a statement

(invalid-syntax)


61-61: Expected an identifier, but found a keyword 'if' that cannot be used here

(invalid-syntax)


61-61: Simple statements must be separated by newlines or semicolons

(invalid-syntax)


61-62: Expected an expression

(invalid-syntax)


62-62: Unexpected indentation

(invalid-syntax)


64-64: Unexpected indentation

(invalid-syntax)


66-66: unindent does not match any outer indentation level

(invalid-syntax)


67-67: Unexpected indentation

(invalid-syntax)


95-95: Expected a statement

(invalid-syntax)

🔇 Additional comments (2)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

25-42: LGTM!

The table formatting logic is clean and handles edge cases appropriately. Using pop() to extract specific labels before printing remaining ones is a good approach.


95-119: LGTM!

The CLI setup is well-structured with clear help text, examples, and proper error handling for invalid paths.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1)

49-63: CRITICAL: Fix indentation and remove duplicate lines.

The code has severe syntax errors preventing execution:

  1. Lines 49-50: Indented with 3 spaces instead of 4
  2. Lines 55-56: Duplicate if status != "success": statement (line 55 marked with - should be removed)
  3. Line 57: Indented with 7 spaces instead of 8
  4. Lines 60-61: Duplicate if "data" not in rules... statement (line 60 marked with - should be removed)
  5. Line 62: Indented with 7 spaces instead of 8

These are merge artifacts where old lines weren't fully removed and new lines have incorrect indentation.

Apply this diff to fix all indentation and remove duplicates:

     base_path = Path(must_gather_path)
 
     # Retrieve active alerts.
-   rules_path = base_path / "monitoring" / "prometheus" / "rules.json"
-   rules = parse_json_file(rules_path)
+    rules_path = base_path / "monitoring" / "prometheus" / "rules.json"
+    rules = parse_json_file(rules_path)
 
     if rules is None:
         return 1
     status = rules.get("status", "")
-   if status != "success":
-   if status != "success":
-       print(f"{rules_path}: unexpected status {status}", file=sys.stderr)
+    if status != "success":
+        print(f"{rules_path}: unexpected status {status}", file=sys.stderr)
         return 1
 
-   if "data" not in rules or "groups" not in rules["data"]:
-   if "data" not in rules or "groups" not in rules["data"]:
-       print(f"Error: Unexpected JSON structure in {rules_path}", file=sys.stderr)
+    if "data" not in rules or "groups" not in rules["data"]:
+        print(f"Error: Unexpected JSON structure in {rules_path}", file=sys.stderr)
         return 1
🧹 Nitpick comments (2)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

33-41: Consider defensive coding and avoid mutating input data.

Two improvements:

  1. Line 35: The slice [:19] will fail if activeAt is unexpectedly short or missing. Add a length check or use a more robust approach.
  2. Lines 37-39: Using pop() mutates the labels dictionary, which may surprise callers who reuse the alerts data. Use .get() instead to preserve the original structure.

Apply this diff:

     for alert in alerts:
         state = alert.get('state', '')
-        since = alert.get('activeAt', '')[:19] + 'Z' # timestamps are always UTC.
+        activeAt = alert.get('activeAt', '')
+        since = (activeAt[:19] + 'Z') if len(activeAt) >= 19 else activeAt
         labels = alert.get('labels', {})
-        namespace = labels.pop('namespace', '')[:50]
-        name = labels.pop('alertname', '')[:50]
-        severity = labels.pop('severity', '')[:10]
+        namespace = labels.get('namespace', '')[:50]
+        name = labels.get('alertname', '')[:50]
+        severity = labels.get('severity', '')[:10]
+        # Create a copy of labels without the keys already displayed
+        remaining_labels = {k: v for k, v in labels.items() if k not in ('namespace', 'alertname', 'severity')}
 
-        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {labels}")
+        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {remaining_labels}")

87-87: Remove unnecessary f-prefix.

The string has no placeholders, so the f prefix is unnecessary.

Apply this diff:

-    print(f"SUMMARY")
+    print("SUMMARY")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 9aeb04a and 10a05cb.

📒 Files selected for processing (1)
  • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

49-49: unindent does not match any outer indentation level

(invalid-syntax)


50-50: Unexpected indentation

(invalid-syntax)


52-52: Unexpected indentation

(invalid-syntax)


55-55: Expected a statement

(invalid-syntax)


55-55: Expected an identifier, but found a keyword 'if' that cannot be used here

(invalid-syntax)


55-55: Simple statements must be separated by newlines or semicolons

(invalid-syntax)


55-56: Expected an expression

(invalid-syntax)


56-56: Unexpected indentation

(invalid-syntax)


58-58: Unexpected indentation

(invalid-syntax)


60-60: Expected a statement

(invalid-syntax)


60-60: Expected an identifier, but found a keyword 'if' that cannot be used here

(invalid-syntax)


60-60: Simple statements must be separated by newlines or semicolons

(invalid-syntax)


60-61: Expected an expression

(invalid-syntax)


61-61: Unexpected indentation

(invalid-syntax)


63-63: Unexpected indentation

(invalid-syntax)


65-65: unindent does not match any outer indentation level

(invalid-syntax)


66-66: Unexpected indentation

(invalid-syntax)


94-94: Expected a statement

(invalid-syntax)

🔇 Additional comments (3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)

14-22: LGTM!

Error handling is specific and appropriate, with UTF-8 encoding explicitly set and errors logged to stderr.


94-118: LGTM!

The CLI implementation is clean with proper argument parsing, input validation, and error handling.


1-1: File is already executable; no action needed.

The file has correct execute permissions set (-rwxr-xr-x), satisfying the shebang requirement. No changes are needed.

@simonpasquier simonpasquier marked this pull request as ready for review November 10, 2025 10:07
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 10, 2025
@openshift-ci openshift-ci bot requested review from dgoodwin and stleerh November 10, 2025 10:07
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)

35-35: Consider more robust timestamp handling.

If activeAt is missing or empty, the result will be just 'Z', which may look odd in the table. Consider adding a check or using a default placeholder like 'N/A'.

Example:

-        since = alert.get('activeAt', '')[:19] + 'Z' # timestamps are always UTC.
+        active_at = alert.get('activeAt', '')
+        since = (active_at[:19] + 'Z') if len(active_at) >= 19 else 'N/A'

66-66: Use defensive dict access for rule fields.

Direct key access to rule["type"] and rule["state"] will raise KeyError if a rule is missing these fields. While the outer structure is validated, individual rules should be accessed defensively.

Apply this diff:

-            if rule["type"] == 'alerting' and rule["state"] != 'inactive':
+            if rule.get("type") == 'alerting' and rule.get("state") != 'inactive':

84-87: Remove extraneous f-string prefix.

Line 85 has an f-string with no placeholders or expressions. Lines 84 and 87 could also be simplified.

Apply this diff:

-    print(f"\n{'='*80}")
-    print(f"SUMMARY")
+    print("\n" + '='*80)
+    print("SUMMARY")
     print(f"Active alerts: {total_alerts} total ({pending} pending, {firing} firing)")
-    print(f"{'='*80}")
+    print('='*80)

As per static analysis hints.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 10a05cb and b7193e4.

📒 Files selected for processing (1)
  • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

85-85: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (2)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

14-22: LGTM!

The error handling is well-structured with specific exception types, explicit UTF-8 encoding, and appropriate error messaging to stderr.


92-116: LGTM!

The CLI interface is well-structured with clear argument parsing, input validation, helpful examples, and proper error handling with stderr output and exit codes.

@simonpasquier
Copy link
Contributor Author

/ok-to-test

@openshift-ci
Copy link

openshift-ci bot commented Nov 10, 2025

@simonpasquier: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@LiangquanLi930
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 10, 2025
@Prashanth684
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 10, 2025
@stbenjam
Copy link
Member

/lgtm

@openshift-ci
Copy link

openshift-ci bot commented Nov 12, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Prashanth684, simonpasquier, stbenjam

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 775bb12 into openshift-eng:main Nov 12, 2025
4 checks passed
@simonpasquier simonpasquier deleted the add-prometheus-must-gather branch November 12, 2025 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants