feat(must-gather): add Prometheus analysis #136

simonpasquier · 2025-11-07T13:41:50Z

What this PR does / why we need it:

Add a new script and skill to parse and analyze Prometheus alerts contained in must-gathers.

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

Subject and description added to both, commit and PR.
Relevant issues have been referenced.
This change includes docs.

Summary by CodeRabbit

New Features
- Added Prometheus alert analysis to must-gather: CLI-invokable, supports global or namespace-scoped runs, lists active alerts in a readable table, and shows totals for pending/firing alerts with input validation and clear error messages.
Documentation
- Updated must-gather analysis docs and skill guidance to include Monitoring/Prometheus analysis, usage examples, integration into the full analysis flow, and expanded interpret-and-report steps.

Add a new script and skill to parse and analyze Prometheus alerts contained in must-gathers. Signed-off-by: Simon Pasquier <spasquie@redhat.com>

coderabbitai · 2025-11-07T13:42:01Z

Note

`.coderabbit.yaml` has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)

Validation error: Unrecognized key(s) in object: 'tools'

⚙️ Configuration instructions

Please see the configuration documentation for more information.
You can also validate your configuration using the online YAML validator.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Walkthrough

Adds Prometheus alerts analysis to the must-gather analyzer: a new analyze_prometheus.py script that parses Prometheus rules/alerts from a must-gather, filters/sorts active alerts (optional namespace scope), and prints an alerts table and summary; documentation and CLI integration updated; README.md shows duplicated usage content.

Changes

Cohort / File(s)	Summary
Documentation & CLI integration `plugins/must-gather/README.md`, `plugins/must-gather/commands/analyze.md`, `plugins/must-gather/skills/must-gather-analyzer/SKILL.md`	Add documentation and CLI mapping for Prometheus/alerts/monitoring analysis: usage examples, keywords mapping to the new analyzer, inclusion of a Monitoring section in full analysis output, and updated interpret/report guidance. Note: `README.md` contains duplicated identical usage insertion.
Prometheus Analysis Implementation `plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py`	New Python script that loads `rules.json` from a must-gather, parses JSON with stderr warnings, validates structure, collects active alerts (type `'alerting'` and `state != 'inactive'`), supports optional namespace filtering, sorts alerts (namespace, name, severity), prints a formatted alerts table and a concise summary, and exposes a CLI `main()` entrypoint.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant Analyzer
    participant RulesJSON
    participant Output

    User->>CLI: run analyze_prometheus.py <must_gather_path> [--namespace ns]
    CLI->>Analyzer: main() -> analyze_prometheus(path, namespace)
    Analyzer->>RulesJSON: parse_json_file(.../monitoring/prometheus/rules.json)
    RulesJSON-->>Analyzer: JSON data or parse error
    Analyzer->>Analyzer: validate structure
    Analyzer->>Analyzer: collect alerts where type=='alerting' and state!='inactive'
    alt namespace provided
        Analyzer->>Analyzer: filter alerts by namespace
    end
    Analyzer->>Analyzer: sort alerts (namespace, name, severity)
    Analyzer->>Output: print_alerts_table(alerts)
    Analyzer->>Output: print_summary(total, pending, firing)
    Output-->>User: formatted alerts table + summary (or "No alerts found." / warnings)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Focus areas:
- plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py: JSON parsing/error handling, alert filtering (type/state), namespace filtering, sorting, output formatting, and CLI validation.
- Documentation consistency across commands/analyze.md, SKILL.md, and README.md, and removal of duplicated README content.

Pre-merge checks and finishing touches

✅ Passed checks (7 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately summarizes the main change: adding Prometheus analysis functionality to the must-gather tool.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
No Real People Names In Style References	✅ Passed	Review of all modified files found no real people names used in plugin commands, skill documentation, example prompts, instructions, or style references.
No Assumed Git Remote Names	✅ Passed	No hardcoded git remote names like 'origin' or 'upstream' found in any modified files.
Git Push Safety Rules	✅ Passed	No git push commands, force push operations, or automated push workflows detected in the PR changes.
No Untrusted Mcp Servers	✅ Passed	Changes introduce analyze_prometheus.py with JSON parsing and Prometheus data analysis using only standard Python libraries, with no untrusted MCP server dependencies.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2025-11-07T13:42:02Z

Hi @simonpasquier. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)
11-11: Remove unused import.

The attrgetter import is not used anywhere in the code.

Apply this diff:
-from operator import attrgetter
15-24: Consider more specific exception handling.

The bare Exception catch is broad. Consider catching specific exceptions for better error handling.

Apply this diff:
 def parse_json_file(file_path: Path) -> Optional[Dict[str, Any]]:
     """Parse a JSON file."""
     try:
         with open(file_path, 'r') as f:
             doc = json.load(f)
             return doc
-    except Exception as e:
+    except (FileNotFoundError, json.JSONDecodeError, PermissionError) as e:
         print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
     return None
78-82: Remove unnecessary f-string.

Line 79 uses an f-string without any placeholders.

Apply this diff:
-    print(f"\n{'='*80}")
+    print("\n" + "="*80)
     print(f"SUMMARY")
     print(f"Active alerts: {total_alerts} total ({pending} pending, {firing} firing)")
-    print(f"{'='*80}")
+    print("="*80)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 01eb538 and d76d384.

📒 Files selected for processing (4)

plugins/must-gather/README.md (1 hunks)
plugins/must-gather/commands/analyze.md (4 hunks)
plugins/must-gather/skills/must-gather-analyzer/SKILL.md (1 hunks)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

1-1: Shebang is present but file is not executable

(EXE001)

21-21: Do not catch blind exception: Exception

(BLE001)

79-79: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (6)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

26-44: LGTM with minor note.

The function correctly formats and prints the alerts table. The use of pop() to extract and display specific labels while showing remaining labels is a good approach.

86-111: LGTM!

The main function properly validates input and handles command-line arguments correctly.

plugins/must-gather/skills/must-gather-analyzer/SKILL.md (2)

179-191: LGTM!

The new Monitoring Analysis section is well-documented and consistent with other analysis sections in the file.

192-200: LGTM!

The numbered steps improve the structure and readability of the interpretation guidance.

plugins/must-gather/commands/analyze.md (1)

28-28: LGTM!

All additions properly integrate Prometheus analysis into the command documentation. The keyword mappings, workflow steps, and output structure are consistent with existing patterns.

Also applies to: 111-111, 124-124, 181-182

plugins/must-gather/README.md (1)

229-252: LGTM—no duplicates found.

The verification confirms that analyze_prometheus.py documentation appears only once in the file. The section is complete, follows the established pattern, and requires no changes.

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1)
57-57: Critical: Fix string interpolation bug.

Line 57 uses ${status} which will print the literal string "${status}" instead of the variable value. This should be {status}.

Apply this diff:
-        print(f"{rules_file}: unexpected status ${status}")
+        print(f"{rules_file}: unexpected status {status}", file=sys.stderr)

🧹 Nitpick comments (2)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)
64-72: Add defensive error handling for malformed data structures.

Lines 66-68 use direct key access (group["rules"], rule["type"], rule["state"], rule["alerts"]) which could raise KeyError if the JSON structure is malformed, even after the validation at lines 60-62.

Apply this diff to add defensive handling:
     alerts = []
     for group in rules["data"]["groups"]:
-        for rule in group["rules"]:
-            if rule["type"] == 'alerting' and rule["state"] != 'inactive':
-                for alert in rule["alerts"]:
+        if not isinstance(group, dict) or "rules" not in group:
+            continue
+        for rule in group.get("rules", []):
+            if not isinstance(rule, dict):
+                continue
+            if rule.get("type") == 'alerting' and rule.get("state") != 'inactive':
+                for alert in rule.get("alerts", []):
                     if namespace is None or namespace == '':
                         alerts.append(alert)
                     elif alert.get('labels', {}).get('namespace', '') == namespace:
86-86: Remove unnecessary f-string prefix.

Line 86 uses an f-string without any placeholders. The f prefix can be removed.

Apply this diff:
-    print(f"{'='*80}")
+    print('='*80)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between d76d384 and 40491c2.

📒 Files selected for processing (1)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

21-21: Do not catch blind exception: Exception

(BLE001)

86-86: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (5)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (5)

15-23: LGTM!

The function correctly handles JSON parsing errors and returns None on failure. The broad exception catch is appropriate here for defensive error handling in a utility script.

26-43: LGTM!

The table formatting is clean and the use of .pop() on labels (lines 39-41) intentionally removes namespace, alertname, and severity from the labels dictionary before printing remaining labels in the last column. This is good design.

93-113: LGTM!

The argument parsing and directory validation are well-implemented. The error handling appropriately prints to stderr and returns non-zero exit codes.

116-117: LGTM!

The entry point follows best practices for Python CLI scripts.

1-5: No action required—file already has execute permissions.

The file is already executable, so no permission changes are needed. The shebang is present and properly configured.

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)

1-1: Shebang present — ensure file is executable

Make the script executable so tools like Ruff don’t flag EXE001.

chmod +x plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

49-56: Fix f-string bug for status and prefer Path joins; log to stderr

The print uses ${status} which renders literally; also use Path for portability and send errors to stderr. This was flagged previously.

-    rules_file = f"{base_path}/monitoring/prometheus/rules.json"
+    rules_file = base_path / "monitoring" / "prometheus" / "rules.json"
@@
-    if status != "success":
-        print(f"{rules_file}: unexpected status ${status}")
+    if status != "success":
+        print(f"{rules_file}: unexpected status {status}", file=sys.stderr)
         return 1

59-72: Guard nested keys when traversing rules/groups/alerts

Direct indexing risks KeyError on unexpected JSON. Use .get chains and safe defaults. Previously reported.

-    alerts = []
-    for group in rules["data"]["groups"]:
-        for rule in group["rules"]:
-            if rule["type"] == 'alerting' and rule["state"] != 'inactive':
-                for alert in rule["alerts"]:
-                    if namespace is None or namespace == '':
-                        alerts.append(alert)
-                    elif alert.get('labels', {}).get('namespace', '') == namespace:
-                        alerts.append(alert)
+    alerts = []
+    for group in rules.get("data", {}).get("groups", []):
+        for rule in group.get("rules", []):
+            if rule.get("type") == "alerting" and rule.get("state") != "inactive":
+                for alert in rule.get("alerts", []):
+                    if not namespace or (alert.get('labels', {}) or {}).get('namespace', '') == namespace:
+                        alerts.append(alert)

🧹 Nitpick comments (4)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (4)

3-5: Docstring claims status/targets but script only prints alerts

Either implement status/targets output or adjust the docstring to reflect current behavior.

-Analyze Prometheus data from must-gather data.
-Shows Prometheus status, targets, and active alerts.
+Analyze Prometheus alerts from a must-gather.
+Shows active alerts and a brief summary.

25-43: Don’t mutate alert labels; make “since” formatting safe

Using pop mutates the input alerts; also appending 'Z' on empty timestamps prints a stray 'Z'.

 def print_alerts_table(alerts):
@@
-    print(f"{'STATE':<10} {'NAMESPACE':<50} {'NAME':<50} {'SEVERITY':<10} {'SINCE':<20} LABELS")
+    print(f"{'STATE':<10} {'NAMESPACE':<50} {'NAME':<50} {'SEVERITY':<10} {'SINCE':<20} LABELS")
@@
-        state = alert.get('state', '')
-        since = alert.get('activeAt', '')[:19] + 'Z' # timestamps are always UTC.
-        labels = alert.get('labels', {})
-        namespace = labels.pop('namespace', '')[:50]
-        name = labels.pop('alertname', '')[:50]
-        severity = labels.pop('severity', '')[:10]
+        state = alert.get('state', '')
+        since_raw = alert.get('activeAt')
+        # Keep second precision when 'Z' is present; avoid adding 'Z' on empty values.
+        since = since_raw[:19] + 'Z' if since_raw and since_raw.endswith('Z') else (since_raw or '')
+        labels = dict(alert.get('labels') or {})
+        namespace = labels.get('namespace', '')[:50]
+        name = labels.get('alertname', '')[:50]
+        severity = labels.get('severity', '')[:10]
+        other_labels = {k: v for k, v in labels.items() if k not in ('namespace', 'alertname', 'severity')}
@@
-        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {labels}")
+        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {other_labels}")

12-12: Drop unused typing import

List isn’t used; trim the import.

-from typing import List, Dict, Any, Optional
+from typing import Dict, Any, Optional

108-111: Minor: prefer Path.is_dir for consistency

Not required, but since you already use Path elsewhere, Path.is_dir keeps style consistent.

-    if not os.path.isdir(args.must_gather_path):
+    if not Path(args.must_gather_path).is_dir():
         print(f"Error: Directory not found: {args.must_gather_path}", file=sys.stderr)
         return 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 40491c2 and 08418fb.

📒 Files selected for processing (1)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

20-20: Do not catch blind exception: Exception

(BLE001)

85-85: f-string without any placeholders

Remove extraneous f prefix

(F541)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)

12-22: Tighten JSON parsing: explicit encoding + specific exceptions + wider path type.

Avoid blind except, set UTF‑8, and accept both str and Path. Keeps failures clear and detectable.

-from typing import List, Dict, Any, Optional
+from typing import List, Dict, Any, Optional, Union
@@
-def parse_json_file(file_path: Path) -> Optional[Dict[str, Any]]:
+def parse_json_file(file_path: Union[str, Path]) -> Optional[Dict[str, Any]]:
     """Parse a JSON file."""
     try:
-        with open(file_path, 'r') as f:
+        with open(file_path, 'r', encoding='utf-8') as f:
             doc = json.load(f)
             return doc
-    except Exception as e:
-        print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
+    except (FileNotFoundError, json.JSONDecodeError, OSError) as e:
+        print(f"Error: Failed to parse {file_path}: {e}", file=sys.stderr)
     return None

As per static analysis hints.

63-71: Guard nested JSON access to avoid KeyError on unexpected structures.

Indexing group["rules"], rule["type"], rule["state"], and rule["alerts"] can throw; use .get() with safe defaults.

-    for group in rules["data"]["groups"]:
-        for rule in group["rules"]:
-            if rule["type"] == 'alerting' and rule["state"] != 'inactive':
-                for alert in rule["alerts"]:
+    for group in rules["data"].get("groups", []):
+        for rule in group.get("rules", []) or []:
+            if rule.get("type") == "alerting" and rule.get("state") != "inactive":
+                for alert in rule.get("alerts", []) or []:
                     if namespace is None or namespace == '':
                         alerts.append(alert)
                     elif alert.get('labels', {}).get('namespace', '') == namespace:
                         alerts.append(alert)

84-87: Remove unnecessary f-strings and keep only the interpolated one.

print(f"SUMMARY") triggers F541; also simplify the separators.

-    print(f"\n{'='*80}")
-    print(f"SUMMARY")
+    print("\n" + "=" * 80)
+    print("SUMMARY")
     print(f"Active alerts: {total_alerts} total ({pending} pending, {firing} firing)")
-    print(f"{'='*80}")
+    print("=" * 80)

As per static analysis hints.

🧹 Nitpick comments (2)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

34-42: Don’t mutate input alerts when printing; handle missing/invalid timestamps.

Using pop() alters alert["labels"]. Derive display labels without mutation and only add 'Z' when a timestamp exists.

-        state = alert.get('state', '')
-        since = alert.get('activeAt', '')[:19] + 'Z' # timestamps are always UTC.
-        labels = alert.get('labels', {})
-        namespace = labels.pop('namespace', '')[:50]
-        name = labels.pop('alertname', '')[:50]
-        severity = labels.pop('severity', '')[:10]
+        state = alert.get('state', '')
+        active_at = alert.get('activeAt')
+        since = (active_at[:19] + 'Z') if isinstance(active_at, str) and active_at else ''
+        labels = alert.get('labels', {}) or {}
+        namespace = labels.get('namespace', '')[:50]
+        name = labels.get('alertname', '')[:50]
+        severity = labels.get('severity', '')[:10]
+        labels_to_print = {k: v for k, v in labels.items() if k not in {'namespace', 'alertname', 'severity'}}
@@
-        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {labels}")
+        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {labels_to_print}")

3-5: Align docstring with current behavior (alerts only).

Docstring mentions status/targets, but the script only analyzes active alerts from rules.json. Update or implement the missing parts.

-Analyze Prometheus data from must-gather data.
-Shows Prometheus status, targets, and active alerts.
+Analyze Prometheus alerts from must-gather data.
+Parses monitoring/prometheus/rules.json and reports active alerts (optionally filtered by namespace).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 08418fb and dbfbe7e.

📒 Files selected for processing (1)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

20-20: Do not catch blind exception: Exception

(BLE001)

85-85: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (2)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

73-75: Sorting key now uses alertname correctly. LGTM.

79-83: Good defensive .get() usage in summary counters.

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

…e_prometheus.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)
87-88: Remove extraneous f-string prefix.

Line 88 has an unnecessary f prefix on a string without placeholders. This was flagged in a previous review but remains unfixed.

Apply this diff:
     print(f"\n{'='*80}")
-    print(f"SUMMARY")
+    print("SUMMARY")
     print(f"Active alerts: {total_alerts} total ({pending} pending, {firing} firing)")
14-22: Critical: Fix exception handling and add encoding parameter.

Despite past review comments marking this as addressed, the code still has the original issues:

Line 17 is missing the encoding='utf-8' parameter

Line 20 catches broad Exception instead of specific exceptions (triggers BLE001)

Line 21 says "Warning" but should say "Error" for consistency with other error messages

Apply this diff:
 def parse_json_file(file_path: Path) -> Optional[Dict[str, Any]]:
     """Parse a JSON file."""
     try:
-        with open(file_path, 'r') as f:
+        with open(file_path, 'r', encoding='utf-8') as f:
             doc = json.load(f)
             return doc
-    except Exception as e:
-        print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
+    except (FileNotFoundError, json.JSONDecodeError, OSError) as e:
+        print(f"Error: Failed to parse {file_path}: {e}", file=sys.stderr)
     return None

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between dbfbe7e and 9aeb04a.

📒 Files selected for processing (1)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

50-50: unindent does not match any outer indentation level

(invalid-syntax)

51-51: Unexpected indentation

(invalid-syntax)

53-53: Unexpected indentation

(invalid-syntax)

56-56: Expected a statement

(invalid-syntax)

56-56: Expected an identifier, but found a keyword 'if' that cannot be used here

(invalid-syntax)

56-56: Simple statements must be separated by newlines or semicolons

(invalid-syntax)

56-57: Expected an expression

(invalid-syntax)

57-57: Unexpected indentation

(invalid-syntax)

59-59: Unexpected indentation

(invalid-syntax)

61-61: Expected a statement

(invalid-syntax)

61-61: Expected an identifier, but found a keyword 'if' that cannot be used here

(invalid-syntax)

61-61: Simple statements must be separated by newlines or semicolons

(invalid-syntax)

61-62: Expected an expression

(invalid-syntax)

62-62: Unexpected indentation

(invalid-syntax)

64-64: Unexpected indentation

(invalid-syntax)

66-66: unindent does not match any outer indentation level

(invalid-syntax)

67-67: Unexpected indentation

(invalid-syntax)

95-95: Expected a statement

(invalid-syntax)

🔇 Additional comments (2)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

25-42: LGTM!

The table formatting logic is clean and handles edge cases appropriately. Using pop() to extract specific labels before printing remaining ones is a good approach.

95-119: LGTM!

The CLI setup is well-structured with clear help text, examples, and proper error handling for invalid paths.

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1)
49-63: CRITICAL: Fix indentation and remove duplicate lines.

The code has severe syntax errors preventing execution:

Lines 49-50: Indented with 3 spaces instead of 4

Lines 55-56: Duplicate if status != "success": statement (line 55 marked with - should be removed)

Line 57: Indented with 7 spaces instead of 8

Lines 60-61: Duplicate if "data" not in rules... statement (line 60 marked with - should be removed)

Line 62: Indented with 7 spaces instead of 8

These are merge artifacts where old lines weren't fully removed and new lines have incorrect indentation.

Apply this diff to fix all indentation and remove duplicates:
     base_path = Path(must_gather_path)
 
     # Retrieve active alerts.
-   rules_path = base_path / "monitoring" / "prometheus" / "rules.json"
-   rules = parse_json_file(rules_path)
+    rules_path = base_path / "monitoring" / "prometheus" / "rules.json"
+    rules = parse_json_file(rules_path)
 
     if rules is None:
         return 1
     status = rules.get("status", "")
-   if status != "success":
-   if status != "success":
-       print(f"{rules_path}: unexpected status {status}", file=sys.stderr)
+    if status != "success":
+        print(f"{rules_path}: unexpected status {status}", file=sys.stderr)
         return 1
 
-   if "data" not in rules or "groups" not in rules["data"]:
-   if "data" not in rules or "groups" not in rules["data"]:
-       print(f"Error: Unexpected JSON structure in {rules_path}", file=sys.stderr)
+    if "data" not in rules or "groups" not in rules["data"]:
+        print(f"Error: Unexpected JSON structure in {rules_path}", file=sys.stderr)
         return 1

🧹 Nitpick comments (2)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

33-41: Consider defensive coding and avoid mutating input data.

Two improvements:

Line 35: The slice [:19] will fail if activeAt is unexpectedly short or missing. Add a length check or use a more robust approach.
Lines 37-39: Using pop() mutates the labels dictionary, which may surprise callers who reuse the alerts data. Use .get() instead to preserve the original structure.

Apply this diff:

     for alert in alerts:
         state = alert.get('state', '')
-        since = alert.get('activeAt', '')[:19] + 'Z' # timestamps are always UTC.
+        activeAt = alert.get('activeAt', '')
+        since = (activeAt[:19] + 'Z') if len(activeAt) >= 19 else activeAt
         labels = alert.get('labels', {})
-        namespace = labels.pop('namespace', '')[:50]
-        name = labels.pop('alertname', '')[:50]
-        severity = labels.pop('severity', '')[:10]
+        namespace = labels.get('namespace', '')[:50]
+        name = labels.get('alertname', '')[:50]
+        severity = labels.get('severity', '')[:10]
+        # Create a copy of labels without the keys already displayed
+        remaining_labels = {k: v for k, v in labels.items() if k not in ('namespace', 'alertname', 'severity')}
 
-        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {labels}")
+        print(f"{state:<10} {namespace:<50} {name:<50} {severity:<10} {since:<20} {remaining_labels}")

87-87: Remove unnecessary f-prefix.

The string has no placeholders, so the f prefix is unnecessary.

Apply this diff:

-    print(f"SUMMARY")
+    print("SUMMARY")

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 9aeb04a and 10a05cb.

📒 Files selected for processing (1)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

49-49: unindent does not match any outer indentation level

(invalid-syntax)

50-50: Unexpected indentation

(invalid-syntax)

52-52: Unexpected indentation

(invalid-syntax)

55-55: Expected a statement

(invalid-syntax)

55-55: Expected an identifier, but found a keyword 'if' that cannot be used here

(invalid-syntax)

55-55: Simple statements must be separated by newlines or semicolons

(invalid-syntax)

55-56: Expected an expression

(invalid-syntax)

56-56: Unexpected indentation

(invalid-syntax)

58-58: Unexpected indentation

(invalid-syntax)

60-60: Expected a statement

(invalid-syntax)

60-60: Expected an identifier, but found a keyword 'if' that cannot be used here

(invalid-syntax)

60-60: Simple statements must be separated by newlines or semicolons

(invalid-syntax)

60-61: Expected an expression

(invalid-syntax)

61-61: Unexpected indentation

(invalid-syntax)

63-63: Unexpected indentation

(invalid-syntax)

65-65: unindent does not match any outer indentation level

(invalid-syntax)

66-66: Unexpected indentation

(invalid-syntax)

94-94: Expected a statement

(invalid-syntax)

🔇 Additional comments (3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)

14-22: LGTM!

Error handling is specific and appropriate, with UTF-8 encoding explicitly set and errors logged to stderr.

94-118: LGTM!

The CLI implementation is clean with proper argument parsing, input validation, and error handling.

1-1: File is already executable; no action needed.

The file has correct execute permissions set (-rwxr-xr-x), satisfying the shebang requirement. No changes are needed.

…dd-prometheus-must-gather

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (3)
35-35: Consider more robust timestamp handling.

If activeAt is missing or empty, the result will be just 'Z', which may look odd in the table. Consider adding a check or using a default placeholder like 'N/A'.

Example:
-        since = alert.get('activeAt', '')[:19] + 'Z' # timestamps are always UTC.
+        active_at = alert.get('activeAt', '')
+        since = (active_at[:19] + 'Z') if len(active_at) >= 19 else 'N/A'
66-66: Use defensive dict access for rule fields.

Direct key access to rule["type"] and rule["state"] will raise KeyError if a rule is missing these fields. While the outer structure is validated, individual rules should be accessed defensively.

Apply this diff:
-            if rule["type"] == 'alerting' and rule["state"] != 'inactive':
+            if rule.get("type") == 'alerting' and rule.get("state") != 'inactive':
84-87: Remove extraneous f-string prefix.

Line 85 has an f-string with no placeholders or expressions. Lines 84 and 87 could also be simplified.

Apply this diff:
-    print(f"\n{'='*80}")
-    print(f"SUMMARY")
+    print("\n" + '='*80)
+    print("SUMMARY")
     print(f"Active alerts: {total_alerts} total ({pending} pending, {firing} firing)")
-    print(f"{'='*80}")
+    print('='*80)
As per static analysis hints.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 10a05cb and b7193e4.

📒 Files selected for processing (1)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py

85-85: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (2)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py (2)

14-22: LGTM!

The error handling is well-structured with specific exception types, explicit UTF-8 encoding, and appropriate error messaging to stderr.

92-116: LGTM!

The CLI interface is well-structured with clear argument parsing, input validation, helpful examples, and proper error handling with stderr output and exit codes.

simonpasquier · 2025-11-10T12:45:46Z

/ok-to-test

openshift-ci · 2025-11-10T12:46:02Z

@simonpasquier: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

LiangquanLi930 · 2025-11-10T14:56:33Z

/ok-to-test

Prashanth684 · 2025-11-10T16:45:56Z

/lgtm

stbenjam · 2025-11-12T15:49:46Z

/lgtm

openshift-ci · 2025-11-12T15:49:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Prashanth684, simonpasquier, stbenjam

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [stbenjam]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

feat(must-gather): add Prometheus analysis

d76d384

Add a new script and skill to parse and analyze Prometheus alerts contained in must-gathers. Signed-off-by: Simon Pasquier <spasquie@redhat.com>

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 7, 2025

openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 7, 2025

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

Address review comments

40491c2

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py Outdated Show resolved Hide resolved

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py Outdated Show resolved Hide resolved

simonpasquier added 3 commits November 7, 2025 15:12

Remove unused import

fc67306

Avoid KeyError exceptions

08418fb

Fix interpolation

dbfbe7e

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py Show resolved Hide resolved

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py Show resolved Hide resolved

coderabbitai bot reviewed Nov 7, 2025

View reviewed changes

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py Show resolved Hide resolved

simonpasquier and others added 3 commits November 10, 2025 11:00

Update plugins/must-gather/skills/must-gather-analyzer/scripts/analyz…

47b6676

…e_prometheus.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Fix bug

9aeb04a

Apply suggestion from @coderabbitai[bot]

10a05cb

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Nov 10, 2025

View reviewed changes

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_prometheus.py Outdated Show resolved Hide resolved

Fix more bugs from CodeRabbit

f4b7247

coderabbitai bot reviewed Nov 10, 2025

View reviewed changes

Merge remote-tracking branch 'fork/add-prometheus-must-gather' into a…

b7193e4

…dd-prometheus-must-gather

simonpasquier marked this pull request as ready for review November 10, 2025 10:07

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 10, 2025

openshift-ci bot requested review from dgoodwin and stleerh November 10, 2025 10:07

coderabbitai bot reviewed Nov 10, 2025

View reviewed changes

openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 10, 2025

openshift-ci bot assigned Prashanth684 Nov 10, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 10, 2025

openshift-ci bot assigned stbenjam Nov 12, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2025

openshift-merge-bot bot merged commit 775bb12 into openshift-eng:main Nov 12, 2025
4 checks passed

simonpasquier deleted the add-prometheus-must-gather branch November 12, 2025 15:55

feat(must-gather): add Prometheus analysis #136

feat(must-gather): add Prometheus analysis #136

Uh oh!

Conversation

simonpasquier commented Nov 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Checklist:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

.coderabbit.yaml has unrecognized properties

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

openshift-ci bot commented Nov 7, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

simonpasquier commented Nov 10, 2025

Uh oh!

openshift-ci bot commented Nov 10, 2025

Uh oh!

LiangquanLi930 commented Nov 10, 2025

Uh oh!

Prashanth684 commented Nov 10, 2025

Uh oh!

stbenjam commented Nov 12, 2025

Uh oh!

openshift-ci bot commented Nov 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

simonpasquier commented Nov 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 7, 2025 •

edited

Loading

`.coderabbit.yaml` has unrecognized properties