Skip to content

Fix: Minor logging uplift for debugging of prompt injection mitigation#7195

Merged
dorien-koelemeijer merged 3 commits intomainfrom
fix/pattern-based-fallback
Feb 17, 2026
Merged

Fix: Minor logging uplift for debugging of prompt injection mitigation#7195
dorien-koelemeijer merged 3 commits intomainfrom
fix/pattern-based-fallback

Conversation

@dorien-koelemeijer
Copy link
Collaborator

@dorien-koelemeijer dorien-koelemeijer commented Feb 13, 2026

Summary

  • Minor logging uplift to make sure we can validate that there is a fallback to pattern-based detection if prompt injection mitigation feature is enabled, but command injection isn't available/misconfigured.
  • Datadog metrics uplift

Type of Change

  • Feature
  • Bug fix
  • Refactor / Code quality
  • Performance improvement
  • Documentation
  • Tests
  • Security fix
  • Build / Release
  • Other (specify below)

AI Assistance

  • This PR was created or reviewed with AI assistance

Testing

Local/manual testing.

Copilot AI review requested due to automatic review settings February 13, 2026 00:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the prompt-injection scanning internals to surface (via logging) whether pattern-based scanning was used as a fallback when ML-based command injection detection isn’t available or fails.

Changes:

  • Add a used_pattern_detection flag to DetailedScanResult to track when pattern-based scanning was used.
  • Switch the tracing::info! field has_patterns to report the fallback-path usage rather than presence of pattern matches.
  • Propagate used_pattern_detection through intermediate scan results used to build the final explanation/logging.

Copilot AI review requested due to automatic review settings February 17, 2026 01:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment on lines 243 to 247
confidence: max_confidence,
pattern_matches: Vec::new(),
ml_confidence: Some(max_confidence),
used_pattern_detection: false,
})
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scan_conversation always sets ml_confidence: Some(max_confidence) even if every classifier call failed (all scan_with_classifier results were None), which makes downstream logic treat this as a real ML signal (and currently reduces tool_confidence by 10% when the value is 0.0). Track whether any classification succeeded (e.g., fold an Option<f32> or keep a success flag) and return ml_confidence: None when there were no successful results.

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +81
monotonic_counter.goose.security_command_classifier_enabled = if command_classifier_enabled { 1 } else { 0 },
monotonic_counter.goose.security_prompt_classifier_enabled = if prompt_classifier_enabled { 1 } else { 0 },
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using monotonic_counter.* = if enabled { 1 } else { 0 } is likely to produce confusing metrics (a counter with value 0 is typically a no-op and the name suggests a gauge); consider logging the booleans as normal fields (e.g. command_classifier_enabled = ...) and emitting a separate monotonic_counter metric with value 1 (or separate enabled/disabled counters) so disabled configurations are still observable.

Suggested change
monotonic_counter.goose.security_command_classifier_enabled = if command_classifier_enabled { 1 } else { 0 },
monotonic_counter.goose.security_prompt_classifier_enabled = if prompt_classifier_enabled { 1 } else { 0 },
monotonic_counter.goose.security_classifier_configuration_logged = 1,
security_command_classifier_enabled = command_classifier_enabled,
security_prompt_classifier_enabled = prompt_classifier_enabled,

Copilot uses AI. Check for mistakes.
@dorien-koelemeijer dorien-koelemeijer added this pull request to the merge queue Feb 17, 2026
Merged via the queue into main with commit e32720f Feb 17, 2026
26 checks passed
@dorien-koelemeijer dorien-koelemeijer deleted the fix/pattern-based-fallback branch February 17, 2026 23:28
jh-block added a commit that referenced this pull request Feb 18, 2026
* origin/main: (49 commits)
  chore: show important keys for provider configuration (#7265)
  fix: subrecipe relative path with summon (#7295)
  fix extension selector not displaying the correct enabled extensions (#7290)
  Use the working dir from the session (#7285)
  Fix: Minor logging uplift for debugging of prompt injection mitigation (#7195)
  feat(otel): make otel logging level configurable (#7271)
  docs: add documentation for Top Of Mind extension (#7283)
  Document gemini 3 thinking levels (#7282)
  docs: stream subagent tool calls (#7280)
  Docs: delete custom provider in desktop (#7279)
  Everything is streaming (#7247)
  openai: responses models and hardens event streaming handling (#6831)
  docs: disable ai session naming (#7194)
  Added cmd to validate bundled extensions json (#7217)
  working_dir usage more clear in add_extension (#6958)
  Use Canonical Models to set context window sizes (#6723)
  Set up direnv and update flake inputs (#6526)
  fix: restore subagent tool call notifications after summon refactor (#7243)
  fix(ui): preserve server config values on partial provider config save (#7248)
  fix(claude-code): allow goose to run inside a Claude Code session (#7232)
  ...
aharvard added a commit that referenced this pull request Feb 18, 2026
* origin/main:
  feat: add GOOSE_SUBAGENT_MODEL and GOOSE_SUBAGENT_PROVIDER config options (#7277)
  fix(openai): support "reasoning" field alias in streaming deltas (#7294)
  fix(ui): revert app-driven iframe width and send containerDimensions per ext-apps spec (#7300)
  New OpenAI event (#7301)
  ci: add fork guards to scheduled workflows (#7292)
  fix: allow ollama input limit override (#7281)
  chore: show important keys for provider configuration (#7265)
  fix: subrecipe relative path with summon (#7295)
  fix extension selector not displaying the correct enabled extensions (#7290)
  Use the working dir from the session (#7285)
  Fix: Minor logging uplift for debugging of prompt injection mitigation (#7195)
  feat(otel): make otel logging level configurable (#7271)
  docs: add documentation for Top Of Mind extension (#7283)
  Document gemini 3 thinking levels (#7282)
  docs: stream subagent tool calls (#7280)
  Docs: delete custom provider in desktop (#7279)

# Conflicts:
#	ui/desktop/src/components/McpApps/McpAppRenderer.tsx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments