Skip to content

Feature request: blocked pattern matching for add-labels safe output #16625

@benvillalobos

Description

@benvillalobos

🤖 Filed with agentic-workflows agent mid-design conversation

Problem

The add-labels safe output currently supports allowed: [list] (allowlist) and max: N (count cap), but has no way to deny labels matching a pattern.

In large repositories like microsoft/vscode with 600+ labels, maintaining an exhaustive allowlist is impractical. However, there are classes of labels that should never be applied by an agentic workflow — for example:

  • Labels prefixed with ~ (tilde) are used as workflow trigger labels (e.g., ~stale triggers the triage workflow). An agent applying these could cause unintended workflow cascades.
  • Labels prefixed with * have special administrative meaning.

Without infrastructure-level enforcement, these constraints can only be expressed in the prompt ("please don't apply labels starting with ~"), which is a weak defense against prompt injection attacks on workflows that process untrusted public input.

Proposed Solution

Add a blocked: field to add-labels (and potentially remove-labels) safe outputs that supports pattern matching:

safe-outputs:
  add-labels:
    blocked: ["~*", "*\\**"]   # deny labels starting with ~ or *
    max: 5

Ideally this would support at minimum prefix matching (e.g., ~* matches any label starting with ~), and potentially the same glob/wildcard syntax used elsewhere in gh-aw (e.g., forks: patterns).

Why This Matters

For workflows that triage public issues (where the issue content is untrusted and may contain prompt injection payloads), the safe-outputs config is the hard security boundary — it's the "you literally can't" layer vs. the prompt-level "please don't" layer. Being able to deny dangerous label patterns at this layer would meaningfully reduce the attack surface for agentic triage workflows.

Current Workaround

Prompt-level instructions telling the agent not to apply certain labels. This works under normal conditions but is not a reliable defense against adversarial input.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions