Skip to content

Conversation

slekkala1
Copy link
Contributor

@slekkala1 slekkala1 commented Oct 10, 2025

What does this PR do?

Have closed the previous PR due to merge conflicts with multiple PRs
Addressed all comments from #3768 (sorry for carrying over to this one)

Test Plan

Added UTs and integration tests

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 10, 2025
@slekkala1 slekkala1 marked this pull request as ready for review October 10, 2025 22:25
@slekkala1 slekkala1 force-pushed the new-responses-and-safety branch 2 times, most recently from 76f0478 to 76b991c Compare October 13, 2025 19:13
:param type: The type/identifier of the guardrail.
"""

type: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for my learning: what types are available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about this part @ashwinb, I usually only know identifier for a shield, as such only supporting that for now. May be this is to allow more fields in future.

if isinstance(guardrail, str):
guardrail_ids.append(guardrail)
elif isinstance(guardrail, ResponseGuardrailSpec):
guardrail_ids.append(guardrail.type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems confusing: type being used as id. Is there a better way to name this?


# Input safety validation - check messages before processing
if self.guardrail_ids:
combined_text = interleaved_content_as_str([msg.content for msg in self.ctx.messages])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we document somewhere that guardrails only apply to text input?

Copy link
Contributor Author

@slekkala1 slekkala1 Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes the shield + moderation apis dont support the image, this is known tech debt, I filed an issue for that before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it in our user-facing documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah may be not, need to do that

@slekkala1 slekkala1 force-pushed the new-responses-and-safety branch 4 times, most recently from a9ebdfe to 1f7d52f Compare October 15, 2025 16:57

# Input safety validation - check messages before processing
if self.guardrail_ids:
combined_text = interleaved_content_as_str([msg.content for msg in self.ctx.messages])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it in our user-facing documentation?

)

# Collect content for final response
chat_response_content.append(chunk_choice.delta.content or "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Re: line +624]

this should be gated too?

See this comment inline on Graphite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes added gating for reasoning content too

sequence_number=self.sequence_number,
)
# Skip Emitting text content delta event if guardrails are configured, only emits chunks after guardrails are applied
if not self.guardrail_ids:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it acceptable to drop these chunks entirely? or should we queue and yield after guardrails pass

Copy link
Contributor Author

@slekkala1 slekkala1 Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Queuing and streaming the deltas when chunk is safe.

@slekkala1 slekkala1 force-pushed the new-responses-and-safety branch from 6bd5b4d to 8d10642 Compare October 15, 2025 20:17
@slekkala1 slekkala1 force-pushed the new-responses-and-safety branch from 8c900d7 to 94b5df7 Compare October 15, 2025 21:06
@slekkala1 slekkala1 merged commit 99141c2 into main Oct 15, 2025
21 of 22 checks passed
@slekkala1 slekkala1 deleted the new-responses-and-safety branch October 15, 2025 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants