Skip to content

Conversation

@Pouyanpi
Copy link
Collaborator

@Pouyanpi Pouyanpi commented Oct 20, 2025

Description

Add custom HTTP headers support to the ChatNVIDIA class patch, enabling users to pass custom headers
(authentication tokens, request IDs, billing information, etc.) with all requests to NVIDIA AI endpoints.

example config

models:
  - type: main
    engine: nvidia_ai_endpoints
    model: meta/llama-3.1-8b-instruct
    parameters:
      base_url: http://localhost:8000/v1
      custom_headers:
        X-Model-Auth: test-bearer-token
        X-Request-ID: "12345"
        X-Billing-Project: project-abc
        X-Trace-ID: trace-xyz

The feat is verified on microservice side.

Implementation

  • Added custom_headers optional field to ChatNVIDIA class with Pydantic v2 compatibility
  • Implemented runtime method wrapping that intercepts _client.get_req() and _client.get_req_stream()
    to merge custom headers with existing headers
  • Included automatic version detection to ensure compatibility with langchain-nvidia-ai-endpoints >=
    0.3.0, with clear error messages for older versions
  • Works with both synchronous invoke() and streaming requests, fully compatible with VLM (Vision
    Language Models)

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

Add custom HTTP headers support to the ChatNVIDIA class patch, enabling
users to pass custom headers
(authentication tokens, request IDs, billing information, etc.) with all
requests to NVIDIA AI
endpoints.

Implementation Approach

- Added custom_headers optional field to ChatNVIDIA class with Pydantic
v2 compatibility
- Implemented runtime method wrapping that intercepts _client.get_req()
and _client.get_req_stream()
to merge custom headers with existing headers
- Included automatic version detection to ensure compatibility with
langchain-nvidia-ai-endpoints >=
0.3.0, with clear error messages for older versions
- Works with both synchronous invoke() and streaming requests, fully
compatible with VLM (Vision
Language Models)
@Pouyanpi Pouyanpi force-pushed the feat/custom-headers-nvidia branch from 0025970 to 4cb7771 Compare October 20, 2025 09:42
@codecov-commenter
Copy link

codecov-commenter commented Oct 20, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
.../providers/_langchain_nvidia_ai_endpoints_patch.py 66.66% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@Pouyanpi Pouyanpi force-pushed the feat/custom-headers-nvidia branch from a5f27cb to a88b954 Compare October 22, 2025 16:20
@Pouyanpi Pouyanpi marked this pull request as ready for review October 22, 2025 16:21
Copy link
Member

@trebedea trebedea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is good, just fix the RuntimeError message.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The latest changes address previous feedback by (1) fixing the mutable default argument bug where dict = {} was replaced with dict = None in both wrapped_get_req and wrapped_get_req_stream functions, and (2) consolidating redundant RuntimeError conditions that checked for langchain-nvidia-ai-endpoints compatibility. The developer appears to have removed duplicate version checks and simplified the error handling logic. These changes integrate with NeMo-Guardrails' existing LLM provider patching mechanism (found in nemoguardrails/llm/providers/) which allows runtime customization of third-party provider clients without modifying the original library code.

Important Files Changed

Filename Score Overview
nemoguardrails/llm/providers/_langchain_nvidia_ai_endpoints_patch.py 1/5 Fixed mutable default arguments but introduced critical duplicate super().__init__() calls and redundant nested if self.custom_headers checks

Confidence score: 1/5

  • This PR contains a critical bug that will cause immediate runtime issues and should not be merged without fixes
  • Score reflects the duplicate super().__init__(**kwargs) call on lines 64 and 66, redundant nested if self.custom_headers checks on lines 65 and 73, and unclear control flow in the __init__ method that suggests merge conflicts or copy-paste errors were not fully resolved
  • Pay close attention to the __init__ method (lines 63-81) - the duplicate initialization and nested conditionals must be fixed before this can safely merge

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The developer has addressed previous feedback by removing duplicate initialization code in the ChatNVIDIA class patch. Specifically, they eliminated the redundant super().__init__(**kwargs) call that was appearing twice and removed a nested if self.custom_headers check that was already covered by an outer conditional. These changes streamline the initialization logic for the custom HTTP headers feature without altering its functionality. The custom headers support allows users to pass authentication tokens, request IDs, and other metadata through YAML configuration to NVIDIA AI endpoints by intercepting and wrapping the internal _client.get_req() and _client.get_req_stream() methods.

Important Files Changed

Filename Score Overview
nemoguardrails/llm/providers/_langchain_nvidia_ai_endpoints_patch.py 4/5 Removed duplicate super().__init__() call and redundant nested conditional check in ChatNVIDIA initialization

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it removes obvious code duplication
  • Score reflects that the changes are purely cleanup with no functional modifications, though the overall feature implementation was not fully reviewed here
  • No files require special attention; this is a straightforward bug fix addressing previous review comments

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@Pouyanpi Pouyanpi added this to the v0.18.0 milestone Oct 24, 2025
@Pouyanpi Pouyanpi self-assigned this Oct 24, 2025
@Pouyanpi Pouyanpi merged commit aafd733 into develop Oct 24, 2025
8 checks passed
@Pouyanpi Pouyanpi deleted the feat/custom-headers-nvidia branch October 24, 2025 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants