Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Tokens cannot be obtained from the model dialogue #2326

Merged
merged 1 commit into from
Feb 19, 2025

Conversation

shaohuzhang1
Copy link
Contributor

fix: Tokens cannot be obtained from the model dialogue

Copy link

f2c-ci-robot bot commented Feb 19, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -121,6 +118,9 @@ def _stream(
generation_chunk.text, chunk=generation_chunk, logprobs=logprobs
)
is_first_chunk = False
# custom code
if generation_chunk.message.usage_metadata is not None:
self.usage_metadata = generation_chunk.message.usage_metadata
yield generation_chunk

def _create_chat_result(self,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided code snippet appears to be a part of an implementation for handling streaming responses from a chat model, specifically within a class called Stream. Here's a breakdown with notes on potential improvements:

Key Issues Identified:

  1. Repeated Stream Option Setting: The stream option (stream attribute) is set multiple times without considering whether it was already defined.
  2. Redundant Usage Metadata Handling: There seems to be redundant logic around usage metadata retrieval.

Potential Improvements:

  1. Single Stream Option Check:
    Ensure that the stream option is only set once, ideally during initialization or just before initiating the stream process. This reduces ambiguity and potential bugs.
# Single check for stream option
if not kwargs.get('stream', False):
    kwargs["stream"] = True
  1. Avoid Redundant Usage Metadata Retrieval:
    Remove unnecessary checks and assignments related to retrieving usage metadata because it might lead to overwriting values if they are fetched more than once.
# Custom code removed for clarity
  1. Enhance Error Handling (Optional):
    While not directly addressed in this snippet, consider adding error handling mechanisms to manage edge cases such as invalid inputs or timeouts when fetching response chunks.

Suggested Changes:

Here’s how you could refactor the function based on these guidelines:

def _stream(
    messages: list[base_message.BaseMessage],
    stop: Optional[list[str]] = None,
    callbacks: Optional[List[CallbackHandler]] = None,
    verbose: bool = False,
    use_cache: bool = True,  # Assuming there's a need for caching
    llm_backend=None,
    run_manager: Optional[ CallbackManagerForLLMRun] = None,
    **kwargs: Any,
) -> Iterator[ChatGenerationChunk]:
    """
    Set default stream_options and initiate streaming response.
    """
    if llm_backend == "azure":
        del kwargs["stream"]
    
    # Ensure stream option is set correctly
    kwargs["stream"] = kwargs.get("stream", False)
    
    if kwargs.get("stream"):
        # Additional setup for streaming can go here
    
    payload = self._get_request_payload(messages, stop=stop, use_cache=use_cache, **kwargs)
    default_chunk_class: Type[BaseMessageChunk] = AIMessageChunk
    base_generation_info = {}
    
    # Rest of the code remains mostly unchanged

# Example usage
async def main():
    async for chunk in client.stream(["Hello"], use_stream=True):
        print(chunk.text)

Conclusion:

By ensuring the stream option is consistently managed and avoiding redundant operations concerning usage metadata, we improve both readability and robustness of the _stream method. These changes also enhance efficiency and reliability while maintaining consistency throughout the implementation.

Copy link

f2c-ci-robot bot commented Feb 19, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shaohuzhang1 shaohuzhang1 merged commit a06c5c0 into main Feb 19, 2025
4 checks passed
@shaohuzhang1 shaohuzhang1 deleted the pr@main@fix_chat_tokens branch February 19, 2025 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant