[PY] fix: Optimizing the to_string Function #2107

jamiesun · 2024-10-11T03:55:36Z

The to_string function has been optimized to add logic for handling string types and null values.

Linked issues

closes: #2065

Details

If non-English characters are used in the conversation, the entire conversation history is saved as an escaped character, causing confusion for the AI model

Change details

Describe your changes, with screenshots and code snippets as appropriate

code snippets:

def to_string(tokenizer: Tokenizer, value: Any, as_json: bool = False) -> str:
    """
    Converts a value to a string representation.
    Dates are converted to ISO strings and Objects are converted to JSON or YAML,
    whichever is shorter.

    Args:
        tokenizer (Tokenizer): The tokenizer object used for encoding.
        value (Any): The value to be converted.
        as_json (bool, optional): Flag indicating whether to return the value as JSON string.
          Defaults to False.

    Returns:
        str: The string representation of the value.
    """
    if value is None:
        return ""
    
    if hasattr(value, "isoformat") and callable(value.isoformat):
        # Used when the value is a datetime object
        return value.isoformat()
    value = todict(value)

    if as_json:
        return json.dumps(value, default=lambda o: o.__dict__, ensure_ascii=False)

    # Return shorter version of object
    yaml_str = yaml.dump(value, allow_unicode=True)
    json_str = json.dumps(value, default=lambda o: o.__dict__, ensure_ascii=False)
    if len(tokenizer.encode(yaml_str)) < len(tokenizer.encode(json_str)):
        return yaml_str

    return json_str

screenshots:

Attestation Checklist

My code follows the style guidelines of this project
I have checked for/fixed spelling, linting, and other errors
I have commented my code for clarity
I have made corresponding changes to the documentation (updating the doc strings in the code is sufficient)
My changes generate no new warnings
I have added tests that validates my changes, and provides sufficient test coverage. I have tested with:
- Local testing
- E2E testing in Teams
New and existing unit tests pass locally with my changes

The to_string function has been optimized to add logic for handling string types and null values.

jamiesun · 2024-10-11T12:25:17Z

@microsoft-github-policy-service agree company="Terateams"

corinagum · 2024-10-11T18:44:50Z

@jamiesun thank you for the contribution! We super appreciate it. Do you happen to know if this same issue occurs in JS / C# as well, or just Python? If they do, I'll create some new tickets to make sure we are tracking for all three scenarios.

lilyydu

This is great, thanks for adding this!

lilyydu · 2024-10-11T21:46:39Z

@jamiesun It looks like there's a linting issue- can you rerun and fix accordingly? Thanks!

jamiesun · 2024-10-14T04:55:25Z

@jamiesun It looks like there's a linting issue- can you rerun and fix accordingly? Thanks!

Thanks for the reminder, I've reworked the commit code

jamiesun added 2 commits October 11, 2024 10:49

refactor: Optimizing the to_string Function

66cfb7e

The to_string function has been optimized to add logic for handling string types and null values.

update to_string function and add test unit

96119a7

jamiesun requested review from aacebo, corinagum, lilyydu and singhk97 as code owners October 11, 2024 03:55

corinagum changed the title ~~refactor: Optimizing the to_string Function~~ [JS] fix: Optimizing the to_string Function Oct 11, 2024

corinagum changed the title ~~[JS] fix: Optimizing the to_string Function~~ [PY] fix: Optimizing the to_string Function Oct 11, 2024

lilyydu previously approved these changes Oct 11, 2024

View reviewed changes

lilyydu self-requested a review October 11, 2024 21:46

update linting issue

7377453

jamiesun dismissed lilyydu’s stale review via 7377453 October 12, 2024 13:03

lilyydu approved these changes Oct 14, 2024

View reviewed changes

lilyydu merged commit d4bf9cc into microsoft:main Oct 14, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PY] fix: Optimizing the to_string Function #2107

[PY] fix: Optimizing the to_string Function #2107

jamiesun commented Oct 11, 2024 •

edited by corinagum

Loading

jamiesun commented Oct 11, 2024

corinagum commented Oct 11, 2024

lilyydu left a comment

lilyydu commented Oct 11, 2024

jamiesun commented Oct 14, 2024

[PY] fix: Optimizing the to_string Function #2107

[PY] fix: Optimizing the to_string Function #2107

Conversation

jamiesun commented Oct 11, 2024 • edited by corinagum Loading

Linked issues

Details

Change details

Attestation Checklist

jamiesun commented Oct 11, 2024

corinagum commented Oct 11, 2024

lilyydu left a comment

Choose a reason for hiding this comment

lilyydu commented Oct 11, 2024

jamiesun commented Oct 14, 2024

jamiesun commented Oct 11, 2024 •

edited by corinagum

Loading