Skip to content

Commit

Permalink
[devkit][trace] Fix "Failed to load trace ... is not a valid JSON" wh…
Browse files Browse the repository at this point in the history
…en traces have invalid JSON values (#3190)

# Description

`-Infinity`, `Infinity` and `NaN` are serializable in Python, but not in
standard JSON; so trace UI cannot parse such JSON string. This PR
applies a deserialize and serialize to parse them as normal string to
make them standard JSON values.

# All Promptflow Contribution checklist:
- [x] **The pull request does not introduce [breaking changes].**
- [x] **CHANGELOG is updated for new features, bug fixes or other
significant changes.**
- [x] **I have read the [contribution guidelines](../CONTRIBUTING.md).**
- [ ] **Create an issue and link to the pull request to get dedicated
review from promptflow team. Learn more: [suggested
workflow](../CONTRIBUTING.md#suggested-workflow).**

## General Guidelines and Best Practices
- [x] Title of the pull request is clear and informative.
- [x] There are a small number of commits, each of which have an
informative message. This means that previously merged commits do not
appear in the history of the PR. For more information on cleaning up the
commits in your PR, [see this
page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md).

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.
  • Loading branch information
zhengfeiwang authored May 10, 2024
1 parent 2b0818a commit 6288dbc
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 2 deletions.
1 change: 1 addition & 0 deletions src/promptflow-devkit/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
### Bugs Fixed
- Fix the issue that import error will be raised after downgrading promptflow from >=1.10.0 to <1.8.0.
- Fix the issue that `pf flow serve` is broken with exception `NotADirectoryError`.
- Fix "Failed to load trace ... is not valid JSON" when traces inputs/outputs have invalid JSON values like `-Infinity`, `Infinity` and `NaN`.

## v1.10.0 (2024.04.26)

Expand Down
15 changes: 13 additions & 2 deletions src/promptflow-devkit/promptflow/_sdk/entities/_trace.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
from promptflow._sdk._orm.trace import Event as ORMEvent
from promptflow._sdk._orm.trace import LineRun as ORMLineRun
from promptflow._sdk._orm.trace import Span as ORMSpan
from promptflow._sdk._utilities.general_utils import json_loads_parse_const_as_str


class Event:
Expand Down Expand Up @@ -364,12 +365,22 @@ def _get_outputs_from_span(span: Span) -> typing.Optional[typing.Dict]:

@staticmethod
def _from_orm_object(obj: ORMLineRun) -> "LineRun":
# handle potential nan, inf and -inf in inputs and outputs
# they are serializable in Python, but not in JSON
# so it will result in trace UI parse error
# here convert them into string type to make them standard JSON value
inputs, outputs = copy.deepcopy(obj.inputs), copy.deepcopy(obj.outputs)
if isinstance(inputs, dict):
inputs = json_loads_parse_const_as_str(json.dumps(inputs))
if isinstance(outputs, dict):
outputs = json_loads_parse_const_as_str(json.dumps(outputs))

return LineRun(
line_run_id=obj.line_run_id,
trace_id=obj.trace_id,
root_span_id=obj.root_span_id,
inputs=copy.deepcopy(obj.inputs),
outputs=copy.deepcopy(obj.outputs),
inputs=inputs,
outputs=outputs,
start_time=obj.start_time,
end_time=obj.end_time,
status=obj.status,
Expand Down
13 changes: 13 additions & 0 deletions src/promptflow-devkit/tests/sdk_cli_test/e2etests/test_trace.py
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,19 @@ def test_span_non_json_io_in_attrs_persist(self, pf: PFClient) -> None:
assert isinstance(line_run.inputs, str) and line_run.inputs == str(inputs)
assert isinstance(line_run.outputs, str) and line_run.outputs == str(output)

def test_span_with_nan_as_io(self, pf: PFClient) -> None:
trace_id, span_id, line_run_id = str(uuid.uuid4()), str(uuid.uuid4()), str(uuid.uuid4())
span = mock_span(trace_id=trace_id, span_id=span_id, parent_id=None, line_run_id=line_run_id)
span.events[0]["attributes"]["payload"] = json.dumps(dict(input1=float("nan"), input2=float("inf")))
span.events[1]["attributes"]["payload"] = json.dumps(dict(output1=float("nan"), output2=float("-inf")))
span._persist()
line_run = pf.traces.get_line_run(line_run_id=line_run_id)
line_run_inputs, line_run_outputs = line_run.inputs, line_run.outputs
assert isinstance(line_run_inputs["input1"], str) and line_run_inputs["input1"] == "NaN"
assert isinstance(line_run_inputs["input2"], str) and line_run_inputs["input2"] == "Infinity"
assert isinstance(line_run_outputs["output1"], str) and line_run_outputs["output1"] == "NaN"
assert isinstance(line_run_outputs["output2"], str) and line_run_outputs["output2"] == "-Infinity"

def test_delete_traces_three_tables(self, pf: PFClient) -> None:
# trace operation does not expose API for events and spans
# so directly use ORM class to list and assert events and spans existence and deletion
Expand Down

0 comments on commit 6288dbc

Please sign in to comment.