Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for array attributes to Zipkin exporter #1285

Conversation

robwknox
Copy link
Contributor

@robwknox robwknox commented Oct 26, 2020

Description

Adds support for basic sequence (list, tuple, range) attribute values in the Zipkin exporter.

Since the value is serialized to a JSON list string, specific logic is needed to ensure max_tag_value_length is honored at the element boundary.

As an aside, I believe I've identified a bug in that bool objects are supposed to be serialized to true or false to adhere to JSON spec, but we appear to be retaining python formatting of True and False. Will file a separate bug report.

Fixes #1110

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

tox environments:

  • py35-test-exporter-zipkin
  • py36-test-exporter-zipkin
  • py37-test-exporter-zipkin
  • py38-test-exporter-zipkin
  • pypy3-test-exporter-zipkin

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@robwknox robwknox requested review from a team, codeboten and aabmass and removed request for a team October 26, 2020 22:07
@robwknox robwknox changed the title Adding support for array attributes to Zipkin exporter (#1110) Adding support for array attributes to Zipkin exporter Nov 3, 2020
@codeboten codeboten added release:required-for-ga To be resolved before GA release 1.0.0rc2 release candidate 2 for tracing GA labels Nov 5, 2020
Comment on lines 389 to 392
if isinstance(element, (int, bool, float)):
tag_value_element = str(element)
elif isinstance(element, str):
tag_value_element = element
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit L354 to 357 repeated here, could be refactored. One complication is sequence attributes are allowed to have None.

Could also combine the two conditions

Suggested change
if isinstance(element, (int, bool, float)):
tag_value_element = str(element)
elif isinstance(element, str):
tag_value_element = element
if isinstance(element, (int, bool, float, str)):
tag_value_element = str(element)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified the formatting as recommended but didn't refactor the duplication as it's small.

I also changed the logic to silently skip invalid sequence elements versus completely failing so that we're passing along as much data s possible.

running_string_length += (
2 # accounts for ', ' connector
)
if running_string_length > self.max_tag_value_length:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is max_tag_value_length supposed to be encoded byte length or string length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're outputting as ASCII with json.dumps() so one and the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure actually so I checked the python docs:

If ensure_ascii is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped.

So I think if the escape sequences are long, you could get longer length. I think this is OK tho 😄 Maybe just update max_tag_value_length docstring to say its string length if not specified?

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, I believe I've identified a bug in that bool objects are supposed to be serialized to true or false to adhere to JSON spec, but we appear to be retaining python formatting of True and False. Will file a separate bug report.

👍

Nice and thanks for addressing comments! LGTM

Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just leaving some suggestions.

value = self._extract_tag_value_string_from_sequence(
attribute_value
)
if not value:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not value:
if value is None:

)
self.assertEqual(
tags["list5"],
'["True","True","True","True","True","True","True","True","True","True","True","True","True","True","True","True","True","True"]',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'["True","True","True","True","True","True","True","True","True","True","True","True","True","True","True","True","True","True"]',
partial_dump([str(True)] * 25)

@@ -440,8 +457,66 @@ def test_export_json_max_tag_length(self):
_, kwargs = mock_post.call_args # pylint: disable=E0633

tags = json.loads(kwargs["data"])[0]["tags"]
self.assertEqual(len(tags["k1"]), 128)
self.assertEqual(len(tags["k2"]), 50)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid these long testing sequences, may I suggest:

from functools import partial
from json import dumps
...
partial_dumps = partial(dumps, separators=(",", ":"))

self.assertEqual(len(tags["string2"]), 50)
self.assertEqual(
tags["list1"],
'["a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a"]',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'["a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a","a"]',
partial_dumps(["a"] * 25),

)
self.assertEqual(
tags["range1"],
'["0","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24"]',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'["0","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24"]',
partial_dumps([str(i) for i in range(25)]),

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codeboten codeboten merged commit 9cfe555 into open-telemetry:master Dec 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0.0rc2 release candidate 2 for tracing GA release:required-for-ga To be resolved before GA release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Zipkin exporter should support Array attributes
5 participants