Add runtime validation in setAttribute #348

jakemalachowski · 2019-12-27T04:33:36Z

Fixes Issue #347

Added lists as an accepted attribute value data type per the OT spec
Added data type validation when setting an attribute on a span
- Validates type is one of int, float, str, bool or list
- If a value is of type list, validates all values in the list are of a homogeneous primitive data type, per the OT spec

I wasn't sure what the best way to enforce valid attribute values was. I decided to just drop invalid values since that's what was being done in the PR linked in the issue. But I was wondering if it made sense to throw an exception or maybe attempt to coerce the value into a valid type instead.

Add lists as an accepted data type

ocelotl

Thanks for your contribution and apologies for the late reply. Most of the team is out in an end of year break, but I'll look at this tomorrow 👍

jakemalachowski · 2019-12-28T02:32:08Z

No problem at all. I know it's a slow time of year.

ocelotl

Looking good in general, some changes requested 👍

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

opentelemetry-sdk/tests/trace/test_trace.py

codecov-io · 2019-12-29T02:12:40Z

Codecov Report

Merging #348 into master will increase coverage by 0.14%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #348      +/-   ##
==========================================
+ Coverage   84.99%   85.13%   +0.14%     
==========================================
  Files          38       38              
  Lines        1859     1911      +52     
  Branches      224      225       +1     
==========================================
+ Hits         1580     1627      +47     
- Misses        214      219       +5     
  Partials       65       65

Impacted Files	Coverage Δ
opentelemetry-api/src/opentelemetry/util/types.py	`100% <100%> (ø)`	⬆️
...emetry-sdk/src/opentelemetry/sdk/trace/__init__.py	`90.94% <100%> (+0.28%)`	⬆️
...ry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py	`68.18% <0%> (ø)`	⬆️
...ntelemetry-api/src/opentelemetry/trace/__init__.py	`84.56% <0%> (+0.44%)`	⬆️
...elemetry-api/src/opentelemetry/metrics/__init__.py	`87.93% <0%> (+1.93%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 269b006...ad58890. Read the comment docs.

jakemalachowski · 2019-12-29T02:15:36Z

Implemented your suggested changes. I am now just waiting on your feedback with regards to where the validation logic should live. I just made it a private method for now. However, considering that there aren't any other custom private methods in the file, I am assuming it belongs elsewhere.

Thanks for your feedback so far.

ocelotl

Very good, just a minor request to use is not.

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

opentelemetry-sdk/tests/trace/test_trace.py

Add lists as an accepted data type

ocelotl · 2020-01-01T00:46:16Z

Looking good! Please request your Linux Foundation CLA permissions so that I can approve. 👍

jakemalachowski · 2020-01-01T23:24:42Z

I signed it

jakemalachowski · 2020-01-01T23:25:47Z

@ocelotl All checks are now passing. Thanks for the guidance on all this.

ocelotl

LGTM 👍

toumorokoshi

I have a few code conciseness and naming comments that I think we should resolve before merging this in.

Thanks for working on this! The tests and code overall looks good.

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

toumorokoshi · 2020-01-03T05:11:45Z

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

@@ -208,8 +209,38 @@ def set_attribute(self, key: str, value: types.AttributeValue) -> None:
        if has_ended:
            logger.warning("Setting attribute on ended span.")
            return
+
+        if not isinstance(value, (int, float, bool, str, list, tuple)):


this int, float be consolidated into Number here?

If I use Number rather than int, float here, I will need to change it to:

if not isinstance(value, (bool, str, list, tuple)) and not issubclass(type(value), Number):

I think just using int, float here is more concise, but I can understand the appeal of being consistent with what is done in _check_attribute_value_sequence. What do you think?

isinstance seems to work fine:

$ python -c "from numbers import Number; print(isinstance(1, (bool, Number)))" True

Similarly, please consider using collections.abc.Sequence instead of list, tuple.

And if using Number here, also use it in the AttributeValue type for consistency (and note that this allows Decimal and Fraction numbers too).

@toumorokoshi I'm not sure why my tests were failing after originally making this change. Just tried this again and it works.

toumorokoshi · 2020-01-03T05:16:25Z

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

+
+        for element in sequence:
+
+            if not isinstance(element, (bool, str, Number)):


should this tuple be moved into a constant in the module? This set should be shared with the tuple on line 213 (could modify line 213 to be that list + list, tuple:

VALID_ATTRIBUTE_TYPES + (list, tuple)

I agree this should be a constant. But do you think it might be misleading to have the constant VALID_ATTRIBUTE_TYPES without list, tuple included, since they are actually valid?

Maybe we could define VALID_ATTRIBUTE_SEQUENCE_TYPES and VALID_ATTRIBUTE_NON_SEQUENCE_TYPES, then define VALID_ATTRIBUTE_TYPES as the union of the two. What do you think?

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

toumorokoshi · 2020-01-03T05:28:38Z

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

+            if return_code is not None:
+                logger.warning("%s in attribute value sequence", return_code)
+                return
+
        self.attributes[key] = value


There's a potential edge case where AttributeValues that are lists can be mutated afterward, resulting in invalid types that exporters will run into.

I've added a followup ticket on that here: #352

That's a great catch. Would adding a copy of the list rather than the original list resolve this?

We could store the copies of the sequence values in tuples. Now that you mention this, instead of accepting lists or tuples we should accept sequences.

yeah, I think that's a good solution. Would be good to make a separate PR for that.

…m:jakemalachowski/opentelemetry-python into ISSUE-347/attribue-value-type-enforcement

Co-Authored-By: Yusuke Tsutsumi <tsutsumi.yusuke@gmail.com>

…try-python into ISSUE-347/attribue-value-type-enforcement

toumorokoshi

Changes look good! One last reply that I think makes my suggestion possible, but let me know what you think.

opentelemetry-api/src/opentelemetry/util/types.py

Oberon00

First, please split the change of the AttributeValue definition into a separate PR (I would "request changes" for this, but seeing as I have far too little time for OpenTelemetry-Python work recently, I don't want to hold up things).

Second, I am worried that the trade offs for this feature could be wrong. The benefit of this PR is that if the API is used wrongly, the user now gets a log message right at the point of the mistake instead of a (caught) exception later in the exporter, and the span can still be processed correctly (except for the wrong attribute).

However, for users which use the API correctly (maybe they even use mypy to check for these kinds of mistakes), we slow down setting attributes (a very common operation, probably the most-common even in the whole OpenTelemetry-Python API).

opentelemetry-api/src/opentelemetry/util/types.py

Change typing to prevent heterogeneous types in lists

jakemalachowski · 2020-01-11T17:21:20Z

@Oberon00 in response to your performance concerns:

You bring up a valid point. Is it specifically the list value validation that you see as a performance concern or do you think the isinstance(value, (bool, str, Number, Sequence) is also a bad trade off? I think keeping the first isinstance check but removing the validation on the list values might strike a good balance, but I will defer to the opinions of people more familiar with the project here.

I'm okay with implementing this either way but it would be great to get some additional input on what's best here.

carlosalberto · 2020-01-13T00:07:02Z

hey hey @Oberon00

I'm also thinking about this. I'm slightly in favor of not doing this validation, and letting exporters handle this. Of course, there's no perfect solution here, but a matter of trade offs. In any case, this is a choice that maintainers will have to take ;)

(And of course, the code can be later updated if/as needed).

toumorokoshi · 2020-01-14T16:18:08Z

@carlosalberto @Oberon00 to address performance concerns: I do agree that adding this check will lead to slightly more expensive span creation, at the gain of giving helpful feedback to the user if span attributes are not valid, and removing the concern of type correctness from the exporters.

It's very well possible that, in the future, we may need to remove of pare down this code. But I think this is a very easy check to remove after the fact, but a hard one to add. And real life use cases will probably better inform whether this will be a performance bottleneck.

from a rudimentary timeit benchmark, checking isinstance is roughly a 188ns operation:

# on an Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz
$ python -m timeit "isinstance('a', (str, bool, list))"
10000000 loops, best of 3: 0.188 usec per loop

Let me know if you're strongly against it, and we can discuss further. I'll leave this PR up for another day to allow more discussion, but I'm currently inclined to merge (also since @jakemalachowski put the effort in to author it).

Oberon00

I'm only very slightly against the checks. 😃
And the argument that we can remove it later is actually a good one. For example if we decide that we want to allow exporters to support arbitrary value types, we can still do that later by removing the checks if they are in place. If we added the checks later, on the other hand, that would be a breaking change.

Oberon00 · 2020-01-15T16:59:20Z

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

@@ -208,8 +209,36 @@ def set_attribute(self, key: str, value: types.AttributeValue) -> None:
        if has_ended:
            logger.warning("Setting attribute on ended span.")
            return
+
+        if not isinstance(value, (bool, str, Number, Sequence)):


Note that this check is more lenient than the AttributeValue definition (Number vs (int, float)). Also note that it seems checking for isinstance(x, ABC-type) is more expensive:

In [5]: %timeit isinstance(3, (float, Number)) 319 ns ± 6.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) In [6]: %timeit isinstance(3, (float, int)) 110 ns ± 2.79 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

This is really interesting. Good optimization to keep in mind if this becomes a bottleneck in the future.

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

Oberon00

From a technical perspective, I think this PR is fine now (apart from nit https://github.com/open-telemetry/opentelemetry-python/pull/348/files#r367960226). 👌

toumorokoshi · 2020-01-20T17:14:09Z

Great, thanks! Congrats on your first merged PR!

4fca8c9 ("Add runtime validation in setAttribute (open-telemetry#348)") added a robust attribute validation using numbers.Number to validate numeric types. Although the approach is correct, it presents some complications because Complex, Fraction and Decimal are accepted because they are Numbers. This presents a problem to the exporters because they will have to consider all these different cases when converting attributes to the underlying exporter representation. This commit simplifies the logic by accepting only int and float as numeric values.

4fca8c9 ("Add runtime validation in setAttribute (#348)") added a robust attribute validation using numbers.Number to validate numeric types. Although the approach is correct, it presents some complications because Complex, Fraction and Decimal are accepted because they are Numbers. This presents a problem to the exporters because they will have to consider all these different cases when converting attributes to the underlying exporter representation. This commit simplifies the logic by accepting only int and float as numeric values.

* chore: add plugin developer guide * chore: add a link to example plugin

Validate attribute value data types before adding to span

6914ff3

Add lists as an accepted data type

jakemalachowski requested a review from a team December 27, 2019 04:33

Change order of checks to remove an else condition

59107ad

ocelotl reviewed Dec 28, 2019

View reviewed changes

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py Outdated Show resolved Hide resolved

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py Outdated Show resolved Hide resolved

opentelemetry-sdk/tests/trace/test_trace.py Outdated Show resolved Hide resolved

jakemalachowski added 3 commits December 28, 2019 19:38

Create separate sequence check method, add tests, fix linting issues

8152fdb

Fix attribute value typing

1d972e8

Apply linting changes

c535e0c

ocelotl suggested changes Dec 29, 2019

View reviewed changes

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py Outdated Show resolved Hide resolved

opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py Outdated Show resolved Hide resolved

opentelemetry-sdk/tests/trace/test_trace.py Show resolved Hide resolved

Use is not None, use optional type instead of union with None

59dd5f5

jakemalachowski requested review from ocelotl and removed request for a team December 31, 2019 00:13

jakemalachowski added 6 commits December 30, 2019 18:33

Validate attribute value data types before adding to span

2a5df2b

Add lists as an accepted data type

Change order of checks to remove an else condition

bacb09b

Create separate sequence check method, add tests, fix linting issues

3e67a9e

Fix attribute value typing

ff61b9e

Apply linting changes

4d74316

Use is not None, use optional type instead of union with None

8a7ec1c

ocelotl approved these changes Jan 2, 2020

View reviewed changes

toumorokoshi mentioned this pull request Jan 3, 2020

AttributeValues accepting a list could lead to invalid types being added #352

Closed

toumorokoshi suggested changes Jan 3, 2020

View reviewed changes

jakemalachowski and others added 3 commits January 3, 2020 07:30

Merge branch 'ISSUE-347/attribue-value-type-enforcement' of github.co…

f05956b

…m:jakemalachowski/opentelemetry-python into ISSUE-347/attribue-value-type-enforcement

Update opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

e7e976a

Co-Authored-By: Yusuke Tsutsumi <tsutsumi.yusuke@gmail.com>

Clarify variable and method names, remove redundant check

75261d8

jakemalachowski added 4 commits January 6, 2020 19:20

Commit lint changes

269b006

Merge branch 'master' of https://github.com/open-telemetry/openteleme…

47a5887

…try-python into ISSUE-347/attribue-value-type-enforcement

Lint changes

669b3eb

Lint changes

4a5812c

toumorokoshi approved these changes Jan 7, 2020

View reviewed changes

Oberon00 reviewed Jan 10, 2020

View reviewed changes

opentelemetry-api/src/opentelemetry/util/types.py Outdated Show resolved Hide resolved

Oberon00 reviewed Jan 10, 2020

View reviewed changes

opentelemetry-api/src/opentelemetry/util/types.py Outdated Show resolved Hide resolved

Use number instead of int, float

6f1dd5d

Change typing to prevent heterogeneous types in lists

Oberon00 reviewed Jan 15, 2020

View reviewed changes

Oberon00 changed the title ~~Validate attribute value data types before adding to span~~ Update AttributeValue type definition and add runtime validation in setAttribute Jan 15, 2020

Revert AttributeValue typing change

5225928

jakemalachowski mentioned this pull request Jan 16, 2020

Add int and valid sequences to AttributeValue type #368

Merged

jakemalachowski added 2 commits January 17, 2020 06:51

Prevent duplicate isinstance checks, run linter

b83dc7b

Lint

d8e5946

Oberon00 approved these changes Jan 17, 2020

View reviewed changes

Move length check inside validation method

ad58890

jakemalachowski changed the title ~~Update AttributeValue type definition and add runtime validation in setAttribute~~ Add runtime validation in setAttribute Jan 18, 2020

toumorokoshi merged commit 4fca8c9 into open-telemetry:master Jan 20, 2020

dgzlopes mentioned this pull request Jan 31, 2020

Add enforcement of AttributeValues for trace events #347

Closed

c24t mentioned this pull request Mar 2, 2020

Freeze sequence-valued span attributes #449

Merged

mauriciovasquezbernal mentioned this pull request Mar 4, 2020

Improve attributes validation #460

Merged

srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020

chore: add plugin developer guide (open-telemetry#348)

61ea2ad

* chore: add plugin developer guide * chore: add a link to example plugin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add runtime validation in setAttribute #348

Add runtime validation in setAttribute #348

jakemalachowski commented Dec 27, 2019 •

edited

Loading

ocelotl left a comment

jakemalachowski commented Dec 28, 2019

ocelotl left a comment

codecov-io commented Dec 29, 2019 •

edited

Loading

jakemalachowski commented Dec 29, 2019

ocelotl left a comment

ocelotl commented Jan 1, 2020

jakemalachowski commented Jan 1, 2020

jakemalachowski commented Jan 1, 2020

ocelotl left a comment

toumorokoshi left a comment

toumorokoshi Jan 3, 2020

jakemalachowski Jan 3, 2020

toumorokoshi Jan 7, 2020 •

edited

Loading

Oberon00 Jan 10, 2020

Oberon00 Jan 10, 2020

jakemalachowski Jan 11, 2020

toumorokoshi Jan 3, 2020

jakemalachowski Jan 3, 2020

toumorokoshi Jan 3, 2020

jakemalachowski Jan 3, 2020

ocelotl Jan 3, 2020 •

edited

Loading

toumorokoshi Jan 3, 2020

toumorokoshi left a comment

Oberon00 left a comment •

edited

Loading

jakemalachowski commented Jan 11, 2020

carlosalberto commented Jan 13, 2020

toumorokoshi commented Jan 14, 2020 •

edited

Loading

Oberon00 left a comment

Oberon00 Jan 15, 2020

toumorokoshi Jan 20, 2020

Oberon00 left a comment

toumorokoshi commented Jan 20, 2020


		for element in sequence:

		if not isinstance(element, (bool, str, Number)):

Add runtime validation in setAttribute #348

Add runtime validation in setAttribute #348

Conversation

jakemalachowski commented Dec 27, 2019 • edited Loading

ocelotl left a comment

Choose a reason for hiding this comment

jakemalachowski commented Dec 28, 2019

ocelotl left a comment

Choose a reason for hiding this comment

codecov-io commented Dec 29, 2019 • edited Loading

Codecov Report

jakemalachowski commented Dec 29, 2019

ocelotl left a comment

Choose a reason for hiding this comment

ocelotl commented Jan 1, 2020

jakemalachowski commented Jan 1, 2020

jakemalachowski commented Jan 1, 2020

ocelotl left a comment

Choose a reason for hiding this comment

toumorokoshi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

toumorokoshi Jan 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ocelotl Jan 3, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

toumorokoshi left a comment

Choose a reason for hiding this comment

Oberon00 left a comment • edited Loading

Choose a reason for hiding this comment

jakemalachowski commented Jan 11, 2020

carlosalberto commented Jan 13, 2020

toumorokoshi commented Jan 14, 2020 • edited Loading

Oberon00 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Oberon00 left a comment

Choose a reason for hiding this comment

toumorokoshi commented Jan 20, 2020

jakemalachowski commented Dec 27, 2019 •

edited

Loading

codecov-io commented Dec 29, 2019 •

edited

Loading

toumorokoshi Jan 7, 2020 •

edited

Loading

ocelotl Jan 3, 2020 •

edited

Loading

Oberon00 left a comment •

edited

Loading

toumorokoshi commented Jan 14, 2020 •

edited

Loading