Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EventHubs] check for any non-None values in amqp header/properties #27444

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def to_outgoing_amqp_message(annotated_message):
:rtype: pyamqp.Message
"""
message_header = None
if annotated_message.header and any(annotated_message.header.values()):
if annotated_message.header and annotated_message.header._any(): # pylint: disable=protected-access
message_header = Header(
delivery_count=annotated_message.header.delivery_count,
ttl=annotated_message.header.time_to_live,
Expand All @@ -93,7 +93,7 @@ def to_outgoing_amqp_message(annotated_message):
)

message_properties = None
if annotated_message.properties and any(annotated_message.properties.values()):
if annotated_message.properties and annotated_message.properties._any(): # pylint: disable=protected-access
message_properties = Properties(
message_id=annotated_message.properties.message_id,
user_id=annotated_message.properties.user_id,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ def to_outgoing_amqp_message(annotated_message):
:rtype: uamqp.Message
"""
message_header = None
if annotated_message.header and any(annotated_message.header.values()):
if annotated_message.header and annotated_message.header._any(): # pylint: disable=protected-access
message_header = MessageHeader()
message_header.delivery_count = annotated_message.header.delivery_count
message_header.time_to_live = annotated_message.header.time_to_live
Expand All @@ -133,7 +133,7 @@ def to_outgoing_amqp_message(annotated_message):
message_header.priority = annotated_message.header.priority

message_properties = None
if annotated_message.properties and any(annotated_message.properties.values()):
if annotated_message.properties and annotated_message.properties._any(): # pylint: disable=protected-access
message_properties = MessageProperties(
message_id=annotated_message.properties.message_id,
user_id=annotated_message.properties.user_id,
Expand Down
13 changes: 13 additions & 0 deletions sdk/eventhub/azure-eventhub/azure/eventhub/amqp/_amqp_message.py
Original file line number Diff line number Diff line change
Expand Up @@ -346,6 +346,12 @@ def __init__(self, **kwargs):
self.durable = kwargs.get("durable")
self.priority = kwargs.get("priority")

def _any(self):
Copy link
Member Author

@swathipil swathipil Nov 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@annatisch -

I tried the following options:

1. any(a)  # baseline
2. any(v is not None for v in a)
3. a.count(None) != len(a)
4.
def _any():
    for val in a:
        if val is not None:
            return True
    return False

I timed the above with different a values. Here's a sample of respective times using timeit (with 1000000 executions *100):

# a1.
a = [None]*13
a[1] = False

0.10174944800062803
0.37777837800007547
0.09170495199796277
0.08195296799996868

# a2
a = [None]*13
a[6] = False

0.09471784099878278
0.6023484600000666
0.08765571700205328
0.08311545400181786

# a3
a = [None]*13

0.09684219000017037
0.7731148380006198
0.08053063299914356
0.08237961799954065

Adding an _any method seems to be the most readable I think.
It's the fastest for a1 and 2. Even when it's slower for a3, it's still faster than the baseline any(). (I can run the perf test on this PR as well.)

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff! Thanks so much for doing this analysis :)
It's probably worth noting that from a perf perspective - calling a standalone function is not exactly the same as calling an object method - however I expect the difference to be pretty minimal - and even if it's not - we could always just use a standalone function for this as well.

Copy link
Member Author

@swathipil swathipil Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah gotcha. Based on what you said, I just tested calling AmqpMessageProperties, just to double check the speed of method vs. function vs. other calls. In general, it looks like 3. is consistently the fastest.

1. any(a.values())  # baseline
2. any(v is not None for v in a.values())
3.
v = a.values() 
v.count(None) != len(v)
4. def _any():
    for val in a.values():
        if val is not None:
            return True
    return False
5.
class AmqpMessageHeader:
  def _any():
      for val in a.values():
          if val is not None:
              return True
      return False

This is the result of grabbing the min of timeit.repeat.


from azure.eventhub.amqp import AmqpMessageProperties
a = AmqpMessageProperties()
a.message_id = False    # [False] + [None]*12

1.9633787999628112
2.168429500015918
1.89844889997039
2.05894580000313
1.9218437999952585

from azure.eventhub.amqp import AmqpMessageProperties
a = AmqpMessageProperties()
a.content_type = False  # [None]*6 + [False] + [None]*6

1.8671079999767244
2.518583500001114
1.971821799990721
2.1595329999690875
2.083914199960418

from azure.eventhub.amqp import AmqpMessageProperties
a = AmqpMessageProperties()     # [None]*13

1.9913484000135213
2.7721921000047587
1.979206099989824
2.3778754000086337
2.2316732999752276

Esp in the case of all values being None, 3. is the only one that is faster than the baseline. Based on this switching to using solution 3. Any concerns?

# Returns True for values 0, False, "", etc. unlike any()
for val in self.values():
if val is not None:
return True
return False

class AmqpMessageProperties(DictMixin):
# pylint: disable=too-many-instance-attributes
Expand Down Expand Up @@ -437,3 +443,10 @@ def __init__(self, **kwargs):
self.group_id = kwargs.get("group_id")
self.group_sequence = kwargs.get("group_sequence")
self.reply_to_group_id = kwargs.get("reply_to_group_id")

def _any(self):
# Returns True for values 0, False, "", etc. unlike any()
for val in self.values():
if val is not None:
return True
return False
16 changes: 16 additions & 0 deletions sdk/eventhub/azure-eventhub/tests/unittest/test_event_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,22 @@ def test_amqp_message_str_repr():
assert str(message) == 'A'
assert 'AmqpAnnotatedMessage(body=A, body_type=data' in repr(message)

def test_amqp_message_header_properties_any():
header = AmqpMessageHeader()
header.first_acquirer = False
properties = AmqpMessageProperties()
properties.creation_time = 0

assert header._any()
assert properties._any()

header = AmqpMessageHeader()
properties = AmqpMessageProperties()
properties.message_id = ""

assert not header._any()
assert properties


def test_amqp_message_from_message(uamqp_transport):
if uamqp_transport:
Expand Down