Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A performanсe when publishing many packets to the same topic #52

Open
YuriK75 opened this issue Feb 11, 2021 · 15 comments
Open

A performanсe when publishing many packets to the same topic #52

YuriK75 opened this issue Feb 11, 2021 · 15 comments

Comments

@YuriK75
Copy link

YuriK75 commented Feb 11, 2021

Hi!
As I see, for publish a packet from client - I must:

  1. Make a topic String (or string slice) - contains own copy of data in heap
  2. Make a TopicName which "eats" topic String (or copies string to own copy in heap) and checks it with RegExp
  3. Make a payload - also own copy in heap
  4. Make a PublishPacket which "eats" TopicName and payload
  5. Serialize the packet to stream, which does not "eat" packet
  6. Drop the packet

When I need (and many others probaly too) to publish different (millions by day) payloads to the same (thousands) topics -
i can't reuse TopicName for doing 1-2 steps once and doing only 3-6 steps for every publish.

Also i can't store a packet for later reuse - it does not support payload replacing.

So this architecture breaks performance.

If You can add ability for using in packet shared ref to TopicName (and for payload may be too, but it
is not important) - it will be great!

Or add convenience fn which receives &TopicName, payload: &[u8], &mut Vec
and serializes 1st and 2nd arguments to 3rd...

Yuri

@zonyitoo
Copy link
Owner

zonyitoo commented Feb 11, 2021

It seems that a PublishPacketRef will do the trick.

pub struct PublishPacketRef<'a> {
    fixed_header: FixedHeader,
    topic_name: &'a TopicNameRef,
    packet_identifier: Option<PacketIdentifier>,
    payload: &'a [u8],
}

And impls Packet for it. Very simple solution, but the Packet trait implies that the struct could be be Encodable and Decodable, but PublishPacketRef couldn't be Decodable.

@Spoonbender
Copy link

Hmmm perhaps split Packet to OutgoingPacket and IncomingPacket (Packet impls both), just so we can have this optimization of published packets?
The scenario @YuriK75 presented is quite common, and avoiding these allocations and deallocations would be great.

@YuriK75
Copy link
Author

YuriK75 commented Feb 12, 2021

What about impl only Encodable for PublishPacketRef?
That PublishPacketRef will live very shortly - may be it is too complex for solving one small task...
May the only function will do all - at once will take all parameters and encode it to Write.

pub fn publish<W: Write>(
    // ... FixedHeader or any other
    topic_name: &TopicName,
    packet_identifier: Option<PacketIdentifier>,
    payload: &[u8],
    writer: &mut W) -> Result<(), PublishPacket::Err>

@zonyitoo
Copy link
Owner

zonyitoo commented Feb 12, 2021

Yes, I have already tried both @Spoonbender and @YuriK75 's proposal locally, yesterday.

  1. Splits Packet into OutgoingPacket and IncomingPacket is good, but one problem have to be solved: how to define Pa cket::Payload? Or we should just remove payload() and payload_ref() member functions, and PacketError<T>'s definition of PayloadError, which are breaking changes.
  2. impls Encodable for PublishPacketRef only is good, but what is the Err type? Normally it should be PacketError<T> as all the other *Packet types.

I haven't decided yet, maybe both of yours ideas should be adopted, which may lead to a huge API breaking chage.

@zonyitoo
Copy link
Owner

zonyitoo commented Feb 12, 2021

May the only function will do all - at once will take all parameters and encode it to Write.

Yes, that is a simpler solution, but only be useful for this case.

@Spoonbender
Copy link

@zonyitoo maybe a cow will help with the fields that are either owned or borrowed?

@zonyitoo
Copy link
Owner

Packet::decode_packet requires Self to have 'static lifetime.

@zonyitoo
Copy link
Owner

It should be ok to remove the Packet trait .. Hmm ... Very complicated.

zonyitoo added a commit that referenced this issue Feb 20, 2021
- splits Packet into EncodablePacket and DecodablePacket
- removes default payload() and payload_ref() methods
- fixes encoded_length() implementations for all packets
- add PublishPacketRef for performance optimization
@zonyitoo
Copy link
Owner

Please check if this commit works for you.

zonyitoo added a commit that referenced this issue Feb 20, 2021
- splits Packet into EncodablePacket and DecodablePacket
- removes default payload() and payload_ref() methods
- fixes encoded_length() implementations for all packets
- add PublishPacketRef for performance optimization
zonyitoo added a commit that referenced this issue Feb 20, 2021
- splits Packet into EncodablePacket and DecodablePacket
- removes default payload() and payload_ref() methods
- fixes encoded_length() implementations for all packets
- add PublishPacketRef for performance optimization
@YuriK75
Copy link
Author

YuriK75 commented Feb 25, 2021

Please check if this commit works for you.

Hmmm... so many changes ... I think my issue much less difficult to solve...

  1. Splitting Packet trait - ok, it is reasonable change!
  2. New PublishPacketRef struct - at most ok.
  3. New TopicNameRef tuple - wrong idea, completely unneeded type. I think in PublishPacketRef::topic_name You should use &TopicName instead of &TopicNameRef. And unsafe code in TopicNameRef::new() with dirty C-style pointers and unusing borrow checker feature - completely bad idea. Remove TopicNameRef at all, please!

Yuri

@zonyitoo
Copy link
Owner

TopicNameRef is not a new struct. But I think it is safe to remove because nobody is actually using it.

@YuriK75
Copy link
Author

YuriK75 commented Feb 26, 2021 via email

@zonyitoo
Copy link
Owner

zonyitoo commented Feb 26, 2021

I just double checked the design of TopicNameRef. It is required because it is used in TopicFilterRef, which is "new type" for adding special methods for &str.

It is ok to left it there, you can just pass &TopicName as the first argument of PublishPacketRef::new.

zonyitoo added a commit that referenced this issue Feb 26, 2021
- splits Packet into EncodablePacket and DecodablePacket
- removes default payload() and payload_ref() methods
- fixes encoded_length() implementations for all packets
- add PublishPacketRef for performance optimization
@YuriK75
Copy link
Author

YuriK75 commented Mar 1, 2021

As for me, PublishPacketRef should use &TopicName in topic_name field instead of &TopicNameRef.
It will be the only usable solution for me.

Can You change it?

I think TopicNameRef can be everywhere replaced by &TopicName (& is the same thing as "Ref").
TopicNameRef is some mistake struct... But You decide, may be i am wrong...

@zonyitoo
Copy link
Owner

zonyitoo commented Mar 1, 2021

TopicNameRef is for making a str type that is a VALID Topic Name.

mqtt-rs/src/topic_filter.rs

Lines 274 to 278 in e910579

assert!(matcher.is_match(TopicNameRef::new("sport").unwrap()));
assert!(matcher.is_match(TopicNameRef::new("/").unwrap()));
assert!(matcher.is_match(TopicNameRef::new("abc/def").unwrap()));
assert!(!matcher.is_match(TopicNameRef::new("$SYS").unwrap()));
assert!(!matcher.is_match(TopicNameRef::new("$SYS/abc").unwrap()));

It will be the only usable solution for me.

Could you provide a code snippet about this? Is this design causing unfixable compile errors? Are there any uncomfortable use cases?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants