-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] python客户端,多 master 无 slave,无压力,概率出现分钟级delay #9145
Comments
原因基本找到了,python的客户端每次调用recive只会看一个队列,即使该队列是空,也不立即切换队列,而是要等待invisible duration时长,而这个时长最低要求设置10秒,导致极限情况下,2master共16个队列,要160秒才轮询到那条有消息的队列。 |
python客户端的轮询策略十分的怪异,消费到1条消息立即切换队列,然后没消费到消息就等待一个duration再切。。 |
更正一下,不是受invisible duration影响,而是simple customer对象构造时传入的await duration,实测至少也要5秒,问题依旧在,在队列为空时应该直接切换broker,而不是再轮询一遍已经没消息的broker的其他队列 |
轮询只需要在master broker之间切换即可,比如有2master,就ab的队列0之间切即可,没必要a01234567再b01234567 |
因为a0就已经能把a01234567的消息一次全部收到,b0同理 |
这样改消费者没有问题,发现生产的时候会只走a0b0这两个队列,不会走a1234567和b1234567,注意判定角色 |
看下客户端的逻辑是不是在多个 topic 的时候,是轮询每个 topic 导致的延迟。用不同的多个 simple consumer 试试 |
建议使用 simple consumer 时构建本地缓存,将消费与 receive 操作进行异步化,receive后将消息放到本地缓存中后立刻进行下一次receive,可以有效减少消息的端到端延迟时间 |
Before Creating the Bug Report
I found a bug, not just asking a question, which should be created in GitHub Discussions.
I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.
I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.
Runtime platform environment
ubuntu
RocketMQ version
5.3.1
JDK Version
Java8
Describe the Bug
2个master分摊消息,无slave
采用python客户端:https://github.com/apache/rocketmq-clients
topic:normal类型
1个消费者:循环调用simpleConsumer.receive来接收消息
1个生产者:调用producer.send来发送消息。
流量压力:无任何流量,仅手动测试。
master的flushDiskType已经都设成同步。
问题:概率的,生产者生产之消息之后,延迟半分钟到2分钟,消费者才消费到消息,
且在delay期间:消费者调用了receive函数多次无果。dashboard上查topic的consumer状态是delay 1条消息。
也就是说,master收到了消息,消费者一直尝试消费,但是消费者一直没有消费到。
规避:采用1个master,无此问题。
或者采用1个master,1个slave,无此问题。
多master,每master配备slave,有此问题。
多master,无slave,有此问题。
即仅在多个master分摊流量时,有此问题。
Steps to Reproduce
见bug描述
What Did You Expect to See?
mq的含义是实时消费,不管在任何组网下,单发测试都不能有分钟级的延迟。
期待多个master分摊消息,消费者组中消费者的数量任意,在此情况下,依然保持消息能实时消费。
What Did You See Instead?
另希望介绍一下normal类型topic的普通消费机制
猜测本问题,是否是因为:消费者某一时刻仅能监听1个master?如果有多个master,该消费者会在master之间切换队列,导致延迟?
如果是这样,请给出解决方案。
Additional Context
No response
The text was updated successfully, but these errors were encountered: