Fix the bug that healthy sentinel displays ERROR on the codis-fe #1730
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
现象
类似 #1345,查看 dashboard 状态,发现错误位于
Info
或masterCommand
解析回包失败处:查看 sentinel 客户端列表发现,dashboard 和 sentinel 之间的连接一直未关闭,不符合代码中出现 error 就 close 连接的逻辑:
# age=1247007 id=3499285 addr=127.0.0.227:43720 fd=53 name= age=1247007 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=info
通过抓取 sentinel 收到的 Redis 请求发现显示 ERROR 的 sentinel 意外收到了两次
INFO
命令:长时间观察发现 dashboard 所维护的 redis 客户端的
Recv-Q
一直不为 0:经过 tcpdump 抓包发现 Redis 客户端和 Sentinel 交互没有问题,问题出在 dashboard 从 TCP 缓冲区拉取数据时每次恰好都是拿到上一条请求的回包,从而导致解析回包错误:
INFO
SENTINEL masters
INFO
INFO
INFO
请求的回包SENTINEL masters
INFO
INFO
SENTINEL masters
INFO
INFO
INFO
请求的回包SENTINEL masters
INFO
INFO
SENTINEL masters
发生此现象的原因猜测
在 newRedisStats 某次超时时,并未关闭该客户端,而是利用 defer 将未从输入缓冲区拉取请求回包的 redis 客户端添加至连接池导致的。