Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
osiris_replica_reader: Stop with
normal
if the leader is already go…
…ne during `init/1` [Why] In the context of RabbitMQ, if a stream queue is deleted right after being declared, there is a chance that some Osiris processes might not be ready yet at the time the queue is deleted. In particular, the `osiris_replica_reader` process monitors the given leader (an `osiris_writer` process in the context of a RabbitMQ stream queue) during its `init/1` and that process might be stopped already. When this happens, here is the crash that is logged: [error] <0.1548.0> ** Generic server <0.1548.0> terminating [error] <0.1548.0> ** Last message in was {'DOWN',#Ref<0.1118981177.1281884162.97904>,process, [error] <0.1548.0> <0.1535.0>,noproc} [error] <0.1548.0> ** When Server state == {state, [error] <0.1548.0> {osiris_log, [error] <0.1548.0> {cfg, [error] <0.1548.0> ".../__delete_queue_1716383944197847531", [error] <0.1548.0> <<"__delete_queue_1716383944197847531">>,500000000, [error] <0.1548.0> 256000,#{},[], [error] <0.1548.0> {write_concurrency, [error] <0.1548.0> #Ref<0.1118981177.1282015234.97903>}, [error] <0.1548.0> {osiris_replica_reader, [error] <0.1548.0> {resource,<<"/">>,queue,<<"delete_queue">>}, [error] <0.1548.0> {127,0,0,1}, [error] <0.1548.0> 6489}, [error] <0.1548.0> #Fun<osiris_writer.0.78287785>, [error] <0.1548.0> #Ref<0.1118981177.1282015234.97826>,16}, [error] <0.1548.0> {read,data,0,tcp,all,8,undefined}, [error] <0.1548.0> undefined,undefined, [error] <0.1548.0> {file_descriptor,prim_file, [error] <0.1548.0> #{handle => #Ref<0.1118981177.1282015238.91045>, [error] <0.1548.0> owner => <0.1548.0>, [error] <0.1548.0> r_buffer => #Ref<0.1118981177.1282015234.97902>, [error] <0.1548.0> r_ahead_size => 0}}}, [error] <0.1548.0> <<"__delete_queue_1716383944197847531">>,tcp, [error] <0.1548.0> #Port<0.84>,<33363.1916.0>,<0.1535.0>, [error] <0.1548.0> #Ref<0.1118981177.1281884162.97904>, [error] <0.1548.0> {write_concurrency, [error] <0.1548.0> #Ref<0.1118981177.1282015234.97903>}, [error] <0.1548.0> {osiris_replica_reader, [error] <0.1548.0> {resource,<<"/">>,queue,<<"delete_queue">>}, [error] <0.1548.0> {127,0,0,1}, [error] <0.1548.0> 6489}, [error] <0.1548.0> -1,0} [error] <0.1548.0> ** Reason for termination == [error] <0.1548.0> ** noproc That is because the `osiris_replica_reader` process receives the `DOWN` message from the leader monitoring with the `noproc` reason. It reuses the reason for its own exit reason. Because this is an abnormal reason, a crash is being logged. [How] There is no reason to log such a crash when the process tree is being shut down concurrently. `osiris_replica_reader` can terminate with a `normal` reason. That is what this patch does: if the leader exit reason is `noproc`, it terminates with the `normal` reason instead.
- Loading branch information