-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure captive core LCL is not ahead of horizon last ingested ledger during restarts #5123
Comments
We could modify Horizon ingestion's termination logic so that we ingest whatever ledgers are left in the in-memory buffer before shutting down. But, I think that will be tricky to implement because of concurrency issues and coupling between the Horizon ingestion state machine and the captive core process manager. Another approach which I think could be promising is to save the ledgers from the in-memory buffer into a file before shutting down. Then when Horizon is restarted it can read the ledgers from the file and insert them into the in-memory buffer before starting up captive core. |
I wonder if we capture metrics on how often it happens that this in-memory buffer is used. Presumably/theoretically, if Horizon is ingesting faster than captive-core is generating ledgers, it shouldn't happen often (or at all?), but obviously we added this for a reason. I ask this because maybe we should explore if it has any utility to us at all, and if so, if we should even have it. There was a recent slack conversation, where @graydon actually pointed out that core has/had attempted to design captive core so as to remove the possibility of this happening (in the case of core itself crashing). And it seems like we re-introduced that possibility by the introduction of this in-memory queue. Context:
|
Yeah the metadata pipe was made synchronous and contains a synchronous flush embedded in core's commit path specifically to allow horizon to make a durability guarantee inside the captive's commit cycle, such that this never happens. Horizon "mitigating" this feature by reading ahead into a non-durable (in-memory) buffer is actually defeating that design. (Thinking about this a bit more, I can see that we might even need a subsequent durability-ack signal from horizon in the form of, say, an additional synchronous dummy frame we write-and-flush and horizon reads in order to ack the previously-flushed non-dummy txmeta; but the idea is absolutely to do-what-is-necessary to prevent his scenario!) |
(if it's a problem to have a captive block while horizon commits, that's a different kettle of fish; we assumed this is not a problem, since captives are guaranteed not to be validators we assumed that it's ok for them to block a bit / fall behind the network a bit, nobody's waiting for their vote) |
Point posed by @tamirms at grooming just now. It seems like core commits as soon as Horizon reads from the pipe, however, at that point, Horizon hasn't committed to it's own database. So there's still a possibility that Horizon could crash or be shut down while processing that ledger, and before it commits to it's own database. The durability-ack idea you propose could potentially solve this. Of course, Horizon would need to remove the in-memory queue on it's side for that to be workable. Either way, it would definitely be valuable for us to understand the history of why that queue was introduced. |
I found the issue which motivated adding the queue: #3132 Another way to solve this issue is if core stored the |
if considering an extended custom serial transport protocol over core's meta o/s pipe with ack messaging, it is worth considering oss network libraries that run a framed transport underneath to accomplish the same, like 0mq |
Core has the |
#4845 added a metric which can be used to calculate how many ledgers are stored in the in-memory queue. Note that these metrics are only available on staging because they have not yet been deployed in production. In the past 30 days we have never accumulated more than 1 ledger in the in-memory queue: |
We discussed this issue during the onsite last week with the stellar core team. Core cannot restart from a ledger earlier than the LCL because we cannot rewind the state of bucket db to a previous ledger. So it looks like we need to implement a solution on the Horizon side (perhaps one of the two options mentioned in #5123 (comment) ) |
Note that, even if we remove the explicit queue, we are still using a pipe to read the data (with an associated buffer). |
As @sreuland says, replacing the pipe with a two-way communication protocol would solve the problem but that's probably a lot of work to implement and propagate downstream. |
two-way protocol could possibly allow us to decouple core and horizon or soroban-rpc in the future. This would be a big win as currently we have to bundle core in docker images and there is no way to mange versions independently. |
Uhm, you still need to manage the captive core process (restart it on error etc ...) |
Yes absolutely - this would have to be folded into the protocol. Core might also need to grow some additional functionality to allow that. I appreciate that this is hard of course but IMO this is the best long term strategy. |
If Horizon is running captive core with BucketsDB enabled, Horizon should be able to restart quickly as long as we run captive core so that it resumes from the LCL (last closed ledger) recorded in the core DB. We have code in Horizon's ingestion library to detect this case:
https://github.com/stellar/go/blob/horizon-v2.27.0-rc2/ingest/ledgerbackend/stellar_core_runner.go#L376-L379
But, there is a scenario where the LCL can be ahead of the last ingested ledger in Horizon's DB:
When captive core writes a ledger to the file pipe, the captive core process (which is single threaded) blocks until the ledger is read out of the file pipe. However, Horizon mitigates the possibility that core is blocked on writing ledgers over the pipe by continuously reading ledgers out of the file pipe until an in-memory buffer is filled up:
https://github.com/stellar/go/blob/horizon-v2.27.0-rc2/ingest/ledgerbackend/buffered_meta_pipe_reader.go
If the speed of Horizon ingestion is faster than the rate at which captive core emits ledgers, the in-memory ledger buffer will not have a chance to accrue many ledgers. But there is always the possibility that, when Horizon receives the shutdown signal, the in-memory ledger buffer will have accumulated some ledgers. In that scenario, the ledgers stored in the buffer are discarded which means that the LCL in the captive core DB is ahead of Horizon's last ingested ledger.
We should mitigate this scenario where the LCL is ahead of horizon's DB so that Horizon can be restarted quickly.
The text was updated successfully, but these errors were encountered: