Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizon Lite: Automate ledger cache management for the web service. #4526

Closed
1 of 3 tasks
Tracked by #4317
sreuland opened this issue Aug 9, 2022 · 6 comments
Closed
1 of 3 tasks
Tracked by #4317
Assignees

Comments

@sreuland
Copy link
Contributor

sreuland commented Aug 9, 2022

A crucial part of getting Horizon Lite to an acceptable level of performance is caching ledgers locally to avoid latency spikes from downloading ledgers.

There's a toolkit to "pre-load" the cache in addition to the existing behavior of caching ledgers on use, e.g.:

exp/lighthorizon> ./lighthorizon cache build --start <ledger>

This same process should be done in parallel on startup of the web service if a parameter is provided, and it should subsequently be integrated into the k8s deployment.

Acceptance Criteria

  • An operator can easily control how much index data is pre-cached locally based on number of ledgers as a web server startup parameter. Enable parallel cache warm up, web server should be responsive as early as possible, using the cache immediately, of which it's state can be changing as it's warmed up from async population process which does not interfere with concurrent web read access attempts to the index.

Sub tasks

@Shaptic Shaptic changed the title exp\lighthorizon: web service automated management of txmeta ledger cache exp/lighthorizon: web service automated management of txmeta ledger cache Aug 9, 2022
@Shaptic Shaptic mentioned this issue Sep 1, 2022
7 tasks
@Shaptic Shaptic changed the title exp/lighthorizon: web service automated management of txmeta ledger cache Horizon Lite: Automate ledger cache management for the web service. Sep 1, 2022
@jcx120 jcx120 moved this to Backlog in Platform Scrum Sep 1, 2022
@jcx120 jcx120 moved this from Backlog to Next Sprint Proposal in Platform Scrum Sep 1, 2022
@sreuland
Copy link
Contributor Author

sreuland commented Sep 1, 2022

@Shaptic , wanted to confirm on scope, this would be one new startup parameter to the horizon light web server, like --preload-cache=true|false and the behavior is that web server will launch an async parallel routine that executes the ./tools cache build --start <ledger> , and can it just pass 0 for --start to default to earliest ledger in the source index?

@Shaptic
Copy link
Contributor

Shaptic commented Sep 1, 2022

That's a great question 🤔 I don't think starting at the earliest available ledger is a good idea. Based on our research, recent ledgers make much more sense to cache up-front. Perhaps we can do something like --preload-cache=<count> which does a cache build --start $(latest - count) --count <count> (where latest is a pseudo-placeholder for the latest ledger in the index [or meta? this is hard to determine from the index] store).

As for async/parallel, my perspective was more that it happens before the web server launches, i.e. if you want to deploy the webserver, it first loads X ledgers into the cache, then starts the webserver. So it's almost part of the provisioning/installation/setup step of deployment. I'm concerned about synchronization and race conditions issues if it's done in parallel to serving requests.

@paulbellamy
Copy link
Contributor

I think (if possible) it would be better for it to happen in the background after the webserver is launched. That way you can launch quickly, and still have a hot cache. Though that might be more complicated, as you would be serving requests so you'd want those cache fills to take precendent so they don't get evicted. If the operator is concerned about cache latency they could use a ramp-up or delay config option in their load-balancer (I forget what HAProxy calls this). But IMO it is better to be serving requests slowly than refusing until the cache is populated (which may or may not even help, depending on the queries).

As for what to populate initially, for the default we use 90 days of ledgers. so maybe the latest 90 days? or maybe even just the latest 30 days, then start serving requests? Depending on cache size it could be cache_max / 3 of the latest ledgers?

@2opremio
Copy link
Contributor

2opremio commented Sep 2, 2022

If you want to do it at the start, k8s supports init containers https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

This means that as long as the tool is containerized, you don’t need do any any further scripting or embed the command in horizon light itself

@sreuland sreuland self-assigned this Sep 2, 2022
@sreuland sreuland moved this from Next Sprint Proposal to Current Sprint in Platform Scrum Sep 2, 2022
@sreuland sreuland moved this from Current Sprint to In Progress in Platform Scrum Sep 2, 2022
@sreuland
Copy link
Contributor Author

@Shaptic , any thoughts on lightweight/fast approaches for getting the 'latest' indexed ledger from the index store. I see the explorer tool index stats file:///tmp/indices could get the latest checkpoint ledger from index store, but it iterates every account first, so that seems expensive for this purpose. My first thought was to expand the scope here to change index processor to emit a 'latest' file at root directory of index store, it just contains one line with greatest sequence number in the index store, then retrieval is O(1), as single file read op.

@sreuland sreuland moved this from In Progress to Current Sprint in Platform Scrum Sep 16, 2022
@Shaptic
Copy link
Contributor

Shaptic commented Sep 16, 2022

You're right, @sreuland, it's pretty much impossible without changes to the index builder (hence the necessity for the tool). I think the initial design meant that the ledger source and the index source were "in sync," but otherwise it's not feasible without additional metadata.

@Shaptic Shaptic moved this from Current Sprint to In Progress in Platform Scrum Sep 27, 2022
@Shaptic Shaptic moved this from In Progress to Needs Review in Platform Scrum Sep 27, 2022
@Shaptic Shaptic moved this from Needs Review to Done in Platform Scrum Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

6 participants