Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp/lighthorizon: Refactor codebase to avoid using HistoryArchiveBackend as a filesystem interface #4460

Closed
Tracked by #4571
Shaptic opened this issue Jul 15, 2022 · 6 comments
Assignees

Comments

@Shaptic
Copy link
Contributor

Shaptic commented Jul 15, 2022

Remove kludges around using historyarchive.ArchiveBackend as a filesystem interface.

Currently, we're using historyarchive.ArchiveBackend as a way to access raw ledgers from the filesystem. This is pretty hacky since these aren't raw LedgerCloseMetas but rather a custom format on top of them (see #4450), so it doesn't make sense as a LedgerBackend anyway, plus we want to be able to use the version and maybe read other config data from the directory.

There's a little bit more context here; the general point is that we shouldn't hack in the ledgerbackend interface for Horizon Lite.

@Shaptic Shaptic mentioned this issue Jul 15, 2022
7 tasks
@Shaptic Shaptic changed the title Remove kludges around using historyarchive.ArchiveBackend as a filesystem interface. exp/lighthorizon: Refactor codebase to avoid using HistoryArchiveBackend as a filesystem interface Jul 15, 2022
@bartekn
Copy link
Contributor

bartekn commented Jul 18, 2022

@Shaptic I'm catching up after Starbridge so let me know if I understand this correctly: LedgerCloseMeta archives will be completely separate from existing archives generated by Stellar-Core. If it's true, maybe we should leave historyarchive.ArchiveBackend (and historyarchive package in general) as is and just create a new ledgermetaarchive package solely for new archive purposes.

Also, I think the design of ArchiveBackend as filesystem makes sense because it's used to access archives in different backends like http, s3 or local file system. I think this can be also true for new archives so in the end we could extract ArchiveBackend to a separate helper package shared by historyarchive and ledgermetaarchive.

WDYT?

@Shaptic
Copy link
Contributor Author

Shaptic commented Jul 18, 2022

LedgerCloseMeta archives will be completely separate from existing archives generated by Stellar-Core.

Yes, definitely separate and most likely storing add'l information or in a different format (i.e. not exclusively a serialized LedgerCloseMeta).

If it's true, maybe we should leave historyarchive.ArchiveBackend (and historyarchive package in general) as is and just create a new ledgermetaarchive package solely for new archive purposes.

That's a great idea! For context, though, the HistoryArchiveBackend and the FsCacheArchiveBackend were both written by @paulbellamy specifically for Light Horizon. That means we can remove them from the historyarchive package and refactor them in whatever form we'd like for the ledgermetaarchive package.

I do agree that we should (and must, in fact) keep it backend-agnostic. The main pain point is that ArchiveBackend.GetLedger() returns a LedgerCloseMeta, while we want it to return our format (for now called SerializedLedgerCloseMeta). So I'm not sure if extracting it entirely into a separate package is the right idea because of that, but I do like it in principle because we have this "backend-agnostic filesystem emulation" paradigm in quite a few places.

@bartekn
Copy link
Contributor

bartekn commented Jul 19, 2022

I do agree that we should (and must, in fact) keep it backend-agnostic. The main pain point is that ArchiveBackend.GetLedger() returns a LedgerCloseMeta

I think there's one misconception here. ArchiveBackend doesn't have GetLedger method, ArchiveInterface does. So we can reuse ArchiveBackend to read raw bytes from archives and then decode them into appropriate structs in higher level package.

@bartekn
Copy link
Contributor

bartekn commented Jul 19, 2022

🤔 So I spend a little more checking the latest code in lighthorizon branch and I think the current structure is fine (I got confused by ArchiveBackend vs ArchiveInterface case above):

  • FsCacheBackend is really just getting files using embedded ArchiveBackend with caching built-in.
  • HistoryArchiveBackend is implementing LedgerBackend. It's using ArchiveBackend simply for getting raw bytes and actual decoding happens inside HistoryArchiveBackend which looks good. The only thing I'd change is maybe a name because now it's easy to mistake it with normal history archives (maybe LedgerMetaArchiveBackend?).

I wonder if we should touch this at all. Eventually we could create the new package (ledgermetaarchive as mentioned above) that encapsulates the structure of files and also contains the code responsible for pushing the files (using one of the existing ArchiveBackends). Then the HistoryArchiveBackend is just a super thin adapter for ledgermetaarchive struct. But maybe such refactoring is not urgent now? We can have a quick chat if you want.

@Shaptic
Copy link
Contributor Author

Shaptic commented Jul 19, 2022

maybe LedgerMetaArchiveBackend

+1 👍

I got confused by ArchiveBackend vs ArchiveInterface case above

I think I did too, apologies for that 🤦 I think everywhere I said historyarchive.ArchiveBackend, I actually meant ingest.LedgerBackend.

I think the main concern is that LedgerBackend.GetLedger() returns xdr.LedgerCloseMeta when we probably want it to return xdr.SerializedLedgerCloseMeta so that we can easily add fields there in the future and have them accessible from Light Horizon. Of course this isn't critical now since there's only the v0 right now that returns only the meta.

But already for example it's slightly problematic that a repository of unpacked ledgers like the one @2opremio just made is not directly associated with a network (pubnet or testnet). The network passphrase isn't stored anywhere and it probably should be. Maybe not per ledger but definitely per repository. And we'd need a way to access this from Light Horizon, but the LedgerBackend interface is too... restricting? Maybe we can keep it implemented, but add more functionality?

We also have another Archive interface in exp/lighthorizon/archive that's a weird amalgamation of a subset of LedgerBackend and TransactionReader... It describes the minimum necessary for Light Horizon.

Basically, getting to a single sane state would be great 😂 As a whole, I'm not entirely sure about the best way to refactor this all, or if it's even necessary as you noted, but I want to throw out that there will be more requirements on the backend interface soon, so we should handle that sooner rather than later if we can. I think I'll leave it up to you, since your picture of the codebase is roughly the same as mine now 👍

@bartekn bartekn self-assigned this Jul 28, 2022
@bartekn
Copy link
Contributor

bartekn commented Aug 1, 2022

I think that #4488 implemented the changes we discussed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants