Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESArchiver + System Indices == 🥰 or 😭 #83592

Closed
kobelb opened this issue Nov 17, 2020 · 17 comments
Closed

ESArchiver + System Indices == 🥰 or 😭 #83592

kobelb opened this issue Nov 17, 2020 · 17 comments
Labels
discuss Project:SystemIndices Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Operations Team label for Operations Team

Comments

@kobelb
Copy link
Contributor

kobelb commented Nov 17, 2020

As part of the effort to switch to system indices, the Kibana Core team is planning on exposing a SystemIndices ES client that will be used to interact with Kibana's system indices.

Will ESArchiver be able to take advantage of this SystemIndices ES client, or will it also need to be independently updated to use the new ES APIs?

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@kobelb kobelb mentioned this issue Nov 17, 2020
5 tasks
@spalger
Copy link
Contributor

spalger commented Nov 17, 2020

It doesn't look like any of the code for that client has been written yet, so as long as it's implemented as a package then it shouldn't be a problem. If it's not a package we could probably hack a way to consume it, but it should definitely be written with the intention of being consumed outside of the standard Kibana environment if we want to be able to consume it from the ES Archiver as well.

We trigger the migration logic by pinging a URL that the core exposes, it might make more sense to just write a few APIs for the ES Archiver to hit in order to save/load an archive rather than writing the client in a way that's consumable outside of core.

@kobelb
Copy link
Contributor Author

kobelb commented Nov 17, 2020

It's always felt a bit off to me that we use the ES Archiver to do test setup for the .kibana index, when we have saved-object HTTP APIs for importing saved-objects. Do you happen to know whether there are reasons why we would have to access the indices directly?

@spalger
Copy link
Contributor

spalger commented Nov 18, 2020

We've thought through it a couple times but always come up with a familiar feeling reason it's won't work... I agree that it would be ideal if we could just use the APIs, but I can't remember what it was that stopped us from doing that at least twice in the past... @LeeDr do you happen to remember?

@rudolf
Copy link
Contributor

rudolf commented Nov 19, 2020

Using the import API would allow core to remove some hacks that allows esArchiver to migrate documents that were inserted into .kibana after kibana is in the start phase. It would also solve #41352

If we were to convert esarchiver archives into saved object export ndjson files we would have to migrate all the documents in these archives. We've sort of been relying on these old documents as an implicit test for plugin's migration transform functions. But these should rather be replaced with intentional/explicit tests.

@pgayvallet
Copy link
Contributor

pgayvallet commented Nov 19, 2020

If it's not a package we could probably hack a way to consume it,

From the current state of the discussion, it's probably going to be a custom transport on the new es client to basically prefix every requests with /.kibana (or any other system namespace). Doing so inside the archiver should be fairly easy (hardest part would be probably to migrate to the new ES client). Extracting a part of core into a package is always some tedious work, and even if I'm not opposed to it, I would really like that we weight the pro's and con's before doing so.

it's always felt a bit off to me that we use the ES Archiver to do test setup for the .kibana index, when we have saved-object HTTP APIs for importing saved-objects.

I overall agree, and think we should really exposes a new FTR service for that. SO test datasets should be in the export (ndjson) format, and imported using SO apis. That would also also allow to get rid of the atrocious, 100 times duplicated, mappings files in the FTR suites (that got a LOT of inconsistencies), and avoid to adapt these mapping files when backporting to 7.x. In 99% of the cases (except when testing migration), the mappings should be generated by the SO service/migrator itself, as done in a 'real' environment. And make the process of creating new dataset more easy, as I would be able to just export the data I need using the UI.

Now, that's always easier to say than to do...

  • All calls to load SOs will need to be performed AFTER the test kibana instance is ready. Not sure that's the case?
  • (follow-up) I suspect there might be some edge cases where we need the objects to be present in the index before kibana actually starts. Could this be the case with spaces or config objects for example?
  • In xpack 'mode', RBAC is plugged on the SO apis. Would that be an issue?
  • When importing data to multiple spaces, I think we would need to perform multiple imports? Meaning that we would have to split the dataset into multiple ndjson files?
  • We still need a way to test SO type migrations (we do, right?), and I guess having functional tests for that makes more sense than only unit-testing them? In that case, we would still need to adapt the archiver to play nice with system indices
  • And, last but not least, as @rudolf mentionned, we would need to migrate all the existing dataset to this new format. Given the volume, it means creating a tool to do so.

@pgayvallet pgayvallet added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Nov 19, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@pgayvallet
Copy link
Contributor

Adding the team for the part of the discussion around the SO-based alternative

@kobelb
Copy link
Contributor Author

kobelb commented Nov 19, 2020

If we do end up deciding that in the long-term we should no longer be using ESArchiver to setup the Kibana system-indices, is my understanding correct that this would likely be a large effort? If so, are there any short-term solutions we should explore besides exposing a "package" that allows ESArchiver to re-use the approach we intend to take in core?

@pgayvallet
Copy link
Contributor

If we do end up deciding that in the long-term we should no longer be using ESArchiver to setup the Kibana system-indices, is my understanding correct that this would likely be a large effort

It would need some preliminary code inspection to see if this is large or very large, but yea, that would not be a minor change.

If so, are there any short-term solutions we should explore besides exposing a "package" that allows ESArchiver to re-use the approach we intend to take in core?

The only alternative I see would be to duplicate this system-indices logic to the es-archiver.

It will depends on the custom transport's implementation (and if it is enough or not), but in the case the system-indices specific code is small enough, and if we all agree that mid/long term we'd rather have a new service to use the SO API from the archiver, I would rather duplicate this small bit of code in the archiver rather than extracting code from core just for that.

Extracting this (yet to be) system-indices client from core would mean extracting the client code that currently lives in src/core/server/elasticsearch/client, which would also requires to extract the elasticsearch service's config and schema to the same package (or refactor the code to remove the coupling between the server and the client's configuration type. Went there, did that, for both the config and the logging services, and I can assure you, that's something that should only be done in case of extreme necessity.

That's just my personal opinion though, would like to see what the other members of @elastic/kibana-core think about it.

We also might want to wait until we actually implement this system indices client in core before continuing the discussion about the options regarding the archiver.

@kobelb
Copy link
Contributor Author

kobelb commented Nov 19, 2020

It will depends on the custom transport's implementation (and if it is enough or not), but in the case the system-indices specific code is small enough, and if we all agree that mid/long term we'd rather have a new service to use the SO API from the archiver, I would rather duplicate this small bit of code in the archiver rather than extracting code from core just for that.

👍

@spalger
Copy link
Contributor

spalger commented Nov 19, 2020

Yeah, it sounds like reimplementing the logic necessary for writing to the system indices in the esArchiver will be pretty simple and it's not worth it to figure out how to share the code.

@rudolf
Copy link
Contributor

rudolf commented Nov 19, 2020

I agree this is at least a medium size effort and hopefully we can spend some time to make the developer experience better while working on it.

The only alternative I see would be to duplicate this system-indices logic to the es-archiver.

Agreed.

All calls to load SOs will need to be performed AFTER the test kibana instance is ready. Not sure that's the case?

We might be able to solve this with #2310

@LeeDr
Copy link

LeeDr commented Jan 14, 2021

I'm digging my way back through my email and came across this. The functional tests use EsArchiver for 2 kinds of data;

  1. test data like logstash_functional, etc which is not a system index kind of thing - this could potentially be replace with snapshots
  2. .kibana index data. Back when we started automating functional tests, the save object import/export didn't include everything in the .kibana index. I'm not sure that it still does? Snapshots could potentially be used here as well?

I have never dug too deep into using snapshots with functional tests. I know years ago I though configuring snapshots was pretty cumbersome but I'm sure it could be automated. I think we'd still want to store them in the .kibana repo so that they would be version controlled along with the tests. So a local file system snapshot restore.

@rudolf
Copy link
Contributor

rudolf commented Jan 15, 2021

kibana index data. Back when we started automating functional tests, the save object import/export didn't include everything in the .kibana index. I'm not sure that it still does? Snapshots could potentially be used here as well?

Yes this is still true, but all saved objects will become importable/exportable as part of removing kibana.index #81536

Snapshots could work. The only downside is that they're not easily inspectable, so if you make a change and update the snapshot, it won't be possible to code review the snapshot changes.

@spalger
Copy link
Contributor

spalger commented Jan 15, 2021

Using a filesystem repository for snapshots would mean that our tests would no longer be compatible with cloud, so I don't think that's going to work unless we also add the ability to get those snapshots in cloud somehow...

@kobelb
Copy link
Contributor Author

kobelb commented Jan 29, 2021

With the introduction of external unmanaged system-indices, the question has been answered: ESArchiver + System Indices == 🥰

Closing this issue in favor of #89805

@kobelb kobelb closed this as completed Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Project:SystemIndices Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Operations Team label for Operations Team
Projects
None yet
Development

No branches or pull requests

6 participants