Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace direct access of hidden indices with system indices api #12279

Merged
merged 9 commits into from
Oct 6, 2020

Conversation

kaisecheng
Copy link
Contributor

What does this PR do?

Replace direct access of elasticsearch hidden indices with elasticsearch system indices API

Why is it important/What is the impact to the user?

Elasticsearch has allowed other services to manipulate hidden indices directly for a long time. Recently, ES team introduced the System Indices and Hidden Indices concepts to replace indices that start with a dot, eg .logstash. These dot indices are an implementation detail that users should not interact with. Therefore, ES restricst the access by introducing a new restful API, system indices API. (elastic/elasticsearch#50251)

User who has cluster permission manage_logstash_pipeline can call system indices API

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

related to elastic/elasticsearch#50251 elastic/elasticsearch#53350

Use cases

Screenshots

Logs

Copy link
Contributor

@roaksoax roaksoax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a quick look over the PR and while it seems sane, there's one thing that wasn't considered (which Joao brought to my attention).

Nothing stops Logstash from using an older version of ES (< 7.10), and if that were to happen, logstash would fail to communicate. So this needs to be adapted to use the system index API starting from stack version 7.10+. For 7.9 and before, we use the old way.

@roaksoax roaksoax requested review from jsvd and removed request for robbavey September 29, 2020 13:00
Copy link
Contributor

@roaksoax roaksoax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a full review, as I'm not a Ruby expert, but I like the implementation path you took.

That said, I think you should add comments to the code that explain what's going to to make it easier for reviewers or code readers.

@kaisecheng
Copy link
Contributor Author

jenkins test this please

raise RemoteConfigError, "Cannot find elasticsearch version, server returned status: `#{response["status"]}`, message: `#{response["error"]}`"
end

logger.debug("Elasticsearch version ", response["version"]["number"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it stands, this line won't output the value of `response["version"]["number"] - you can update this by either:
logger.debug("Elasticsearch version {}", response["version"]["number"]),
```logger.debug("Elasticsearch version #{response['version']['number']}")``` or
```logger.debug("Elasticsearch version ", :version => response["version"]["number"])```

You might also want to add some context as to what is happening, such as "Reading configuration from Elasticsearch version..."

@@ -50,6 +49,21 @@ def config_conflict?
false
end

# decide using system indices api (7.10+) or legacy api (< 7.10) base on elasticsearch server version
def pipeline_fetcher_factory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe a different name for the method, such as get_pipeline_fetcher?

end
end

# TODO clean up LegacyHiddenIndicesFetcher when 7.9.* is deprecated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create an issue to remove this, and put a link in here

client.get("#{SYSTEM_INDICES_API_PATH}/#{path_ids}")
end

def format_response(response)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to expose this method, which is always required by the caller? Or could the fetch_config method return the formatted response?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will put format_response to fetch_config

@@ -63,33 +77,27 @@ def pipeline_configs
end
end

response = fetch_config(pipeline_ids)
fetcher = pipeline_fetcher_factory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe call this method get_pipeline_fetcher rather than factory?

client.get("#{SYSTEM_INDICES_API_PATH}/#{path_ids}")
end

def format_response(response)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to expose this method, or could it be done as part of fetch_config?


if response["found"] == false
def get_pipeline(pipeline_id, response, fetcher)
if response.has_key?(pipeline_id) == false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Consider using unless instead of if X == false, eg unless response.has_key?(pipeline_id)

@@ -193,5 +186,63 @@ def client
@client ||= build_client
end
end

class SystemIndicesFetcher
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this might be simpler if we kept the response object in this class, and had methods like 'config_exists?(pipeline_id)', get_pipeline_config(pipeline_id) and get_pipeline_settings(pipeline_id). This would avoid having to pass around the response and fetcher objects

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question to me is if we want to further refactor the existing code or the goal is to apply the new API in a manageable way. This involved thirty lines of code, mainly moving get_pipeline and the fetcher in a OO way, which is not a big change. At the same time, the readability of the current version is quite similar to the existing one. I am opened to the suggestion. Do you think we should refactor the code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there is a huge amount of refactoring to the existing code either way, beyond what is already present; you already have the method fetcher.get_single_pipeline_setting(response, pipeline_id)["pipeline"]
which could change to something like fetched_config.get_pipeline_settings(pipeline_id), although I do realize that there is more work to be done in the extra classes that you have added.

I'm comfortable either way, what you have appears to be functionally correct after running this code locally against Elasticsearch 7.9 and 8.0

Copy link
Member

@robbavey robbavey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kaisecheng kaisecheng merged commit 999601c into elastic:master Oct 6, 2020
kaisecheng added a commit that referenced this pull request Oct 6, 2020
* replace direct hidden indices access with system indices api

* fulfill backward compatibility

* fix log msg, rename class, simplify response handling

* modularise fetcher
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants