Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auto create action #55858

Merged
merged 15 commits into from
May 4, 2020
Merged

Conversation

martijnvg
Copy link
Member

@martijnvg martijnvg commented Apr 28, 2020

Currently the TransportBulkAction detects whether an index is missing and
then decides whether it should be auto created. The coordination of the
index creation also happens in the TransportBulkAction on the coordinating node.

This change adds a new transport action that the TransportBulkAction delegates to
if missing indices need to be created. The reasons for this change:

  • Auto creation of data streams can't occur on the coordinating node.
    Based on the index template (v2) either a regular index or a data stream should be created.
    However if the coordinating node is slow in processing cluster state updates then it may be
    unaware of the existence of certain index templates, which then can load to the
    TransportBulkAction creating an index instead of a data stream. Therefor the coordination of
    creating an index or data stream should occur on the master node. See Auto create data streams using index templates v2 #55377
  • From a security perspective it is useful to know whether index creation originates from the
    create index api or from auto creating a new index via the bulk or index api. For example
    a user would be allowed to auto create an index, but not to use the create index api. The
    auto create action will allow security to distinguish these two different patterns of
    index creation.

This change adds the following new transport actions:

  • AutoCreateAction, the TransportBulkAction redirects to this action and this action will actually create the index (instead of the TransportCreateIndexAction). Later via Auto create data streams using index templates v2 #55377, can improve the AutoCreateAction to also determine whether an index or data stream should be created.

The create_index index privilege is also modified, so that if this permission is granted then a user is also allowed to auto create indices. This change does not yet add an auto_create index privilege. A future change can introduce this new index privilege or modify an existing index / write index privilege.

Currently the TransportBulkAction detects whether an index is missing and
then decides whether it should be auto created. The coordination of the
index creation also happens in the TransportBulkAction on the coordinating node.

This change adds a new transport action that the TransportBulkAction delegates to
if missing indices need to be created. The reasons for this change:
* Auto creation of data streams can't occur on the coordinating node.
  Based on the index template (v2) either a regular index or a data stream should be created.
  However if the coordinating node is unaware of certain index templates then the TransportBulkAction
  could create an index instead of a data stream. Therefor the decision of whether an index or
  data stream should be created should happen on the master node. See elastic#55377
* From a security perspective it is useful to know whether index creation originates from the
  create index api or from auto creating a new index via the bulk or index api. For example
  a user would be allowed to auto create an index, but not to use the create index api. The
  auto create action will allow security to distinguish these two different patterns of
  index creation.
@martijnvg martijnvg added >non-issue :Data Management/Indices APIs APIs to create and manage indices and templates v8.0.0 v7.8.0 labels Apr 28, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Indices APIs)

…IndexAction.

This is needed, because AutoCreateAction will make the decision whether to auto create an data stream
or index, and AutoCreateIndexAction really auto creates the index. Future change will add
AutoCreateDataStream action. If this logic is in a single action then we can't distingues
whether a data stream or index is being auto created, which is important because a user may be
able to auto create a data stream, but not an index or visa versa.
@martijnvg
Copy link
Member Author

@elasticmachine run elasticsearch-ci/1

@martijnvg
Copy link
Member Author

@elasticmachine run elasticsearch-ci/docs

@martijnvg martijnvg marked this pull request as ready for review April 29, 2020 13:09
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @martijnvg , I did an initial review and added a few comments, primarily the one we already discussed on collapsing the two transport actions into one.

Exception suppressed = null;
for (Map.Entry<String, Exception> entry : response.getFailureByNames().entrySet()) {
Exception e = entry.getValue();
if (e == null || ExceptionsHelper.unwrapCause(e) instanceof ResourceAlreadyExistsException) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nicer to handle the ResourceAlreadyExistsException in the auto-create action.

createIndexResponse -> {
// Maybe a bit overkill to ensure visibility of results map across threads...
synchronized (results) {
results.put(name, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The map is in the response named "failureByNames". I think we should either not add successful indices into it, rename the map or separate the two parts in the response.

createIndexRequest.cause(request.getCause());
createIndexRequest.masterNodeTimeout(request.masterNodeTimeout());
createIndexRequest.preferV2Templates(request.getPreferV2Templates());
client.execute(AutoCreateIndexAction.INSTANCE, createIndexRequest, ActionListener.wrap(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By delegating to a separate transport action, we risk that the retry on no longer master happens inside that transport action, making any decision we make on cluster state here invalid (or at least potentially stale). I would much prefer to do the cluster state update here.

That way we could also create all the indices and streams in one go, reducing the number of cluster states published.

I think auto-create does not have to be specific to data stream or index. It goes together with the document level privileges which are also agnostic of whether the data ends up in a data stream or an index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think auto-create does not have to be specific to data stream or index. It goes together with the document level privileges which are also agnostic of whether the data ends up in a data stream or an index.

Yes and this is another reason to keep all the auto creation logic in a single action.

@martijnvg martijnvg marked this pull request as draft May 3, 2020 20:15
martijnvg added 3 commits May 3, 2020 22:28
* One action for auto creating indices and in the future also auto create data streams
* The TransportBulkAction just executes the auto create action like the create index action.
* Avoid the complexity of auto creating indices in a single request / cluster state update.
  This adds quite some complexity while the benefits are likely only noticeable in edge cases
  (if many indices get auto created)) Also on the security side, authorization of the auto
  create indices would become much more complex compared to authorization of auto creating
  an index one at a time.
@martijnvg
Copy link
Member Author

I've changed this PR, so that there is now a single auto create action that auto creates an index. This can then later be used to also auto create data streams. The auto create action also now handle only a single index at a time instead of multiple indices. The additional complexity isn't worth the benefit from auto creating multiple indices in a single remote call / cluster state update. Also this would require additional logic in security to deny / allow specific indices from being auto created.

@martijnvg martijnvg marked this pull request as ready for review May 4, 2020 08:38
@martijnvg martijnvg requested a review from henningandersen May 4, 2020 08:38
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@martijnvg martijnvg merged commit 0a4428c into elastic:master May 4, 2020
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 4, 2020
Backport of elastic#55858 to 7.x branch.

Currently the TransportBulkAction detects whether an index is missing and
then decides whether it should be auto created. The coordination of the
index creation also happens in the TransportBulkAction on the coordinating node.

This change adds a new transport action that the TransportBulkAction delegates to
if missing indices need to be created. The reasons for this change:

* Auto creation of data streams can't occur on the coordinating node.
Based on the index template (v2) either a regular index or a data stream should be created.
However if the coordinating node is slow in processing cluster state updates then it may be
unaware of the existence of certain index templates, which then can load to the
TransportBulkAction creating an index instead of a data stream. Therefor the coordination of
creating an index or data stream should occur on the master node. See elastic#55377

* From a security perspective it is useful to know whether index creation originates from the
create index api or from auto creating a new index via the bulk or index api. For example
a user would be allowed to auto create an index, but not to use the create index api. The
auto create action will allow security to distinguish these two different patterns of
index creation.
This change adds the following new transport actions:

AutoCreateAction, the TransportBulkAction redirects to this action and this action will actually create the index (instead of the TransportCreateIndexAction). Later via elastic#55377, can improve the AutoCreateAction to also determine whether an index or data stream should be created.

The create_index index privilege is also modified, so that if this permission is granted then a user is also allowed to auto create indices. This change does not yet add an auto_create index privilege. A future change can introduce this new index privilege or modify an existing index / write index privilege.

Relates to elastic#53100
martijnvg added a commit that referenced this pull request May 4, 2020
Backport of #55858 to 7.x branch.

Currently the TransportBulkAction detects whether an index is missing and
then decides whether it should be auto created. The coordination of the
index creation also happens in the TransportBulkAction on the coordinating node.

This change adds a new transport action that the TransportBulkAction delegates to
if missing indices need to be created. The reasons for this change:

* Auto creation of data streams can't occur on the coordinating node.
Based on the index template (v2) either a regular index or a data stream should be created.
However if the coordinating node is slow in processing cluster state updates then it may be
unaware of the existence of certain index templates, which then can load to the
TransportBulkAction creating an index instead of a data stream. Therefor the coordination of
creating an index or data stream should occur on the master node. See #55377

* From a security perspective it is useful to know whether index creation originates from the
create index api or from auto creating a new index via the bulk or index api. For example
a user would be allowed to auto create an index, but not to use the create index api. The
auto create action will allow security to distinguish these two different patterns of
index creation.
This change adds the following new transport actions:

AutoCreateAction, the TransportBulkAction redirects to this action and this action will actually create the index (instead of the TransportCreateIndexAction). Later via #55377, can improve the AutoCreateAction to also determine whether an index or data stream should be created.

The create_index index privilege is also modified, so that if this permission is granted then a user is also allowed to auto create indices. This change does not yet add an auto_create index privilege. A future change can introduce this new index privilege or modify an existing index / write index privilege.

Relates to #53100
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jun 3, 2020
The create index action name (`indices:admin/create`) can no longer be used to grant privileges to auto create indices and
instead the `create_index` builtin privilege should be used.
@martijnvg
Copy link
Member Author

I've added the breaking label, because the privilege indices:admin/create can no longer be used to allow auto creation of indices and create_index should be used instead.

The indices:admin/create is a privilege based on an internal transport action and we don't provide backwards compatibility guarantees on transport action names being used as privilege and instead the named privileges should be used.

However there is usage of indices:admin/create privilege and therefor I'm marking this pr as breaking.

martijnvg added a commit that referenced this pull request Jun 4, 2020
The create index action name (`indices:admin/create`) can no longer be used to grant privileges to auto create indices and instead the `create_index` builtin privilege should be used.

Relates to #55858

Co-authored-by: Jake Landis <jake.landis@elastic.co>
martijnvg added a commit that referenced this pull request Jun 4, 2020
The create index action name (`indices:admin/create`) can no longer be used to grant privileges to auto create indices and instead the `create_index` builtin privilege should be used.

Relates to #55858

Co-authored-by: Jake Landis <jake.landis@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants