Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add require_data_stream feature #101872

Merged
merged 34 commits into from
Jan 18, 2024
Merged

Conversation

eyalkoren
Copy link
Contributor

Closes #97032

Adding the ability to set require_data_stream parameter (boolean) on several APIs.
For document indexing and update, this flag affects also auto creation of underlying index - if set to true, an index will be created only if a matching index template is found and it contains a data stream template.

@eyalkoren eyalkoren added >feature :Data Management/Data streams Data streams and their lifecycles v8.12.0 labels Nov 7, 2023
@eyalkoren eyalkoren self-assigned this Nov 7, 2023
@elasticsearchmachine elasticsearchmachine added the external-contributor Pull request authored by a developer outside the Elasticsearch team label Nov 7, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @eyalkoren, I've created a changelog YAML for you.

axw added a commit to axw/apm-server that referenced this pull request Nov 17, 2023
We'll switch to setting require_data_stream when
it's available:
elastic/elasticsearch#101872
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good general direction to me. I think we should check whether writing to an alias that points to a data stream will still work with the require_data_stream flag (and potentially whether it should or should not.) I left a few minor comments since this is still a draft PR

@felixbarny felixbarny assigned dakrone and unassigned eyalkoren Dec 8, 2023
@dakrone dakrone changed the title [WIP] Adding require_data_stream feature Add require_data_stream feature Dec 11, 2023
@dakrone dakrone marked this pull request as ready for review December 11, 2023 15:40
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Dec 11, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@dakrone dakrone removed the request for review from jbaiera December 11, 2023 15:41
@dakrone dakrone requested a review from jbaiera January 9, 2024 23:11
@dakrone
Copy link
Member

dakrone commented Jan 9, 2024

Alright, I've pulled out the update support, and made this flag mean "if require_data_stream is set, the indexing operation needs to be targeting a data stream, or a data stream that will be created by a template". I've also added more tests for this. Hopefully it's clearer and not too bad to review.

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚢

@dakrone
Copy link
Member

dakrone commented Jan 17, 2024

@elasticmachine update branch

@dakrone dakrone merged commit 6f4e293 into elastic:main Jan 18, 2024
15 checks passed
@axw
Copy link
Member

axw commented Jan 22, 2024

🎉 thank you @dakrone and @eyalkoren!

@leandrojmp
Copy link
Contributor

leandrojmp commented Mar 26, 2024

Hello @dakrone and @eyalkoren

Does this change have any impact when using Logstash elasticsearch output to index data on data streams, but on data streams that have a custom naming pattern?

We are planning the upgrade to 8.13 and checking the release notes, we have Logstash writing to a custom data stream and since logstash does not support custom data stream names, we use data_stream => false and on the index option we point to the data stream alias name and use the action as a create.

Our logstash outputs are like this:

output {
    elasticsearch {
        hosts => ["HOSTS"]
        index => "data-stream-name"
        action => "create"
        http_compression => true
        data_stream => false
        manage_template => false
        ilm_enabled => false
        cacert => 'ca.crt'
        user => 'USER'
        password => 'PASSWORD'
    }
}

Not sure if this change here will break this.

@axw
Copy link
Member

axw commented Mar 27, 2024

@leandrojmp this is an opt-in feature, so it will have no effect on your use case. You would need to reconfigure Logstash to pass ?require_data_stream=true in the _bulk API requests for it to have any effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles external-contributor Pull request authored by a developer outside the Elasticsearch team >feature Team:Data Management Meta label for data/management team v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to prevent auto-creating index with no matching index template
9 participants