Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ingest] Default ingest node pipeline via index template fails to apply for first document. #32758

Closed
jakelandis opened this issue Aug 9, 2018 · 1 comment · Fixed by #32786
Assignees
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP

Comments

@jakelandis
Copy link
Contributor

A ingest pipeline defined as a default from an index template will not apply the pipeline to the very first document indexed.

curl -XDELETE "http://localhost:9200/test"
curl -XPUT "http://localhost:9200/_ingest/pipeline/test-pipeline" -H 'Content-Type: application/json' -d'
{
  "description" : "test pipeline",
  "processors" : [
    {
      "set" : {
        "field" : "foo",
        "value" : "bar"
      }
    }
  ]
}'
curl -XPUT "http://localhost:9200/_template/template_test" -H 'Content-Type: application/json' -d'
{
    "index_patterns": ["test"],
    "settings": {
        "index.default_pipeline": "test-pipeline"
    }
}'
curl -XPOST "http://localhost:9200/test/_doc/1" -H 'Content-Type: application/json' -d'
{
  "a" : "b"
}'
curl -XPOST "http://localhost:9200/test/_refresh"
curl -XGET "http://localhost:9200/test/_search?pretty"
curl -XPOST "http://localhost:9200/test/_doc/2" -H 'Content-Type: application/json' -d'
{
  "a" : "b"
}'
curl -XPOST "http://localhost:9200/test/_refresh"
curl -XGET "http://localhost:9200/test/_search?pretty"

results in

"hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "a" : "b"
        }
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "a" : "b",
          "foo" : "bar"
        }
      }
    ]
  }

^^ note the the set pipeline did not take effect on the first document.

It works fine when creating the index without without a template

curl -XDELETE "http://localhost:9200/test"
curl -XPUT "http://localhost:9200/_ingest/pipeline/test-pipeline" -H 'Content-Type: application/json' -d'
{
  "description" : "test pipeline",
  "processors" : [
    {
      "set" : {
        "field" : "foo",
        "value" : "bar"
      }
    }
  ]
}'
curl -XPUT "http://localhost:9200/test" -H 'Content-Type: application/json' -d'
{
    "settings" : {
        "index" : {
            "default_pipeline" : "test-pipeline" 
        }
    }
}'
curl -XPOST "http://localhost:9200/test/_doc/1" -H 'Content-Type: application/json' -d'
{
  "a" : "b"
}'
curl -XPOST "http://localhost:9200/test/_refresh"
curl -XGET "http://localhost:9200/test/_search?pretty"
curl -XPOST "http://localhost:9200/test/_doc/2" -H 'Content-Type: application/json' -d'
{
  "a" : "b"
}'
curl -XPOST "http://localhost:9200/test/_refresh"
curl -XGET "http://localhost:9200/test/_search?pretty"

output is as expected:

"hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "a" : "b",
          "foo" : "bar"
        }
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "a" : "b",
          "foo" : "bar"
        }
      }
    ]
  }

cc: @original-brownbear

@jakelandis jakelandis added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP labels Aug 9, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@original-brownbear original-brownbear self-assigned this Aug 10, 2018
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Aug 10, 2018
* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in elastic#32758)
* closes elastic#32758
original-brownbear added a commit that referenced this issue Aug 14, 2018
* INGEST: Create Index Before Pipeline Execute

* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in #32758)
* closes #32758
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Aug 19, 2018
* INGEST: Create Index Before Pipeline Execute

* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in elastic#32758)
* closes elastic#32758
original-brownbear added a commit that referenced this issue Aug 19, 2018
* INGEST: Create Index Before Pipeline Execute

* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in #32758)
* closes #32758
jasontedor pushed a commit that referenced this issue Aug 21, 2018
* INGEST: Create Index Before Pipeline Execute

* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in #32758)
* closes #32758
jakelandis added a commit that referenced this issue Mar 6, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (elastic#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates elastic#32786
Relates elastic#32758 
Closes elastic#36545
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (elastic#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates elastic#32786
Relates elastic#32758 
Closes elastic#36545
jakelandis added a commit to jakelandis/elasticsearch that referenced this issue Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (elastic#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates elastic#32786
Relates elastic#32758 
Closes elastic#36545
jakelandis added a commit that referenced this issue Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
jakelandis added a commit that referenced this issue Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
jakelandis added a commit that referenced this issue Mar 8, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants