-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce no_items_on_success option to the bulk API #55088
base: main
Are you sure you want to change the base?
Introduce no_items_on_success option to the bulk API #55088
Conversation
Pinging @elastic/es-distributed (:Distributed/CRUD) |
builder.startArray(ITEMS); | ||
for (BulkItemResponse item : this) { | ||
item.toXContent(builder, params); | ||
if (noFailuresAnNoItemsRequested == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great if the decision on whether to include the items can be made earlier.
Maybe in TransportBulkAction
in the finishHim()
method around line 541?
This would then save creating bulk item response array.
Would be great if we could make this decision even earlier, so that no shard level response items
are collected. Then in the case that there are failures then we could serialize the successful items
as null elements? But then I'm afraid that we break the bulk response format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I could do the first, that would definitely help 👍
Not sure the second warrants the additional complexity? The goal of this PR is to shave off serialization time on the consumer. Perhaps something to consider once this lands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martijnvg revisiting this I think it's best not to introduce the complexity in the transport layer and only expose this on the REST layer.
It would introduce a set of new problems
- shards would need to send back how many items they would have returned
- reducing all shard information becomes more complex to expand the
empty
shard responses - We need to find a new representation for successful bulk operations in case of failures.
The PR as it stands does not preclude taking this on as follow up work but is useful as is already.
Keen to hear your thoughts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mpdreamz I agree, this PR is useful in its current form.
However I think it makes sense to not include the items in the bulk response if there are no failures and no_items_on_success is set. This can simply be done by iterating over the response items and see if there is an error and then have an additional constructor of bulk responses which accepts no response items and noFailuresAnNoItemsRequested
(or sets that always to true).
2b7ab7d
to
c9f3cb0
Compare
a4a4c7d
to
99654a5
Compare
99654a5
to
2d393f4
Compare
server/src/main/java/org/elasticsearch/rest/action/document/RestBulkAction.java
Show resolved
Hide resolved
@@ -153,9 +167,14 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws | |||
builder.field(INGEST_TOOK, ingestTookInMillis); | |||
} | |||
builder.field(ERRORS, hasFailures()); | |||
boolean noItems = noItemsOnSuccess != null && noItemsOnSuccess; | |||
boolean noFailuresAndNoItemsRequested = noItems && this.hasFailures() == false; | |||
builder.field(ITEMS_OMITTED, noFailuresAndNoItemsRequested); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can return items
with null
value instead of returning the items_omitted
field?
It looks like we currently never return null
, so we could use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The third state for items_omitted
adds more complexity to the users of the API IMO. Knowing the property is always there makes checking for it all the easier.
This allows the bulk to return an empty array for `items` if so explicitly instructed by the user. This also introduces an extra `items_omitted` property to broadcast the act of omitting the items from the response very loudly.
…if its never going to be serialized
af86291
to
e623aa6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few more comments.
@@ -4,7 +4,7 @@ | |||
<titleabbrev>Bulk</titleabbrev> | |||
++++ | |||
|
|||
Performs multiple indexing or delete operations in a single API call. | |||
Performs multiple indexing or delete operations in a single API call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe undo these whitespace changes?
|
||
public static final long NO_INGEST_TOOK = -1L; | ||
|
||
private final BulkItemResponse[] responses; | ||
private final long tookInMillis; | ||
private final long ingestTookInMillis; | ||
@Nullable | ||
private final Boolean noItemsOnSuccess; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be a primitive boolean? I think we default to false
.
|
||
public BulkResponse(BulkItemResponse[] responses, long tookInMillis, long ingestTookInMillis, Boolean noItemsOnSuccess) { | ||
final boolean hasError = responses != null && Arrays.stream(responses).anyMatch(r-> r != null && r.isFailed()); | ||
this.responses = noItemsOnSuccess != null && noItemsOnSuccess && hasError == false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment that the bulk items are omitted because these are no longer needed if noItemsOnSuccess
is true, which would allow the gc to clean the items up before the response has been serialized over the network.
@@ -49,28 +50,42 @@ | |||
private static final String ERRORS = "errors"; | |||
private static final String TOOK = "took"; | |||
private static final String INGEST_TOOK = "ingest_took"; | |||
private static final String ITEMS_OMITTED = "items_omitted"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename to no_items
? Then it is more inline with the field name noItemsOnSuccess
?
This allows the
_bulk
API to return an empty array foritems
if so explicitly requested by the user.This also introduces an extra
items_omitted
property to broadcast the act of omitting the items from the response very loudly.If
no_items_on_success
is set but the bulk has failuresitems
will always return all the items anditems_omitted
will befalse
.