-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit CS Update Task Description Size #79443
Limit CS Update Task Description Size #79443
Conversation
When working with very large clusters, there's various avenues to create very large batches of tasks that can render as strings with O(MB) length. Since we only use these strings for logging and there is limited value in knowing the exact task descriptions of large numbers of tasks it seems reasonable to put a hard limit on the logging here to prevent hard to work with logs and save some memory in extreme cases.
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already have a thing to do almost exactly this, I suggested using it (and maybe generalising it a bit)
server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java
Show resolved
Hide resolved
|
||
run(updateTask.batchingKey, toExecute, tasksSummary); | ||
private String buildTasksDescription(BatchedTask updateTask, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise here I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it's a little less fun to use the general string builder than in the other case because I think it'd be nice to have the overall task count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused: collectionToDelimitedStringWithLimit
does yield the overall count if it truncated the output. There's no filtering happening here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we have this weird setup here where we have the tasks grouped by source so the counting will only work out correctly if the source is different for each task? I kinda liked having that level of detail on the counts for debugging still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see we use toExecute.size()
rather than processTasksBySource.size()
.
How about either just appending (N tasks in total)
if output.length()
exceeds the limit we set, or else letting collectionToDelimitedStringWithLimit
take some extra detail that it puts into the truncation summary?
@DaveCTurner I dried up one of the two spots, the other one was somewhat impractical to do the same for. Maybe not worth it to add some tricky generalization for that right now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks David! |
When working with very large clusters, there's various avenues to create very large batches of tasks that can render as strings with O(MB) length. Since we only use these strings for logging and there is limited value in knowing the exact task descriptions of large numbers of tasks it seems reasonable to put a hard limit on the logging here to prevent hard to work with logs and save some memory in extreme cases.
When working with very large clusters, there's various avenues to create very large batches of tasks that can render as strings with O(MB) length. Since we only use these strings for logging and there is limited value in knowing the exact task descriptions of large numbers of tasks it seems reasonable to put a hard limit on the logging here to prevent hard to work with logs and save some memory in extreme cases.
* upstream/master: (209 commits) Enforce license expiration (elastic#79671) TSDB: Automatically add timestamp mapper (elastic#79136) [DOCS] `_id` is required for bulk API's `update` action (elastic#79774) EQL: Add optional fields and limit joining keys on non-null values only (elastic#79677) [DOCS] Document range enrich policy (elastic#79607) [DOCS] Fix typos in 8.0 security migration (elastic#79802) Allow listing older repositories (elastic#78244) [ML] track inference model feature usage per node (elastic#79752) Remove IncrementalClusterStateWriter & related code (elastic#79738) Reuse previous indices lookup when possible (elastic#79004) Reduce merging in PersistedClusterStateService (elastic#79793) SQL: Adjust JDBC docs to use milliseconds for timeouts (elastic#79628) Remove endpoint for freezing indices (elastic#78918) [ML] add timeout parameter for DELETE trained_models API (elastic#79739) [ML] wait for .ml-state-write alias to be readable (elastic#79731) [Docs] Update edgengram-tokenizer.asciidoc (elastic#79577) [Test][Transform] fix UpdateTransformActionRequestTests failure (elastic#79787) Limit CS Update Task Description Size (elastic#79443) Apply the reader wrapper on can_match source (elastic#78988) [DOCS] Adds new transform limitation item and a note to the tutorial (elastic#79479) ... # Conflicts: # server/src/main/java/org/elasticsearch/index/IndexMode.java # server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java
When working with very large clusters, there's various avenues to create very large batches of tasks that can render as strings with O(MB) length. Since we only use these strings for logging and there is limited value in knowing the exact task descriptions of large numbers of tasks it seems reasonable to put a hard limit on the logging here to prevent hard to work with logs and save some memory in extreme cases.
Not sure about the numbers here, but I think we need some limits here. We're seeming a couple
of pretty extreme task descriptions when benchmarking.
When working with very large clusters, there's various avenues to create very
large batches of tasks that can render as strings with O(MB) length. Since we only
use these strings for logging and there is limited value in knowing the exact task
descriptions of large numbers of tasks it seems reasonable to put a hard limit
on the logging here to prevent hard to work with logs and save some memory in extreme
cases.