Limit CS Update Task Description Size #79443

original-brownbear · 2021-10-19T09:43:10Z

Not sure about the numbers here, but I think we need some limits here. We're seeming a couple
of pretty extreme task descriptions when benchmarking.

When working with very large clusters, there's various avenues to create very
large batches of tasks that can render as strings with O(MB) length. Since we only
use these strings for logging and there is limited value in knowing the exact task
descriptions of large numbers of tasks it seems reasonable to put a hard limit
on the logging here to prevent hard to work with logs and save some memory in extreme
cases.

When working with very large clusters, there's various avenues to create very large batches of tasks that can render as strings with O(MB) length. Since we only use these strings for logging and there is limited value in knowing the exact task descriptions of large numbers of tasks it seems reasonable to put a hard limit on the logging here to prevent hard to work with logs and save some memory in extreme cases.

elasticmachine · 2021-10-19T09:43:18Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner

I think we already have a thing to do almost exactly this, I suggested using it (and maybe generalising it a bit)

server/src/main/java/org/elasticsearch/cluster/ClusterStateTaskExecutor.java

DaveCTurner · 2021-10-21T07:33:19Z

server/src/main/java/org/elasticsearch/cluster/service/TaskBatcher.java


-                run(updateTask.batchingKey, toExecute, tasksSummary);
+    private String buildTasksDescription(BatchedTask updateTask,


Likewise here I think?

Here it's a little less fun to use the general string builder than in the other case because I think it'd be nice to have the overall task count?

I'm confused: collectionToDelimitedStringWithLimit does yield the overall count if it truncated the output. There's no filtering happening here.

But we have this weird setup here where we have the tasks grouped by source so the counting will only work out correctly if the source is different for each task? I kinda liked having that level of detail on the counts for debugging still.

Oh I see we use toExecute.size() rather than processTasksBySource.size().

How about either just appending (N tasks in total) if output.length() exceeds the limit we set, or else letting collectionToDelimitedStringWithLimit take some extra detail that it puts into the truncation summary?

…scription-length

original-brownbear · 2021-10-21T09:42:26Z

@DaveCTurner I dried up one of the two spots, the other one was somewhat impractical to do the same for. Maybe not worth it to add some tricky generalization for that right now?

…scription-length

DaveCTurner

LGTM

original-brownbear · 2021-10-26T10:45:21Z

Thanks David!

When working with very large clusters, there's various avenues to create very large batches of tasks that can render as strings with O(MB) length. Since we only use these strings for logging and there is limited value in knowing the exact task descriptions of large numbers of tasks it seems reasonable to put a hard limit on the logging here to prevent hard to work with logs and save some memory in extreme cases.

* upstream/master: (209 commits) Enforce license expiration (elastic#79671) TSDB: Automatically add timestamp mapper (elastic#79136) [DOCS] `_id` is required for bulk API's `update` action (elastic#79774) EQL: Add optional fields and limit joining keys on non-null values only (elastic#79677) [DOCS] Document range enrich policy (elastic#79607) [DOCS] Fix typos in 8.0 security migration (elastic#79802) Allow listing older repositories (elastic#78244) [ML] track inference model feature usage per node (elastic#79752) Remove IncrementalClusterStateWriter & related code (elastic#79738) Reuse previous indices lookup when possible (elastic#79004) Reduce merging in PersistedClusterStateService (elastic#79793) SQL: Adjust JDBC docs to use milliseconds for timeouts (elastic#79628) Remove endpoint for freezing indices (elastic#78918) [ML] add timeout parameter for DELETE trained_models API (elastic#79739) [ML] wait for .ml-state-write alias to be readable (elastic#79731) [Docs] Update edgengram-tokenizer.asciidoc (elastic#79577) [Test][Transform] fix UpdateTransformActionRequestTests failure (elastic#79787) Limit CS Update Task Description Size (elastic#79443) Apply the reader wrapper on can_match source (elastic#78988) [DOCS] Adds new transform limitation item and a note to the tutorial (elastic#79479) ... # Conflicts: # server/src/main/java/org/elasticsearch/index/IndexMode.java # server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java

When working with very large clusters, there's various avenues to create very large batches of tasks that can render as strings with O(MB) length. Since we only use these strings for logging and there is limited value in knowing the exact task descriptions of large numbers of tasks it seems reasonable to put a hard limit on the logging here to prevent hard to work with logs and save some memory in extreme cases.

original-brownbear added >non-issue :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.0.0 v7.16.0 labels Oct 19, 2021

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Oct 19, 2021

original-brownbear requested a review from DaveCTurner October 21, 2021 07:06

DaveCTurner reviewed Oct 21, 2021

View reviewed changes

original-brownbear added 2 commits October 21, 2021 09:38

Merge remote-tracking branch 'elastic/master' into limit-cs-update-de…

c4309e9

…scription-length

reuse existing logic

dc041cc

original-brownbear requested a review from DaveCTurner October 21, 2021 09:41

original-brownbear added 2 commits October 26, 2021 10:20

Merge remote-tracking branch 'elastic/master' into limit-cs-update-de…

596e200

…scription-length

simpler

1f3e3ab

DaveCTurner approved these changes Oct 26, 2021

View reviewed changes

original-brownbear merged commit 489dc01 into elastic:master Oct 26, 2021

original-brownbear deleted the limit-cs-update-description-length branch October 26, 2021 10:45

original-brownbear mentioned this pull request Oct 26, 2021

Limit CS Update Task Description Size (#79443) #79795

Merged

jakelandis added v8.0.0-beta1 and removed v8.0.0 labels Oct 27, 2021

original-brownbear mentioned this pull request Nov 4, 2021

Batch rollover cluster state updates. #79945

Merged

original-brownbear restored the limit-cs-update-description-length branch April 18, 2023 21:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit CS Update Task Description Size #79443

Limit CS Update Task Description Size #79443

original-brownbear commented Oct 19, 2021

elasticmachine commented Oct 19, 2021

DaveCTurner left a comment

DaveCTurner Oct 21, 2021

original-brownbear Oct 21, 2021

DaveCTurner Oct 21, 2021

original-brownbear Oct 21, 2021

DaveCTurner Oct 21, 2021

original-brownbear commented Oct 21, 2021

DaveCTurner left a comment

original-brownbear commented Oct 26, 2021


		run(updateTask.batchingKey, toExecute, tasksSummary);
		private String buildTasksDescription(BatchedTask updateTask,

Limit CS Update Task Description Size #79443

Limit CS Update Task Description Size #79443

Conversation

original-brownbear commented Oct 19, 2021

elasticmachine commented Oct 19, 2021

DaveCTurner left a comment

Choose a reason for hiding this comment

DaveCTurner Oct 21, 2021

Choose a reason for hiding this comment

original-brownbear Oct 21, 2021

Choose a reason for hiding this comment

DaveCTurner Oct 21, 2021

Choose a reason for hiding this comment

original-brownbear Oct 21, 2021

Choose a reason for hiding this comment

DaveCTurner Oct 21, 2021

Choose a reason for hiding this comment

original-brownbear commented Oct 21, 2021

DaveCTurner left a comment

Choose a reason for hiding this comment

original-brownbear commented Oct 26, 2021