[RAM] adds bulkUpdatesSchedules method to Task Manager API #132637

vitaliidm · 2022-05-20T15:57:30Z

Addresses: #124850

Summary

Adds new method Task Manager API bulkUpdateSchedules
Adds calling taskManager.bulkUpdateSchedules in rulesClient.bulkEdit to update tasks if updated rules have scheduleTaskId property
Enables the rest of operations for rulesClient.bulkEdit (set schedule, notifyWhen, throttle)

bulkUpdateSchedules

Using bulkUpdatesSchedules you can instruct TaskManager to update interval of tasks that are in idle status.
When interval updated, new runAt will be computed and task will be updated with that value

export class Plugin {
  constructor() {
  }

  public setup(core: CoreSetup, plugins: { taskManager }) {
  }

  public start(core: CoreStart, plugins: { taskManager }) {
    try {
      const bulkUpdateResults = await taskManager.bulkUpdateSchedule(
        ['97c2c4e7-d850-11ec-bf95-895ffd19f959', 'a5ee24d1-dce2-11ec-ab8d-cf74da82133d'],
        { interval: '10m' },
      );
      // If no error is thrown, the bulkUpdateSchedule has completed successfully.
      // But some updates of some tasks can be failed, due to OCC 409 conflict for example
    } catch(err: Error) {
      // if error is caught, means the whole method requested has failed and tasks weren't updated
    }    
  }
}

in follow-up PRs

use taskManager.bulkUpdateSchedules in rulesClient.update ([RAM] refactors RulesClient.update, by using new Task Manager API bulkUpdateSchedules #134027)
functional test for bulkEdit ([RAM] fixes flaky tests for alerting bulk_edit HTTP API, adds additional tests for alerting bulk_edit #133635)

Checklist

Unit or functional tests were updated or added to match the most common scenarios

Release note

Adds new method to Task Manager - bulkUpdatesSchedules, that allow bulk updates of scheduled tasks.
Adds 3 new operations to rulesClient.bulkUpdate: update of schedule, notifyWhen, throttle.

vitaliidm · 2022-05-20T16:02:15Z

x-pack/plugins/alerting/server/rules_client/rules_client.ts

+    const scheduleOperation = options.operations.find(
+      (op): op is Extract<BulkEditOperation, { field: Extract<BulkEditFields, 'schedule'> }> =>
+        op.field === 'schedule'
+    );


what other operations should affect rescheduling of tasks?

Should notifyWhen or throttle trigger this bulkUpdateSchedules as well?

The only logic that causes a rule's task to get rescheduled at this time is when the rule interval changes. We should be ok to skip the other operations.

…liidm/kibana into task-manager-bulk-schedules

vitaliidm · 2022-06-06T12:25:16Z

x-pack/plugins/alerting/server/routes/bulk_edit_rules.ts

@@ -34,6 +38,27 @@ const operationsSchema = schema.arrayOf(
      field: schema.literal('actions'),
      value: schema.arrayOf(ruleActionSchema),
    }),
+    schema.object({


Currently functional tests are skipped due to #132195
I plan to fix them and add additional tests in the following PR, so won't mix everything together.

I can do it in this PR, if it preferable

For reviewers:
I believe this is the follow up PR @vitaliidm is referring to 😉
https://github.com/elastic/kibana/pull/133635/files

I'd prefer to see them here, but TBH doesn't make sense if they're skipped, so ok in the follow-up. I added a comment to that PR as a reminder.

elasticmachine · 2022-06-06T15:10:49Z

Pinging @elastic/response-ops (Team:ResponseOps)

elasticmachine · 2022-06-06T15:10:51Z

Pinging @elastic/security-detections-response (Team:Detections and Resp)

elasticmachine · 2022-06-06T15:10:52Z

Pinging @elastic/security-solution (Team: SecuritySolution)

ymao1

Code and tests look good!

There is this from the original issue:

The update(...) function of the rulesClient will move to use the newly proposed updateSchedule API (see below) to keep the behaviour the same as bulkUpdate(...).

Should that be done in this issue or is there a separate issue for that?

x-pack/plugins/alerting/server/rules_client/rules_client.ts

x-pack/plugins/task_manager/server/task_scheduling.ts

vitaliidm · 2022-06-08T16:40:29Z

There is this from the original issue:

The update(...) function of the rulesClient will move to use the newly proposed updateSchedule API (see below) to keep the behaviour the same as bulkUpdate(...).
Should that be done in this issue or is there a separate issue for that?

I would like to address it in a separate PR.
Will prepare it, right after this one merged.
Will update description, noting that update will come next

ymao1

LGTM! Verified it works as expected. Believe @pmuellr is also going to take a look at this PR

pmuellr

Everything looks good to me, except I don't see any rationale why we only update idle tasks. And I'm concerned about some tasks not getting updated, because they happen to be running when a bulk update occurs, with no indication to the caller that the task was skipped.

pmuellr · 2022-06-09T14:43:02Z

x-pack/plugins/alerting/server/routes/bulk_edit_rules.ts

@@ -12,6 +12,10 @@ import { ILicenseState, RuleTypeDisabledError } from '../lib';
 import { verifyAccessAndContext, rewriteRule, handleDisabledApiKeysError } from './lib';
 import { AlertingRequestHandlerContext, INTERNAL_BASE_ALERTING_API_PATH } from '../types';

+const scheduleSchema = schema.object({
+  interval: schema.string(),


We have a schema config validator we use for intervals, as seen here:

kibana/x-pack/plugins/alerting/server/routes/create_rule.ts

Line 36 in 5c166a6

interval: schema.string({ validate: validateDurationSchema }),

Using that, a bad interval sent will "fail fast" and not somewhere else, deeper in the framework.

thanks, I've replaced it with validateDurationSchema

pmuellr · 2022-06-09T14:47:22Z

x-pack/plugins/alerting/server/routes/bulk_edit_rules.ts

@@ -34,6 +38,27 @@ const operationsSchema = schema.arrayOf(
      field: schema.literal('actions'),
      value: schema.arrayOf(ruleActionSchema),
    }),
+    schema.object({


I'd prefer to see them here, but TBH doesn't make sense if they're skipped, so ok in the follow-up. I added a comment to that PR as a reminder.

pmuellr · 2022-06-09T14:54:00Z

x-pack/plugins/task_manager/server/task_scheduling.ts

+            },
+            {
+              term: {
+                'task.status': 'idle',


I'm curious why we only look at idle tasks. Basically, why aren't we operating on running tasks as well? I'm guessing because the runAt would get overwritten when the task completes and reschedules itself.

But this seems like a pain to the users. Ask to update X rules, but some aren't updated because they happen to be running?

I feel like we should at least return the rules which weren't idle, in the response back to the user, so they could "try again", if we can't make this work for running tasks. Or at a minimum log something, so we at least have an indication that some rules were skipped.

@mikecote filled me in - we don't NEED to update the running tasks, as their schedule is recalculated after the run, so it should recalculate the new runAt with the new schedule when it's complete. In case of a race condition - the rule is finished before the schedule is updated in it's SO - then there should be at most one more run with the old schedule. Which seems like the best we can do.

pmuellr · 2022-06-09T17:29:43Z

x-pack/test/plugin_api_integration/test_suites/task_manager/task_management.ts

@@ -899,6 +909,62 @@ export default function ({ getService }: FtrProviderContext) {
      });
    });

+    it('should bulk updates schedules for multiple tasks', async () => {
+      const initialTime = Date.now();
+      const tasks = await Promise.all([


Seems like it might be hard to arrange, but can we test this with a "running" task? I think it would be some new task that would take a long time to run, and then we'd check to see if we're "ready" by querying the task state looking to see if it's "running". And then verify that it's runAt wasn't changed.

Hey @pmuellr , I added a test to cover this bit, here is my solution: 52b794a

Let me know if you are fine with that, and I will merge this PR then

ya, looks good, thx!

pmuellr

Got my comment about not updating non-idle tasks answered by Mike - we simply don't NEED to, as they will pick up the new schedule when they recalculate when the rule finishes.

However, I think we should add a comment about that - presumably in the place where we're filtering on idle, to indicate that. Otherwise I'll be confused about it next year :-)

kibana-ci · 2022-06-13T12:50:46Z

💚 Build Succeeded

Metrics [docs]

Unknown metric groups

API count

id	before	after	diff
`taskManager`	77	80	+3

History

💔 Build #50448 failed eb4af87
💚 Build #50108 succeeded 6fb6bee
💚 Build #49928 succeeded 64e7adb
💚 Build #49350 succeeded 0563f88
💚 Build #49311 succeeded cb5411a

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @vitaliidm

vitaliidm · 2022-06-13T14:27:14Z

thanks @pmuellr for the feedback and approval

However, I think we should add a comment about that - presumably in the place where we're filtering on idle, to indicate that. Otherwise I'll be confused about it next year :-)

Added it to method description

…nal tests for alerting bulk_edit (#133635) ## Summary Addresses #132195 Follow-up to #132637 - Fixes flaky tests `x-pack/test/alerting_api_integration/spaces_only/tests/alerting/bulk_edit.ts` - Adds new tests in spaces_only/tests/alerting/bulk_edit.ts for the following operations: `schedule`, `notifyWhen`, `throttle` - Adds units test in x-pack/plugins/alerting/server/rules_client/lib/apply_bulk_edit_operation.test.ts ### Details to fixing flaky tests Despite having bulkEdit retry when 409 conflict happens during bulk update of SO, sometimes there was still failing test. So, all tests in bulk_edit.ts were skipped 93ffd78 The reason, it happens I believe because mocked rule that used for test is enabled, its update after run clashes with `SavedObject.bulkUpdate` method that used inside `RulesClient.bulkEdit`. So, to fix it, I made this mocked rule disabled, which seems like fixed the flakiness Here is flaky test runner builds with ENABLED rule: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/717 https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/713 and with DISABLED: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/718 https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/714 In both enabled runs there was test failure, but not for disabled. Another possible way to fix: use retry in test for `rules.bulkEdit` call and assertion. Let me know if it can more preferable way to fix it ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

…kUpdateSchedules (#134027) ## Summary - follow-up to #132637, #124850 - replaces in `RulesClient.update` method TaskManager API `runNow` to `bulkUpdateSchedules` When using runNow in scale, there can be situations, when TaskManager capacity is full, thus leading failure of `runNow`. Instead, new API `bulkUpdateSchedules` will be used, which in case if rule schedule is getting updated: will update underlying task schedule and will calculate new `runAt` time. More details on new TaskManager API: #132637, #124850 ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

init taskManager bulk

2759c2e

vitaliidm self-assigned this May 20, 2022

vitaliidm commented May 20, 2022

View reviewed changes

vitaliidm and others added 19 commits May 23, 2022 12:48

updates and fixes

d24c2f3

Merge branch 'main' into task-manager-bulk-schedules

ff55e01

fix the rest of tests

6efc5fb

Merge branch 'main' into task-manager-bulk-schedules

655a753

Merge branch 'main' into task-manager-bulk-schedules

74c5a7c

add unit tests

7e1e7c7

tests!!!

815854b

refactor it

79fc692

Merge branch 'main' into task-manager-bulk-schedules

8197409

add test to rukes_client

e9d4426

Merge branch 'task-manager-bulk-schedules' of https://github.com/vita…

51469cc

…liidm/kibana into task-manager-bulk-schedules

tests, more tests

0b96f46

README, docs

9f1ccf7

skip again

edc7890

add rest of ops

d20f2fd

Merge branch 'main' into task-manager-bulk-schedules

f572388

tests

f2bc696

comments updates

0b44ef1

JSDoc

cb5411a

vitaliidm commented Jun 6, 2022

View reviewed changes

vitaliidm changed the title ~~[RAM]bulkUpdatesSchedules for taskManager~~ [RAM] adds bulkUpdatesSchedules method to Task Manager API Jun 6, 2022

vitaliidm added the backport:skip This commit does not require backporting label Jun 6, 2022

vitaliidm and others added 2 commits June 6, 2022 15:01

few perf improvements

bdeadf6

Merge branch 'main' into task-manager-bulk-schedules

0563f88

vitaliidm marked this pull request as ready for review June 6, 2022 15:10

vitaliidm requested a review from a team as a code owner June 6, 2022 15:10

vitaliidm mentioned this pull request Jun 7, 2022

[RAM] fixes flaky tests for alerting bulk_edit HTTP API, adds additional tests for alerting bulk_edit #133635

Merged

1 task

ymao1 reviewed Jun 8, 2022

View reviewed changes

x-pack/plugins/alerting/server/rules_client/rules_client.ts Outdated Show resolved Hide resolved

x-pack/plugins/task_manager/server/task_scheduling.ts Show resolved Hide resolved

vitaliidm and others added 2 commits June 8, 2022 17:02

Merge branch 'main' into task-manager-bulk-schedules

bd36d9a

CR: replace auditLogger with logger.error

64e7adb

Merge branch 'main' into task-manager-bulk-schedules

6fb6bee

ymao1 approved these changes Jun 9, 2022

View reviewed changes

pmuellr reviewed Jun 9, 2022

View reviewed changes

pmuellr approved these changes Jun 9, 2022

View reviewed changes

vitaliidm and others added 4 commits June 13, 2022 11:15

CR: minor suggestions addressed

fc2f119

Merge branch 'main' into task-manager-bulk-schedules

eb4af87

CR: fix tests

0e2d63a

CR: add functional test for task in running status

52b794a

vitaliidm merged commit 6e0086d into elastic:main Jun 13, 2022

ymao1 mentioned this pull request Jun 13, 2022

Task Manager plugin API to bulk update task schedules #124850

Closed

vitaliidm mentioned this pull request Jun 13, 2022

[RAM] refactors RulesClient.update, by using new Task Manager API bulkUpdateSchedules #134027

Merged

1 task

vitaliidm deleted the task-manager-bulk-schedules branch March 4, 2024 17:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RAM] adds bulkUpdatesSchedules method to Task Manager API #132637

[RAM] adds bulkUpdatesSchedules method to Task Manager API #132637

vitaliidm commented May 20, 2022 •

edited

Loading

vitaliidm May 20, 2022

mikecote May 26, 2022

vitaliidm Jun 6, 2022

gmmorris Jun 6, 2022

pmuellr Jun 9, 2022

elasticmachine commented Jun 6, 2022

elasticmachine commented Jun 6, 2022

elasticmachine commented Jun 6, 2022

ymao1 left a comment

vitaliidm commented Jun 8, 2022

ymao1 left a comment

pmuellr left a comment

pmuellr Jun 9, 2022

vitaliidm Jun 13, 2022

pmuellr Jun 9, 2022

pmuellr Jun 9, 2022

pmuellr Jun 9, 2022

pmuellr Jun 9, 2022

vitaliidm Jun 13, 2022

pmuellr Jun 13, 2022

pmuellr left a comment

kibana-ci commented Jun 13, 2022

API count

vitaliidm commented Jun 13, 2022

[RAM] adds bulkUpdatesSchedules method to Task Manager API #132637

[RAM] adds bulkUpdatesSchedules method to Task Manager API #132637

Conversation

vitaliidm commented May 20, 2022 • edited Loading

Summary

bulkUpdateSchedules

in follow-up PRs

Checklist

Release note

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticmachine commented Jun 6, 2022

elasticmachine commented Jun 6, 2022

elasticmachine commented Jun 6, 2022

ymao1 left a comment

Choose a reason for hiding this comment

vitaliidm commented Jun 8, 2022

ymao1 left a comment

Choose a reason for hiding this comment

pmuellr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmuellr left a comment

Choose a reason for hiding this comment

kibana-ci commented Jun 13, 2022

💚 Build Succeeded

Metrics [docs]

API count

History

vitaliidm commented Jun 13, 2022

vitaliidm commented May 20, 2022 •

edited

Loading