Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Response Ops][Connectors] Add unsecured actions client to allow system to schedule email action #143282

Merged
merged 24 commits into from
Oct 26, 2022

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Oct 13, 2022

Resolves #143052

Summary

Creates an UnsecuredActionsClient that contains a single function for bulk enqueuing actions. This differs from the normal bulk enqueue function inside the ActionsClient because it does not check RBAC on the user request object in order to verify that the user has access to the Actions and Connectors feature. Because we want to allow the system to schedule actions without a user in the middle, we need to bypass the RBAC. This unsecured client uses the internalSavedObjectsRespository which uses the internal Kibana user to write the action_task_params saved object required to schedule an action.

This MVP limits the available connectors to preconfigured connectors in order to remove spaces from the equation. It also contains two allowlists:

  • Allowlist in x-pack/plugins/actions/server/create_unsecured_execute_function.ts to limit this feature to just the email connector. Because the system is scheduling this action, we also don't have any API keys to use for the action. The initial notifications feature will only use the email connector which does not require an API key but this allowlist should only contain connectors that don't need API keys (currently only the ES Index connector needs one).
  • Allowlist in x-pack/plugins/actions/server/unsecured_actions_client/unsecured_actions_client.ts to limit this feature to only certain plugins (notifications to start). If another plugin wants to use the unsecured actions client, they should add themselves to the allowlist, which will ping the Response Ops team, allowing us to be aware of where this client is being used.

Note: I added only unit tests in this PR since this functionality will be used by other plugins. I think functional tests should be added by the plugins that use this client.

To Verify

  1. Make sure you have a preconfigured email connector defined in your kibana.yml
  2. Add alerting to the allowlist in x-pack/plugins/actions/server/unsecured_actions_client/unsecured_actions_client.ts
  3. In x-pack/plugins/alerting/server/plugin.ts in the start function, add the following:
const unsecuredActionsClient = plugins.actions.getUnsecuredActionsClient();
unsecuredActionsClient.bulkEnqueueExecution('alerting', [
  {
    id: <preconfigured_connector_id>,
    params: {
      to: [<email_address>],
      subject: 'hello from Kibana!',
      message: 'does this work??',
    },
  },
]);
  1. Start Kibana and verify that the email was scheduled and sent.

@ymao1 ymao1 changed the title Adding unsecured actions client [POC] Add unsecured actions client Oct 13, 2022
@ymao1
Copy link
Contributor Author

ymao1 commented Oct 13, 2022

@legrego Here is POC for an unsecured actions client. Open to feedback!

Copy link
Member

@legrego legrego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this, @ymao1! I left a couple of early thoughts, but this could be viable, pending answers to my questions.

Since these requests will not be tied to a user, we likely won't ever have an API key. We could add a check to the execution enqueuer to only allow enqueuing executions for certain action types?

I think it's a good idea to restrict which action types can be enqueued via this new unsecured client, but I'll ultimately defer to y'all on this question.

The action executor will be writing event log entries for these executions. Do we need a way to differentiate between these actions and actions initiated by alerting?

I think our users would benefit from this. If they're consuming this event log, I imagine they'd want to be able to tell the difference.

spaceIds[actionToExecute.id] = actionToExecute.spaceId;

return {
type: ACTION_TASK_PARAMS_SAVED_OBJECT_TYPE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this SO type is namespaceType: 'multiple-isolated', do we need to also be specifying the initialNamespaces property when we create the saved object? Since we're using the bare repository, this won't be populated for you automatically.

/**
* Optional initial namespaces for the object to be created in. If this is defined, it will supersede the namespace ID that is in
* {@link SavedObjectsCreateOptions}.
*
* * For shareable object types (registered with `namespaceType: 'multiple'`): this option can be used to specify one or more spaces,
* including the "All spaces" identifier (`'*'`).
* * For isolated object types (registered with `namespaceType: 'single'` or `namespaceType: 'multiple-isolated'`): this option can only
* be used to specify a single space, and the "All spaces" identifier (`'*'`) is not allowed.
* * For global object types (registered with `namespaceType: 'agnostic'`): this option cannot be used.
*/
initialNamespaces?: string[];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the action_task_param SO that's created by this, the namespaces field is set to ['default'] which seems ok? I can pass it in explicitly if it's necessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough about the implications of these params existing in one space or another. By default, everything will live in the default space unless we explicitly say otherwise. For our immediate needs, this is might be ok? If a notification is triggered from a Case Assignment within the marketing space, would it be confusing that the corresponding notification was "sent" from the default space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsoldevila WDYT of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@legrego it sounds like notifications are fairly space agnostic so there will not be an indication that the notification came from any particular space, even if the case assignment occurred in a specific space @gsoldevila is that a fair statement? Given that, I think we can default to the default space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to be on the default space as it is not accessible by the user (is this assumption correct?) and it is controlled by the system.

++ I agree. I wasn't sure if this would be surfaced in the Event Log or not, and whether or not we cared about that discrepancy within the Event Log.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I am not sure either 🙂. Maybe @ymao1 can help with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event log does not make any references to the action_task_param SO. It does contain information about the preconfigured connector but since the preconfigured connectors have no namespace, it does not write out a namespace for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action executor will be writing event log entries for these executions. Do we need a way to differentiate between these actions and actions initiated by alerting?

For the event log, we already have a different event.action for when connectors are executed via http - execute-via-http, so I think having a separate one for this scenario would be good. execute-via-notification?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect at some point the notification system will care about spaces, and we'll support non-pre-configured connectors that live across spaces, and we can defer figuring this out till then.

A non-pre-configured connector's space will show up in the event log, and presumably always will. Pre-configured connectors likely appear to be in the "default" space today. What happens when we have connectors in multiple spaces, not sure. But that's all about the connector, and it sort of feels like the "space of the notification" is going to be a different thing. If everything was supported I could use connector X that's defined in spaces A and B to perform a notification in space A, but the event log doesn't have a notion of "what space something is executed in". It would probably need to be some new field, if we want to capture it.

@@ -105,12 +105,6 @@ export class ActionExecutor {
throw new Error('ActionExecutor not initialized');
}

if (!this.isESOCanEncrypt) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this check so we only check if isESOCanEncrypt if we are using connector SOs (which we need to decrypt). If this action is using preconfigured connectors, we will not throw if isESOCanEncrypt is false

@mikecote
Copy link
Contributor

POC looks great, thanks for putting this together 🙏

Another idea would be to add an allowlist in the Actions plugin. That way, plugins that want to use the unsecured actions client would have to add themselves to the allowlist, which would trigger a code review from Response Ops.

I'm +1 to start with whatever is easier, in case you feel this approach is.

This POC only considers preconfigured connectors so the logic is a little simpler than the original execution enqueuer.

Makes sense to start with. To remove this limitation down the line, we are able to with this approach?

Since these requests will not be tied to a user, we likely won't ever have an API key. We could add a check to the execution enqueuer to only allow enqueuing executions for certain action types?

Good idea. Most likely just index threshold. Down the line, we could optimize by only passing the API key from the rule if necessary.

The action executor will be writing event log entries for these executions. Do we need a way to differentiate between these actions and actions initiated by alerting?

This would be good for debugging purposes. I wonder if there is a mechanism in place today? (ad-hoc vs from alerting rule maybe?)

@ymao1
Copy link
Contributor Author

ymao1 commented Oct 13, 2022

This POC only considers preconfigured connectors so the logic is a little simpler than the original execution enqueuer.

Makes sense to start with. To remove this limitation down the line, we are able to with this approach?

The action executor needs to get and decrypt the action saved object if we're not using preconfigured connectors. We can use the internalSavedObjectRepository instead of the actionsClient to get the action SO and it doesn't look to me like the encrypted saved objects client needs to be instantiated with a user request object. We would just likely need to make a different version of the executor that doesn't use the actions client.

@mikecote
Copy link
Contributor

The action executor needs to get and decrypt the action saved object if we're not using preconfigured connectors. We can use the internalSavedObjectRepository instead of the actionsClient to get the action SO and it doesn't look to me like the encrypted saved objects client needs to be instantiated with a user request object. We would just likely need to make a different version of the executor that doesn't use the actions client.

Gotcha, sounds good! I'm good waiting down the line for this enhancement 👍

@ymao1 ymao1 changed the title [POC] Add unsecured actions client [Response Ops][Connectors] Add unsecured actions client to allow system to schedule email action Oct 17, 2022
@ymao1 ymao1 self-assigned this Oct 17, 2022
@ymao1 ymao1 added release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Feature:Actions/ConnectorsManagement Issues related to Connectors Management UX v8.6.0 labels Oct 17, 2022
@ymao1 ymao1 marked this pull request as ready for review October 17, 2022 17:25
@ymao1 ymao1 requested a review from a team as a code owner October 17, 2022 17:25
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@pmuellr
Copy link
Member

pmuellr commented Oct 17, 2022

Note: I added only unit tests in this PR since this functionality will be used by other plugins. I think functional tests should be added by the plugins that use this client.

But if they say the functionality is broken, how can we prove them wrong? :-)

Seems like we could build a test of this with a new route that just executes an email connector. We can use the existing way we test these today, with service set to __json, which returns an indication of the email that it >would< send, that we can assert over.

@ymao1
Copy link
Contributor Author

ymao1 commented Oct 17, 2022

For the event log, we already have a different event.action for when connectors are executed via http - execute-via-http, so I think having a separate one for this scenario would be good. execute-via-notification?

@pmuellr It doesn't look like we ever set the action to execute-via-http although it is defined as an event log constant. Do you recall how we differentiated between a normal execution and an http execution? I like the idea of making this distinction this way but it seems that we'd have to store something inside the action task params so that the action executor knows the source of the action? Otherwise it just knows that it picked up a task and is executing it.

@pmuellr
Copy link
Member

pmuellr commented Oct 17, 2022

It doesn't look like we ever set the action to execute-via-http although it is defined as an event log constant.

hahahaha, we must have lost this in a PR, I guess we never had a test for it? Or I wonder if previously we had executed these "directly" but are now queuing them?

I like the idea of making this distinction this way but it seems that we'd have to store something inside the action task params so that the action executor knows the source of the action? Otherwise it just knows that it picked up a task and is executing it.

Ya, does feel like something should be added to the ATO for this. Maybe 'source', which could be alert http or notification? Something we can defer for now ...

@ymao1
Copy link
Contributor Author

ymao1 commented Oct 18, 2022

But if they say the functionality is broken, how can we prove them wrong? :-)

Seems like we could build a test of this with a new route that just executes an email connector. We can use the existing way we test these today, with service set to __json, which returns an indication of the email that it >would< send, that we can assert over.

@pmuellr Thanks for keeping me honest :) Added functional tests here: 47f1ca9

@ymao1 ymao1 requested a review from pmuellr October 18, 2022 15:22
@ymao1
Copy link
Contributor Author

ymao1 commented Oct 24, 2022

@elasticmachine merge upstream

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and works as described (tested success and failure (accidentally!)).

Only question I have is whether we want to do something about missing the "reason" for why the action is executing. After realizing we no longer write the execute-via-http event.action, or whatever. Presumably the reasons today would be "alert", "http", or "notification"? Thinking we should open an issue, not sure if we already did, or we already resolved this, or ...

@ymao1
Copy link
Contributor Author

ymao1 commented Oct 25, 2022

Only question I have is whether we want to do something about missing the "reason" for why the action is executing. After realizing we no longer write the execute-via-http event.action, or whatever. Presumably the reasons today would be "alert", "http", or "notification"? Thinking we should open an issue, not sure if we already did, or we already resolved this, or ...

Thanks for the reminder! Created an issue: #143984

Copy link
Contributor

@doakalexi doakalexi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I was able to test this locally

Copy link
Contributor

@gsoldevila gsoldevila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
actions 209 220 +11

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
actions 23 24 +1
Unknown metric groups

API count

id before after diff
actions 214 225 +11

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ymao1

@ymao1 ymao1 merged commit 21806ca into elastic:main Oct 26, 2022
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Oct 26, 2022
@ymao1 ymao1 deleted the actions/unsecured-client branch October 26, 2022 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Actions/ConnectorsManagement Issues related to Connectors Management UX release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Research how our actions framework can send emails for the system
10 participants