-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Centralized Scheduled Job Identity Management using IdentityPlugin #394
Centralized Scheduled Job Identity Management using IdentityPlugin #394
Conversation
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@saratvemulapalli @joshpalis @vibrantvarun Would love to know your thoughts on this approach to securing scheduled jobs. |
List.of( | ||
new NamedWriteableRegistry.Entry(AuthToken.class, BasicAuthToken.NAME, BasicAuthToken::new), | ||
new NamedWriteableRegistry.Entry(AuthToken.class, BearerAuthToken.NAME, BearerAuthToken::new) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of implementing this in the JobScheduler plugin could we pull in a static list defined in Core that knows what all of these tokens are so as new tokens are added it automatically supports them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. Putting this in core.
accessToken | ||
); | ||
} else { | ||
// invoke job runner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment should be deduped and moved down or just erased
this.accessToken = null; | ||
} | ||
|
||
public JobExecutionContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we keep a single constructor since auth token can be passed in as null, or if you'd prefer Optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -186,6 +197,82 @@ public void beforeClose() { | |||
this.fullSweepExecutor.shutdown(); | |||
} | |||
|
|||
@Override | |||
public Engine.Index preIndex(ShardId shardId, Engine.Index operation) { | |||
if (JobSchedulerPlugin.GuiceHolder.getIndicesService() != null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you invert the condition so that we exit early if any of these are not true? Makes it easer to read
&& JobSchedulerPlugin.GuiceHolder.getIdentityService().getScheduledJobIdentityManager() != null) { | ||
JobSchedulerPlugin.GuiceHolder.getIdentityService() | ||
.getScheduledJobIdentityManager() | ||
.deleteUserDetails(delete.id(), shardId.getIndexName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if the deleteUserDetails completes, but the deschedule fails?
ParsedDocument parsedDoc = operation.parsedDoc(); | ||
|
||
try { | ||
XContentParser parser = JsonXContent.jsonXContent.createParser( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this parsing logic be pulled out into another class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it could, this was a bit difficult to figure out at first but I learned a lot about XContent trying to figure this out
XContentType.JSON | ||
); | ||
|
||
ParsedDocument docMinusOperator = mapperService.documentMapper().parse(toParse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm having trouble reading this, I'm concerned about this, from this code the implication is that the operator information, including the token are accessible by the plugin, since the data is stored within the existing job index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad you pointed this out because this is where I want the most feedback.
- For plugins, plugins will stash their threadcontext before writing to their jobs index so at the time the job details are indexed the user info is not available in the threadcontext.
- For extensions, extensions will use their service account token when writing to their jobs index so the identity of the requester (service account) is different from the identity of what to associate for the scheduled job.
This is where this convention comes into play. In this model, when writing the job details document to the plugin/extension's job index they would include a field called operator
to tell the identity system the identity to store alongside the job details. This parses that information from the document. Using the preIndex
hook of the IndexOperatorListener
and modifies the document before its stored in the plugin/extensions job index and removes it from the document. Instead, that information would be stored in the centralized scheduled job identity index owned by the security plugin.
See this comment in the companion Core PR for more details: opensearch-project/OpenSearch#7573 (comment)
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@cwperks Are data flow diagrams / trust boundary diagrams associated with this design? |
Signed-off-by: Craig Perkins <cwperx@amazon.com>
In this diagram the authz information is being stored in a different location. While its slightly more secure to hold this information in a different spot that has limited access, it doesn't expand how AuthN/AuthZ information can be handled. What would it take to augment this design to support external refresh? By allowing assigning jobs to operate as principals other than the current principal this would be a huge upgrade to the manageability of these jobs. |
@peternied The When centralizing the identities associated with scheduled jobs in an index owned by the security plugin, its possible to create APIs to manage identities associated with the jobs. Edit: This design covers how an access token could be issued on job invocation, but if the access token is not issued for a long enough window then an extension may still need to ask for a new one. In that case, the extension would request a new access token by calling a JS API to refresh their token provided a jobId and jobIndex. plugins wouldn't require a refresh because the threadcontext headers/transient headers do not expire for the life of the thread |
If an access token expires while a job is still running, then I am thinking about providing a Job Scheduler API that a Job Runner on an Extension can call to refresh their access token. The API would consume the service account token of the extension, jobId and jobIndex. On the handler of the API, the Job Scheduler would verify that the job 1) exists, 2) is owned by the extension and 3) is currently running. If those conditions are met than it can issue a new access token for a job in the middle of execution. |
@peternied To capture the problems with existing problems of storing authz outside of the security index I filed a campaign issue here: opensearch-project/security#2846 IMO I would love to solve the problems around dangling references and come up with a strategic way of fetching most up-to-date authz at job execution time going forward but I think its unrealistic to expect that that would be ready for the targeted experimental release of 2.9.0. I think that this design of centralizing the identities so that they are not stored in indices owned by extensions is a good move forward for extensions security and does create a clearer path to removing dangling references to static user authz. It will include the authz initially (same as plugins do now), but in this model the security plugin will have the references in a single place which can then make it possible to update the information on changes to a user or deletion of a role by also updating this index storing the identities associated with scheduled jobs. |
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Closing for now. Will re-open at a later date with updates. |
Description
Companion PR in core: opensearch-project/OpenSearch#7573
Companion PR in security repo: https://github.com/opensearch-project/security
This PR uses the IdentityService from core to invoke the ScheduledJobIdentityManager if an implementation is provided via the IdentityService. The main idea is to centralize the identities that jobs are associated with in a single secure index owned by the security plugin. The current de-centralized management where plugins store identities of users associated with a job and inject the user's roles into the thread context at job execution is not a viable solution for an ecosystem with 3rd-party code where extensions cannot be implicitly trusted to inject the proper roles for privilege evaluation. The current concept of roles injection will not have an analogy in the extensions ecosystem.
In its place I am proposing a solution to centralize the management of identities associated with scheduled jobs and then call on the security plugin to perform the roles injection (in case of plugins) or issue an access token (in case of extensions). This is a more secure system where the identity cannot be overwritten by a plugin/extension, but management could be performed by a super admin or UX could be developed in the security-dashboards-plugin repo to change identities associated with jobs by modifying the index that identities associated with scheduled jobs are stored in.
This PR takes advantage of the fact that
JobSweeper
is anIndexOperationListener
and hooks intopreIndex
andpostDelete
to create and delete entries in the centralized scheduled jobs identity index respectively.Job Creation
In the security plugin, the implementation for scheduled job identity management introduces a convention for a plugin/extension to supply the identity they would like to associate with a scheduled job on job creation inside the job details document being indexed. Many plugins currently store jobs details and associated identities in system indices owned by the plugins. In order for a plugin to write to its system index it will stash the thread context and then index a document. Because of the thread context stashing, when indexing the document containing jobs details in the plugin's jobs index, there will be no User in the threadcontext during the time
preIndex
is called inJobSweeper
. To solve for this issue, there is a new convention where the plugin can supply an operator field in the document.i.e.
Inside of
preIndex
the security plugin will read theoperator
field from the document and modify the document prior to being indexed to remove theoperator
being stored in the.hello-world-jobs
index and instead store it in the.opendistro-security-scheduled-job-identity
where there is referential integrity with all plugin jobs indices and the compound key isjob_index
andjob_id
.See below for an example entry in the
.opendistro-security-scheduled-job-identity
index:After the operator is removed from the document the plugin/extension is trying to index the resulting document stored in the jobs details index is:
Job Execution
When the job scheduler determines that a job is due to be run it will invoke the
runJob
method of the Job Runner. TherunJob
method consumesJobExecutionContext
and in this PR it modifies this context for Job Scheduler to request an access token from the identity system and pass it to an extension as part of the execution context. The AuthToken is a NamedWriteable meaning that it is serializable and can be securely passed via transport to the extension running the job. The extension (using the SDK) will extract the access token from that context and pass it along with REST Requests that the job would perform on behalf of the user associated with the job.Job Deletion/De-Scheduled
When a job is deleted from the plugin/extension's job index, Job Scheduler will also hook into the event utilizing
postDelete
or theIndexOperationListener
. In this method, thejobId
andjobIndex
are used call on the identity system to locate and delete the identity associated with the scheduled job in the centralized scheduled job identity index. I am currently looking into ways to ensure that this operation is atomic.Refresh Access Token During Job Execution
It is possible that a job takes longer to run than the access token is valid for. In these circumstances a mechanism will be developed for a job runner to request a new access token. To do this, the job runner will call on a Job Scheduler REST API (RestRefreshTokenAction) using the service account token of the extension along with the
jobId
andjobIndex
of the currently executing job.The handler of this REST Request will verify:
jobId
andjobIndex
and if so issue a new access tokenIssues Resolved
Related to: opensearch-project/security#2528
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.