-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bulkGet saved objects across spaces #109967
bulkGet saved objects across spaces #109967
Conversation
These unit tests were not using mocks correctly, so they passed when they should not have: * '#find returns empty result if user is unauthorized in any space' * '#openPointInTimeForType throws error if if user is unauthorized in any space' The previous refactor fixed the mocks, which caused these two tests to start failing. In this commit, I have fixed the code so the tests pass as written.
Did not look at implementation in depth for now, but API looks good |
1. Removed previous changes to pass unit tests, changed the tests instead. The tests were meant to be asserting for the "403 Forbidden" error. 2. Updated bulkGet to replace `'*'` with an empty namespaces array when a user is not authorized to access any spaces. This is primarily so that the API is consistent (the Security plugin will still check that they are authorized to access the type in the active space and return a more specific 403 error if not).
At the repository level, if a user tries to bulkGet an isolated object in multiple spaces (explicitly defined, or '*') they will get a 400 Bad Request error. However, in the Spaces SOC wrapper we are deconstructing the '*' identifier and replacing it with all available spaces. This can cause a problem when the user has access to only one space, the API response would be inconsistent (the repository would treat this as a perfectly valid request and attempt to fetch the object.) So, now in the Spaces SOC wrapper we modify the results based on what the request, leaving any 403 errors from the Security SOC wrapper intact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Author's notes for reviewers.
Note: I looked at the EncryptedSavedObjects SOC wrapper and it doesn't need any changes, since it just takes the returned objects and decrypts the attributes accordingly.
const allTypes = normalTypes.concat(hiddenType); | ||
const allTypes = [...normalTypes, ...crossNamespace, ...hiddenType]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is unrelated to the rest of this PR, but as I was looking at these test cases I noticed that the crossNamespace
test cases were not included in the allTypes
superset. That meant they were getting skipped for the superuser.
return availableSpacesPromise; | ||
}; | ||
|
||
const expectedResults = await Promise.all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const namespaces = await this.getSearchableSpaces(options.namespaces); | ||
let namespaces: string[]; | ||
try { | ||
namespaces = await this.getSearchableSpaces(options.namespaces); | ||
} catch (err) { | ||
if (Boom.isBoom(err) && err.output.payload.statusCode === 403) { | ||
// throw bad request since the user is unauthorized in any space | ||
throw SavedObjectsErrorHelpers.createBadRequestError(); | ||
} | ||
throw err; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unit test for this (and for find
) were not working correctly (see e6588b7). I fixed the tests and added this bit to match the behavior to the unit tests.
@elasticmachine merge upstream |
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. A few remarks and questions
{ ...CASES.MULTI_NAMESPACE_ISOLATED_ONLY_SPACE_1, namespaces: [SPACE_1_ID] }, // second try searches for it in a single other space, which is valid | ||
{ ...CASES.MULTI_NAMESPACE_DEFAULT_AND_SPACE_1, namespaces: [SPACE_2_ID], ...fail404() }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just wondering: our bulkGet
security suite is only performing tests where it bulkGets a single object, right 😅 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the bulk*
SO test suites behave as follows:
- For non-forbidden cases (successes and object-specific errors), batch them into a single request
- For forbidden cases (where the entire request fails with a 403), test each case individually
Note this is not automated in any way but it is controlled by use of the singleRequest
option in each test definition.
x-pack/test/saved_object_api_integration/security_and_spaces/apis/bulk_get.ts
Outdated
Show resolved
Hide resolved
return true; | ||
} | ||
|
||
const namespacesToCheck = new Set(namespaces); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: I don't think there's any need to convert to a set here? Can't we just work with the initial array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, had the same thought. The only corner case I can imagine where it'd be beneficial to have a Set
is if we have large existingNamespaces
and namespaces
lists that have intersection only at the end of these lists (and bulkGet is large enough to have a significant compound effect), but it probably doesn't make sense to cater for that.
Having a set here will not completely protect us from a malicious actor anyway, they can just do something like this...., for every ID:
const namespaces = [...Array.from({ length: 1000000 }).map(() => Math.random().toString()), 'valid-ns']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of the Set here is to reduce the time complexity, this takes us down from O(n^2)
to O(n)
. This is because line 179 below loops through existingNamespaces
and checks to see if any existing space is present in namespacesToCheck
.
// documents with the correct namespace prefix. We may revisit this in the future. | ||
const doc1 = createRawDoc(SINGLE_NAMESPACE_TYPE, { namespace: 'some-space' }); // the namespace field is ignored | ||
const doc2 = createRawDoc(SINGLE_NAMESPACE_TYPE, { namespaces: ['some-space'] }); // the namespaces field is ignored | ||
expect(rawDocExistsInNamespaces(registry, doc1, [])).toBe(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a remark: We already talked about it in a similar function from a previous PR, but I still don't like the way single-NS docs are short-circuiting these functions, as it could lead to errors if a developer decides to use it elsewhere than in a part of the code where we're already assured that the fetched single-NS objects are of the correct requested space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll one-up you, I don't like single-NS docs at all 😄
await this.legacyEnsureAuthorized( | ||
this.getUniqueObjectTypes(objects), | ||
'bulk_get', | ||
options.namespace, | ||
namespaces, | ||
{ | ||
args, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL: I thought we were doing per-object security checks and returning per-object unauthorized errors (as we do for 404)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! If the user is unauthorized for part of the request, the entire request fails 😄
let availableSpacesPromise: Promise<string[]> | undefined; | ||
const getAvailableSpaces = async () => { | ||
if (!availableSpacesPromise) { | ||
availableSpacesPromise = this.getSearchableSpaces([ALL_SPACES_ID]).catch((err) => { | ||
if (Boom.isBoom(err) && err.output.payload.statusCode === 403) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just
const getAvailableSpaces = async () => { ... }
const availableSpaces = await getAvailableSpaces();
instead of using Promise.all
around the expectedResults
? current implementation seems more complex than necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or something like this to block on this only when necessary:
const availableSpaces = objects.some((object) => object.namespaces?.includes(ALL_SPACES_ID))
? await this.getSearchableSpaces([ALL_SPACES_ID]).catch((err) => {
// the user doesn't have access to any spaces
if (Boom.isBoom(err) && err.output.payload.statusCode === 403) {
return [];
}
throw err;
})
: [];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the idea here was not to fetch all available spaces if we didn't need to, since that is a potentially expensive operation.
We could go through @azasypkin's suggestion but it effectively means we loop through all objects' namespaces
fields twice in the worst case scenario. I'm on the fence about changing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could go through @azasypkin's suggestion but it effectively means we loop through all objects' namespaces fields twice in the worst case scenario.
My assumption was that the time it takes to iterate through objects in request is negligible comparing to the request execution time, but it's just not-validated assumption, feel free to pick whatever approach you feel is better!
if (isLeft(expectedResult) && actualResult?.error?.statusCode !== 403) { | ||
const { type, id } = expectedResult.value; | ||
return ({ | ||
type, | ||
id, | ||
error: SavedObjectsErrorHelpers.createBadRequestError( | ||
'"namespaces" can only specify a single space when used with space-isolated types' | ||
).output.payload, | ||
} as unknown) as SavedObject<T>; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud here, but I wonder if we would be able to perform this check before the underlying call to client.bulkGet
?. I think we have all the necessary informations without performing the call?
In that case, we could potentially only bulkGet
the objects that are not in that situation, and aggregate with the ones that are matching this case? May be premature optimization though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original implementation that I wrote used that approach -- any object that we predicted should result in a bad request, it skipped for the base client bulkGet
.
The problem is that we really need to factor in behavior of both the Secure SOC wrapper and the SOR. If the user is unauthorized to get a specific type, then the request would never reach the SOR level validation. So that original implementation caused even more irregular API responses (and made the integration tests fail spectacularly).
However: I realized as I wrote this comment that the && actualResult?.error?.statusCode !== 403
bit is not necessary; if the user is unauthorized to fetch one specific object, the whole operation fails, so that code would never be reached anyway. So I'll take that bit out 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM 👍
const getNamespaceId = (namespaces?: string[]) => | ||
namespaces !== undefined ? SavedObjectsUtils.namespaceStringToId(namespaces[0]) : namespace; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: no need to change anything, just wanted to note that if the reader (okay, that was me) doesn't know the internal implementation of generateRawId
by heart they may be confused by the SavedObjectsUtils.namespaceStringToId(namespaces[0])
for shareable objects as it gives impression that we search only in the very first namespace 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I had to check the implementation too to be sure 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment here to clarify.
return true; | ||
} | ||
|
||
const namespacesToCheck = new Set(namespaces); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, had the same thought. The only corner case I can imagine where it'd be beneficial to have a Set
is if we have large existingNamespaces
and namespaces
lists that have intersection only at the end of these lists (and bulkGet is large enough to have a significant compound effect), but it probably doesn't make sense to cater for that.
Having a set here will not completely protect us from a malicious actor anyway, they can just do something like this...., for every ID:
const namespaces = [...Array.from({ length: 1000000 }).map(() => Math.random().toString()), 'valid-ns']
let availableSpacesPromise: Promise<string[]> | undefined; | ||
const getAvailableSpaces = async () => { | ||
if (!availableSpacesPromise) { | ||
availableSpacesPromise = this.getSearchableSpaces([ALL_SPACES_ID]).catch((err) => { | ||
if (Boom.isBoom(err) && err.output.payload.statusCode === 403) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or something like this to block on this only when necessary:
const availableSpaces = objects.some((object) => object.namespaces?.includes(ALL_SPACES_ID))
? await this.getSearchableSpaces([ALL_SPACES_ID]).catch((err) => {
// the user doesn't have access to any spaces
if (Boom.isBoom(err) && err.output.payload.statusCode === 403) {
return [];
}
throw err;
})
: [];
x-pack/plugins/spaces/server/saved_objects/spaces_saved_objects_client.ts
Outdated
Show resolved
Hide resolved
💚 Build Succeeded
Metrics [docs]History
To update your PR or re-run it, just comment with: |
💚 Backport successful
This backport PR will be merged automatically after passing CI. |
Summary
Resolves #109197.