Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change from using protos to structs for relationships, ONRs and RRs #2081

Merged
merged 28 commits into from
Oct 21, 2024

Conversation

josephschorr
Copy link
Member

This significantly reduces memory allocation in the datastore and dispatcher and should make other improvements easier down the road

Also changes to use a Go 1.23-style iterator for the relationship iterator rather than the custom one previously used

@github-actions github-actions bot added area/api v1 Affects the v1 API area/datastore Affects the storage system area/dependencies Affects dependencies area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) area/dispatch Affects dispatching of requests labels Sep 30, 2024
Copy link
Member

@jzelinskie jzelinskie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two minor comments while scrolling through

Comment on lines +44 to 57
err := set.AddConcreteSubject(tuple.MustParseONR("document:foo#viewer"))
require.NoError(t, err)

err = set.AddConcreteSubject(tuple.ParseONR("document:bar#viewer"))
err = set.AddConcreteSubject(tuple.MustParseONR("document:bar#viewer"))
require.NoError(t, err)

err = set.AddConcreteSubject(tuple.ParseONR("team:something#member"))
err = set.AddConcreteSubject(tuple.MustParseONR("team:something#member"))
require.NoError(t, err)

err = set.AddConcreteSubject(tuple.ParseONR("team:other#member"))
err = set.AddConcreteSubject(tuple.MustParseONR("team:other#member"))
require.NoError(t, err)

err = set.AddConcreteSubject(tuple.ParseONR("team:other#manager"))
err = set.AddConcreteSubject(tuple.MustParseONR("team:other#manager"))
require.NoError(t, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easier to maintain over time if you write these with a loop:

for _, rel := range []string{
    "document:foo#viewer",
    "document:bar#viewer",
    "team:something#member",
    "team:other#member",
    "team:other#manager",
} {
    require.NoError(t, set.AddConcreteSubject(tuple.MustParseONR(rel)))
}

func MustIteratorBeClosed(iter *sliceRelationshipIterator) {
if !iter.closed {
panic("Tuple iterator garbage collected before Close() was called")
func NewSliceRelationshipIterator(rels []tuple.Relationship, order options.SortOrder) datastore.RelationshipIterator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a generic Relationship Iterator constructor with ordering should be in the tuple package (probably also needs a rename) instead of datastore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only really using the slice iterator now in testing

@github-actions github-actions bot added the area/CLI Affects the command line label Sep 30, 2024
@josephschorr josephschorr marked this pull request as ready for review September 30, 2024 20:10
@josephschorr josephschorr requested a review from a team September 30, 2024 20:10
tstirrat15
tstirrat15 previously approved these changes Oct 1, 2024
Copy link
Contributor

@tstirrat15 tstirrat15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - no blockers. I've got a bunch of questions but they're mostly understanding questions.

internal/datasets/subjectsetbytype.go Show resolved Hide resolved
@@ -1461,7 +1411,9 @@ func StrictReadModeTest(t *testing.T, ds datastore.Datastore) {
OptionalResourceType: "resource",
})
require.NoError(err)
it.Close()

_, err = datastore.IteratorToSlice(it)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this basically asserting that the thing is an iterator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its ensuring it can read everything

Comment on lines +278 to +280
for range tempIterator {
break
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of this idiom?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It closes the iterator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is the iterator getting closed this way? It would be doing 1 iteration, unless there is some magic happening here? If that's the case, can you document it (e.g. pointer to go docs describing the behavior?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop invokes the iterator and break causes it to load a single item. Once the loop has completed, Go terminates the function being invoked, which via its defer, closes the underlying iterator

@@ -12,13 +12,13 @@ import (
// NOTE: This is designed solely for the developer API and testing and should *not* be used in any
// performance sensitive code.
type TrackingSubjectSet struct {
setByType map[string]datasets.BaseSubjectSet[FoundSubject]
setByType map[tuple.RelationReference]datasets.BaseSubjectSet[FoundSubject]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own curiosity - when you've got an object as the key in a map, is it using the pointer to the object as the hashmap key, or does the object itself have to be hashable? Are maps even hashmaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a reference (pointer), it uses the "pointer" itself. Otherwise, it compares structurally. Here, it is comparing structurally

Comment on lines -395 to -397
spiceerrors.DebugAssert(func() bool {
return tuple.OnrEqualOrWildcard(tpl.Subject, crc.parentReq.Subject)
}, "somehow got invalid ONR for direct check matching")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this check get moved or is it no longer necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it because I didn't think it was necessary anymore

func Leaf(start *tuple.ObjectAndRelation, subjects ...*core.DirectSubject) *core.RelationTupleTreeNode {
var startONR *core.ObjectAndRelation
if start != nil {
startONR = start.ToCoreONR()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this boundary - is this a task for future refactoring or is there a reason that this part of the codebase is still speaking in terms of the proto objects?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its generating protos for external consumption and our testing

func TestCanonicalBytes(t *testing.T) {
foundBytes := make(map[string]string)

for _, tc := range testCases {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are these testCases defined?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for _, tc := range testCases {
// These test cases are defined in parsing_test.go
for _, tc := range testCases {

Relation string
}

const onrStructSize = 48 /* size of the struct itself */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in the number of bytes that an empty struct occupies?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep; i.e. its size without any other data to which it points

Comment on lines +129 to +131
UpdateOperationTouch UpdateOperation = iota
UpdateOperationCreate
UpdateOperationDelete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean that there are two identifiers to the left of the assignment here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Golang allows you to elide the type in var ( ... ) and const (...) expressions.

iota is a special keyword that says "start at 0 and increment for each value below"

This reads as:

	UpdateOperationTouch UpdateOperation = 0
	UpdateOperationCreate UpdateOperation = 1
	UpdateOperationDelete UpdateOperation = 2

)

func TestONRStructSize(t *testing.T) {
size := int(unsafe.Sizeof(ObjectAndRelation{}))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - will this potentially depend on the Go runtime?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, hence the test

@josephschorr
Copy link
Member Author

Rebased

Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Halfway through this, looking good so far

return s.AddSubject(subject, nil)
}

// AddSubject adds the specified subject to the set.
func (s *SubjectByTypeSet) AddSubject(subject *core.ObjectAndRelation, caveat *core.ContextualizedCaveat) error {
key := tuple.JoinRelRef(subject.Namespace, subject.Relation)
func (s *SubjectByTypeSet) AddSubject(subject tuple.ObjectAndRelation, caveat *core.ContextualizedCaveat) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were the contextualized caveat and integrity bits left as protos? Will it be addressed in a follow-up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few reasons:

  1. They aren't used nearly as often, so the space savings are minimal
  2. Context can be quite large, so copying it around on the stack seems unwise
  3. Context is often nil, so there won't be any overhead in the common case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context. It just feels odd because it won't be clear to the reader why the codebase is left in this odd mixture of structs and protos, and what pattern should be followed as the codebase evolves. From the arguments laid, context being quite large seems like something that would be equally problematic for the heap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context being large is better to be on the heap: it ensures we're not copying it around; granted, context is not usually super large, but either way, its not used as much

ObjectId: userID,
Relation: datastore.Ellipsis,
func docViewer(documentID, userID string) tuple.Relationship {
return tuple.Relationship{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have been great if we had a fluent API to create struct values

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add it, but it runs the risk of having "partial state" Relationships or needing a builder struct. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The risk is the same if you are creating a struct value directly. The struct builder will be in the stack and the overhead is minimal, and the impact on readability would be great - we do this a lot in the codebase. At the very least we could add the helper to use it in tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followup?

@@ -75,7 +74,7 @@ type CreateRelationshipExistsError struct {
error

// Relationship is the relationship that caused the error. May be nil, depending on the datastore.
Relationship *core.RelationTuple
Relationship *tuple.Relationship
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this guy left as a pointer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it can be nil; see comment above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Just for the sake of the argument, you know you can achieve the same by comparing against a zero-value struct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the explicit nil to indicate its not present

// NOTE: The ordering of these columns can affect query performance, be aware when changing.
columnsAndValues := map[options.SortOrder][]nameAndValue{
options.ByResource: {
{
sqf.schema.colNamespace, cursor.ResourceAndRelation.Namespace,
sqf.schema.colNamespace, cursor.Resource.ObjectType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since you took the time to make the refactor, this could have been less verbose. ObjectType could have been Type and ObjectID could have been ID

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explicitly put Object in there: I wanted it to read very clearly that it was the Type (and ID) of the object

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unnecesary. See how this reads: cursor.Resource.Type

It clearly states that it's the Type of the Resource. Most accesses need to go via Resource or Subject.

Anyway it was just a nit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still prefer it being explicit, but I get it

internal/datastore/common/sql.go Outdated Show resolved Hide resolved
internal/graph/resourcesubjectsmap.go Show resolved Hide resolved
internal/services/v1/preconditions.go Outdated Show resolved Hide resolved
internal/services/v1/watch.go Outdated Show resolved Hide resolved
pkg/datastore/options/options.go Show resolved Hide resolved
pkg/proto/core/v1/util.go Outdated Show resolved Hide resolved
internal/datastore/proxy/observable.go Show resolved Hide resolved
internal/datastore/spanner/reader.go Show resolved Hide resolved
internal/graph/resourcesubjectsmap2.go Show resolved Hide resolved
pkg/datastore/datastore.go Show resolved Hide resolved
internal/testfixtures/validating.go Show resolved Hide resolved
internal/services/v1/experimental_test.go Outdated Show resolved Hide resolved
internal/services/v1/permissions.go Show resolved Hide resolved
vroldanbet
vroldanbet previously approved these changes Oct 21, 2024
pkg/tuple/strings.go Outdated Show resolved Hide resolved
)

// ObjectAndRelation represents an object and its relation.
type ObjectAndRelation struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes me wonder if we are running fieldalignment linter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran it locally and will address

@josephschorr josephschorr added this pull request to the merge queue Oct 21, 2024
Merged via the queue into authzed:main with commit 3727476 Oct 21, 2024
22 checks passed
@josephschorr josephschorr deleted the rel-structs branch October 21, 2024 17:31
@github-actions github-actions bot locked and limited conversation to collaborators Oct 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/api v1 Affects the v1 API area/CLI Affects the command line area/datastore Affects the storage system area/dependencies Affects dependencies area/dispatch Affects dispatching of requests area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants