Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Move field id off of schema #2336

Merged
merged 12 commits into from
Feb 28, 2024

Conversation

AndrewSisley
Copy link
Contributor

Relevant issue(s)

Resolves #2334

Description

Moves field id off of schema and onto collection.

It is now a local property and does not need to be the same across multiple nodes. The new Fields property on CollectionDescription will be where other local field stuff is move too from the schema (relation_name, descriptions, secondary relation fields).

This change is a prerequisite of allowing fields to be deleted and schema branching (see #2333 for more info on that if currious)

@AndrewSisley AndrewSisley added bug Something isn't working feature New feature or request area/schema Related to the schema system labels Feb 21, 2024
@AndrewSisley AndrewSisley added this to the DefraDB v0.10 milestone Feb 21, 2024
@AndrewSisley AndrewSisley requested a review from a team February 21, 2024 17:39
@AndrewSisley AndrewSisley self-assigned this Feb 21, 2024
@AndrewSisley AndrewSisley changed the title fix: Move field id off of schema fix: Move field id off of schema Feb 21, 2024
Copy link

codecov bot commented Feb 21, 2024

Codecov Report

Attention: Patch coverage is 94.58333% with 13 lines in your changes are missing coverage. Please review.

Project coverage is 74.91%. Comparing base (9534113) to head (1b70a23).

❗ Current head 1b70a23 differs from pull request most recent head c9bb80b. Consider uploading reports for the commit c9bb80b to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2336      +/-   ##
===========================================
+ Coverage    74.73%   74.91%   +0.18%     
===========================================
  Files          256      257       +1     
  Lines        25254    25314      +60     
===========================================
+ Hits         18872    18962      +90     
+ Misses        5093     5066      -27     
+ Partials      1289     1286       -3     
Flag Coverage Δ
all-tests 74.91% <94.58%> (+0.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
client/document.go 70.64% <100.00%> (ø)
core/encoding.go 64.23% <100.00%> (ø)
db/base/collection_keys.go 84.62% <100.00%> (+4.62%) ⬆️
db/collection_get.go 72.22% <ø> (ø)
db/collection_index.go 90.91% <100.00%> (-0.08%) ⬇️
db/collection_update.go 71.28% <100.00%> (ø)
db/description/schema.go 61.75% <ø> (-1.21%) ⬇️
db/errors.go 61.09% <100.00%> (-1.17%) ⬇️
db/fetcher/encoded_doc.go 76.14% <100.00%> (ø)
db/fetcher/errors.go 54.55% <ø> (+18.18%) ⬆️
... and 18 more

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9534113...c9bb80b. Read the comment docs.

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments to resolve before I can approve :)

for _, field := range schema.Fields {
if field.ID == id {
return field, true
func (def CollectionDefinition) GetFieldByName(fieldName string) (FieldDescription, bool) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: It feels al little weird to have CollectionDefinition methods mixed in here.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I did like having all the GetFieldByName next to each other, but that's partly because the Descriptions are all jumbled up in this one file anyway, and in a way its odd that Definition isn't.

Will let this discussion sit for a while and you/anyone-else talk some more before I try and make a choice RE its location :)

  • Wait and see

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CollectionDefinition is actually defined in a poor place, at the top of client/collection.go - I'd rather not add a bunch of stuff in there. I think it should be moved to a definitions.go file along with FieldDefinition and the funcs they both host. Please let me know if you want that done in this PR, otherwise they might remain where they are for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved them all in this PR, in a separate commit. Is nicer and it might not get done if we don't do it now.

//
// It draws it's information from the [CollectionFieldDescription] on the [CollectionDescription],
// and the [SchemaFieldDescription] on the [SchemaDescription].
type FieldDescription struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: It really threw me off to see FieldDescription still in existance. It's an exact duplicate of SchemaFieldDescription with tha addition of the ID field. Every function that returns FieldDescription or uses it is a method of CollectionDescription or collection and it's odd that when FieldDescription is returned you have to construct the struct with NewFieldDescription. A much cleaner approach would be to embed the SchemaFieldDescription into the CollectionFieldDescription.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really threw me off to see FieldDescription still in existance.

That's not good - do you have a rough idea as to how much of that was caused my terrible setting of expectations in discord, vs the actual change?

Every function that returns FieldDescription or uses it is a method of CollectionDescription

Yes, that kind of felt like it was breaking expectations when I was consuming it. It kind of flows better at the moment IMO, but that is very possibly because of the limited number of properties on collection-field.

A much cleaner approach would be to embed the SchemaFieldDescription into the CollectionFieldDescription

I do very strongly disagree with this. It creates a large amount of ambiguity in the data model and is kind of reverting back to the starting point where collection had a pointer to schema on it. Amongst other things, it would be very very confusing when mutating collection field stuff as to what you could and couldn't mutate - it is bad with a pointer, and much worse when embedding.

CollectionFieldDescription does not have a SchemaFieldDescription, and in the possibly quite near future, not all CollectionFieldDescription will even have a corresponding SchemaFieldDescription (think secondary fields in a relation).


I did originally have a todo here about moving this struct to the core package, but decided it was nice having the funcs that return it as public. Maybe this choice should be revisited, along with the fact that the functions returning it are hosted on CollectionFieldDescription (which feels uglier the longer I look at it, even though it is kind of convenient atm).

  • Discuss further, maybe move stuff

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not good - do you have a rough idea as to how much of that was caused my terrible setting of expectations in discord, vs the actual change?

I though FieldDescription would become SchemaFieldDescription and CollectionFieldDescription. Not have the two new struct plus FieldDescription.

I do very strongly disagree with this. It creates a large amount of ambiguity in the data model

It feels very ambiguous the way it is now in my opinion.

is kind of reverting back to the starting point where collection had a pointer to schema on it

I disagree. Everywhere FieldDescription is used, both collection and schema description are needed to build the struct. It would be simple to remove the confusion and the have the same functionality you have now with something like this

type CollectionFieldDescription struct {
  Name string
  ID FieldID

  schemaField immutable.Option[SchemaFieldDescription] `json:"-"`

}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. Everywhere FieldDescription is used, both collection and schema description are needed to build the struct. It would be simple to remove the confusion and the have the same functionality you have now with something like this

Only at the moment, because we have only moved one property.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell it would only get worse with more fields in the CollectionFieldDescription.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed changes that I am happy with. I realized part way through that (what was) FieldDescription is basically what CollectionDefinition is, but at a field level. So I ended up just cleaning up the GetFieldByName funcs and renaming it.

CollectionDefinition will eventually suffer from the same (and worse) ambiguity that FieldDescription.Name does when embedding, however at the moment we were able to get away with it because there is currently no overlap in properties. This is almost certainly going to change, and CollectionDefinition will need to be changed for that (redeclaration of it's components in a way that eliminates ambiguity over what should be used, for example a single Fields []FieldDefinition property, containing a merged set of collection and schema fields).

Copy link
Contributor Author

@AndrewSisley AndrewSisley Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and that problem is moderately likely to happen with a few other props in the future

To expand on this, I see the following properties likely to clash, and they may have different values on schema vs collection:

  • Name (if/when it becomes a label this will be much worse, where it's value may be different on schema vs collection)
  • Description, I definitely want this on Collection, and there is value on having one on Schema too (overridable by collection
  • RelationName, I want this on Collection, but we could also allow it to be defined on Schema in the case where users wish to include both halves of a relation on the global schema. In such a case I'm not sure we'd want to allow different values to be set, but we could also allow it to be overrideable like Description and Name.
  • Potentially ACP field level stuff too, I can imaging people might want to be able to set field level stuff on both the schema, and allow local collections to define additional rules on top of the global ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Islam (#2339 (comment)):

todo: I feel like it's wrong having almost identical structs with SchemaFieldDescription.

Moreover, you draw an analogy with collectin/schema description, but these 2 are embedded into CollectionDefinition.
But this one is not. It's inconsistent. I you think that CollectionFieldDescription is too small and has only 1 unique field ID we can make it:

type FieldDefinition struct {
    schemaField SchemaFieldDescription
    collectionFieldID int
}

And you won't need to copy&past all method below (IsObject, IsObjectArray, ...)

Copy link
Contributor Author

@AndrewSisley AndrewSisley Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, you draw an analogy with collectin/schema description, but these 2 are embedded into CollectionDefinition.

They shouldn't be long term, please see my earlier comment:

This is almost certainly going to change, and CollectionDefinition will need to be changed for that (redeclaration of it's components in a way that eliminates ambiguity over what should be used, for example a single Fields []FieldDefinition property, containing a merged set of collection and schema fields).


And you won't need to copy&past all method below (IsObject, IsObjectArray, ...)

I consider copy-paste here to be significantly cheaper than detangling inheritance hierarchies, especially coupled with how embedding structs in Go works (is half flattened, but not quite, so the end interface is not the same as declaring the properties in place). And as mentioned further up, these property sets atm are very similar because only one has been moved, they will diverge more so in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you won't need to copy&past all method below (IsObject, IsObjectArray, ...)

Discussion point bypassed by Islam's suggestion of moving these functions onto FieldKind.

db/collection.go Show resolved Hide resolved
planner/mapper/mapper.go Show resolved Hide resolved
client/descriptions.go Outdated Show resolved Hide resolved
Comment on lines 450 to 374
// IsObject returns true if this field is an object type.
func (f SchemaFieldDescription) IsObject() bool {
return (f.Kind == FieldKind_FOREIGN_OBJECT) ||
(f.Kind == FieldKind_FOREIGN_OBJECT_ARRAY)
}

// IsObjectArray returns true if this field is an object array type.
func (f SchemaFieldDescription) IsObjectArray() bool {
return (f.Kind == FieldKind_FOREIGN_OBJECT_ARRAY)
}

// IsRelation returns true if this field is a relation.
func (f SchemaFieldDescription) IsRelation() bool {
return f.RelationName != ""
}

// IsArray returns true if this field is an array type which includes inline arrays as well
// as relation arrays.
func (f SchemaFieldDescription) IsArray() bool {
return f.Kind == FieldKind_BOOL_ARRAY ||
f.Kind == FieldKind_INT_ARRAY ||
f.Kind == FieldKind_FLOAT_ARRAY ||
f.Kind == FieldKind_STRING_ARRAY ||
f.Kind == FieldKind_FOREIGN_OBJECT_ARRAY ||
f.Kind == FieldKind_NILLABLE_BOOL_ARRAY ||
f.Kind == FieldKind_NILLABLE_INT_ARRAY ||
f.Kind == FieldKind_NILLABLE_FLOAT_ARRAY ||
f.Kind == FieldKind_NILLABLE_STRING_ARRAY
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: Another not ideal side effect of this is that we need to dupliucate these methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is quite cheap, I'm not worried about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can justify copy&paste by performance characteristics.

I'd say you either make one struct embed the other or create something like client.IsArrayKind(Kind)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cheap in terms of dev cost, performance is not relevant here

Copy link
Contributor Author

@AndrewSisley AndrewSisley Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client.IsArrayKind(Kind) or Kind.IsArray() is a nice thought though, can do the same for the others too.

  • Move Kind related funcs

//
// It is to [CollectionFieldDescription] and [SchemaFieldDescription] what [CollectionDefinition]
// is to [CollectionDescription] and [SchemaDescription].
type FieldDefinition struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Islam:

suggestion: would be great to have some information on why it exists and how should it be used.
Especially now that we have 3 different field descriptions/definitions

Copy link
Contributor Author

@AndrewSisley AndrewSisley Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add something (and maybe to CollectionDefinition too), thanks Islam :)

  • expand FieldDefinition docs

client/index.go Outdated
}
if fieldsMap[field.Name] {
// If the FieldDescription has already been added to the result do not add it a second time
// this can happen if a field if referenced by multiple indexes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Islam (#2339 (comment)):

typo: "if a field referenced"

Copy link
Contributor Author

@AndrewSisley AndrewSisley Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will sort out, thanks :)

  • fix typo

@@ -267,7 +267,7 @@ func resolveAggregates(
inputFields []Requestable,
mapping *core.DocumentMapping,
collectionName string,
schema client.SchemaDescription,
schema client.CollectionDefinition,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Islam (#2339 (comment)):

todo: rename the param to colDef or something similar please

Copy link
Contributor Author

@AndrewSisley AndrewSisley Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, thanks for spotting :)

  • Rename param

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say I'm still struggling with the FieldDefinition but the latest changes to the PR make it overall much better than it was initially. I'll give you the benefit of the doubt as to the future benefits that FieldDefinition will bring.

Approving with a few minor comments to resolve before merge.

client/descriptions.go Outdated Show resolved Hide resolved
client/descriptions.go Outdated Show resolved Hide resolved
client/descriptions.go Outdated Show resolved Hide resolved
Copy link
Contributor

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. But as with Fred, this FieldDescription makes me a bit uncomfortable.
Approving with the promise that it will get better in the near future.


// IsArray returns true if this FieldKind is an array type which includes inline arrays as well
// as relation arrays.
func (f FieldKind) IsArray() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: looks much better now

@AndrewSisley AndrewSisley merged commit db75564 into sourcenetwork:develop Feb 28, 2024
28 of 29 checks passed
@AndrewSisley AndrewSisley deleted the 2334-field-id-mv branch February 28, 2024 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/schema Related to the schema system bug Something isn't working feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove FieldDescription.ID from Schema
3 participants