Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support entity fields subset of db fields #49

Merged
merged 6 commits into from
Jul 1, 2020

Conversation

wschurman
Copy link
Member

@wschurman wschurman commented Jun 21, 2020

Why

This is the other approach mentioned in #47 (comment): EntityDataManager and everything below it doesn't operate on the current entity's "fields". Rather an entity is a selection of fields exposed from the full set of fields loaded.

This PR adds the following concept: If an entity wants to only expose a subset of its database row's fields via getField and getAllFields, it can do so by specifying a new TSelectedFields type parameter. To make the change non-breaking, TSelectedFields defaults to all fields on TFields.

The underlying mutators and loaders still operate on the full set of fields, as not doing so creates an interesting case: Hypothetically, let's say that two entities reference the same table and also the same rows. Entity A has field selection a, b, c. Entity B has field selection a, c. Entities are cached by all fields from both entities.
Instance A of Entity A is created with a, b, c.
Instance B is loaded with ID of Instance A.
Instance B is mutated, changing field c to something else.
Now, all the cache entries of the underlying fields of B must be invalidated. If only cache entries for Entity B's fields were invalidated, then the following load would be inconsistent.
Instance A is loaded by field b. Because the cache was invalidated by all fields (including field b) this will be consistent.

How

  • Add type parameter to all entity-related classes that need it, then after it typechecks successfully add default = keyof TFields (to ensure no cases were missed).
  • Add a way to specify field selection at runtime to match compile time specification.
  • Add a new concept, EntityTableDataCoordinator, which is essentially what EntityCompanion used to be but shared amongst all entity companions that use the same table. This way the dataloaders and caches remain consistent.

Test Plan

Run new tests to ensure it works as expected.

@codecov
Copy link

codecov bot commented Jun 21, 2020

Codecov Report

Merging #49 into master will increase coverage by 0.11%.
The diff coverage is 97.05%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #49      +/-   ##
==========================================
+ Coverage   93.95%   94.07%   +0.11%     
==========================================
  Files          57       58       +1     
  Lines        1357     1384      +27     
  Branches      147      147              
==========================================
+ Hits         1275     1302      +27     
  Misses         81       81              
  Partials        1        1              
Flag Coverage Δ
#integration 94.07% <97.05%> (+0.11%) ⬆️
#unittest 94.07% <97.05%> (+0.11%) ⬆️
Impacted Files Coverage Δ
packages/entity/src/EnforcingEntityLoader.ts 100.00% <ø> (ø)
packages/entity/src/Entity.ts 88.00% <ø> (ø)
packages/entity/src/EntityErrors.ts 100.00% <ø> (ø)
packages/entity/src/EntityPrivacyPolicy.ts 96.61% <ø> (ø)
packages/entity/src/ViewerScopedEntityCompanion.ts 100.00% <ø> (ø)
.../entity/src/ViewerScopedEntityCompanionProvider.ts 100.00% <ø> (ø)
...ages/entity/src/ViewerScopedEntityLoaderFactory.ts 100.00% <ø> (ø)
...ges/entity/src/ViewerScopedEntityMutatorFactory.ts 100.00% <ø> (ø)
...ages/entity/src/internal/ReadThroughEntityCache.ts 95.45% <ø> (ø)
...s/entity/src/rules/AlwaysAllowPrivacyPolicyRule.ts 100.00% <ø> (ø)
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 89cfb3d...1272324. Read the comment docs.

@ide
Copy link
Member

ide commented Jun 22, 2020

In general I think we should avoid compile-time only features like this. I often say, "Just because it works doesn't mean it's right," but if there is a simple way (like an any cast or a for-in loop`) to make an unintended feature work at runtime, it too often gets used.

One way we could address this is by specifying the selected fields (the exposed subset) statically in a data structure that's available at runtime as part of the entity configuration. The entity library would use this data structure for enforcement at runtime. And since it's statically defined, TypeScript would use it for static enforcement.

Continuing this train of thought, don't we already declare the fields (namely, the mapping from database fields to entity fields) in a static, runtime-available data structure that specifies what fields to load?

What I'm wondering/proposing is to invert this PR a bit and use this existing data structure to know what fields to expose on an entity instance (i.e. the subset). By default, this data structure also would specify what fields to fetch from the database (i.e. the superset).

Entities that share tables (e.g. robots & users) would be able to provide a second data structure that specifies the fields to fetch from the database, since they need a strict superset. For correctness, the RobotEntityConfiguration and UserEntityConfiguration would want to share this data structure so the loaders for both entities fetch the same fields. It would also be correct (but in some cases possibly inefficient) for there to be a way for this second data structure to say "SELECT *".

@wschurman
Copy link
Member Author

One way we could address this is by specifying the selected fields (the exposed subset) statically in a data structure that's available at runtime as part of the entity configuration. The entity library would use this data structure for enforcement at runtime. And since it's statically defined, TypeScript would use it for static enforcement.

I tried this and it turned out pretty well; I'll put up a commit here. It supports filtering the fields at runtime for the getField and getAllFields methods so that only the set of selected fields is accessible both at compile time and runtime.

What I'm wondering/proposing is to invert this PR a bit and use this existing data structure to know what fields to expose on an entity instance (i.e. the subset). By default, this data structure also would specify what fields to fetch from the database (i.e. the superset).

I tried this before this PR actually. The issue is that TypeScript doesn't have a great way of expressing subsets of types (as far as I know). It supports the inverse through extends, so what I tried was having two generic parameters, TFields, TDBFields extends TFields, but this ended up being a lot messier in the end from an API perspective, which is why I leaned more towards the field selection approach mentioned above.

@wschurman
Copy link
Member Author

Made a pretty large update to this. Adding the test for entities that can both reference the same row was critical and an excellent suggestion.

@ide
Copy link
Member

ide commented Jun 26, 2020

The general idea LGTM (I mostly looked at the tests) but I think the API could be more streamlined.

  • When writing an entity class, the first thing that comes to mind are the fields on the entity, not the entire set of fields it loads from the database. Figuring out a way for the entity fields to be the first generic parameter would reflect this (put the most important parameters first).

Untested draft of an idea:

type Example1EntityFields = {
  id: string;
  field_a: string;
};

type Example2EntityFields = {
  id: string;
  field_b: string;
};

// TS will make sure that shared fields ("id" in this example) have compatible types or else they get typed as "never"
type ExampleEntityDatabaseFields = Example1EntityFields & Example2EntityFields;

const sharedExampleEntityConfiguration = new EntityConfiguration<ExampleEntityDatabaseFields>({
  idField: 'id',
  tableName: 'entities',
  schema: {
    id: new UUIDField({ columnName: 'custom_id', cache: true }),
    field_a: new StringField({ columnName: 'field_a', cache: true }),
    field_b: new StringField({ columnName: 'field_b', cache: true }),
  },
});

class Example1Entity extends Entity<Example1EntityFields, string, ViewerContext, ExampleEntityDatabaseFields> { ... }

abstract class Entity<
  TFields extends Record<string, unknown>,
  TID,
  TViewerContext extends ViewerContext,
  TDatabaseFields extends TFields = TFields
> { }
  • The word "selected" is a bit confusing. It could be interpreted as "selected from the database."
  • TFields defines field types while TSelectedFields is just the names -- TSelectedFieldNames would help make this distinction clearer.
  • If two entities backed by the same table specify different TFields would that cause problems? If so, is there some lightweight way we could create a pit of success (API structure, static TS checks) that makes sure same-table entity classes specify the same TFields (the same EntityConfiguration instance, really).

@ide
Copy link
Member

ide commented Jun 26, 2020

The trick in the above proposal code is that T2 extends T1 means that T2 can have more fields than T1 but there are fewer possible instances/values for T2 than there are for T1.

In the PR, the entity field names are values of the types (type TFields = 'a' | 'b' | 'c'; type TSelectedFields = 'a' | b';). That is, TSelectedFields extends TFields.

In the proposal snippet, the entity field names are properties of the types (type TDatabaseFields = { a: any; b: any; c: any; }; type TEntityFields = { a: any; b: any; };). This way we can write TDatabaseFields extends TEntityFields.

@wschurman
Copy link
Member Author

wschurman commented Jun 29, 2020

  • When writing an entity class, the first thing that comes to mind are the fields on the entity, not the entire set of fields it loads from the database. Figuring out a way for the entity fields to be the first generic parameter would reflect this (put the most important parameters first).

In its current form, the default is that both sets of fields are the same and no extra field specification is necessary. There's only one set of fields and it only needs to be specified once. It's only when the user wants to specify two entities that use the same underlying set of fields that they need to specify the extra parameters.

TDatabaseFields extends TFields = TFields

I had actually tried this first (before doing the field key selection stuff). I'll try to articulate the reasons I ended up going with the key subset approach:

  • Most critically, we'd need a way to specify TFields at runtime to prevent an entity instance from being able to return any database field from getField (the same discussion we had in the first iteration of this PR about compile-time-only features). Because EntityConfiguration would now operate on TDatabaseFields (the schema would be shape TDatabaseFields), there would no longer be a place where the shape of TFields is specified at runtime, which would mean we'd need to specify a key subset like we do in the current iteration.
  • For some reason, TS isn't quite powerful enough to infer subset using the extends keyword. A simple example of this can be seen in this playground. Essentially, it struggles to know that TDatabaseFields extends TFields when assigning a field of type keyof TFields to an object of type TDatabaseFields. This manifests itself in the loader and mutator components which need to operate on TFields instead of TDatabaseFields.

I'll put up a stacked PR with my most recent attempt at this though just in case you have any insight. #51

@wschurman wschurman requested a review from ide June 30, 2020 16:39
Copy link
Member

@ide ide left a comment

@wschurman wschurman merged commit 4e40b2e into master Jul 1, 2020
@wschurman wschurman deleted the @wschurman/hoist branch July 1, 2020 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants