feat: support entity fields subset of db fields #49

wschurman · 2020-06-21T21:38:43Z

Why

This is the other approach mentioned in #47 (comment): EntityDataManager and everything below it doesn't operate on the current entity's "fields". Rather an entity is a selection of fields exposed from the full set of fields loaded.

This PR adds the following concept: If an entity wants to only expose a subset of its database row's fields via getField and getAllFields, it can do so by specifying a new TSelectedFields type parameter. To make the change non-breaking, TSelectedFields defaults to all fields on TFields.

The underlying mutators and loaders still operate on the full set of fields, as not doing so creates an interesting case: Hypothetically, let's say that two entities reference the same table and also the same rows. Entity A has field selection a, b, c. Entity B has field selection a, c. Entities are cached by all fields from both entities.
Instance A of Entity A is created with a, b, c.
Instance B is loaded with ID of Instance A.
Instance B is mutated, changing field c to something else.
Now, all the cache entries of the underlying fields of B must be invalidated. If only cache entries for Entity B's fields were invalidated, then the following load would be inconsistent.
Instance A is loaded by field b. Because the cache was invalidated by all fields (including field b) this will be consistent.

How

Add type parameter to all entity-related classes that need it, then after it typechecks successfully add default = keyof TFields (to ensure no cases were missed).
Add a way to specify field selection at runtime to match compile time specification.
Add a new concept, EntityTableDataCoordinator, which is essentially what EntityCompanion used to be but shared amongst all entity companions that use the same table. This way the dataloaders and caches remain consistent.

Test Plan

Run new tests to ensure it works as expected.

codecov · 2020-06-21T21:41:25Z

Codecov Report

Merging #49 into master will increase coverage by 0.11%.
The diff coverage is 97.05%.

@@            Coverage Diff             @@
##           master      #49      +/-   ##
==========================================
+ Coverage   93.95%   94.07%   +0.11%     
==========================================
  Files          57       58       +1     
  Lines        1357     1384      +27     
  Branches      147      147              
==========================================
+ Hits         1275     1302      +27     
  Misses         81       81              
  Partials        1        1

Flag	Coverage Δ
#integration	`94.07% <97.05%> (+0.11%)`	⬆️
#unittest	`94.07% <97.05%> (+0.11%)`	⬆️

Impacted Files	Coverage Δ
packages/entity/src/EnforcingEntityLoader.ts	`100.00% <ø> (ø)`
packages/entity/src/Entity.ts	`88.00% <ø> (ø)`
packages/entity/src/EntityErrors.ts	`100.00% <ø> (ø)`
packages/entity/src/EntityPrivacyPolicy.ts	`96.61% <ø> (ø)`
packages/entity/src/ViewerScopedEntityCompanion.ts	`100.00% <ø> (ø)`
.../entity/src/ViewerScopedEntityCompanionProvider.ts	`100.00% <ø> (ø)`
...ages/entity/src/ViewerScopedEntityLoaderFactory.ts	`100.00% <ø> (ø)`
...ges/entity/src/ViewerScopedEntityMutatorFactory.ts	`100.00% <ø> (ø)`
...ages/entity/src/internal/ReadThroughEntityCache.ts	`95.45% <ø> (ø)`
...s/entity/src/rules/AlwaysAllowPrivacyPolicyRule.ts	`100.00% <ø> (ø)`
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 89cfb3d...1272324. Read the comment docs.

ide · 2020-06-22T20:46:33Z

In general I think we should avoid compile-time only features like this. I often say, "Just because it works doesn't mean it's right," but if there is a simple way (like an any cast or a for-in loop`) to make an unintended feature work at runtime, it too often gets used.

One way we could address this is by specifying the selected fields (the exposed subset) statically in a data structure that's available at runtime as part of the entity configuration. The entity library would use this data structure for enforcement at runtime. And since it's statically defined, TypeScript would use it for static enforcement.

Continuing this train of thought, don't we already declare the fields (namely, the mapping from database fields to entity fields) in a static, runtime-available data structure that specifies what fields to load?

What I'm wondering/proposing is to invert this PR a bit and use this existing data structure to know what fields to expose on an entity instance (i.e. the subset). By default, this data structure also would specify what fields to fetch from the database (i.e. the superset).

Entities that share tables (e.g. robots & users) would be able to provide a second data structure that specifies the fields to fetch from the database, since they need a strict superset. For correctness, the RobotEntityConfiguration and UserEntityConfiguration would want to share this data structure so the loaders for both entities fetch the same fields. It would also be correct (but in some cases possibly inefficient) for there to be a way for this second data structure to say "SELECT *".

wschurman · 2020-06-23T17:35:59Z

One way we could address this is by specifying the selected fields (the exposed subset) statically in a data structure that's available at runtime as part of the entity configuration. The entity library would use this data structure for enforcement at runtime. And since it's statically defined, TypeScript would use it for static enforcement.

I tried this and it turned out pretty well; I'll put up a commit here. It supports filtering the fields at runtime for the getField and getAllFields methods so that only the set of selected fields is accessible both at compile time and runtime.

What I'm wondering/proposing is to invert this PR a bit and use this existing data structure to know what fields to expose on an entity instance (i.e. the subset). By default, this data structure also would specify what fields to fetch from the database (i.e. the superset).

I tried this before this PR actually. The issue is that TypeScript doesn't have a great way of expressing subsets of types (as far as I know). It supports the inverse through extends, so what I tried was having two generic parameters, TFields, TDBFields extends TFields, but this ended up being a lot messier in the end from an API perspective, which is why I leaned more towards the field selection approach mentioned above.

wschurman · 2020-06-25T16:41:57Z

Made a pretty large update to this. Adding the test for entities that can both reference the same row was critical and an excellent suggestion.

ide · 2020-06-26T04:21:52Z

The general idea LGTM (I mostly looked at the tests) but I think the API could be more streamlined.

When writing an entity class, the first thing that comes to mind are the fields on the entity, not the entire set of fields it loads from the database. Figuring out a way for the entity fields to be the first generic parameter would reflect this (put the most important parameters first).

Untested draft of an idea:

type Example1EntityFields = {
  id: string;
  field_a: string;
};

type Example2EntityFields = {
  id: string;
  field_b: string;
};

// TS will make sure that shared fields ("id" in this example) have compatible types or else they get typed as "never"
type ExampleEntityDatabaseFields = Example1EntityFields & Example2EntityFields;

const sharedExampleEntityConfiguration = new EntityConfiguration<ExampleEntityDatabaseFields>({
  idField: 'id',
  tableName: 'entities',
  schema: {
    id: new UUIDField({ columnName: 'custom_id', cache: true }),
    field_a: new StringField({ columnName: 'field_a', cache: true }),
    field_b: new StringField({ columnName: 'field_b', cache: true }),
  },
});

class Example1Entity extends Entity<Example1EntityFields, string, ViewerContext, ExampleEntityDatabaseFields> { ... }

abstract class Entity<
  TFields extends Record<string, unknown>,
  TID,
  TViewerContext extends ViewerContext,
  TDatabaseFields extends TFields = TFields
> { }

The word "selected" is a bit confusing. It could be interpreted as "selected from the database."
TFields defines field types while TSelectedFields is just the names -- TSelectedFieldNames would help make this distinction clearer.
If two entities backed by the same table specify different TFields would that cause problems? If so, is there some lightweight way we could create a pit of success (API structure, static TS checks) that makes sure same-table entity classes specify the same TFields (the same EntityConfiguration instance, really).

ide · 2020-06-26T04:29:14Z

The trick in the above proposal code is that T2 extends T1 means that T2 can have more fields than T1 but there are fewer possible instances/values for T2 than there are for T1.

In the PR, the entity field names are values of the types (type TFields = 'a' | 'b' | 'c'; type TSelectedFields = 'a' | b';). That is, TSelectedFields extends TFields.

In the proposal snippet, the entity field names are properties of the types (type TDatabaseFields = { a: any; b: any; c: any; }; type TEntityFields = { a: any; b: any; };). This way we can write TDatabaseFields extends TEntityFields.

packages/entity-example/src/__tests__/NoteEntity-test.ts

packages/entity-example/src/entities/AllowIfUserOwnerPrivacyRule.ts

packages/entity/src/EntityConfiguration.ts

packages/entity/src/__tests__/cases/TwoEntitySameTableOverlappingRows-test.ts

wschurman · 2020-06-29T19:10:01Z

When writing an entity class, the first thing that comes to mind are the fields on the entity, not the entire set of fields it loads from the database. Figuring out a way for the entity fields to be the first generic parameter would reflect this (put the most important parameters first).

In its current form, the default is that both sets of fields are the same and no extra field specification is necessary. There's only one set of fields and it only needs to be specified once. It's only when the user wants to specify two entities that use the same underlying set of fields that they need to specify the extra parameters.

TDatabaseFields extends TFields = TFields

I had actually tried this first (before doing the field key selection stuff). I'll try to articulate the reasons I ended up going with the key subset approach:

Most critically, we'd need a way to specify TFields at runtime to prevent an entity instance from being able to return any database field from getField (the same discussion we had in the first iteration of this PR about compile-time-only features). Because EntityConfiguration would now operate on TDatabaseFields (the schema would be shape TDatabaseFields), there would no longer be a place where the shape of TFields is specified at runtime, which would mean we'd need to specify a key subset like we do in the current iteration.
For some reason, TS isn't quite powerful enough to infer subset using the extends keyword. A simple example of this can be seen in this playground. Essentially, it struggles to know that TDatabaseFields extends TFields when assigning a field of type keyof TFields to an object of type TDatabaseFields. This manifests itself in the loader and mutator components which need to operate on TFields instead of TDatabaseFields.

I'll put up a stacked PR with my most recent attempt at this though just in case you have any insight. #51

ide

Approving this since this is the direction we want to go in and API changes wouldn't need complicated refactors.

Future work:

Here is an example of why I think TS behaves the way it does and a potential fix:

https://www.typescriptlang.org/play/index.html#code/MYGwhgzhAEBC4AsA8AVAYgSwKYgCYywA8AXLAO32gCUtgB7AJ1yQmIYzIHMAaaAVzIBrMnQDuZAHy8UAETDEwAI0hZMOSkVIUY6bHhgBeaLvUQJ0AN4BYAFDRo9MqwZ9gxRgAoADuwBu8rGgGLDBcOjIQAE9oXEU1fQAuYzkFZQhVPXwASksAX1tbe04sYnjmAGloTXJKQSxIugAzYzKzD0bMgDkwAFssJPKspJN9AG1ygF1LQvsgkr4GMmhiBAwIADpY1tGO9W6+iYBuGfybGYB6c+gAYQQwLixcaH8QPn6WzIhxqfdnsFf3rJ5EoVNtJjN0qVMkhKtVtNA6g1miN8BJ2l1eu9yrwXm9hikQekwRMhstVjBrHZZis1ps4p8dhiDtAjLisMcqfZgsQFksaRAOfZTqdbJdoAAJLDBADkMFECEiCQKNlAkBgAGU+F4paqoNMqWAkmQ+D1FFKWdAAIwckUq8B6zWKXUEEg1DVanX2ikzRRGk1mhgWgBMNuVxEi2ugmu1DHDkaMlPsHFY92A72jnrVobONjjgUdeYtiegyYUZDTSUdzuztkcrGgikQFrIWFEcEQSAzsYjWF4BZ7aIsJacZYr0BbbarXo8OVyWQ5YpQ5OgojofDwDcCYGgPjojawPQAhLZG2AEOtIWUPNLS6msNLeBOox6GM6Z-PbEA

wschurman requested a review from ide June 21, 2020 21:38

wschurman mentioned this pull request Jun 21, 2020

feat!: add support for entities sharing the same table #47

Closed

ide reviewed Jun 26, 2020

View reviewed changes

wschurman requested a review from ide June 30, 2020 16:39

wschurman added 6 commits June 30, 2020 11:33

feat: support entity fields subset of db fields

c0b0241

Remove unnecessary test case

7256410

Add ability to specify subset of fields at runtime for an entity

daeb894

Move dataloaders to share on table name rather than entity name

7c4f050

Address comments

a309f90

Fix merge conficts

1272324

wschurman force-pushed the @wschurman/hoist branch from 7ec4e8a to 1272324 Compare June 30, 2020 18:36

ide approved these changes Jun 30, 2020

View reviewed changes

wschurman merged commit 4e40b2e into master Jul 1, 2020

wschurman deleted the @wschurman/hoist branch July 1, 2020 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support entity fields subset of db fields #49

feat: support entity fields subset of db fields #49

wschurman commented Jun 21, 2020 •

edited

Loading

codecov bot commented Jun 21, 2020 •

edited

Loading

ide commented Jun 22, 2020

wschurman commented Jun 23, 2020

wschurman commented Jun 25, 2020

ide commented Jun 26, 2020 •

edited

Loading

ide commented Jun 26, 2020

wschurman commented Jun 29, 2020 •

edited

Loading

ide left a comment

feat: support entity fields subset of db fields #49

feat: support entity fields subset of db fields #49

Conversation

wschurman commented Jun 21, 2020 • edited Loading

Why

How

Test Plan

codecov bot commented Jun 21, 2020 • edited Loading

Codecov Report

ide commented Jun 22, 2020

wschurman commented Jun 23, 2020

wschurman commented Jun 25, 2020

ide commented Jun 26, 2020 • edited Loading

ide commented Jun 26, 2020

wschurman commented Jun 29, 2020 • edited Loading

ide left a comment

Choose a reason for hiding this comment

wschurman commented Jun 21, 2020 •

edited

Loading

codecov bot commented Jun 21, 2020 •

edited

Loading

ide commented Jun 26, 2020 •

edited

Loading

wschurman commented Jun 29, 2020 •

edited

Loading