Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to PureDataObjectFactory & data store creation flows; introduction of detached data store context #3112

Merged
merged 30 commits into from
Aug 31, 2020

Conversation

vladsud
Copy link
Contributor

@vladsud vladsud commented Aug 10, 2020

Alternative exploration to PR #2950 (get rid of createDataStoreWithRealizationFn, introduce local scope)

This is not really apples-to-apples comparison, as in my exploration I went much further in this PR
But it explores detached creation of runtime, i.e. attaching runtime to context without runtime going through factories.
I.e. runtime does not has enough data to validate correctness of operation and ensure that loading path would load same runtime later. But, in this exploration I put that kind of validation inside PureDataObjectFactory. The creation path is mostly synchronous, and can become fully synchronous if we are Ok removing this validation. The alternative that I'd prefer is to move this validation into runtime. Though instantiateInstance() is async to allow some async initialization, so it's not very clear if we can allow sync creation for general flow.
I chose explicit workflow of createDetachedDataStore() & adding IFluidDataStoreContextDetached interface with extra method for attaching, and not exposing it for regular flow.

Key changes:

  1. object creation is done only through factory, it's type-save. I.e. instead of
        const storage = await this.createFluidObject(SpacesStorage.ComponentName);

we have

        const storage = await SpacesStorage.getFactory().createInstance(this.context);

which returns SpacesStorage! No casting or guessing is required!

  1. PureDataObjectFactory.createInstance() now takes either context or IContainerRuntimeBase. It creates a child component and uses either context or runtime's registry to validate it contains entry with factory's type and such factory matches this factory. Previous code was guessing which registry to use, now this choice is done based on type of argument passed in, with fallback code removed.

    • PureDataObject.createFluidObject(type) is gone!. Either create objects through factory, or use anonymous creation through FluidDataObjectFactory.createAnonymousInstance(). If there is desire, same functionality can be added to PureDataObjectFactory itself.
  2. I tried to remove factory type names from usage in placed I've touched, and rely in factories instead. In other workds, if you need vltava object, you can't just refer to it by "vltava", you need to pass Vltava.getFactory() - names are becoming more of the internal thing.

  3. instantiateDataStore() is converted to return (async) runtime, instead of relying on calling bindRuntime(). This is mostly to make sure that any errors on this path are correctly captured and result in failed to load data store, not infinite wait. For backward compatibility, old flow is totally supported for a version.

  4. In spaces, I experimented with using factories through the code, and not using names. Open to suggesting here, I'm not sure if code became more readable. Here, code operates with anonymous objects, so it does not have the benefit of work in 1-2 above.

  5. IFluidDataStoreRegistry moved from runtime to context

Future:

  1. I do not see any validation that we do not have registry with duplicate names. This can happen due to some mistakes and failing early would save developers a lot of time in debugging.
  2. I'd love to convert NamedFluidDataStoreRegistryEntry from [string, Promise] to FluidDataStoreRegistryEntry. get() methods are already async. The only synchronous thing on FluidDataStoreRegistryEntry is type, which is duplciation of NamedFluidDataStoreRegistryEntry[0]. This duplication & structure pushes people to use names instead of actual factories, which potentially can result in hard to diagnose bugs. Any existing code (consumer & producer) can be easily converted to rely on simpler pattern without loss of asynchrony in flows.

Notes for myself:

  1. Breaking.md needs to be updated

);

return requestFluidObject<PureDataObject<P, S>>(router, "/");
const instanceP = this.instantiateInstance(runtime, newContext, initialState);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@curtisman, this is the place where we have asynchrony that can't be removed from this workflow. I.e. even if I remove validation that happens in buildSubPath(), whole workflow is async because PureDataObject creation flow is async.
While it's not ideal (for the reasons @DLehenbauer mentioned in email - code that creates component needs to adjust for time lapse, i.e. if it was about to stuck a handle to new component into merge tree, it needs to adjust CP reference based on changes while this call was in flight), I think it's fine, as many other workflows would force developers to deal with it anyway (like right click | show some menu | insert component - things will change in document while UI is up, so code needs to track CP reference through those changes).

If we are Ok with asynchrony here, I'd move logic in buildSubPath() directly into runtime, to guarantee correctness of operation at lowest level.

Copy link
Member

@curtisman curtisman Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem now is that now you create a async gap between instantiateDataStore and bindRuntime.
So the runtime might not be completely set up before other code can run and the system is in an incomplete state.

Also I scan all the implementation, and I don't see any one of them requiring instantiateDataStore to be async?
The instantiateDataStoreCore in sharedComponentFactory will call an async function with an instanceP, but it doesn't await on them until someone requests it (and that's when any error will propagate currently)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really new - this is how we instantiate runtimes, per contract we have (in reality all users call DataStoreRuntime.load() right away from there instantiateDataStore() and that results in immediate call to bindRuntime). I think the intention here is similar to instantiateRuntime, the process allows asynchrony for data store to fully be initialized (and possibly load all DDSs if needed) before it is returned back.

I do not see an issue with runtime not being attached right away - nobody (except data object that is initializing here) has a pointer to it.

As for error handling, yes, sure we can examine this code and declare that it's not going to throw. We can't do so about any instantiateDataStore() call. I believe data store context should care and cover properly error handling here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I can definitely re-arrange this code to be more like instantiateDataStoreCore() above. But that does not address problem of async runtime creation, and overall I find it less correct for bindRuntime() being called before request handler is setup (as it's done in master). bindRuntime does a ton of stuff (processes ops, calls notifyDataStoreInstantiated which raises events), so runtime is expose to external world before we fully set it up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there was no gap :) instantiateInstance() result is not awaited. But glance at latest iteration - moving validation logic into runtime actually creates that gap you mentioned

Copy link
Member

@curtisman curtisman Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the async instantiateDataStore() that I am mpt sure why it is required.
I took your PR and remove the async, and it seems like it is ok with it not be async?
https://github.com/curtisman/FluidFramework/tree/async

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can do it today. But how is this different in principle from async instantiateRuntime?
In both cases it's async for simplification of workflows. For example, creation of root sub-items (data store in case of container, DDS in case of data store) on creation path, where creation might be async due to code loading. Being able do so as part of instantiation results in simpler logic around op processing & request handling later. It's possible to do all of that even with sync creation, but it's just a bit harder.

Note that once bindinng happens, someone can request an object from data store and attach it's handle. If creation of root DDS is async (or once we allow data stores on channels - creation of root sub-data store is async), then request handling should also take into account that async activity and wait for it before returning anything.

I do not have a strong opinion here, just pointing out that it's symmetrical (and especially true if we allow channels to have other data stores, i.e. nesting). So unless there is really good reason why these workflows needs to be sync, I'd prefer to have freedom here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add that this flow is already async (see context.realize()), so there is no benefit here from trying to make it sync. If we want to make creation synchronous, detached context creation workflow is the one we need to focus on.

…into initialProps2

# Conflicts:
#	examples/data-objects/vltava/src/components/anchor/anchor.ts
#	examples/data-objects/vltava/src/interfaces/componentInternalRegistry.ts
#	packages/runtime/container-runtime/src/containerRuntime.ts
#	packages/runtime/container-runtime/src/dataStoreContext.ts
…into initialProps2

# Conflicts:
#	examples/data-objects/vltava/src/components/anchor/anchor.ts
#	examples/data-objects/vltava/src/components/tabs/newTabButton.tsx
#	examples/data-objects/vltava/src/index.ts
…into initialProps2

# Conflicts:
#	packages/runtime/runtime-definitions/src/dataStoreContext.ts
#	packages/runtime/runtime-definitions/src/dataStoreFactory.ts
#	packages/runtime/runtime-definitions/src/dataStoreRegistry.ts
@vladsud vladsud changed the title Exploration into (almost synchronous) data object creation flow via detached context. Improvements to PureDataObjectFactory & data store creation flows; introduction of detached data store context Aug 13, 2020
@vladsud vladsud requested a review from ChumpChief August 13, 2020 19:40
@vladsud
Copy link
Contributor Author

vladsud commented Aug 13, 2020

This PR graduated from exploration into proposal - it's ready to be merged in - just need your review.

Copy link
Contributor

@heliocliu heliocliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me 👀 my main concern would be that bohemia seems very reliant on having a centralized create path that currently uses the types rather than factories and it may not be an easy api change for them to resolve that.

@ghost ghost added the triage label Aug 18, 2020
@microsoft-github-updates microsoft-github-updates bot changed the base branch from master to main August 19, 2020 02:18
@ghost ghost added the triage label Aug 20, 2020
…to initialProps2

# Conflicts:
#	examples/data-objects/external-component-loader/src/externalComponentLoader.tsx
#	examples/data-objects/external-component-loader/src/urlRegistry.ts
#	examples/data-objects/external-component-loader/src/waterPark.tsx
#	examples/data-objects/simple-fluidobject-embed/src/index.tsx
#	examples/data-objects/vltava/src/components/anchor/anchor.ts
#	examples/data-objects/vltava/src/fluidObjects/tabs/dataModel.ts
#	examples/data-objects/vltava/src/fluidObjects/tabs/newTabButton.tsx
#	examples/data-objects/vltava/src/fluidObjects/vltava/vltava.tsx
#	examples/data-objects/vltava/src/index.ts
#	packages/framework/aqueduct/src/data-object-factories/dataObjectFactory.ts
#	packages/framework/aqueduct/src/data-object-factories/pureDataObjectFactory.ts
#	packages/framework/aqueduct/src/data-objects/pureDataObject.ts
#	packages/runtime/component-runtime/src/componentRuntime.ts
#	packages/test/end-to-end-tests/src/test/componentHandle.spec.ts
#	packages/test/end-to-end-tests/src/test/deRehydrateContainerTests.spec.ts
#	packages/test/test-utils/src/localCodeLoader.ts
…to initialProps2

# Conflicts:
#	packages/framework/aqueduct/src/data-objects/pureDataObject.ts
#	packages/test/end-to-end-tests/src/test/componentHandle.spec.ts
@vladsud vladsud merged commit 3d266cb into microsoft:main Aug 31, 2020
@vladsud vladsud deleted the initialProps2 branch August 31, 2020 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants