Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design: Creation Flow / Singletons #1096

Closed
vladsud opened this issue Jan 31, 2020 · 12 comments
Closed

Design: Creation Flow / Singletons #1096

vladsud opened this issue Jan 31, 2020 · 12 comments
Assignees
Labels
area: framework Framework is a tag for issues involving the developer framework. Eg Aqueduct design-required This issue requires design thought
Milestone

Comments

@vladsud
Copy link
Contributor

vladsud commented Jan 31, 2020

Terminology: Component = component runtime, container = container runtime (code)

Problem statement: In today’s world, it’s tempting to initialize container by having singleton components with well-defined names / IDs, relying on being connected through initialization process and be the only one online client on the wire. That’s the easiest but not correct (robust to failure) way of initialization. ‘isExisting’ does not have correctness guarantees – client dying half-way through the creation process leaves document in corrupt state. When not using consensus data structures, collisions on component IDs (two components created with same ID) are not easily detectable, leaving client (and potentially document) in a broken state. Consensus data structures can be used here, but programmability of this pattern needs to be improved, and there are certain limitations – it’s requires client to be connected to web socket, with higher latencies on slow networks.

Without knowing data semantics, runtime cannot really help here by automatically merging data. Let’s use example of sparse matrix, when two components with same names are created in different clients, with matrixes having different dimensions and different content. There is no way to merge them automatically, preserving higher level semantics and invariants, without knowing inner details on what this data represents.

I do not see clear one size fits all solutions here, so I would propose a set of steps that would reduce that problem to a minimum:

File is created atomically with all initial content where container has fully initialized state, including all singletons. Host should not rely on a fact that we push final content atomically to storage (which might be storage specific concept), and instead use consensus data structures (which in this case would have zero latency), such that if this assumption is broken, we still have correct sequence. Both host and runtime likely need triggers to tell when they are done with initialization. Signal form container runtime would be required for “regular” file creation to indicate the end of container initialization. Similarly, container may need to play a role here, indicating when user "commits" file (send message in Teams).

This substantially reduces the problem but does not solve it completely. New version of code may introduce new singletons. Process of creating new singletons ideally should be tied with proposal to update to new version of code, and depends on being online, thus either consensus data structures should be used to create new singletons, or better – version converter would create new singletons as part of (ideally atomic) upgrade process.

We unify how merges are exposed by removing a way to name components / DDS / refer by ID. The only way to reference components / DDSs are through handles, and merges (of one component instance overwriting another) show up only in places with well-defined DDS events. I.e. if component handle is stored on a map key or shared string marker, then there is established notification mechanism for anyone interested to listen when new value overwrites a key or marker properties that results in component instance becoming unrooted. In future, application will have to deal with these issues by implementing three-way merge. But for online case, dropping unrooted component is the simplest and correct way to resolve these conflicts (i.e. default behavior gives intended result). A component or DDS that has no references through handles in container gets garbage collected, and it is responsibility of container to keep components alive on as needed bases.

We need to have root(s) from where the rest of the tree is exposed. Components should fully initialize themselves before component shows up in the tree.

Container must have root component baked into runtime. It would be automatically created as part of new container runtime creation (first code proposal).

Important notes / side-effects of the proposal:

  1. This proposal does not remove singletons as a pattern. They can be retrieved through the request route – root component can look up proper named key in its root dictionary and resolve appropriate handle.
    a. That said, creation of singletons by parts in container is gone. Only container runtime / root component can do it. This helps with enforcing locality of parts.
  2. Container/components are responsible for keeping components alive (referenced in the tree) when handing URIs over. In a scenario where Scriptor inserts a Table component, and later deletes reference to Table, it’s not responsibility of Scriptor to keep Table references alive – it's responsibility of some other authority (sharing manager?) to keep a copy of a handle to Table to keep it alive on as needed bases.
  3. DDS / Component Attach is not important for this discussion. Merges happen at the point elements are rooted in a tree, and thus there is no need to keep objects unattached to solve problems raised in this doc. That said, keeping objects non-attached can be useful strategy to reduce noise and reduce the amount of GC storage must do if / when things go wrong and object never gets attached.
  4. For places where we have conflicts with two clients creating conflicting components, “first one wins” merge policy might make more sense. That said, it’s not clear how to localize it. I.e. it probably makes sense to have it on root map of root component. But what about random marker in sequence that represents table cell (pointing to a component)?
@vladsud
Copy link
Contributor Author

vladsud commented Jan 31, 2020

Related issues:
#777 New Component Creation Paradigm
#885 Singleton creation pattern

@vladsud vladsud added component isolation design-required This issue requires design thought area: framework Framework is a tag for issues involving the developer framework. Eg Aqueduct labels Jan 31, 2020
@DLehenbauer
Copy link
Contributor

Strongly agree on your main points:

  • Components should generate unique ids on creation.
  • Apps make components discoverable by attaching them to DDSes.
  • Attachment of a subtree is atomic.
  • Resolving merges is impossible at the runtime layer.

@vladsud vladsud changed the title Design: Creation Flow / Signletons Design: Creation Flow / Singletons Jan 31, 2020
@vladsud vladsud added this to the Build2020 milestone Jan 31, 2020
@tanviraumi
Copy link
Contributor

Thanks for the write up! Strongly agree on the 'isExisting' problems. We should really create an 'ensureCreation' method where clients can race and the first one will win.

@vladsud vladsud pinned this issue Jan 31, 2020
@ChumpChief
Copy link
Contributor

I like it!

Host should [...] use consensus data structures (which in this case would have zero latency), such that if this assumption is broken, we still have correct sequence. Both host and runtime likely need triggers to tell when they are done with initialization. Signal form container runtime would be required for “regular” file creation to indicate the end of container initialization. Similarly, container may need to play a role here, indicating when user "commits" file (send message in Teams).

How do you see this tying in with code proposal, etc.? E.g. am I likely using this signal in combination with the signal of "contextChanged"? Or maybe this is a replacement?

New version of code may introduce new singletons. Process of creating new singletons ideally should be tied with proposal to update to new version of code, and depends on being online, thus either consensus data structures should be used to create new singletons, or better – version converter would create new singletons as part of (ideally atomic) upgrade process.

It sounds like this proposal would effectively be a lock on the document while the initialization takes place - would the version converter also be taking a lock in the same fashion? We've talked previously about treating initialization as equivalent to an upgrade (i.e. from v0 to v1), would love if the flow is the same between the two.

@vladsud
Copy link
Contributor Author

vladsud commented Jan 31, 2020

@ChumpChief, I think initial code proposal (and initial content generation) should be correct no matter how file is created, so there should be no dependency here (this is the place where we need to think how to atomically create root component as part of initial code proposal). Signal is needed only for perf optimization - when to commit a file, and it's optional. At least that's the direction I want to go.

For new code showing up, I do not know we can make it atomic. All I'm saying is that converter already has to modify document state, and thus creation of new singletons is part of that process. Whether results of conversion are applied atomically (or how we deal with multiple clients doing same conversion) should be dealt in conversion design discussion.

@leeviana
Copy link
Contributor

How the container would store references to the singleton IDs. In a world where there may not be any one "root component" (or the root component may be different components every time)?

@vladsud
Copy link
Contributor Author

vladsud commented Jan 31, 2020

@leeviana, from above:

Container must have root component baked into runtime. It would be automatically created as part of new container runtime creation (first code proposal).

I do not have good picture on all mechanics yet (in terms of combining that with other requirements of components to not have IDs / names exposed), but I'm sure we can work through these details.
I think the key here is that proposed code in container can get a handle on that component, i.e. it's either freely available on container runtime, or it is one of the results returned from ContainerRuntime.load().

@curtisman
Copy link
Member

If no component can be given an ID, I think having a root component would be an requirement to coordinate naming and routing to "well-known" things

That said, I am still 100% sold on "well-known" singleton pattern, which is worth another discussion

@vladsud
Copy link
Contributor Author

vladsud commented Apr 16, 2020

I want to add some observations based on offline discussions related to recent data loss investigation.
Comments are as always welcome. Eventually I'll extract it into some form of public documentation. :)

Guidance to container builders (with existing containers)

  1. Start with root component. If you have existing container, change your file creation flow to always create root component - you do not need to wait here for any solution from runtime - you can make that incremental change now.
    • Create root component on demand for excising files, on file open. Collisions here will not matter much at the beginning, as this component has almost no state (yet).
  2. All named (singleton) components can be created in the same way as today (named) - for compatibility with existing (old) clients. But let's put a reference to them into root component by storing handle into root component's root directory. Code can continue to load them by name.
  3. At some future point (when old builds are no longer in circulation), stop creating named components, and find them by following a pointer from root component's root directory.
  4. Any future on-demand created singleton components are created and referenced through root component.
    • you will hit collisions here (race condition) in the same way as before, but at least there is clear point for collision that application can observe. Recover is unlikely, as mechanisms are missing here.

Considerations when to create singleton components

  1. File can be readonly (meaning permissions do not allow any changes). This information is available as "readonly" event and as readonly propertly on DeltaManager. Sending OPs in this mode would result in critical error and closure of connection, though we need to revisit this.
  2. There might be long times without any connection. Either because of user being offline, or because initial loading / rendering / instantiation of session on server takes a while. Local changes should be allowed in this state (i.e. UX components should allow user input). If we are using "connected" event as an event to create some named singletons (to reduce race conditions chances - this does not eliminate them!), then the rest of the code needs to be capable to work without this singleton component for a while.
    • You can use detached components creation. Components in such mode are not generating ops and are not visible by other clients (once connected). That said, you need to be careful, as storing handle to this component to any already attached component will trigger attach process.
  3. Connections can be read-write (ops are allowed, client is in quorum) can be view-only (either user does not have write permissions to a file, or loader chose to connect in this mode to reduce number of OPs flowing through the system / COGS of end-to-end system). In the later case, any OP sent to loader would trigger to reconnection to read-write connection.
    • You can use DeltaManager.active to differentiate view-only vs. read-write connection

At the end of the day we need to move to a system that allows GC, full offline support, and versioning. That support will require runtime to provide more capabilities for managing merges, but key mechanisms are likely to stay the same.

Also what is described above about components is applicable about DDSs:

  1. You can create set of DDSs without fear of conflicts at component creation, before component is live (attached)
  2. Once it's live, avoid creating named DDSs. Let system pick unique names, and put DDS into the tree by storing a handle to it into some existing structure (like a key in root directory).

@anthony-murphy
Copy link
Contributor

i don't think it's true that a new component will have no content, as a component can be attached after adding content. i would switch this to say that singletons must not have any initial content, so duplicates can be discarded.

I'm also worried about just setting the key. that means the new component will win, and all old content will be lost. in order for this to work we should have write once keys.

@vladsud
Copy link
Contributor Author

vladsud commented Apr 17, 2020

RE key: yes, we will need to improve it, and provide a way to resolve merges at that moment. We do not have that capability now, so that's as far as we can go.

@anthony-murphy , can you please clarify why newly created single component should not have initial content? I think it just shift problem to DDS level from component level - we have no solution either way. I think we should not dictate any guidance overall here. My wording my have been confusing, I've reworked guidance part, please take a look

@skylerjokiel skylerjokiel unpinned this issue May 5, 2020
@ghost ghost added the triage label May 5, 2020
@curtisman curtisman modified the milestones: Build 2020, May 2020 May 8, 2020
@curtisman curtisman modified the milestones: May 2020, June 2020 Jun 1, 2020
@curtisman curtisman modified the milestones: June 2020, July 2020 Jul 6, 2020
@curtisman curtisman modified the milestones: July 2020, August 2020 Aug 3, 2020
@danielroney danielroney removed the triage label Sep 3, 2020
@danielroney danielroney removed this from the August 2020 milestone Sep 3, 2020
@ghost ghost added the triage label Sep 3, 2020
@danielroney danielroney removed the triage label Sep 4, 2020
@danielroney danielroney added this to the Future milestone Sep 4, 2020
@vladsud vladsud self-assigned this Jun 29, 2021
@vladsud vladsud modified the milestones: Future, June 2021, July 2021 Jun 29, 2021
@vladsud
Copy link
Contributor Author

vladsud commented Jul 14, 2021

That's rather old issue, and reading through it it sounds like I created it before (or around the time) detached container was taking shape, so some of the references are a bit stale.

Latest discussion around this topic should continue in in #6465 that has more iterations of various approaches.

This thread has some number of unique thoughts that are worth calling out:

  • singletons must not have any initial content, so duplicates can be discarded.
  • no component can be given an ID (with exception of a root). Implies using DDSs (map, CRC) for putting components into the graph
  • removal of existing flags (this is being actively worked on!)

I think most of the other ideas / areas are already covered by that other issue, so closing this one.

@vladsud vladsud closed this as completed Jul 14, 2021
jason-ha pushed a commit that referenced this issue Jan 23, 2025
Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss) from 3.4.10 to 3.4.11.
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/v3.4.11/CHANGELOG.md)
- [Commits](tailwindlabs/tailwindcss@v3.4.10...v3.4.11)

---
updated-dependencies:
- dependency-name: tailwindcss
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: framework Framework is a tag for issues involving the developer framework. Eg Aqueduct design-required This issue requires design thought
Projects
None yet
Development

No branches or pull requests

8 participants