-
Notifications
You must be signed in to change notification settings - Fork 47.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flight] Detriplicate Objects #27537
Conversation
Outline objects that has been seen at least once before so they can be deduped.
This won't always work but it'll work if we discover the Map/Set before the key. The idea is that you'd pass the Map to something like a parent context and then select out of it with a key passed to a child.
Another possible heuristic could be to always outline objects one level below the props object of client components. The idea is that JSX is often not duplicated by reference and creating large JSX trees is where inlining is better. Other than the style object, you don't typically pass deduped props to host elements. Passing flat values to client components is generally better so it's not super common that you should pass rich objects to client components in the first place. It also might have semantic meaning if you do. So eagerly outlining them might not be as big of a cost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is it not safe to reuse elements on the client?
@@ -810,12 +822,36 @@ function serializeMap( | |||
request: Request, | |||
map: Map<ReactClientValue, ReactClientValue>, | |||
): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maps with object keys (or Sets of objects) seems like an interesting case where the object identity is virtually guaranteed to be semantic. Should we even allow that if there's no guarantee on it matching other places where that object is sent down (including other Map/Set instances)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rationalize it is that you can or cannot use it the same way you can use it in React on the client. If you create a Map and keys during render and pass it around, it depends on if those rerender together whether you can use that pattern or not.
In normal usage like if you create a Map in a parent and then pass ids into that Map to the children to refer to that map, as long as they get recreated together they'll flow downwards together too to get updated. Assuming you follow the updating rules of React.
Similarly, a second request updating the Map would need to do the same.
All the icons in the shared component would share the same id in this case which they wouldn't on the client. This change expands that to another case but it's the same principle. |
I understand why this isn’t safe for useId but I don’t understand what existing behavior has a similar problem. |
My example is showing the problem with existing behavior. The Client component is only rendered once on the server. The loop happens client side which reuses the result of rendering SharedComponent which has only been rendered once. |
This is kind of how I was imagining it would work |
You could also outline anything that’s the result of a |
@@ -379,9 +402,13 @@ function initializeModelChunk<T>(chunk: ResolvedModelChunk<T>): void { | |||
blockedChunk.value = null; | |||
blockedChunk.reason = null; | |||
} else { | |||
const resolveListeners = cyclicChunk.value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when is this not null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discovering a cycle doesn't add to the .deps
so we might end up at zero. The inner property listens for this model to complete so that it can update its property to set the cycle.
Throwing crazy ideas out here maybe we put a special empty chunk at the start of a series of chunks representing an object which uniquely identifies an object. When we see this during flush we store how many bytes into the stream this object appears. We also stored how long the object chunks are when we encoded it. Later when we encounter that object again we serialize a byte offset and length value. On the client when we encounter this offset + length we reparse those bytes and cache the object. On a second encounter to this offset + length we look the object up in a cache using the byte offset. It's unfortunate that we'd need to check each chunk to see if it is a marker chunk but maybe it is worth it for the compactness you get on the wire |
Now that we no longer support Server Context, we can now deduplicate objects. It's not completely safe for useId but only in the same way as it's not safe if you reuse elements on the client, so it's not a new issue. This also solves cyclic object references. The issue is that we prefer to inline objects into a plain JSON format when an object is not going to get reused. In this case the object doesn't have an id. We could potentially serialize a reference to an existing model + a path to it but it bloats the format and complicates the client. In a smarter flush phase like we have in Fizz we could choose to inline or outline depending on what we've discovered so far before a flush. We can't do that here since we use native stringify. However, even in that solution you might not know that you're going to discover the same object later so it's not perfect deduping anyway. Instead, I use a heuristic where I mark previously seen objects and if I ever see that object again, then I'll outline it. The idea is that most objects are just going to be emitted once and if it's more than once it's fairly likely you have a shared reference to it somewhere and it might be more than two. The third object gets deduplicated (or "detriplicated"). It's not a perfect heuristic because when we write the second object we will have already visited all the nested objects inside of it, which causes us to outline every nested object too even those weren't reference more than by that parent. Not sure how to solve for that. If we for some other reason outline an object such as if it suspends, then it's truly deduplicated since it already has an id. DiffTrain build for [f172fa7](f172fa7)
Now that we no longer support Server Context, we can now deduplicate objects. It's not completely safe for useId but only in the same way as it's not safe if you reuse elements on the client, so it's not a new issue. This also solves cyclic object references. The issue is that we prefer to inline objects into a plain JSON format when an object is not going to get reused. In this case the object doesn't have an id. We could potentially serialize a reference to an existing model + a path to it but it bloats the format and complicates the client. In a smarter flush phase like we have in Fizz we could choose to inline or outline depending on what we've discovered so far before a flush. We can't do that here since we use native stringify. However, even in that solution you might not know that you're going to discover the same object later so it's not perfect deduping anyway. Instead, I use a heuristic where I mark previously seen objects and if I ever see that object again, then I'll outline it. The idea is that most objects are just going to be emitted once and if it's more than once it's fairly likely you have a shared reference to it somewhere and it might be more than two. The third object gets deduplicated (or "detriplicated"). It's not a perfect heuristic because when we write the second object we will have already visited all the nested objects inside of it, which causes us to outline every nested object too even those weren't reference more than by that parent. Not sure how to solve for that. If we for some other reason outline an object such as if it suspends, then it's truly deduplicated since it already has an id.
Now that we no longer support Server Context, we can now deduplicate objects. It's not completely safe for useId but only in the same way as it's not safe if you reuse elements on the client, so it's not a new issue. This also solves cyclic object references. The issue is that we prefer to inline objects into a plain JSON format when an object is not going to get reused. In this case the object doesn't have an id. We could potentially serialize a reference to an existing model + a path to it but it bloats the format and complicates the client. In a smarter flush phase like we have in Fizz we could choose to inline or outline depending on what we've discovered so far before a flush. We can't do that here since we use native stringify. However, even in that solution you might not know that you're going to discover the same object later so it's not perfect deduping anyway. Instead, I use a heuristic where I mark previously seen objects and if I ever see that object again, then I'll outline it. The idea is that most objects are just going to be emitted once and if it's more than once it's fairly likely you have a shared reference to it somewhere and it might be more than two. The third object gets deduplicated (or "detriplicated"). It's not a perfect heuristic because when we write the second object we will have already visited all the nested objects inside of it, which causes us to outline every nested object too even those weren't reference more than by that parent. Not sure how to solve for that. If we for some other reason outline an object such as if it suspends, then it's truly deduplicated since it already has an id. DiffTrain build for commit f172fa7.
Now that we no longer support Server Context, we can now deduplicate objects. It's not completely safe for useId but only in the same way as it's not safe if you reuse elements on the client, so it's not a new issue.
This also solves cyclic object references.
The issue is that we prefer to inline objects into a plain JSON format when an object is not going to get reused. In this case the object doesn't have an id. We could potentially serialize a reference to an existing model + a path to it but it bloats the format and complicates the client.
In a smarter flush phase like we have in Fizz we could choose to inline or outline depending on what we've discovered so far before a flush. We can't do that here since we use native stringify. However, even in that solution you might not know that you're going to discover the same object later so it's not perfect deduping anyway.
Instead, I use a heuristic where I mark previously seen objects and if I ever see that object again, then I'll outline it. The idea is that most objects are just going to be emitted once and if it's more than once it's fairly likely you have a shared reference to it somewhere and it might be more than two.
The third object gets deduplicated (or "detriplicated").
It's not a perfect heuristic because when we write the second object we will have already visited all the nested objects inside of it, which causes us to outline every nested object too even those weren't reference more than by that parent. Not sure how to solve for that.
If we for some other reason outline an object such as if it suspends, then it's truly deduplicated since it already has an id.