-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify life time management #15
Comments
My current idea: When one of the following conditions meets,
|
This is annoying :P. (But necessary.) Let me go a bit slower. I think you jumped to the end because you understand this stuff better. My perspective might not be that valuable because I don't understand the implementation complexities as well as you, so feel free to tell me I'm wrong :). Here is my toy example that helps me understand the situation better: // UA code, written here in JS, but trying to mimic C++ semantics
var resolve;
var foo = {
closedPromise: new Promise(r => resolve = r)
};
// Expose `foo` to author code, but not `resolve`:
giveAuthor(foo); // author code
foo.closedPromise.then(() => console.log("closed successfully")); The graph in this scenario is: Now, I guess the main issue for a stream is that we want something like web socket's
e.g., "if Expanding the above toy example into an analogous one becomes: // UA code
var resolve;
var foo = {
closedPromise: new Promise(r => resolve = r)
};
addWeakCallback(foo, function () {
resolve();
free(resolve); // we won't use it again.
});
giveAuthor(foo); The stream version of this, I think, would be: class FetchBodySource {
start(enqueue, close, error) {
this@[[closeStream]] = close;
}
...
}
var source = new FetchBodySource(...);
var body = new ReadableStream(source);
addWeakCallback(body, function () {
source@[[closeStream]]();
free(source);
});
exposeViaBindings(body); // author code
body.closed.then(
() => console.log("closed successfully"),
e => console.error("errored", e)
); Let's say that author code drops all references to OK, cool. We still have Thus when UA code calls Does this make any sense? Or am I crazy and out of my depth and should leave this to the pros? |
Thank you very much for your feedback! You're right, this life time management is for the underlying source rather than for Readable[Byte]Stream itself. And I realized that such treatment is unnecessary if the author has both the read end and the write end. Moreover, we need to think about the underlying sink as well. Clearly I need to think more carefully. |
Let me describe my motivation. Imagine that the browser is implemented in JS and all objects are governed by a single GC. Even in such a browser, a connection to outside of browser such as a socket is a GC root. That means by default a socket will be kept alive while it is open and the socket will be open until closed by the author or the server.
In the above code, the author doesn't have a means to get any information from the connection. In such a case, I'd like to allow the system to close the socket. Closing a socket has side-effect but it is not observable to anyone in the browser and the server doesn't care about it. So I'd like to allow that.
In the above case, the socket MUST not be closed automatically, because the author can detect the closed / errored timing. Note that the author doesn't have a means to pull data from the response, so the connection will stall if the back pressure mechanism takes effect.
Theoretically this is equivalent to (1) but I'm afraid we cannot detect it. |
Oh, I understand now. This is another "we don't want to expose GC behavior" thing. Because if we did allow that, then we could just use the scheme from my above comment. The browser would automatically close the stream (which is OK because all references to OK. Now that I understand the problem I think your proposed solution is good. The only other alternative I can think of would be not adding the close-stream-on-GC behavior, and forcing users to manually manage their resources. Maybe that is OK? In ES spec speak you can say "promise has any entries in the List that is the value of its [[PromiseFulfillReactions]] internal slot" (similarly for [[PromiseRejectReactions]]). Probably can simplify that to just "promise@[[PromiseFulfillReactions]] is not empty." Note that once the promise becomes fulfilled or rejected those Lists get emptied so there's no need for the state check to ensure it's pending. |
I found #15 (comment) doesn't work, because @readyPromise has handlers when
|
How feasible do you think it is to force authors to manually manage their open connections and such? Very bad? I am pretty sure in Node.js you have to do so (and in most server-side runtimes), but I appreciate that the browser is different.
I'd like to help figure out the fix but I don't quite understand the problem :-/. Why does having handlers when .ready is accessed and the state is "waiting" mean your original plan does not work? |
That's possible, and in fact, it is comfortable for me as an implementer, if authors accept it. I think of a few reasons why we need this feature.
I don't know how strongly developers want this feature, though.
I uploaded a WIP commit: https://github.com/yutakahirano/fetch-with-streams/tree/auto-close So the possible options I can think of are here.
|
Discussed with @tyoshino, I'm now inclined to accept Domenic's proposal (manual connection management). Here are reasons:
|
If you are convinced of this then that is very good. I wasn't 100% sure. I think it might be hard to say that fetch() is nicer than XHR except that it can cause connections to stay open. But maybe it isn't a big deal because in most cases a developer will read the body to the end. The POST example you gave might be an exception. But then again most POST bodies might be small enough that the browser can just read it into memory and close the open connection anyway (without closing the stream itself). I anticipate pushback from other parts of the implementer and standards community on this so we might have to figure out an auto-close story anyway, or at least assemble a good explanation of why we think it's OK. I also think it might be OK to expose GC timing (your 2), because eventually there is a (somewhat controversial) hope to add weak references to JS, as long as the reference only changes between turns of the event loop. That would make the behavior no different from what any JS program could achieve. But again this is controversial. It might be easier to add later when/if weak references make it into JS, and rely on manual collection in the meantime. |
Status Update: Streams API has changed. Now
See whatwg/streams#297 for details. |
As I now think the manual management is feasible for readers. It means that a reader (got from Request.body or Response.body) will not be collected while it is active. I'm not certain about streams. A: Manual managementRequest.body and Response.body will not be collected while readable. When we are sure that B: No specificationRequest.body and Response.body can be collected if not reachable from the root set. If |
I would like to choose B because it is feasible for many users if |
Agreed. The trick of not calling close() or error() is a nice one! But |
This should be very straightforward. Everything that would be able to track GC of the object from JavaScript must keep the object alive. And then you have a separate hook for "do this when GC'd" which Fetch will use to terminate. |
@annevk I think it's debatable whether the better model is "if it's at all possible that the termination is unobservable to JS, then terminate" vs. "if the user has crossed a clear boundary line, then they no longer get auto-terminate-on-GC." The former means e.g. const reader = res.body.getReader();
reader.closed;
// can terminate const reader = res.body.getReader();
reader.closed.then(onClosed);
// can terminate const reader = res.body.getReader();
reader.closed.catch(onAbnormalTermination);
// cannot terminate The distinction between these three cases is subtle, and it seems unfortunate to have the number of open connections to the server end up depending on the differences between them. Whereas, the latter approach says: by calling The intent of the current spec is the latter. I don't have a strong preference for it, but it makes sense. We noted that in non-web platforms, such as Node, these type of GC hooks do not exist, and resource management of this sort is entirely manual. |
I don't see why we'd disallow optimizations that cannot be observed and are consistent with other platform objects. |
I think it's debatable both whether termination is an optimization, and whether it cannot be observed. (Since the number of connections is indirectly observable on the client, or directly observable on the server.) Concretely, what I'm worried about here is someone coding up a site which works well with 10 open connections, based on code which forgets to call |
@domenic, I'm all for "fast fail" to avoid subtle errors, but memory constraints are already shifting and hard to pin down. Even if we always leaked here it might work fine and then suddenly start getting OOM'd when the device is used in a slightly different way. Also, encouraging leaks from one site negatively impacts other tabs running different origins. I think this is highly undesirable. In general I think the UA should strive to be as memory efficient as it can given the constraints from the script. But the great thing about different browsers is we can optimize in different ways. I just don't think the spec should re-define observability to rule out optimizations chrome doesn't want to implement. |
I don't think saying "you need to release your resources; we won't do that automatically for you" is "encouraging leaks"... But I see your point about letting UAs do different things. As long as we accept that loses some interop, it seems fine. I think I've said my piece here and will leave it to @yutakahirano to chime in with his thoughts :) |
I was also going to mention leaking these resources is extra bad in that it also leaks the file descriptor. I don't know about chrome, but I think this could potentially crash the gecko parent process if all fds are exhausted. That's extra double bad. |
I think so, and the cache data will not be truncated. Fetch operation can be terminated by various reasons including network errors and user abort, and cache should work well with them. GC termination should be treated in the same manner. |
Just to clarify, are you saying that if a fetch is terminated it should still be written to the http cache in full? Or are you saying it should be removed from the http cache? Thanks! |
I'm not sure if it is specified. That happens when the body is extremely large even without streaming. In the spec, there is a statement
but it says nothing about the case when the update fails. |
We should probably fix that. I wish there was a more object-based specification of the HTTP cache with algorithms that return error or success and some advice what to do under various conditions... |
@annevk, the existing specs like WebSockets or XHR have Garbage Collection section, but they are written in an explicit fashion, like
Is it possible (or desirable) to specify it in a implicit fashion, as @wanderview said?
|
Is "suspended" defined? That language still doesn't tell me much about how GC depends on the network, but maybe that's okay? Not sure I'm the best reviewer for this. |
I did not intend to accept the literal expression, but wanted to ask if it's ok to specify it in a more implicit way than XHR's. The most implicit version doesn't have to mention at the Response object or its body.
Do you think the behavior is well-defined? If so, I'm happy to write so (Many examples and notes are needed though). @domenic, "can be observed" is well-defined in the JS world? |
Do you have any idea? To be clear, I'm happy with the current description - it's easy to understand and easy to implement. But I'm not opposing to loosen the rule so that an implementer can implement a more sophisticated mechanism. Note that terminating fetch is not mandatory, i.e. a system which doesn't terminate a fetch operation with reason garbage collection is conformant. Similarly, a system compliant with the current description will be compliant with the loosened rule automatically. |
Seems no one is interested in this topic... |
Well, I think Mozilla is interested in allowing implementations to close whenever it cannot be observed. |
Hm, I will create a PR containing something like #15 (comment). |
Instead of describe conditions explicitely, describe them in a more implicit way in order to allow implementers' optimizations.
Instead of describe conditions explicitely, describe them in a more implicit way in order to allow implementers' optimizations.
Merged the PR. |
We need to specify the life time management, like WebSockets.
It is necessary because closing an active socket has side-effect that is observable from a remote peer that is not governed by the GC, so we don't rely on the GC. It's not general ReadableStream's concern, so we should specify it here.
The text was updated successfully, but these errors were encountered: