-
Notifications
You must be signed in to change notification settings - Fork 30
Conversation
@vasco-santos I think this best reflects my issue with |
Thanks for the PR @filoozom , I think I could figure out the outstanding problems. I have looked through it, ran it and checked both The interface test looks almost good and we found a new problem 👌🏼 I think we need to have two test cases here. Going through the added test case: await stream.sink([randomBuffer()])
await stream.closeWrite() Once the sink promise finishes here, the sink ended, which explains why the onSinkEnd will be called before the With the above in mind, we need three tests:
Does this make sense? |
@vasco-santos It does make sense! You're right, I didn't think about the added test enough, it doesn't do anything. The Unless I'm missing something or it's related to my |
I need to debug that part yet. But looking quite fast I understand your point, perhaps it needs to check if the error is an AbortError and not wait for the resolve. However, it is strange as per this test, it should be aborting and finishing, as the async iterator will never end? If we can create a test scenario with abort that hangs in the abortable-iterator, we can create a PR there too. Perhaps a test where nothing is ever returned in the async generator |
The new test works on my machine (which I did not expect at first) but does not on Travis. Strange. Almost, I'm not sure if what I wrote is correct. |
Cool, we are almost there! The interface test generator awaits on a function but as it does not have a loop the underlying |
I'm not sure I understand what you mean by this:
Right now, it doesn't seem to be ending, right? At least I'm assuming that's why the test is timing out. I wrote a new function because I wanted to mimic the scenario in which Actually, this is basically the issue I was seeing in #120. Here it works because we're writing to the Regarding the |
Actually I just tested this code in the code base I was having issues with and it works, so I guess I'll just make the changes so we can get this moved along! |
src/stream.js
Outdated
log('%s stream %s reset', type, name) | ||
} else { | ||
log('%s stream %s error', type, name, err) | ||
send({ id, type: Types.RESET }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might result in sending attempts when the sink ended in the other party (read closed). This will result in ERR_STREAM_DOESNT_EXIST
error thrown.
Better be safe. Can you create _send
function that checks if sinkEnded
is false before doing the actual send? This will guarantee that on going writes will be stopped if we receive a message that the other party closed the read stream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something doesn't seem to like not receiving the Types.RESET
message. I tried 5c17ed6 for only the line you commented on, and it results in the same OOM error as on CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I have a couple of meetings now. I can recheck later. Otherwise, we can revisit if we should throw the error. I need to properly test this with libp2p first
src/stream.js
Outdated
@@ -75,41 +112,49 @@ module.exports = ({ id, name, send, onEnd = () => {}, type = 'initiator', maxMsg | |||
onSinkEnd(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problems seems here, both in reset and abort. They call onSinkEnd
and do the abort. This will cause the sink catch to be triggered and a second onSinkEnd
happens in the end.
Removing onSinkEnd
in abort and reset will not work as if the sink did not start the onEnd will not be triggered. So, we will likely need to improve the logic in the catch function of sink to not do the following on abort/reset:
stream.source.end(err)
return onSinkEnd(err)
Without the previous change, this was not problematic somehow. I think it is related to something now happening in a different event loop.
What do you think to be the best approach here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure I follow. Both onSourceEnd
and onSinkEnd
are guarded from running twice, so both of these shouldn't do anything if ran a second time, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, I guess it's because in abort
, onEnd
is called before the Types.Reset
message was sent. And with the new _send
it can't be, and apparently this makes everything hang. Maybe a bug in some other software, I guess there should be a timeout somewhere.
I'll see tomorrow if I can clean this up a bit, but at least it works now.
src/stream.js
Outdated
return sinkClosedDefer.promise | ||
} | ||
|
||
return stream.sink([]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm starting to wrap my head around these streams, and I have one question regarding the current code in relation to the spec: does it make sense to create a new stream right before closing it here? Does the spec require this, or could we simply replace it with onSinkEnd()
and prohibit a the sink
function from being called if sinkEnded
?
I don't see anything indicating that there needs to be a stream in both directions in https://github.com/libp2p/specs/tree/master/mplex#opening-a-new-stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the advantage of doing so is that we inform the other party that we will not write to it anymore and only expect read and their side will also be close for reading
Alright, I rewrote a bit of stuff to make it work and add a few features:
Let me know what you like and what I should revert! This would also require a few more changes to https://github.com/filoozom/js-libp2p-interfaces/blob/master/src/stream-muxer/types.d.ts |
Thanks for moving this forward @filoozom The general code arrangements look great, but I am concerned about the Promise return values, specially for for abort/reset (as we should try to match with the common patterns, like Node streams, can we keep them as before? I briefly tried this (this branch + interfaces branch) with libp2p tests and I got some problems, the close function was not resolving. One of the tests failing was in connection close: https://github.com/libp2p/js-libp2p/blob/v0.31.0-rc.4/test/dialing/direct.spec.js#L382 , blocked on streams being closed. Could not understand the underlying reason yet. Not opening the stream to send the message to the other party saying that their read can be closed might be problematic here. We can revisit if we really need this, but I would do this as a separate PR to evaluate interop with other implementations and guarantee we have no streams leaked |
You're right, |
We use it for tracking the stream in the connection https://github.com/libp2p/js-libp2p/blob/v0.31.0-rc.5/src/upgrader.js#L248 |
As far as I can tell, this doesn't trigger an "outside" event. Basically, in: const { stream } = await node.dialProtocol(peer, protocol);
stream.reset() how can I tell if the connection was successfully reset or aborted? Does it even matter? I'm not sure 😛 |
Should I remove the |
My main concern is actually potential errors. Perhaps it will be better to make the Stream an EventEmitter and emit the events like node streams, including an error? This means wrap the closeAll with a try catch.
Perhaps remove it from now. Unless there is a specific need for it, which I don't think we have. I created libp2p/js-libp2p#923 but we have some new problems with the latest changes 🤔 |
Should we do that in this PR or in a new one?
Ok
Yes we do 😜: libp2p/js-libp2p-interfaces#92 (comment) |
We can do a new PR for that
It seems that the close is trying to write for the stream that existed before? |
Previously we were delegating the responsability to the stream creators, but this was usually resulting on leaked streams. It came from: libp2p/js-libp2p-interfaces#67 I have no idea of what is going on yet, I am multitasking with other things that I need to get done and I did not have the opportunity to debug yet |
Sorry for deleting my previous comment, I feel like I'm going a bit crazy here. So indeed, removing the Alternatively, I can also comment this line: https://github.com/alanshaw/abortable-iterator/blob/master/index.js#L60, which is exactly what I was seeing at the beginning but kinda fixed itself. So the result is that the writeCloseController.signal.addEventListener('abort', () => console.log('write close aborted')) just before the loop, but the Either I'm crazy, or this issue had been there forever and is why I guess I'll have to do a bit more debugging 😬 |
So, when protocol selection happens, the So I guess the whole thing is waiting on https://github.com/jacobheun/it-handshake/blob/master/src/index.js#L18 (which I'm assuming is blocking the This is completely unknown territory for me so I might be way off base, but I think it kind of makes sense. Basically from inside EDIT: Can't quite reproduce with |
Thanks for the analysis, I will try to understand the root reason for this tomorrow/friday and get back to you. I am also not understanding what is happening, and yesterday when I tested with libp2p I got it to work after changing this manually in libp2p's node_modules. But, I think I only changed one of the _send I will need to get the next release of libp2p out of the door, I thought we could land this with it, but we do it afterwards :) |
Hi, here's a reproduction with const abortable = require("abortable-iterator");
const AbortController = require("abort-controller");
const pDefer = require("p-defer");
const handshake = require("it-handshake");
const abortController = new AbortController();
const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
const defer = pDefer();
const stream = {
sink: async (source) => {
source = abortable(source, abortController.signal);
try {
for await (let data of source) {
console.log("Read:", data);
}
} catch (err) {
if (err.code !== "ABORT_ERR") {
throw err;
}
}
console.log("Sink closed");
defer.resolve();
},
close: () => {
abortController.abort();
return defer.promise;
},
};
const stop = pDefer();
let closed = false;
// Simulate long running process
// (without it the program quits without resolving stream.close)
(async () => {
await Promise.race([stop.promise, sleep(10000)]);
console.log("Successful close:", closed);
process.exit(0);
})();
(async () => {
const shake = handshake(stream);
const { sink } = shake.stream;
// Same result with any or both of these commented
// It only hangs on another line in it-handshake
shake.write([]);
shake.rest();
// Doesn't work with (blocks the return function)
/*
sink(async function* () {
let counter = 0;
while (true) {
yield counter++;
await sleep(1000);
}
});
*/
// Only works with finished sink
//sink([]);
await sleep(1000);
console.log("Waiting for the stream to close");
await stream.close();
console.log("Stream closed");
closed = true;
stop.resolve();
})(); The abort hangs on either of:
Here's a simpler example using only const abortable = require("abortable-iterator");
const AbortController = require("abort-controller");
const pDefer = require("p-defer");
const abortController = new AbortController();
const stop = pDefer();
let stopped = false;
const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
async function* neverResolves() {
await new Promise(() => {});
}
(async () => {
await Promise.race([stop.promise, sleep(5000)]);
console.log("Successful close:", stopped);
process.exit(0);
})();
(async () => {
setTimeout(() => abortController.abort(), 1000);
for await (const data of abortable(neverResolves(), abortController.signal)) {
console.log(data);
}
stopped = true;
stop.resolve();
})(); At this points I'd say that it's a design problem in Funnily enough, replacing async function* neverResolves() {
yield* (async function* () {
while (true) {
yield 1;
await sleep(1000);
}
})();
await new Promise(() => {});
} actually makes it so that it works and doesn't hang on the Promise... Maybe a bug in NodeJS? |
I found this thread with some information on potential problems that we are hitting: tc39/proposal-async-iteration#126 I tried earlier to go back to a previous version: trigger CI commit and tried it with libp2p and I am also getting problems. But I am pretty sure I tested it before doing the review about Sorry, I could not put my hands back in this stuff properly yet. |
No worries, nothing urgent. It's quite a bit of a rabbit hole too!
I think this would explain why the following works: async function* neverResolves() {
yield* (async function* () {
while (true) {
yield 1;
await sleep(1000);
}
})();
await new Promise(() => {});
} Also, I wanted to submit a PR to const Writer = require("it-pushable");
const writer = Writer();
async function* neverResolves() {
yield* writer;
} |
Per 2021-05-03 triage session, the ball is in @vasco-santos' court. |
I spent some time debugging this today, but could not land on a compromise solution. Thanks for all the work and test scenarios @filoozom This code here looks good and does what we expect, but the side effects are worrying when we actually have async generators in scene (like Consider a small resume of the iterations we had:
@filoozom created some simple test examples without libp2p worth taking a look: libp2p integration PR that also shows these errors: libp2p/js-libp2p#923 All this comes essentially from a limitation with async generators |
I'll cut out some time either Friday or some time next week to take a look. |
Based on #121 and fixes and issue when the sink was already written to.
This also prohibits the sink function from being called a second time, as it would break either way.
There's still the issue that the sink does not abort without writing to it.
Adding tests.
Related to #120
Closed #121
Closes #115
Needs: