-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I/O module: where should I/O read calls leave the fileReader on failure? #21345
Comments
This seems right to me, specifically for the cases you mention here. Could the guiding philosophy be "Routines that can read a variable/unknown amount of data (and then return that data) leave it where the failure occurs and those that are more "All or nothing" will revert it? (that seems like what you may be proposing, though it's not obvious to me). Other cases to think about would be ones like |
|
Right, another tricky case is |
That case is interesting - my first attempt to support it would be with returning the position inside the format string that caused problems but the values that could go into each of the different format strings can vary in length and the problem could be with the string it is trying to match (e.g. it could say "My int is 15, my bool is false, but my real is 3.5"). So maybe refer to each of the arguments including the format string? |
For any of the read(my*); // arg 1
read(my*); // arg 2
read(my*); // arg 3 and then to return how many of them were successful; so we'd only need to rewind the cursor to the beginning of the last unsuccessful one. For the [edit: so, for example, |
I feel like having |
That seems very different and new relative to what we've traditionally supported, isn't it? If so, it feels "maybe nice?" but not necessary to me, at least for 2.0.
We've discussed having the compiler warn about dropping return values on the floor, which would help with this. |
I'm very confident that you can write such patterns today. You could have your own |
Sorry, what I was trying to say is that our |
I'm not sure, but let's move to a code example to make it more concrete. use IO;
var f = openTempFile();
f.writer().write("1 hello");
var reader = f.reader();
writeln("channel position is ", reader.offset());
try {
var x: int;
var y: 2*int;
var z: string;
reader.read(x, y, z); // input is "1 hello", so what does this do?
} catch e {
writeln("caught error ", e);
writeln("channel position is ", reader.offset());
} What happens when we actually run this?
This issue is asking what the channel position should be in this case (and similar cases). Where the 2 patterns I am expecting we will use are:
Now, I interpreted your comments as potentially suggesting that, in this case, |
Oh shoot, you're right that this is what I was missing (I was thinking it could both return 1 and throw. So we could potentially collapse down the responses after #21345 (comment) which I think is where I took us off-course. |
In a design discussion on this topic, we decided that reading methods on the With this change, the following example (from above): use IO;
var f = openTempFile();
f.writer().write("1 hello");
var reader = f.reader();
writeln("channel position is ", reader.offset());
try {
var x: int;
var y: 2*int;
var z: string;
reader.read(x, y, z);
} catch e {
writeln("caught error ", e);
writeln("channel position is ", reader.offset());
} would report a file offset of We also discussed the possibility of including the offset and the number of entries read in the error type, so that a user could query those values when handling the error. This would be a non-breaking change, so we decided that it could be designed and added post-2.0. |
@jeremiah-corrado - in the discussion, I was saying that people could use a single-argument |
It doesn't look like I'm not sure what should be done about the memory demands or potential performance drawbacks though. A couple of ideas:
var nonRevertingReader = myFile.reader(revertOnErrors=false); (We'd need to decide which mode
reader.readArray(myLargeArray, revertOnError=false); |
In principle, I think we could alternatively add |
Also, IMO having a default that is "don't mark" would basically be equivalent to what we have today; where to retry you would have to opt in to marking. But you can do that today just by calling |
Me too, @benharsh, do you remember? I believe you ran into this question with the serializer design recently.
I agree, I think the default should be "do mark" with some way to opt-out. |
The problem with adding a default argument alongside varargs is that the compiler will try binding the last actual to the default argument rather than the varargs. So calls like |
@mppf, given that we could add a |
I'm concerned we'll lose the performance in key benchmarks. It would be OK with me to try to implement it, but I think we should discuss it again.
Actually I think the current situation is not so bad if we document it. I wasn't trying to say that the default has to be "do the marking for me" but rather that what we have today is close enough to a default of "don't mark" that it might just be better to ask people to Note that we can add a new formal to the non-varargs It seems like a tricky choice:
|
After some further offline discussion, we've decided to keep the current behavior of leaving the Our reasoning is that the current behavior prioritizes performance over error-recovery, which we believe to be a better default behavior. The original proposal of automatically
To address the concerns in this issue about how users can recover from errors that occur part-way through multiple read operations, the recommendation is to manually proc readItems(fr: fileReader): (int, bool 3*real) throws {
var x: int, y: bool, z: 3*real;
fr.mark(); // mark the fileReader position before attempting to read
try {
fr.read(x, y, z);
} catch e {
fr.revert(); // there was an error; revert to the starting position
throw new Error("unable to read values...");
}
fr.commit(); // there was no error; pop from the "mark stack"
return (x, y, z);
} As a post-2.0 effort, we'd also like to investigate the possibility of providing some mechanism for opting into automatic |
I don't believe we have any additional action items here (that aren't tracked in other places), closing |
If there was an error reading, such as a formatting error, a I/O read call will generally throw. Supposing that the error is caught, what happens to the channel's position? Some possibilities:
Note that the answer to this question has interplay with quite a few other elements of the I/O function under consideration.
fileReader
has to do buffering, which can have a performance impact. In particular, any read data would have to end up in thefileReader
buffer for potentially being read again in a different way. (Maybe, for something like a binary read into an array, it could do the OS read into the array and then copy it to the buffer if there was a failure? Not sure how useful that is).ref
intent formal that it is updating, rather than anout
intent formal or something that is being returned.out
formal since these are never initialized if it throws. Perhaps such calls need to not throw?See also #19496 (comment) which points out that
readBinary(array)
is throwing if the entire array is not read, but it also stores the data directly in theref
formal array, without indicating how many elements were read. This takes key information away if the error is caught and error recovery is attempted.Perhaps we want use different strategies on different calls. "Go back to where the read started" seems like a lot for
readBinary(array)
but it seems more reasonable forread(myInt)
.The text was updated successfully, but these errors were encountered: