Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In/OutStream design flaw introduced with new error set changes. #764

Closed
Hejsil opened this issue Feb 9, 2018 · 12 comments
Closed

In/OutStream design flaw introduced with new error set changes. #764

Hejsil opened this issue Feb 9, 2018 · 12 comments
Labels
use case Describes a real use case that is difficult or impossible, but does not propose a solution.
Milestone

Comments

@Hejsil
Copy link
Contributor

Hejsil commented Feb 9, 2018

So, before error sets, we had In/OutStreams that could be implemented on a memory buffer, files or whatever would fit this interface, and could function to take In/OutStreams, and it would work on all of these.

Now, with error sets, I can see that In/Outstream was redesigned to be generic, taking an error set as the parameter. This ruins the useability of the In/OutStream:

error: expected type '&OutStream(error)', found '&OutStream(error{SystemResources,OperationAborted,IoPending,BrokenPipe,Unexpected,WouldBlock,FileClosed,DestinationAddressRequired,DiskQuota,FileTooBig,InputOutput,NoSpaceLeft,AccessDenied,})'

Now, you would have to pass the error type to all functions taking an In/OutStream, for it to work again:

fn foo(comptime Error: type, stream: &io.OutStream(Error)) !void { ... }

If this is the way they should now be used, then can I be sure that foo doesn't explode into a million instances of foo, as different OutStreams are used? Users of my function now also have to call it like this:

const file_stream = FileOutStream.init(&file);
foo(@typeOf(file_stream).Error, &file_stream.stream);

Which, idk, seems not so ergonomic for the users of my library. It relies on the fact, that the implementer of the OutStream adaptor was kind enough to actually provide this Error constant field. If the implementer did not do this, the user is more screwed, and would have to write this code:

const file_stream = FileOutStream.init(&file);
foo(@typeOf(file_stream.stream.writeFn).ReturnType.ErrorSet, &file_stream.stream);

Ooh god, plz no.

Hejsil added a commit to Hejsil/pokemon-randomizer that referenced this issue Feb 9, 2018
Code doesn't compile, as we now have this issue:
ziglang/zig#764
@Hejsil Hejsil closed this as completed Feb 9, 2018
@Hejsil Hejsil reopened this Feb 9, 2018
@andrewrk andrewrk added this to the 0.2.0 milestone Feb 9, 2018
@andrewrk
Copy link
Member

andrewrk commented Feb 9, 2018

Thanks for the early adoption. Let's get this sorted out.

doesn't explode into a million instances of foo, as different OutStreams are used?

This aspect should be fine. This isn't c++ where we duplicate all the work in every file and then rely on the linker to toss out redundant definitions. LLVM actually has an optimization pass that merges compatible function definitions together. We could potentially even notice when the difference between instantiations differs only in error sets and save time and space even in debug mode.

The API ergonomics and semantics on the other hand, let's talk about that.

Here are our options:

  • OutStream can become non-generic again and use error as the error set, the void * equivalent of error sets. It then "poisons" other error sets, because the union of any error set and error is error. (This would be the equivalent to how it was before error sets merge)
  • OutStream can become non-generic and only define a single error set - error{StreamingFailed} which is the error set for writeFn. This bubbles up to std.io.getStdOut() and you would not be able to distinguish between, for example error.BrokenPipe and error.NoSpaceLeft. It wouldn't make sense to hard code these errors into the OutStreamError set because you might make for example an OutStream on a memory buffer and then the only possible error should be OutOfMemory.
  • You can use var as the type of the outstream parameter in your example like this: https://github.com/zig-lang/zig/blob/2c697e50db36e4f9c0af8b1a9d357deeb8f80026/std/debug/index.zig#L143
  • I thought I had one more but I forgot. I'll keep thinking about it.

Sorry about all the breaking changes in a small amount of time. I've been trying to do the breaking changes all together and sooner to avoid even more pain later.

@Hejsil
Copy link
Contributor Author

Hejsil commented Feb 9, 2018

Actually, the code explosion is more a compile speed concern. foo could be big, and we might want to switch between loggers or something. EDIT: Ops, I guess I didn't read the last part about no difference in error sets optimization. That probably solves this.

Anyways, I think the ergonomics aspect is the bigger concern. I don't know the exact solution to this either, which is why I didn't propose anything. Something similar to var is probably what we want, where we can keep all our error code intact. Sadly, var throws away a lot of readability from the function signature. I don't want to propose some complicated "Concepts" feature, just so I can say foo takes any OutStream.

Maybe #470 is what would solve all of this? fn foo(stream: &OutStream(var)) !void, though I have no idea as to how this would work, as OutStream technically is a function being invoked at comptime to figure out the type.

@tiehuis
Copy link
Member

tiehuis commented Apr 24, 2018

Came across this a few weeks ago when doing a deflate implementation. https://github.com/tiehuis/zig-deflate/blob/master/deflate.zig#L99

This particular case was a bit more troublesome than simply taking a var however since I needed to also generate a wrapped type and store in the new case. Couldn't think of a way at the time to get the comptime Error from the type from a specific instance.

@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Jul 18, 2018
andrewrk added a commit that referenced this issue Dec 2, 2018
Relevant #764

dwarf debug info is modified to use this instead of std.os.File
directly to make it easier for bare metal projects to take advantage
of debug info parsing
@tgschultz
Copy link
Contributor

I was all set to begin overhauling std with comptime interfaces (#1829) when it occurred to me that, of all the interfaces I know about in std, only InStream and OutStream have any real reason to be comptime, by virtue of being the only ones that return dynamic error sets.

The ergonomics of both creating and using interfaces could be a lot cleaner if we didn't have to deal with that, so I'd like to try and gather some more suggestions and opinions on how we could handle errors with InStream and OutStream.

One thing that was not suggested was to have functions that actually care about the error set information take it as a parameter along with the stream interface:
pub fn func(stream: *InStream, comptime InError: type) InError!void

@andrewrk
Copy link
Member

One thing that was not suggested was to have functions that actually care about the error set information take it as a parameter along with the stream interface:

I think that's worth trying. Wouldn't it look like this?

pub fn func(comptime InError: type, stream: *InStream(InError)) InError!void

I got bit by this problem recently as well:

zig/std/debug/index.zig

Lines 1092 to 1093 in 0afb868

pub const DwarfSeekableStream = io.SeekableStream(anyerror, anyerror);
pub const DwarfInStream = io.InStream(anyerror);

You can see I gave up and used anyerror. But I think passing a comptime Error type is a good thing to try.

Then there's still code bloat to solve - this is a really gnarly problem. I certainly agree with the title of this issue that this is a design flaw.

@tgschultz
Copy link
Contributor

I think that's worth trying. Wouldn't it look like this?

pub fn func(comptime InError: type, stream: *InStream(InError)) InError!void

That's a way of doing it. I was thinking that func would @errSetCast from the anyerror returned by InStream's function pointers. One reason for it to be done like that is that a runtime-dispatched function would have to take *InStream(anyerror) in your example, but anyone calling that function would probably have to @ptrCast to get their InStream(SomeErrorSet) in there.

I suppose I'll experiment and see which is more of a headache to rewrite std with.

@andrewrk
Copy link
Member

andrewrk commented Dec 14, 2018

I just realized that @Hejsil already wrote all this up in the original description back in February.

@tgschultz
Copy link
Contributor

So something I've been asking myself is, why do we even care about the ErrorSet in the context of interfaces?

The function taking an interface for generic purposes can't possibly cover all possible error cases, so would always end up punting on the ones it didn't know about when it was written anyway. Presumably someone up the call stack might care about handling all the possible errors, but they would already have the ErrorSet since they have the implementation, right? They'd just have to cast to it.

So I guess what I'm saying is, what is the real problem if all interfaces were to return anyerror? The only thing I can really think of is that the functions called on the interface may add their own errors to the implementation's set (as InStream.readByte() adds EndOfStream), and there's no way for the caller to know that. That could be solved with multiple returns and comptime fields (see: #208).

@andrewrk
Copy link
Member

So I guess what I'm saying is, what is the real problem if all interfaces were to return anyerror?

That might be a worthwhile tradeoff, given the downsides of status quo. But to answer your question the real problem is that you can no longer switch on the error, and have the compiler make sure that you handled all the possible errors, even when the code is later changed so that more error codes are possible, and similarly the compiler would no longer detect dead code for handling an impossible error code. To summarize, it makes it more difficult to write code where it's clear that you've handled all the possibilities. You'll probably have to do a @errorCast which incurs safety-checked undefined behavior - a clear step down from compile-error protection.

@Hejsil
Copy link
Contributor Author

Hejsil commented Dec 19, 2018

Another option. Have interfaces with custom errors have a castToErrorSuperSet function, which @bitCast the struct into a new interface, whose interface is a superset of the original.

const E = error{A, B};
const stream = &OutStream(E).stream;

// Validates that the errorset passed in is a superset.
const stream2 = stream.castToErrorSuperSet(error{A,B,C});

@tgschultz
Copy link
Contributor

tgschultz commented Dec 19, 2018

I think with the interface pattern I'm trying right now this won't be a big deal. Most code will want to take the comptime interface (retrieved as my_stream.inStreamInterface()) as var which will come with full ErrorSet information and inference. If the stream needs to be taken and/or stored as an abstract (type-erased, runtime) interface (retrieved as my_stream.inStream()), then everything will be anyerror because there really isn't any alternative, and you have the option of also taking the ErrorSet as a parameter.

@andrewrk andrewrk modified the milestones: 0.4.0, 0.5.0 Feb 19, 2019
@andrewrk andrewrk modified the milestones: 0.5.0, 0.6.0 Aug 22, 2019
@andrewrk andrewrk modified the milestones: 0.6.0, 0.7.0 Feb 10, 2020
andrewrk added a commit that referenced this issue Mar 10, 2020
The main goal here is to make the function pointers comptime, so that we
don't have to do the crazy stuff with async function frames.

Since InStream, OutStream, and SeekableStream are already generic
across error sets, it's not really worse to make them generic across the
vtable as well.

See #764 for the open issue acknowledging that using generics for these
abstractions is a design flaw.

See #130 for the efforts to make these abstractions non-generic.

This commit also changes the OutStream API so that `write` returns
number of bytes written, and `writeAll` is the one that loops until the
whole buffer is written.
@andrewrk andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 10, 2020
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 Nov 6, 2020
@SpexGuy SpexGuy added the use case Describes a real use case that is difficult or impossible, but does not propose a solution. label Feb 12, 2021
@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 May 19, 2021
@andrewrk andrewrk modified the milestones: 0.15.0, 0.12.0 Oct 4, 2023
@andrewrk
Copy link
Member

andrewrk commented Oct 4, 2023

I think this can be considered solved by #17344. Any further discussion is welcome.

@andrewrk andrewrk closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
use case Describes a real use case that is difficult or impossible, but does not propose a solution.
Projects
None yet
Development

No branches or pull requests

5 participants