-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C# Design Notes for Mar 10 and 17, 2015 #1648
Comments
I definitely think option 1 (Attributes/Opt-in diagnostics) is the right approach. The opt-in model also allows you to defer the decision on which rules to enforce to the user - they could opt-in to increasing levels of pedantry. One fringe benefit you don't mention with respect to the Attribute/Opt-in model is that it would provide a ready-made boilerplate for others looking to add similar attributes/checks to their code. Given that this is likely the most useful model for an organization looking to add their own custom checks, having that boilerplate would be handy. |
I think the attribute route is right, too. Given that massive sweeping changes to the CLR and framework are likely out of the question non-nullable reference types seems better represented as annotative for analysis purposes. This is how it works in Java, although I think that with integrated compiler support it can be a little better in C#. IIRC, this is also how Apple Swift handles it since Cocoa and that entire framework has no concept of optional types. But, like with Cocoa, once a direction has been established I do believe that it is very important that the entire framework be updated to support these attributes. Obviously going the attribute route brings limitations. Statements like As for dereferencing, I think that the compiler could intelligently determine scopes where the variable could not be string? x = ...;
if (x != null) {
int i = x.Length; // legal, x can't possibly be null here
} This is similar to how Java analyzers in IntelliJ deal with nullability. The pattern matching syntax also works well. But do we really need both |
This feature feels like something that will be weak in practice, or worse, misleading, and thus rarely used. Adding syntax that has no real teeth lulls the programmer into a false sense of security, making them elide null checks and Since the best we can do is make compile-time suggestions that a variable is nullable or not, why not just use attributes Also agree with HaloFour that the I appreciate that this a commonly requested feature and you guys want to provide a solution here- but it seems pretty clear the solution is not going to be much better than the tools that are already available to the community. I'd much rather see design effort spent on new features which will have a large impact, rather than half-baked solutions to impossible problems. |
I agree this change sounds toothless. Backwards compatibility has always been Microsoft's forte, so I understand your reluctance to introduce breaking changes, but I don't think we need special syntax for something that cannot be enforced across assemblies. Attributes, perhaps with automatic |
same opinion here. |
I don't think "string?" is redundant syntax. It clearly sets intention that I personally think that not nullable reference types is as big and 2015-03-28 13:54 GMT+01:00 qrli notifications@github.com:
|
I like all the ideas, but I'd also like to make the suggestion to always generate the null checks while in debug mode. Yes it's additional IL bloat and added overhead, but for debug mode that's okay. That will hopefully strike a good balance of having the checks while developing, but during release they don't cause performance issues. In the long term is it possible to have the compiler aware of the analyzers that have run, or possibly even have optimizers be additional components like analyzers? That way the compiled code can be optimized with the advantage of the guarantees that the analyzer enforces. (The same idea would be applicable for purity analyzers, perhaps others as well) |
Additional diagnostics to consider (pulled from a
|
How does that nullable reference type is going to help. Reference types are already nullable isn't it? What would be the difference between |
@SriramSakthivel The point of |
Given that we can't start over that would leave C#/.NET with effectively five different kinds of variables. That just seems way over complicated to me. Not to mention confusing because these new types will not play well with generics and cannot be enforced in a huge number of scenarios. On top of that, unless there is also a coordinated effort across the BCL we'll see little benefit across most code-bases, and we'll still see potentially tons of NREs popping up out of calls into the BCL due to parameters still being "raw" reference types. If we're just going the attribute route I think that we should just skip the special syntax and do exactly what Java IDEs do, respect and enforce those attributes as warnings or errors. Add the attributes to the BCL and make Roslyn support issuing warnings based on their application to method parameters. I'd save syntax changes for when there can be a coordinated effort between all of the teams to actually modify the CLR to have real semantics for non-nullable reference types as a core concept of the framework and modify the BCL to support those types properly across the board. |
1. "...but you wouldn't be able to automatically trust a string! to not be null." It sounds very close to what ReSharper is doing: In that case it definitely should not be implemented on a syntax level, but rather use plain attributes. 2. Opt-in diagnostics. Since null is a bad practice anyway, it means we will need to put the exclamation mark pretty much everywhere - without even trusting it completely. I tried it with ReSharper annotations and it sucks. I would rather agree with this obnoxious individual: http://simonhdickson.github.io/blog/2014/02/27/fsharp-31-as-csharp-60-strict-mode Maybe some sort of compiler flag to turn opt-in into opt-out? 3. As a thought experiment: would it be possible to implement everything F# has to offer but with C# syntax? The only downside is that it would cannibalize every other .NET language. |
Once the BCL and other major libraries have been annotated with non-nullability, it might be interesting to introduce a "strict" C# mode, where the default reference type is non-nullable and Hopefully, non-nullability is much more common than nullability in practice, so this feature shouldn't force correct code to! be! littered! with! exclamation! points! because of backward compatibility concerns. |
Thanks all, those are great comments! There's a lot of disagreement it seems; I'll comment on some of the themes emerging: Encoding: There seems to be agreement that if we do the feature, and as long as we cannot modify the CLR for a first class representation, we should take the attribute route. Is the feature worthwhile?: To be honest we aren't sure ourselves. What we do know is that it is worthwhile aggressively pursuing a design when a feature is so highly requested. When a bunch of really smart people inside and outside of Microsoft have collectively produced the best possible design, then we can see if what we ended up with should go in. If our best just isn't good enough, then we should and will feel free to walk away. Syntax vs. attributes: I appreciate the comments about dedicated syntax being misleading, or saving it for "when" the CLR can represent things natively. That said, there are very good reasons for having built-in syntax:
Are three kinds of types necessary?: Why add both Among the comments, there isn't agreement on which of the two should be represented by the unannotated type. We'd have to pick wisely. There are good arguments for making
Arguably, with a two-type approach you end up with a much better language in the end. However, the transition is more jarring: the minute you opt in (through whichever mechanism we land on), we would reinterpret all your as-yet unannotated reference types as non-nullable, causing possibly a slew of warnings. Wouldn't this critically discourage adoption? Also, we end up in a world where the same code ends up meaning different things, depending on whether it's in C# 6 (and earlier) or C# 7 (and later). When you look at a piece of source text, how do you know which meaning is intended? We're not entirely settled on the matter. Annotating BCL types: I agree that this feature has much less value if BCL types aren't annotated. We can only annotate them if such annotation doesn't become a breaking change. The opt-in semantics help with that: if you just want to continue to compile against the BCL, well if you don't opt in you don't see the annotations and you aren't broken. Annotating existing libraries will only break people who opt-in. There's a reasonable YAFIYGI argument to be made that breaking people who opt in is exactly what they want! |
Despite the outcry of requests for non-nullability, I don't believe this is the case. You always need something to represent an empty or unknown value; if you're not using |
Probably it is possible to add new constraint to generic, that will specify not null constraint for type parameter (it can be encoded with attribute same as for dynamic). Looks like compiler has ability to restrict user with correct implementing such generic (modreq on all methods with input/output of such parameter, always assign value on initialization). In that case, it would be safe to trust such generics and user will be able to create G<T!>, if G has been created with such constraint. |
There is a conflict between getting CLR support (so that the feature has more teeth), and compatibility. With CLR support the core BCL libraries are either retrofitted with this feature (potentially breaking compatibility with existing clients), or they are not retrofitted (in which we do not reap most of the benefits of the feature). |
That's simply not true. For instance the following method:
This method makes no sense to call it on a null user. I don't know if anyone has done the statistics, but on the majority of projects I've worked on, null was very rarely needed. |
@mirhagk And if the user doesn't have a favorite color? Even if you use an empty string you're relying on a sentinel value for the sake of representing a non-value. Either way, my points aren't that either side is wrong. It's that the ship has long since sailed. Adding it after the fact is inherently problematic. I do like the concept when it is handled as an intrinsic part of a new language. It's kind of slick in Apple Swift. Short of some painful and massive CLR changes what I'd rather see come out of this is the concept of methods advertising what represents legal arguments, enforcing those arguments and having the analysis to identify that at compile time. Nullability is just the tip of that iceberg. Why can't |
Then the correct signature would be:
You have to pass a username in always.
Yes ideally this should work in conjuction with contracts, and have Although the method of having an analyzer check contracts and enforce them doesn't require any C# changes, it only requires a roslyn analyzer, along with ensuring that analyzer is actually used (perhaps include it in the default visual studio templates). |
How about this: [assembly: NullOptOut()] Opting out of nullability assembly-wide, opting in only where necessary - can't get any lighter :).
Maybe that should be addressed? If there was some way of specifying custom attributes in more places then it would make units of measure possible/easier as well: https://msdn.microsoft.com/en-us/library/dd233243.aspx If they wouldn't make sense as post-compilation meta-data then they could be erased. |
Using attributes for |
@gafter What about a compromise between the two? There would be CLR-enforced This would mean that the feature would have teeth ( The issues I can see with this solution:
|
@svick Assuming that the compiler uses the same encoding method used for That said, I don't really see a reasonable way for the compiler to ever enforce non-nullability on generic types. Assuming the parameter type of |
@MadsTorgersen So suppose I start to use C# 7 with So I don't see "jarring" at all, because it is an attribute based approach, not enforced to the metal anyway. Maybe I missed something? |
In my experience, the default is almost always that an instance should be non-null. Therefore, having |
@ErikSchierboom, how would you represent what you know to be the city where I live? |
@paulomorgado Depends on the context. If I keep a list of people living in cities, it would be a non-nullable string, otherwise it would be a nullable. My point is that when I use reference types, most of the time what I actually want is for them to be non-nullable. |
Are you guys analyzing how Kotlin has gone about this problem? They are dealing with the same issues with an unchangeable JVM but have made it work beautifully with solutions that I'm not seeing being discussed here (ie: see points 2 & 3 below). 1. Opt-in is not the way 2. Different Versions of C# problem 3. The undeclared type solution NOTE: Kotlin takes this very approach and it works beautifully. See the Platform Types concept here: http://kotlinlang.org/docs/reference/java-interop.html. They use the syntax "Type!" for this semantic, meaning "either Type or Type?" instead of the proposed "??" above. I find their syntax a tad awkward and think something like "??" would be more appropriate. The browsing non-attributed Java code and return values in the IntelliJ IDE shows up using the "platform type" syntax. This same approach can be applied to C#, specifically since Microsoft controls both the C# language and the de facto IDE of the ecosystem. Guys let's please not sacrifice the future of C# because of conservative fears for which there are creative solutions for. We can have our cake and eat it too! |
As Mads says, though,
Except OmniSharp is a thing these days, so there's half a dozen other IDEs not under Microsoft's control that will have to include such UX features. I estimate the odds of that to be slim to none.
I'd like to see this as well (maybe), but the big discussions surrounding non-nullability indicate that the implementation should be done as complete and correct as possible from the first stable release onward. I'd rather see them postpone a game-changer like this until they can get it right (or dropped altogether if needed), rather than having it be a half-baked solution resulting in it not being used at all. |
When Code Contracts failed* at least there was some comfort in knowing that it's just an API. When syntax-based LINQ failed*, it will stay in the language forever. (*) failed, as in not getting wide popularity on it's own. Code Contracts is too slow and verbose to be worth it, even Microsoft doesn't want to use it - look at EF7 repo. LINQ query is competing with LINQ extension methods and not winning, there is not even a query-based way to say |
Because it would be meaningless in a query. |
@GeirGrusom what about |
@dsaf that is a perfect example :) |
@dsaf I'm not sure how one would compile the metrics but I would be very surprised if the LINQ query extensions weren't extensively used. Using a single project like EF7 as an anecdote isn't very representative. I personally switch between them depending on the situation. For the most part I prefer the extension method syntax, but I think that the query syntax is more succinct for group joins, outer joins or projecting a sub-expression ( Aside that, I agree. I'm sure that the compiler team has to weigh the reality of features failing to gain traction and ending up as cruft grammar within the language. Understanding why Code Contracts didn't take off is probably significantly more important than what behavior the proposed language feature might have. |
@dsaf Oh, well, in that case we agree. JetBrains probably has it (mostly) right. More important than what it does if the parameters are invalid is how it helps to avoid them from being invalid. The bread-and-butter cases overlap perfectly with the not-nullability proposals. Also agreed about LINQ. There are plenty of common use cases where people do mix both and doing so within a single expression isn't the most pleasant syntax in the world. It's hard to say where to draw that line since LINQ is so open-ended. VB.NET does offer var dictionary = from person in people
where person.Age >= 21
take 10
select person
into dictionary by person.Name; Wading into some kind of convention of decorating static extension methods with attributes to support extending the query syntax would be interesting but probably become very messy very quickly. |
@HaloFour: Are you aware of how F# solves this problem quite elegantly? |
@axel-habermaier, I'm not. Would you care to share how? |
@paulomorgado: Well, F# has this concept of computation expressions which are roughly comparable to monads. The syntax for those expressions is user-extensible and they use it to implement sequence expressions (like However, I'm not so sure that a similar concept would fit well into C#, but it might be worth looking into it. Maybe there is a way to generalize the implementations of those features, make them user-extensible and all of that while still retaining backwards compatibility. |
I think it is easily possible to make LINQ query operators extensible via libraries: just project extension methods to query operators. That at least solves majority simple cases. e.g. var dictionary = from person in people
where person.Age >= 21
Take 10
select person
ToDictionary person.Name; Note the above Taking this way, it seems only |
@qrli If you want a syntax like that, I believe you would somehow need to distinguish between F# uses an attribute for that. |
@svick I'm thinking in a simpler way. The var dictionary = people
.Where(person => person.Age >=21)
.Take(10)
.Select(person => person)
.ToDictionary(person => person.Name); And then some optimization to remove the Select(). |
Design notes have been archived at https://github.com/dotnet/roslyn/blob/future/docs/designNotes/2015-03-10%2C17%20C%23%20Design%20Meeting.md but discussion can continue here. |
C# Design Meeting Notes for Mar 10 and 17, 2015
Agenda
These two meetings looked exclusively at nullable/non-nullable reference types. I've written them up together to add more of the clarity of insight we had when the meetings were over, rather than represent the circuitous path we took to get there.
Initial port and addition of README.md #1. Nullable and non-nullable reference types
The core features on the table are nullable and non-nullable reference types, as in
string?
andstring!
respectively. We might do one or both (or neither of course).The value of these annotations would be to allow a developer to express intent, and to get errors or warnings when working against that intent.
#2. Opt-in diagnostics
However, depending on various design and implementation choices, some of these diagnostics would be a breaking change to add. In order to get the full value of the new feature but retain backward compatibility, we therefore probably need to allow the enforcement of some or most of these diagnostics to be opt-in. That is certainly an uncomfortable concept, and adding switches to the language changing its meaning is not something we have much of an appetite for.
However, there are other ways of making diagnostics opt-in. We now have an infrastructure for custom analyzers (built on the Roslyn infrastructure). In principle, some or all of the diagnostics gained from using the nullability annotations could be custom diagnostics that you'd have to switch on.
The downside of opt-in diagnostics is that we can forget any pretense to guarantees around nullability. The feature would help you find more errors, and maybe guide you in VS, but you wouldn't be able to automatically trust a
string!
to not be null.There's an important upside though, in that it would allow you to gradually strengthen your code to nullability checks, one project at a time.
#3. Representation
The representation of the annotations in metadata is a key decision point, because it affects the number of diagnostics that can be added to the language itself without it being a breaking change. There are essentially four options:
string?
be represented asstring
plus an attribute saying it's nullable. This is similar to how we representdynamic
today, and for generic types etc. we'd use similar tricks to what we do fordynamic
today.NullableRef<T>
andNonNullableRef<T>
or something like that. The structs would have a single field containing the actual reference.We can probably dispense with 3 and 4. We've never used modreq's before, and who knows how existing compilers (of all .NET languages!) will react to them. Besides, they cannot be used on type arguments, so they don't have the right expressiveness. A truly new metadata annotation has similar problems with existing compilers, and also seems like overkill.
Options 1 and 2 are interesting because they both have meaning to existing compilers.
Say a library written in C# 7 offers this method:
With option 1, this would compile down to something like this:
A consuming program in C# 6 would not be constrained by those attributes, because the C# 6 compiler does not know about them. So this would be totally fine:
Unfortunately, if something is fine in C# 6 it has to also be fine in C# 7. So C# 7 cannot have rules to prevent passing null to a nonnullable reference type, or prevent dereferencing a nullable reference type!
That's obviously a pretty toothless - and hence useless - version of the nullability feature in and of itself, given that the value was supposed to be in getting diagnostics to prevent null reference exceptions! This is where the opt-in possibility comes in. Essentially, if we use an attribute encoding, we need all the diagnostics that make nullability annotations useful be opt-in, e.g. as custom diagnostics.
With option 2, the library would compile down to this:
Now the C# 6 program above would not compile. The C# 6 compiler would see structs that can't be null and don't have a Length. Whatever members those structs do have, though, would be accessible, so C# 7 would still have to accept using them as structs. (We could mitigate this by not giving the structs any public members).
For the most part, this approach would make the C# 6 program able to do so little with the API that C# 7, instead of adding restrictions, can allow more things than C# 6.
There are exceptions, though. For instance, casting any returned such struct to
object
would box it in C# 6, whereas presumably the desired behavior in C# 7 would be to unwrap it. This is exactly where the CLR today has special behavior, boxing nullable value types by first unwrapping to the underlying type if possible.Also, having these single-field structs everywhere is likely going to have an impact on runtime performance, even if the JIT can optimize many of them away.
Probably the most damning objection to the wrapper structs is probably the degree to which they would hamper interoperation between the different variations of a type. For instance, the conversion from
string!
tostring
and on tostring?
wouldn't be a reference conversion at runtime. Hence,IEnumerable<string!>
wouldn't convert toIEnumerable<string>
, despite covariance.We are currently leaning strongly in the direction of an attribute-based representation, which means that there needs to be an opt-in mechanism for enforcement of the useful rules to kick in.
#4. Potentially useful rules to enforce
Don't dereference
C?
: you must check for null or assert that the value is not null.Don't pass
null
,C
orC?
toC!
: you must check for null or assert that the value is not null.Don't leave
C!
fields unassigned: require definite assignment at the end of the constructor. (Doesn't prevent observing null during initialization)Avoid
default(C!)
: it would be null!Don't instantiate
C![]
: it's elements would be null. This seems like a draconian restriction - as long as you only ever read fields from the array that were previously written, no-one would observe the default value. Many data structures wrapping arrays observe this discipline.Don't instantiate
G<C!>
: this is because the above rules aren't currently enforced on even unconstrained type parameters, so they could be circumvented in generic types and methods. Again, this restriction seems draconian. No existing generic types could be used on nonnullable reference types. Maybe the generic types could opt in?Don't null-check
C!
: oftentimes using e.g.?.
on something that's already non-nullable is redundant. However, since non-nullable reference types can be null, maybe flagging such checks is not always so helpful?We very much understand that these rules can't be perfect. The trade-off needs to be between adding value and allowing continuity with existing code.
#5. Safely dereferencing nullable reference types
For nullable reference types, the main useful error would come from dereferencing the value without checking for null. That would often be in the shape of the null-conditional operator:
However, just as often you'd want the null test to guard a block of code, wherein dereferencing is safe. An obvious candidate is to use pattern matching:
It is somewhat annoying to have to introduce a new variable name. However, in real code the expression being tested (
ns
in the above example) is more likely to be a more complex expression, not just a local variable. Or rather, theis
expression is how you'd get a local variable for it in the first place.More annoying is having to state the type again in
ns is string! s
. We should think of some shorthand, likens is ! s
orns is var s
or something else.Whatever syntax we come up with here would be equally useful to nullable value types.
#6. Generating null checks for parameters
There'd be no guarantees that a
string!
parameter actually isn't going to be null. Most public API's would probably still want to check arguments for null at runtime. Should we help with that by automatically generating null checks forC!
parameters?Every generated null check is performance overhead and IL bloat. So this may be a bad idea to do on every parameter with a non-nullable reference type. But we could have the user more compactly indicate the desire to do so. As a complete strawman syntax:
Where the double
!!
means the type is non-nullable and a runtime check should be generated.If we choose to also do contracts (#119), it would be natural for this feature to simply be a shorthand for a null-checking
requires
contract.The text was updated successfully, but these errors were encountered: