-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for non-nullable references (and safe nullable references) #227
Comments
Follow-on Post1. IntroductionThis follows on from my previous post, which contained the main body of the proposal. This post lists some other cases which are mostly variations on what is presented in the original post, and which would just have cluttered up the original post post if I had put them all in. 3. Mandatory ReferencesShould an uninitialised mandatory reference trigger an error? No, because there are situations where you need more complex initialisation. But the reference can't be used until it is initialised. Dog! mandatoryDog; // OK, but the compiler is keeping a close eye on you. It wants the variable initialised asap.
mandatoryDog.Bark(); // Compiler Error - you can't do anything with the reference until it is initialised.
anotherMandatoryDog = mandatoryDog; // Compiler Error - you can't do anything with the reference until it is initialised.
// There is some complexity in how the variable is initialised (which is why it wasn't initialised when it was declared).
if (getNameFromFile)
{
using (var stream = new StreamReader("DogName.txt"))
{
string name = stream.ReadLine();
mandatoryDog = new Dog(name);
}
}
else
{
mandatoryDog = new Dog("Mandatory");
}
mandatoryDog.Bark(); // OK - compiler knows that the reference has definitely been initialised See also the Constructors section of my original post which attempts to address similar issues in the context of constructors. 6. Using Nullable ReferencesThe original post showed how to use an 'if' / 'else' statement to apply a null check to a nullable reference so that the compiler would let us use that reference inside the 'if' block. Note that when you are in the 'else' block, there is no point actually using the nullable reference because you know it is null in this context. You might as just use the constant 'null' as this is clearer. I would like to see this as a compiler error: if (nullableDog != null)
{
// Can do stuff here with nullableDog
}
else
{
Dog? myNullableDog1 = nullableDog; // Compiler Error - it is pointless and misleading to use the variable when it is definitely null.
Dog? myNullableDog2 = null; // OK - achieves the same thing but is clearer.
} Note that even though the same reasoning applies to traditional (general) references, we can't enforce this rule or we would break existing code: if (generalDog != null)
{
// Can do stuff here with generalDog (actually we can do stuff anywhere because it is a general reference).
}
else
{
Dog myGeneralDog1 = generalDog; // OK - otherwise would break existing code.
} Now, here are some common variations on the use of the 'if' / 'else' statement that the compiler recognises: Firstly, you do not have to have the 'else' block if you don't need to handle the null case: if (nullableDog != null)
{
nullableDog.Bark(); // OK - the reference behaves like a mandatory reference.
} Also you can check for null rather than non-null: if (nullableDog == null)
{
nullableDog.Bark(); // Compiler Error - the reference still behaves as a nullable reference.
}
else
{
nullableDog.Bark(); // OK - the reference behaves like a mandatory reference.
} You can also have 'else if' blocks in which case the reference behaves the same in each 'else if' block as it would in a plain 'else' block: if (nullableDog != null)
{
nullableDog.Bark(); // OK - the reference behaves like a mandatory reference.
}
else if (someOtherCondition)
{
nullableDog.Bark(); // Compiler Error - the reference still behaves as a nullable reference.
}
else
{
nullableDog.Bark(); // Compiler Error - the reference still behaves as a nullable reference.
} You can also have 'else if' with a check for null rather than a check for non-null: if (nullableDog == null)
{
nullableDog.Bark(); // Compiler Error - the reference still behaves as a nullable reference.
}
else if (someOtherCondition)
{
nullableDog.Bark(); // OK - the reference behaves like a mandatory reference.
}
else
{
nullableDog.Bark(); // OK - the reference behaves like a mandatory reference.
} You can also have additional conditions in the 'if' statement ('AND' or 'OR'): if (nullableDog != null && thereIsSomethingToBarkAt)
{
nullableDog.Bark(); // OK - the reference behaves like a mandatory reference in this scope.
}
else
{
nullableDog.Bark(); // Compiler Error - reference still behaves as a nullable reference in this scope (we don't know whether it is null or not, as we don't know which condition made us reach here).
Dog? myNullableDog = nullableDog; // OK - unlike the example at the top of this section, it does make sense to use the myDog reference here because it could be non-null.
} if (nullableDog != null || someOtherCondition)
{
nullableDog.Bark(); // Compiler Error - reference still behaves as a nullable reference in this scope (we don't know whether it is null or not, as we don't know which condition made us reach here).
Dog? myNullableDog = nullableDog; // OK - unlike the example at the top of this section, it does make sense to use the myDog reference here because it could be non-null.
}
else
{
nullableDog.Bark(); // Compiler Error - reference still behaves as a nullable reference in this scope (in fact we know for certain it is null).
Dog? myNullableDog = nullableDog; // Compiler Error - as in the example at the top of this section, it doesn't make sense to use the myDog reference here because we know it is null.
} You can also have multiple checks in the same 'if' statement: if (nullableDog1 != null && nullableDog2 != null)
{
nullableDog1.Bark(); // OK - the reference behaves like a mandatory reference in this scope.
nullableDog2.Bark(); // OK - the reference behaves like a mandatory reference in this scope.
} Note that when you are in the context of a null check, you can do anything with your nullable reference that you would be able to do with a mandatory reference (not only accessing methods and properties, but anything else that a mandatory reference can do): if (nullableDog != null)
{
nullableDog.Bark(); // OK - the reference behaves like a mandatory reference.
Dog! mandatoryDog = nullableDog; // OK - the reference behaves like a mandatory reference.
} On a slightly different note - we have established that we can use the following language features to allow a nullable reference to be dereferenced: string name1 = (nullableDog != null ? nullableDog.Name : null); // OK
string name2 = nullableDog?.Name; // OK But it is pointless to apply these constructs to a mandatory reference, so the following will generate compiler errors: string name3 = (mandatoryDog != null ? mandatoryDog.Name : null); // Compiler Error - it is a mandatory reference so it can't be null.
string name4 = mandatoryDog?.Name; // Compiler Error - it is a mandatory reference so it can't be null. In fact a mandatory reference cannot be compared to null in any circumstances. 9. Class LibrariesWhat about if you have an existing assembly, compiled with an older version of the C# compiler, and you want it to use a class library which has new style references? There should be no issue here as the older compiler will not look at the new property on ParameterInfo (because it doesn't even know that the new property exists), and in a state of blissful ignorance will treat the library as if it only had traditional (general) references. On another note, in order to facilitate rapid adoption of the new style references an attribute like this could be introduced: [assembly: IgnoreNewStyleReferencesInternally] This would mean that the ParameterInfo properties would be generated, but the new style references would be ignored internally within the library. This would mean that the library writers could get a version of their library with the new style references to market more rapidly. The code within the library would of course not be null reference safe, but would be no less safe than it already was. They could then make their library null safe internally for a later release. |
This: http://twistedoakstudios.com/blog/Post330_non-nullable-types-vs-c-fixing-the-billion-dollar-mistake is also interesting reading. |
I'm all for this. However in example 3 where you declare a mandatory reference but then do not initialise it. Wouldn't it be better to require a mandatory reference to be initialised the moment it is declared. Kind of like the way that Kotlin does it. |
Hi Miista, I hadn't heard of Kotlin before, but having now read its documentation at http://kotlinlang.org/docs/reference/null-safety.html, I realise that I have (unintentionally) pretty much stolen its null safety paradigm :-) Regarding initialisation, I have tried to allow programmers a bit of flexibility to do the sort of initialisation that cannot be done on a single line. It would be possible to be stricter and say that if they do want to to this they have to wrap their initialisation code in a method: Dog! mandatoryDog = MyInitialisationMethod(); // The method does all the complex stuff and returns a mandatory reference This may be seen as being too dictatorial about coding style - but it's something worthy of discussion. |
Having read the article by Craig Gidney (http://twistedoakstudios.com/blog/Post330_non-nullable-types-vs-c-fixing-the-billion-dollar-mistake), I now realise that I was on the wrong track saying that "the different types of references are not different 'types' in the way that int and int? are different types". I have amended my original post to remove this statement and I also re-wrote the section on 'var' due to this realisation. |
By the way you can vote for my proposal on UserVoice if you want: https://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/7049944-consider-my-detailed-proposal-for-non-nullable-ref As well as voting for my specific proposal you can also vote for the general idea of adding non-nullable references (this has quite a lot of votes): https://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2320188-add-non-nullable-reference-types-in-c |
Craig Gidney's article (mentioned above) raises the very valid question - what is the compiler meant to do when asked to create an array of mandatory references? var nullableDogs = new Dog?[10]; // OK.
var mandatoryDogs = new Dog![10]; // Not OK - what does the compiler initially fill the array with? He explains: "The fundamental problem here is an assumption deeply ingrained into C#: the assumption that every type has a default value". This problem can be dealt with using the same principle that has been used previously in this proposal - teaching the compiler to detect a finite list of clear and intuitive 'null-safe' code structures, and having the compiler generate a compiler error if the programmer steps outside that list. So what would the list look like in this situation? Obviously the compiler will be happy if the array is declared and populated on the same line (as long as no elements are set to null): Dog![] dogs1 = { new Dog("Ben"), new Dog("Sophie"), new Dog("Rex") }; // OK.
Dog![] dogs2 = { new Dog("Ben"), null, null }; // Compiler Error - nulls not allowed. The following syntax variations are also ok: var dogs3 = new Dog![] { new Dog("Ben"), new Dog("Sophie"), new Dog("Rex") }; // OK.
Dog![] dogs4 = new [] { new Dog("Ben"), new Dog("Sophie"), new Dog("Rex") }; // OK. The compiler will also be happy if we populate the array using a loop, but the loop must be of the exact structure shown below (because the compiler needs to know at compile time that all elements will be populated): int numberOfDogs = 3;
Dog![] dogs5 = new Dog[numberOfDogs];
for (int i = 0; i < dogs5.Length; i++)
{
dogs5[i] = new Dog("Dog " + i);
} The compiler won't let you use the array in between the declaration and the loop: int numberOfDogs = 3;
Dog![] dogs5 = new Dog[numberOfDogs];
Dog![] myDogs = dogs5; // Compiler Error - cannot use the array in any way.
for (int i = 0; i < dogs5.Length; i++)
{
dogs5[i] = new Dog("Dog " + i);
} The compiler will also be ok if we copy from an existing array of mandatory references: Dog![] dogs6 = new Dog[numberOfDogs];
Array.Copy(dogs5, dogs6, dogs6.Length); Similarly to the previous case, the array cannot be used in between being declared and being populated. Also note that the above code could throw an exception if the source array is not long enough, but this has nothing to do with the subject mandatory references. The compiler will also allow us to clone an existing array of mandatory references: Dog![] dogs7 = (Dog![])dogs6.Clone(); This seems to me like a reasonable list of recognised safe code structures but people may be able to think of others. |
It is great to see some thinking on non-nullable and safely nullable reference types. This gist is another take on it - adding only non-nullable reference types, not safely nullable ones. Not only Kotlin but also Swift have approaches to this. Of course, many functional languages, such as F#, don't even have the issue in the first place. Indeed their approach of using T (never nullable) and Option (where the T can only be gotten at through a matching operation that checks for null) is probably the best inspiration we can get for how to address the problem. I want to point out a few difficulties and possible solutions. Guards and mutabilityThe proposal above uses "guards" to establish non-nullness; i.e. it recognizes checks for null and remembers that a given variable was not null. This does have benefits, such as relying on existing language constructs, but it also has limitations. First of all, variables are mutable by default, and in order to trust that they don't change between the null check and the access the compiler would need to also make sure the variable isn't assigned to. That is only really feasible to do for local variables, so for anything more complex, say a field access ( I think a better approach is to follow the functional languages and use simple matching techniques that test and simultaneously introduce a new variable, guaranteed to contain a non-null value. Following the syntax proposed in #206, something like: if (o is Dog! d) { ... d.Bark(); ... } Default valuesThe fact that every type has a default value is really fundamental to the CLR, and it is an uphill battle to deal with that. Eric Lipperts blog post points to some surprisingly deep issues around ensuring that a field is always definitely assigned. But the real kicker is arrays. How do you ensure that the contents of an array are never observed before they are assigned? You can look for code patterns, as proposed above. But it will be too restrictive. Say I'm building a Say that the same One option here is to just not allow arrays of non-nullable reference types - people will have to use Library compatibilityThink of a library method public string GetName(Dog d); In the light of this feature you might want to edit the API. It may throw on a null argument, so you want to annotate the parameter with a public string GetName(Dog! d); Depending on your conversion rules, this may or may not be a breaking change for a consumer of the libarary: Dog dog = ...;
var name = GetName(dog); If we use the "safe" rule that Instead we could consider allowing an implicit conversion from On the other end of the API there's also a problem. Assume that the method never returns null, it should be completely safe to add Not quite. Notice that the consumer stores the result in a |
The way I see it public string GetName(Dog! dog) { ... }
Dog! dog = ...;
var name = GetName(dog); // May return null If you wanted to return a non-nullable public string! GetName(Dog! dog) { ... }
Dog! dog = ...;
var name = GetName(dog); // Will never return null The non-nullability in the last example could even be enforced by the compiler (to some lengths – there may be some edge cases I can't think of). |
In order to maintain backwards compatibility I believe the type should be inferred to the loosest (may be the wrong term) possible scope e.g. Trying to set a non-nullable reference to |
Yes yes yes, my god yes. The billion dollar mistake infuriates me. It's absolutely insane that references allow null by default, since i know c# will never be willing to fix the billion dollar mistake this is atleast a viable alternative. And removes the need to use the stupid |
The "billion dollar mistake" was having a type system with null at all. This would not fix that, it just makes it slightly less painful. |
@gafter what I want most is for C# to drop nulls entirely unless a reference is specifically marked nullable, but i know that will never happen |
@dotnetchris There is no way to shoehorn that into the existing IL or C# semantics. |
I think there is value in stepping back and watching the Swift and Obj-C communities battle it out over this issue. Apparently despite the slick appearance of optionals in Swift it creates a number of severe nuisance scenarios, particularly in writing the initialization of a class: Swift Initialization and the Pain of Optionals My concern has always been that without Ultimately, in my opinion, non-nullable references feels a little like Java checked exceptions. Sure, it seems great on paper, and even better with perfectly idiomatic examples, but it also creates obnoxious barriers in practice which encourage developers to take the easy/lazy way out thus defeating the entire purpose. It feels like erecting guard-rails along a hairpin curve on the edge of a cliff. Sure, the purpose is safety, but perceived safety can encourage recklessness, and I think that developers should be learning how to code more defensively (not just for simple validation but to also never trust your inputs) not assuming that someone else will relieve them of that burden. Just a devil's advocate rebuttal by someone who would probably welcome them to the language if done well. 😄 |
@HaloFour checked exceptions was the only positive statement I ever have to say about Java, other than ICloneable actually being you know, cloning. |
I really can't understand what the problem is about Look at a I don't know that much about F# but I can't see Sure it's a pain to have to be lookng out for The So far, the compiler does everything it can to generate code that will behave as intended at run time. For that, it relies on the CLR. What you are proposing goes more on the way of "looks good on paper, hope it goes well at run time.". Having the compiler yield warnings just because your intention is not verifiable, even at compile time, is a very bad idea. Compiler warnings should not be yield for something that the developer cannot do anything about.
The cast should be either possible or not. Regarding Is |
@paulomorgado what it fundamentally boils down is I as the author of code should have the authority to determine whether null is valid or absolutely invalid. If I state this code may never allow null, the compiler should within all reason attempt to enforce this statically. While the It being labeled "The Billion Dollar mistake" is not hyperbole, I actually expect it to have cost multiple billions if not tens of billions at this point. |
What costs billions of dollars are bad programmers and bad practices, not The great thing with null reference exceptions is that you can always point where it was and fix it. Changing the compiler without having runtime guarantees will be just fooling yourself and, when bitten by the mistake, you might not be able to know where it is or fix it. Sure I'd like to have non nullable reference types as much as I wanted nullable value types. But I want it done right. And I don't see how that will ever be possible without changing the runtime, like it happened with nullable value types. |
Thanks everyone for engaging in discussion on this topic. I have some responses to what people have said but I haven't had time yet to write them down due to working day and night to meet a deadline. I'll try and post something over the weekend. |
On Generics: I don't think generics should be allowed to be invoked with non-nullable references unless there is a constraint on the generic method. public static void InvalidGeneric<T>(out T result) { result = default(T); }
public static void OkGeneric<T>(out T result) where T : class! { result = Arbitrary(); }
public static void Bar()
{
string! input;
InvalidGeneric(out input); // Illegal as it would return a mandatory with a null reference
OkGeneric(out input); // OK.
} |
IMO converting a mandatory reference to a "weaker" one should always be allowed and implicit. I.e. if a function takes a nullable reference as an argument you should be able to pass in a mandatory reference. Same if the function takes a legacy reference. You're not losing anything here, the code is expecting weaker guarantees than you can provide. If your code works with a nullable or general reference, then clearly it wouldn't break if I pass it a mandatory reference (it will just always go down any non-null checks inside). I also think nullable to general and vice versa should be allowed. They're effectively the same except for compile time vs runtime checking. So dealing with older code would be painful if you couldn't use good practices (nullable references) in YOUR code without having to make the interface to legacy code ugly. Chances are people will just keep using general references if you add a bunch of friction to that. Make it easy and encouraged to move to safer styles little by little, IMO. This last case may warrant a warning ("this will turn compile time check into runtime check"). The first two cases (mandatory/nullable implicitly converts to general) seems like perfectly reasonable code that you would encourage people to do in order to transition to a safer style. You don't want to penalize that. |
@paulomorgado As much as I hate to bring Java up, its lack of reified types means that generic type information is not around at runtime, and yet in 10 years I've never once accidentally added an integer to a list of strings. (Don't get me wrong, not having reified types causes other issues, usually around reflection, but reflection can cause all sorts of bad things if you don't know what you're doing.) While runtime checking may sound like a good sanity check, it comes at a cost, and it's by no means required to make your system verifiably correct. (Assuming of course you aren't hacking the innards of your classes through reflection.) Re: empty string vs non empty string: Those are two different types, and should be treated as such. You couldn't do any compile-time verification that you didn't pass an empty string to the constructor of NonEmptyString, but you'd at least catch it at the place of instantiation, rather than later classes doing the check and making it difficult to trace back to where the illegal value originated. The same theory goes for converting nullable types to non-null types. By the way, Ceylon does something very similar to this proposal. Might be worthwhile looking at them. |
Is the
|
Nice that the TOP 1 C# requested feature, Non-Nullable reference types, is alive again. We discussed deeply some months ago about the topic. It's a hard topic, with many different alternatives and implications. Consequently is easier to write a comment with a naive proposal than read and understand the other solutions already proposed. In order to work in the same direction, I think is important to share a common basis about the problem. On top of the current proposal, I think this links are important.
Back to the topic. I think the concept explained here lacks a solution for two hard problems: GenericsIt's explained how to use the feature in a generic type, (uisng This problem is really important to solve because generics are often used for collections, and 99% of the cases you don't want nulls in your collection. It's also challenging because is a must that it works transparently on non-nullable references and nullable value types, even if they are implemented in a completely different way at the binary level. We already have many collections to chose ( I think unifying the type system is really important, but this has the consequences that Library compatibilityThis solutions focuses in local variables, but the challenge is in integrating with other libraries, legacy or not, written in C# or other languages. It's important that the tooling is strong in this cases, and that safety is preserved. Unfortunately this requires run-time checks. Also, is important that library writers (including BCL) can chose to use the features without fear of undesired consequences for their client code. I propose three compilation flags: strict, transitory and legacy (similar to As Lucian made me see, this solution is better than branching C# in two different languages: One where |
@HaloFour but it's weird. I abolutely sure that I don't want to check is passed value is null when I said that i's not null. Imagine the code
it would be VERY strange if I get a NullReferenceException here. I see only two possibilities here: compiler automaticly adds not-null checks everythere in the code, or it just will be CLR-feature, so it won't be possible to reference C# 8.0 (or whatever version) from below one. I think they will use the first approach, becuase as I said, there is branch predictor, so extra-check for not-null will be predicted and skipped for most of time, and it also makes them easier to implement it: for example, if we leave it as compiler-feature, it's hard to make reflection work fine with it. And if we have runtime checks in methods, we have nothing to do with reflection. |
I am only relaying the proposal as it is currently. Both of those approaches have already been discussed as, as of now, neither are being implemented. This will be purely a compiler/analyzer feature. It won't even result in compiler errors, just warnings, which can be disabled and intentionally worked around. I believe the latest version of this proposal is here: #5032 As mentioned, this is at least an additional C# version out, so it's all subject to change. |
@Pzixel I assume (hope) that |
Actually, the non-nullable version would be |
@HaloFour I don't know if T? is good syntax because it breaks down entire existing code. No, i'm totally agree that it's more consistent, than mixing ?, ! and so on, but if we are looking for backward comptability, it will break everything in a code. And of course it should be an error, not warning. Why? But it's types mismatch, and it is clearly an error. We should get CS1503 and that's all. It's weird to get a null when i said that i can't get null. If i want a warning - i can use [NotNull] attribute, not introducing a whole new type. And it makes sense. |
I've already made that argument, but it seems that this is the direction that the team wants to go anyway. I believe that the justification is that the vast majority case is wanting a non-nullable type so having to explicitly decorate them would lead to a lot of unnecessary noise. Pull the band-aid once.
Primarily because of how much of a change it is to the language and because it can never be achieved with 100% certainty. I'm not particularly interested in rehashing all of the comments already on these proposals, but justifications are listed there. |
I'm pretty sure you're going to have to break backwards compatibility anyways, or make type inference useless.
What is the inferred type of x on Line A? If it's inferred as non-null (the expected type), then our code at line B will no longer compile. If we infer on Line A that x is nullable, then everything compiles as it used to, but now your type inference is inferring a less useful type. Either devs won't used non-null types, or devs will stop using type inference. I can imagine which of those two options will win out... |
You should rehash nothing, basically i only want to get type mismatch when I get it insead of warnings and so on. It will emit checks or it will be a compiler feature - that is topic to speak, but if we are talking about interface - type mismatch - it's defenitly should be an error. |
Local variables will have a nullable type state that can be different from one point in the program to another, based on the flow of the code. The state at any given point can be computed by flow analysis. You won't need to use nullable annotations on local variables, because it can be inferred. |
@gafter var is used to infer type in point of declaration, we shouldn't analyze any flow after. |
@Pzixel "Nullability" isn't being treated as a separate type, it's a hint to the compiler. The flow analysis is intentional to prevent the need for extraneous casts when the compiler can be sure that the value would not be public int? GetLength(string? s) {
if (s == null) {
return null;
}
// because of the previous null check the compiler knows
// that the variable s cannot be null here so it will not
// warn about the dereference
return s.Length;
} |
So I'm still a bit confused. @gafter's comment insinuated that flow analysis would go upwards, while @HaloFour's example demonstrates it going downwards. Downwards flow analysis would be pretty much required if any implementation, and in fact R# already does that sort of analysis with the [NotNull] attributes. However, without the upwards flow analysis, I don't think type inference would be able to provide much benefit, unless breaking backwards compatibility was an option. |
@HaloFour int and int? are completly different types. I really want the same UX for reference types. I can use attribute [NotNull], [Pure] and so on for a warning. I want to be abolutely sure that I CAN'T receive null if it is marked as not null. So in provided example:
Of course, ideally i'd like to see something like |
Simple put, that wouldn't be possible. Even if massive CLR changes were on the table it probably couldn't be done. The notion of a default value is too baked in. Generics, arrays, etc., there's no way to get around the fact that Flow analysis is a compromise, one that can be fitted onto the existing run time and one that can work with a language that has 15 years of legacy that it needs to support. It follows the Eiffel route, know where you can't make your guarantees and solve through flow analysis. Even then, sometimes the developer can (and should) override. |
IIRC the type inferred by string s2 = null; // no warning?
int i1 = s1.Length; // warning of potential null dereference
string? s2 = "foo";
int i2 = s2.Length; // no warning |
Generics was introduced once, another major change is possible too. Nobody says that it's easy, but they have to do it to implement it properly. It's the only way to make a strong type system. All Now i see that it's even simpler than I thought. Just treat them as others types and that's all. Reflection throws a mismatch error in runtime, compiler does it in compile-time and everyone are happy. And it even still be compile-time feature. The only problem is with older versions of C#, but changes in reflection should be changes in runtime, so it will be feature for net .Net. We can do compatble version with runtime checks, for example when we use .Net 4.6 and below runtime checks (if blabla != null), with .Net 4.7 we assume that reflection do its job in runtime and remove them from the code. Elegant solution. |
Generics was additive and worked entirely within the existing semantics of the run time.
That "other" type can't prevent the reference type that it contains from being |
But if we get a whole new types, we can programm savely, as it is in functional languages There is no Yes, C# has legacy, but it's only about syntax, not about its spirit or internals. Again, we do not need to change anything in CLR, we do not need to change anything in C# or reflection, we just add new types with couple of rules, about upcast and downcast, and several compiler errors. It's enough to implement it in all power. |
@Pzixel Warnings have the huge benefit that you can just ignore or even suppress them. All legacy code and all samples continue to work. With this approach, you can use all the benefits of null safety, but you don't have to. If you write all your code using this new feature, you would solve nearly all your NREs. You won't ever get 100% safety anyways, because there are still things like COM interop where evil C++ might null something, you have unsafe C# where someone could sneak in null's into your non-nullable fields. So since 100% safety is not possible anyways, and breaking changes are off the table, this is our best option to still get close to 100% safety. It might be worth discussing if it is also possible to have the compiler automatically insert null checks when referencing libraries which were not developed with the null safety feature. As an optional compiler switch, in addition to the already discussed warning switch for referenced libraries. This would solve some more corner cases and bring you closer to 100%. |
The DLR is completely separate from the CLR and isn't relevant to this discussion.
The
And 15 years worth of applications/libraries that you're asking to be broken.
Which creates a solution which is just as leaky (since you cannot possibly define a wrapping type that actually enforces this behavior) but has the added benefit of breaking every written piece of code. |
@lukasf again, we don't change meaning of existing keywords, we just adds another one, where
Something like this. |
@Pzixel Non-Nullable: Nullable: Even when using "!" to create a new, non-nullable type, you'd still get massive problems, especially with libraries. Update one library to non-nullable and BOOM all projects and libraries referencing that library would stop working, because they all see different, unknown types now. So once you upgrade one lib you basically would need to update all libs. Again you would create kind of a different language where you could not use new libs from old projects, and you would have lots of trouble using old libs with new code. If the language was created from ground up, I would surely pledge for full null safety as can be seen in other new languages. But C# is out there since more than a decade and there is lots of legacy code, lots of libs, lots of samples and knowledge. You cannot introduce such a radical breaking change into an existing and well established language. It's sad, I'd love to see a really strong nullability concept, but it is not going to happen. So now we better look at what the realistic options are. Better take an almost safe nullability system than no system at all. |
@lukasf Use a modopt or modreq to create an overload, and you can have backwards compatibility. Sadly this proposal does not seem to be heading in that direction. |
@lukasf yes, we have legacy, thus we should choose between two evils. The first one is a bit confusing syntax, while in second you receive nothing except warning noise. Warnings was never guratantee, while I want to be SURE that if i write not-null prameter, it will NEVER be null. It's bizzare to write a not-null parameter and then check internally if it's really not null. In this case we don't even need this extra syntax while attributes does this very thing, NotNullAttribute will warn you if you pass a possible null. Why we are needing this syntax, for locals only? Well, locals are local enough and we don't really need this feature. it is useful, but there are plenty of more significant features to be implemented. About BOOM: nobody was blaming Microsoft to make nullables of abolutely different types. Becuase it is logically correct: it's not a type, it's a wrapper for this type. When we are talking about not-null types we require one-side cast available, like this one:
It could be implemented like this:
It's just a concept, it requires extra space for struct etc, but it CAN be done right now! We only need some syntax sugar for it, and internally it could be managed in some other way. But idea is the same: we CAN'T get a null reference when we said that we don't want it. If we want to be warned - welcome to Attibutes world: NotNull, CanBeNull, Pure and so on. |
You can always turn on warnings as errors and you will get your errors on compile time. If you ignore warnings and then complain that you have not been warned about problems, well, that does not really make sense. The "!" syntax is problematic. The normal case should be that a variable is not nullable. Only very few variables are really ment to be nullable. It does not make sense to annotate > 97% of all variables with a "!". This is useless clutter. The default must be non nullable, with only the few exceptions getting specially marked. Also with your concept, you would not only add clutter to almost all variables, but you would also add one struct per reference which is again unneccessary memory and runtime overhead. I think that the general direction has already been decided by the language team, it's not much use to continue this discussion. The "!" operator is not going to come, and a strong type system is also not going to come. C# was just not made with this in mind, and it would cause trouble in various points if a strong system is now somehow forced onto it. The warning approach feels very natural, it's easy to use, has the right syntax as already known from value types, does not cause breaking changes or other compatibility issues. When used properly, it will lead to the same safety as you would get from a strong system. Null safety is always going to be a compromise for an existing language. I think that the warning approach here is a good compromise. You can't have it all, not on C#. Maybe at some point we will get a new language where all this is taken care of from the beginning, M#, D#, whatever... |
It'd have to be Nope, a // all structs have a default constructor that zero-inits the struct
NotNullReference<string> s1 = new NotNullReference();
// but constructors aren't necessary anyway since the stack is zero-inited anyway
NotNullReference<string> s2 = default(NotNullReference<string>);
// and the CLS has to zero init array allocations
NotNullReference<string>[] s3 = new NotNullReference<string>[10];
// and then you have generics
public T Foo1<T>() { return default(T); }
public T Foo2<T>() { return new T(); }
NotNullReference<string> s4 = Foo1<NotNullReference<string>>();
NotNullReference<string> s5 = Foo2<NotNullReference<string>>(); |
@HaloFour there's also always |
@HaloFour as i said, it's a concept. Sure, they could be some workaround with it. The same manner as immutable strings are not immutable (you can always pinn a string and change its content!). But it's what I call And again, it's a concept and it might not work quite as expected, but C# devs doesn't have such limits. They even have get CLR support for any feature they request, if it worth. |
This is now tracked at dotnet/csharplang#36 . |
#1. Overview
This is my concept for non-nullable references (and safe nullable references) in C#. I have tried to keep my points brief and clear so I hope you will be interested in having a look through my proposal.
I will begin with an extract from the C# Design Meeting Notes for Jan 21, 2015 (#98):
There's a long-standing request for non-nullable reference types, where the type system helps you ensure that a value can't be null, and therefore is safe to access. Importantly such a feature might go along well with proper safe nullable reference types, where you simply cannot access the members until you've checked for null.
This is my proposal for how this could be designed. The types of references in the language would be:
Important points about this proposal:
Conversely, code will continue to behave identically if the '!' and '?' are removed (but the code will not be protected against any future code changes that are not 'null safe').
The Design Meeting Notes cite a blog post by Eric Lippert (http://blog.coverity.com/2013/11/20/c-non-nullable-reference-types/#.VM_yZmiUe2E) which points out some of the thorny issues that arise when considering non-nullable reference types. I respond to some of his points in this post.
Here is the Dog class that is used in the examples:
#2. Background
I will add a bit of context that will hopefully make the intention of the idea clearer.
I have thought about this topic on and off over the years and my thinking has been along the lines of this type of construct (with a new 'check' keyword):
The 'check' keyword does two things:
It then occurred to me that since it is easy to achieve the first objective using the existing C# language, why invent a new syntax and/or keyword just for the sake of the second objective? We can achieve the second objective by teaching the compiler to apply its rules wherever it detects this common construct:
Furthermore it occurred to me that we could extend the idea by teaching the compiler to detect other simple ways of doing null checks that already exist in the language, such as the ternary (?:) operator.
This line of thinking is developed in the explanation below.
#3. Mandatory References
As the name suggests, mandatory references can never be null:
However the good thing about mandatory references is that the compiler lets us dereference them (i.e. use their methods and properties) any time we want, because it knows at compile time that a null reference exception is impossible:
(See my additional post for more details.)
#4. Nullable References
As the name suggests, nullable references can be null:
However the compiler will not allow us (except in circumstances described later) to dereference nullable references, as it can't guarantee that the reference won't be null at runtime:
This may make nullable references sound pretty useless, but there are further details to follow.
#5. General References
General references are the references that C# has always had. Nothing is changed about them.
#6. Using Nullable References
So if you can't call methods or access properties on a nullable reference, what's the use of them?
Well, if you do the appropriate null reference check (I mean just an ordinary null reference check using traditional C# syntax), the compiler will detect that the reference can be safely used, and the nullable reference will then behave (within the scope of the check) as if it were a mandatory reference.
In the example below the compiler detects the null check and this affects the way that the nullable reference can be used within the 'if' block and 'else' block:
The compiler will also recognise this sort of null check:
And this:
The compiler will also recognise when you do the null check using other language features:
Hopefully it is now clear that if the new style references are used throughout the code, null reference exceptions are actually impossible. However once the effort has been made to convert the code to the new style references, it is important to guard against the accidental use of general references, as this compromises null safety. There needs to be an attribute such as this to tell the compiler to prevent the use use of general references:
This attribute could also be applied at the class level, so you could for example forbid general references for the assembly but then allow them for a class (if the class has not yet been converted to use the new style references):
(See my additional post for more details.)
#7. Can we develop a reasonable list of null check patterns that the compiler can recognise?
I have not listed every possible way that a developer could do a null check; there are any number of complex and obscure ways of doing it. The compiler can't be expected to handle cases like this:
However the fact that the compiler will not handle every case is a feature, not a bug. We don't want the compiler to detect every obscure type of null check construct. We want it to detect a finite list of null checking patterns that reflect clear coding practices and appropriate use of the C# language. If the programmer steps outside this list, it will be very clear to them because the compiler will not let them dereference their nullable references, and the compiler will in effect be telling them to express their intention more simply and clearly in their code.
So is it possible to develop a reasonable list of null checking constructs that the compiler can enforce? Characteristics of such a list would be:
I think the list of null check patterns in the previous section, combined with some variations that I am going to put in a more advanced post, is an appropriate and intuitive list. But I am interested to hear what others have to say.
Am I expecting compiler writers to perform impossible magic here? I hope not - I think that the patterns here are reasonably clear, and the logic is hopefully of the same order of difficulty as the logic in existing compiler warnings and in code checking tools such as ReSharper.
#8. Converting Between Mandatory, Nullable and General References
The principles presented so far lead on to rules about conversions between the three types of references. You don't have to take in every detail of this section to get the general idea of what I'm saying - just skim over it if you want.
Let's define some references to use in the examples that follow.
Firstly, any reference can be assigned to another reference if it is the same type of reference:
Here are all the other possible conversions. Note that when I talk about 'intent' I am meaning the idea that a traditional (general) reference is conceptually either mandatory or nullable at any given point in the code. This intent is explicit and self-documenting in the new style references, but it still exists implicitly in general references (e.g. "I know this reference can't be null because I wrote a null check", or "I know that this reference can't or at least shouldn't be null from my knowledge of the business domain").
There has to be some compromise in the last three cases as our code has to interact with existing code that uses general references. These three cases are allowed if an explicit cast is used to make the compromise visible (and perhaps there should also be a compiler warning).
Some of the conversions that were not possible by direct assignment can be achieved slightly less directly using existing language features:
#9. Class Libraries
As mentioned previously, the compiled IL code will be the same whether you use the new style references or not. If you compile an assembly, the resulting binary will not know what type of references were used in its source code.
This is fine for executables, but in the case of a class library, where the goal is obviously re-use, the compiler will need a way of knowing the types of references used in the public method and public property signatures of the library.
I don't know much about the internal structure of DLLs, but maybe there could be some metadata embedded in the class library which provides this information.
Or even better, maybe reflection could be used - an enum property indicating the type of reference could be added to the ParameterInfo class. Note that the reflection would be used by the compiler to get the information it needs to do its checks - there would be no reflection imposed at runtime. At runtime everything would be exactly the same as if traditional (general) references were used.
Now say we have an assembly that has not yet been converted to use the new style references, but which needs to use a library that does use the new style references. There needs to be a way of turning off the mechanism described above so that the library appears as a traditional library with only general references. This could be achieved with an attribute like this:
Perhaps this attribute could also be applied at a class level. The class could remain completely unchanged except for the addition of the attribute, but still be able to make use of a library which uses the new style references.
(See my additional post for more details.)
#10. Constructors
Eric Lippert's post (see reference in the introduction to this post) also raises thorny issues about constructors. Eric points out that "the type system absolutely guarantees that ...[class] fields always contain a valid string reference or null".
A simple (but compromised) way of addressing this may be for mandatory references to behave like nullable references within the scope of a constructor. It is the programmer's responsibility to ensure safety within the constructor, as has always been the case. This is a significant compromise but may be worth it if the thorny constructor issues would otherwise kill off the idea of the new style references altogether.
It could be argued that there is a similar compromise for readonly fields which can be set multiple times in a constructor.
A better option would be to prevent any access to the mandatory field (and to the 'this' reference, which can be used to access it) until the field is initialised:
Note that it is not an issue if this forces adjustment of existing code - the programmer has chosen to introduce the new style references and thus will inevitably be adjusting the code in various ways as described earlier in this post.
And what if the programmer initializes the property in some way that still makes everything safe but is a bit more obscure and thus more difficult for the compiler to recognise? Well, the general philosophy of this entire proposal is that the compiler recognises a finite list of sensible constructs, and if you step outside of these you will get a compiler error and you will have to make your code simpler and clearer.
#11. Generics
Using mandatory and nullable references in generics seems to be generally ok if we are prepared to have a class constraint on the generic class:
However there is more to think about generics - see comments below.
#12. Var
This is the way that I think var would work:
The first case in each group would be clearer if we had a suffix to indicate a general reference (say #), rather than having no suffix due to the need for backwards compatibility. This would make it clear that 'var#' would be a general reference whereas 'var' can be mandatory, nullable or general depending on the context.
#12. More Cases
In the process of thinking through this idea as thoroughly as possible, I have come up with some other cases that are mostly variations on what is presented above, and which would just have cluttered up this post if I had put them all in. I'll put these in a separate post in case anyone is keen enough to read them.
The text was updated successfully, but these errors were encountered: