Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stronger inference of a common type of a set of expressions #1419

Closed
dsaf opened this issue Mar 20, 2015 · 31 comments
Closed

Stronger inference of a common type of a set of expressions #1419

dsaf opened this issue Mar 20, 2015 · 31 comments

Comments

@dsaf
Copy link

dsaf commented Mar 20, 2015

This code will currently not work:

public class Sample
{
    public static void Main()
    {
        //https://msdn.microsoft.com/en-us/library/bb384226.aspx
        //Compiler Error CS0826: No best type found for implicitly typed array.
        var animalsArray = new[] { new Dog(), new Cat() };
    }
}

public abstract class Animal {}

public class Dog : Animal {}

public class Cat : Animal {}

It could go one step further and assume the first common type in the inheritance hierarchy.
Several situations possible:

  1. Most straightforward - single common parent type - see above.
  2. Primitive types - needs explicit manual resolution.
  3. Nullable and non-nullable - should nullable be assumed?
  4. Only several common interfaces are shared - it's better to be explicit in this case rather than picking the first one.
  5. One common interface and a common parent type are shared - always assume the parent type, unless it's object?
  6. Several common interfaces and a common parent type are shared - always assume the parent type.
  7. Generics - probably more tricky, and a separate issue - see [C# feature] Infer generic type arguments in collection initializers #1470 to start.

Related SO question: http://stackoverflow.com/questions/2799902/implicit-typing-of-arrays-that-implement-interfaces.

@dsaf dsaf changed the title [C# feature] Slightly stronger collection initializer type inference [C# feature] Stronger collection initializer type inference Mar 20, 2015
@gafter
Copy link
Member

gafter commented Mar 20, 2015

As I understand it, you are requesting a modification to the best common type of a set of expressions specification (C# language spec section 7.5.2.14).

@svick
Copy link
Contributor

svick commented Mar 20, 2015

It looks to me like you have two proposals:

  1. change how the best common type is inferred for array initializers (and elsewhere? §7.5.2.14 also mentions return types of anonymous functions)
  2. support inferring type parameters of generic types in collection initializers

The two seem to be completely independent to me, so I think they should have separate issues.

@dsaf
Copy link
Author

dsaf commented Mar 21, 2015

@gafter : yes, but maybe more like @svick 's post above? Do I need to create a separate issue for (2)?

@gafter
Copy link
Member

gafter commented Mar 22, 2015

@dsaf Yes, a separate issue would be a good idea, please.

@jveselka
Copy link

I'd rather not see this feature implemented.
This problem often occurs in Scala when you make a mistake in decalration, usualy the top class is inferred. The declaration itself is then valid, but errors start popping from all around the code.
Current C# inference is the perfect mix of power and clarity, don't change it please.

@dsaf
Copy link
Author

dsaf commented Jul 24, 2015

@jveselka

...usualy the top class is inferred...

Well, yes this is the most straightforward assumption to make. If you create a list of an apple a banana and an orange than most likely you want a list of fruit. This is what I would personally expect and Scala seems to be confirming it for me.

...errors start popping from all around the code...

  1. This is not too different from the current situation with a single item inference, e.g. assigning a LINQ expression to a var - it can be a List<> or an IEnumerable<> or an IQueryable<> depending on the last method in the chain. The IDE guides you to fix the problem in seconds. It's already there, it's what people are doing, it works.
  2. This inference would be scoped to a method - that is a very small amount of code.
  3. It's enough to mouse over the var to see the inferred type and adjust the code if needed.

I cannot see how what you have described could be a problem. Can you please give a sample of the situation and code?

@jveselka
Copy link

@dsaf
Imagine you have this code:

//correctly inferred as Vegetable[]
var vegetable = new[] { new Carrot(), new Cucumber() }
//uses Person.Eat(Vegetable v) overload
foreach (var v in vegetable) person.Eat(v);

Then you try to add Tomato forgeting that it is technically a Fruit

//infers Object[] assuming that Vegetable and Fruit have no other common base class
var vegetable = new[] { new Carrot(), new Cucumber(), new Tomato() }
//error: no overload Person.Eat(Object o) found
foreach (var v in vegetable) person.Eat(v); 

So error that should have appeared at declaration appears further in the code as different error.

I can even imagine cases where similar mistake results in runtime error and I fear some developers will rather fall back to using explicitly typed locals rather than dealing with these problems.

@dsaf
Copy link
Author

dsaf commented Jul 26, 2015

@jveselka I see your point and it's a valid concern. There could be ways a modern IDE would circumvent this (e.g. showing a warning 'Suspicious best type inferred - object' together with the error you have given as an example). This issue needs more opinions.

@dsaf
Copy link
Author

dsaf commented Sep 10, 2015

Related to #438.

@HaloFour
Copy link

Not to mention how to determine what that best type is. Do you go up three levels of hierarchy to find a common base class, or two to find a common implemented interface? I think that there are too many questions and concerns to start having the compiler guess. I don't think that type inference should do much more than be a simple convenience so that the developer can type less in those cases where there is no possible ambiguity, but as soon as the intent isn't absolutely crystal-clear the developer should be required to explicitly state their intent through declaration. Not to mention, I'm going to have to be able to read that same code again at some point in the future and I don't want to have to make the same best guess that the compiler will have to.

@paul1956
Copy link
Contributor

If you are worried about typing and reading code in the future why not offer a automatic "make explicit mode" and compile with "Strict On". It solves both issues you raise. Whenever you type

Dim i = 1

The IDE replaces it with

Dim i as integer = 1

@gafter
Copy link
Member

gafter commented Sep 10, 2015

@paul1956 C# doesn't have a Dim keyword.

@paul1956
Copy link
Contributor

@gafter My example was VB but the concept applies.to both languages.

@gafter
Copy link
Member

gafter commented Sep 10, 2015

@paul1956 The languages are different enough that your suggestion could be interpreted to mean one of a number of things in the context of C#.

@paul1956
Copy link
Contributor

I don't know C# well enough to construct examples. The concept is once I finish typing the statement the IDE goes back and adds the missing "Type" information. In VB's case the "As integer", this gets more interesting when the Type is IEnumerable(Of INamedTypeSymbol).. Today at least for VB there are Code Refactoring's to do it but they require more keystrokes.

@gafter
Copy link
Member

gafter commented Sep 10, 2015

Why doesn't C# already do this

There was an explicit decision to have the C# compiler, in type inference, only produce types that were input to the inference process. The reason is that expanding the type inference to other types can result in very complex and nonintuitive results. See, for example, a stackoverflow question about infinite types in Java that result from type inference. It is all well and good to say that we should generalize the common-supertype computation, but actually doing so in practice - in a way that is intuitive to users - is very hard. We would need a specific concrete proposal (not just a list of options) along with a prototype to play with in order to move this kind of proposal forward. If the proposal includes special cases like "if there happens to be only a single common interface..." then it is unlikely to be seen as a good solution, as it will be fragile in the face of adding an interface to an existing class - a kind of fragility we like to avoid. I suspect any solution would require adding the concept of intersection types ala Java.

Any solution for C# would likely be applied equally to VB, so we're treating this issue as covering both languages.

@gafter gafter changed the title [C# feature] Stronger inference of a common type of a set of expressions Stronger inference of a common type of a set of expressions Sep 10, 2015
@gafter
Copy link
Member

gafter commented Sep 10, 2015

@paul1956 A refactoring might help to convert correct code into more explicit correct code. But in this code the original code won't typecheck, so the compiler can't select an appropriate type for the variable.

Generally, C# programmers prefer not to have refactorings and code normalization (other than whitespace) applied automatically.

@paul1956
Copy link
Contributor

I was responding to the comment by @HaloFour where he said he didn't want to guess what the compiler did. Maybe I should open a new issue and move my comments to a VB thread. I am sensitive to this issue because I just spend the last 3 days refactoring someone else's code to add type information and the bug I was chasing was only found when a String was changed to its correct Type Integer. I found existing refactoring extensions to be very slow and in many cases inaccurate.

@alrz
Copy link
Member

alrz commented Dec 31, 2015

@gafter Does this apply numeric promotions for integrals or implicit type conversions for other types?

@gafter
Copy link
Member

gafter commented Dec 31, 2015

@alrz It doesn't have a design or implementation, so it doesn't do anything yet.

@alrz
Copy link
Member

alrz commented Dec 31, 2015

@gafter Couldn't you read that more literally? 😄

@gafter
Copy link
Member

gafter commented Jan 1, 2016

@alrz if/when we design this, we probably will go through all the implicit conversions and ask those questions.

@yufeih
Copy link
Contributor

yufeih commented Jan 7, 2016

What if the type is declared like:

Animal[] animalsArray = new[] { new Dog(), new Cat() };
Animal animal = dog ? new Dog() : new Cat();
Animal GetAnimal(bool dog) => dog ? new Dog() : new Cat();

This still won't compile today, but the compile have enough knowledge about the inferred type.

@gafter
Copy link
Member

gafter commented Jan 11, 2016

We (the language design team) looked at this briefly last week. To summarize our understanding of the proposal, we begin with the two existing sections

7.5.2.14 Finding the best common type of a set of expressions

In some cases, a common type needs to be inferred for a set of expressions. In particular, the element types of implicitly typed arrays and the return types of anonymous functions with block bodies are found in this way.

Intuitively, given a set of expressions E1…Em this inference should be equivalent to calling a method

Tr M<X>(X x1 … X xm)

with the Ei as arguments.

More precisely, the inference starts out with an unfixed type variable X. Output type inferences are then made from each Ei to X. Finally, X is fixed and, if successful, the resulting type S is the resulting best common type for the expressions. If no such S exists, the expressions have no best common type.

and

7.5.2.11 Fixing

An unfixed type variable Xi with a set of bounds is fixed as follows:

  • The set of candidate types Uj starts out as the set of all types in the set of bounds for Xi.
  • We then examine each bound for Xi in turn: For each exact bound U of Xi all types Uj which are not identical to U are removed from the candidate set. For each lower bound U of Xi all types Uj to which there is not an implicit conversion from U are removed from the candidate set. For each upper bound U of Xi all types Uj from which there is not an implicit conversion to U are removed from the candidate set.
  • If among the remaining candidate types Uj there is a unique type V to which there is an implicit conversion from all the other candidate types, then Xi is fixed to V.
  • Otherwise, type inference fails.

The proposal is to modify the penultimate bullet of 7.5.2.11 to be more generous in when it succeeds.

The LDM notes that this is a technically incompatible change to overload resolution. As such it is unlikely that we could accept the change as proposed. Users tend to tar and feather us if we change the meaning of existing programs, and this change would do just that. A demonstration of that fact is left to the reader.

Having said that, all is not lost. Another approach would be to leave type inference alone, and modify 7.5.2.14 to define best common type in terms of the desired new rule, instead of deferring to 7.5.2.11. Alternatively, we could make the desired change to the language, and provide a tool for users to migrate (or check) their code from one language version to the next, warning if there is a change in the semantic meaning of their code between language versions. Neither option is as attractive as the original proposal, presuming it were not a breaking change.

@dsaf
Copy link
Author

dsaf commented Jan 11, 2016

Alternatively, we could make the desired change to the language, and provide a tool for users to migrate (or check) their code from one language version to the next, warning if there is a change in the semantic meaning of their code between language versions.

Makes sense only when/if a critical mass of similarly invasive changes gets accumulated.

Another approach would be to leave type inference alone, and modify 7.5.2.14 to define best common type in terms of the desired new rule, instead of deferring to 7.5.2.11.

I am personally leaning towards @HaloFour 's opinion:

I don't think that type inference should do much more than be a simple convenience so that the developer can type less in those cases where there is no possible ambiguity, but as soon as the intent isn't absolutely crystal-clear the developer should be required to explicitly state their intent through declaration.

@realbart
Copy link

Please don't implement this

var array = new string[]{};
var list = new List<string>();

var collections = new[] { array, list };

What is the type of collections here?
In older .net-versions, the only common interface was IList.
However, in C#4.5 typed arrays also implement IReadOnlyCollection<>.

If you'd allow this kind of type inference, adding an interface in newer framework versions breaks existing code.

@dsaf
Copy link
Author

dsaf commented Oct 11, 2016

@realbart as mentioned above it could be an intersection type of common interfaces #4586.

@Opiumtm
Copy link

Opiumtm commented Oct 12, 2016

Only several common interfaces are shared - it's better to be explicit in this case rather than picking the first one.
One common interface and a common parent type are shared - always assume the parent type, unless it's object?

I think interfaces should never be used in this "common-type" inference.
Only single common parent class should. So, if two classes share one common interface and no common parents, array type of object should be inferred - as object is a single common class.

Reason: interfaces are somewhat a form of multiple inheritance. Multiple inheritance resolving in typing should always be explicit.

Primitive types - needs explicit manual resolution.

Use type priority order for some primitive types.
int -> long -> double

If any other primitive type encountered - use object common class instead.
So, any unsigned integers, decimal values and single-precision float appearing in array with mixed primitive types should result in object common class and compiler should throw warning (not error).

Nullable and non-nullable - should nullable be assumed?

Any null in implicitly-typed array with only primitive types should result in Nullable
Type argument of nullable should be inferred using same int -> long -> double logic

So

var a = new [] { 1, 1.0 }; // having type double[]
var b = new [] { 1, 1.0, null }; // having type double?[]
var c = new [] { 1, 1.0f }; // having type object[] because of float and compiler throw warning
var d = new [] { 1, 1.0f, null }; // having type object[] because of float and compiler throw warning

Structures of same type in array with any null should be inferred as T?[] - where T is a struct type. Any other value type alongside struct (primitive type, another struct) should result in object[] with compiler warning.

Generics - probably more tricky, and a separate issue

Never take interfaces into consideration when inferring types. See above.
Interfaces are form of multiple inheritance and multiple inheritance should always be treated explicitly.

@Opiumtm
Copy link

Opiumtm commented Oct 12, 2016

@dsaf @realbart
Interfaces are form of multiple inheritance so type inferring based on multiple inheritance is unreliable and definitely would break code. So any form of multiple inheritance (interfaces, for example) should always require explicit typing.

But use common parent class for type inference isn't such unreliable nor dangerous.

For your example

var array = new string[]{};
var list = new List<string>();

var collections = new[] { array, list };

Type of collections variable should be object[] as object is a common base class for array and list.

@jcouv
Copy link
Member

jcouv commented Apr 17, 2017

One part of this proposal (finding best common type between 1 and null to be int?) has been discussed in LDM and is considered for a point release:
https://github.com/dotnet/csharplang/blob/master/proposals/improved-common-type.md

Btw, I'll move this discussion over to csharplang repo. Feel free to continue discussion there.

@jcouv
Copy link
Member

jcouv commented Apr 17, 2017

Issue moved to dotnet/csharplang #450 via ZenHub

@jcouv jcouv closed this as completed Apr 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests