Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Method Contracts #119

Closed
stephentoub opened this issue Jan 28, 2015 · 259 comments
Closed

Proposal: Method Contracts #119

stephentoub opened this issue Jan 28, 2015 · 259 comments

Comments

@stephentoub
Copy link
Member

(Note: this proposal was briefly discussed in #98, the C# design notes for Jan 21, 2015. It has not been updated based on the discussion that's already occurred on that thread.)

Background

"Code Contracts" were introduced in .NET 4 as a feature of the Framework, primarily exposed through the System.Diagnostics.Contracts.Contract class. Through method calls like Contract.Requires and Contract.Ensures, developers could add code to their methods to specify preconditions and postconditions, and then a subsequent tool known as a rewriter could be used to post-process the code generated by the compiler to appropriately implement the expressed contracts, e.g. whereas the developer puts a call to Ensures at the beginning of the method, the rewriter ensures that path through the method that could cause it to exit ends with a validation of the expressed postcondition.

public int Insert(T item, int index)
{
    Contract.Requires(index >= 0 && index <= Count);
    Contract.Ensures(Contract.Result<int>() >= 0 && Contract.Result<int>() < Count);
    return InsertCore(item, index);
}

Problem

Very little code actually uses these API-based code contracts. They require additional tooling to be integrated into the build, they're verbose, and it's difficult to write additional robust tooling related to them (e.g. integrating contract information into documentation).

Solution: Language support for method contracts

Basic contract support for preconditions and postconditions can be built directly into the language, without requiring any separate tooling or custom build processes, and more thoroughly integrated throughout the language.

The compiler would generate all of the relevant code to correctly validate the contracts, and the contracts would show up as part of the method signature in the Visual Studio IDE, such as in IntelliSense at call sites to the method.

public int Insert(T item, int index)
    requires index >= 0 && index <= Count
    ensures return >= 0 && return < Count
{
    return InsertCore(item, index);
}

These contracts would primarily be validated at run time, however if the compiler (or a 3rd-party diagnostic) could statically detect a violated contract (e.g. a precondition requires that an argument be non-null, and null is being passed as the argument), it could choose to fail the compilation rather than allowing the bug to make its way to run time. Similarly, if the compiler (frontend or backend) is able to statically determine that a contract will always be satisfied, it can optimize based on that knowledge.

Unlike .NET Code Contracts, which have configurable but complicated behavior provided by a post-compilation rewriter tool (and which are no-ops if such a rewriter isn’t used), failed validation of ‘requires’ and ‘ensures’ clauses would ideally result in "fail fast" (ala Environment.FailFast(...)), meaning the program abandons immediately. This is very useful for validating preconditions and postconditions, which are typically used to catch usage errors by the developer. Such errors should never make it into production.

The compiler would require that preconditions and postconditions can be validated by the caller, and as such it requires that any members used in the requires and ensures clauses are all accessible to any caller, e.g. the requires clause of a public method may not access internal or private members, and the requires clause of an internal method may not access private members.

Preconditions and postconditions should also not throw exceptions.

Virtual methods and interface methods may have preconditions and postconditions, in which case the compiler guarantees that overrides and implementations of these methods satisfy the preconditions. To make this clear to a developer reading the code, the override or interface implementation would state "requires base" or "ensures base", to indicate that there are imposed constraints, while not forcing the developer writing the code to explicitly type them out.

interface IItemCollection
{
    int Count ensures return >= 0 { get; }}

class MyItemCollection : IItemCollection
{
    public int Count ensures base => m_count;}

Alternatives: Fail fast vs exceptions

In support of migrating existing code to use contracts, or for places where an exception is really desired over fail fast, we could optionally consider additional syntax to specify that an exception should be thrown instead of fail fasting.

For example, the previous Insert example's requires and ensures clauses would result in fail fasting if a condition isn't met:

public int Insert(T item, int index)
    requires index >= 0 && index <= Count
    ensures return >= 0 && return < Count
{
    return InsertCore(item, index);
}

But, the developer could explicitly override the fail fast by specifying what should happen in case of failure:

public int Insert(T item, int index)
    requires index >= 0 && index <= Count else throw new ArgumentOutOfRangeException(nameof(index))
    ensures return >= 0 && return < Count
{
    return InsertCore(item, index);
}
@aarondandy
Copy link

Any possibility of getting invariants as well?

Would love to see the static analysis tools we have in Code Contracts working with these new contracts.

I really like the presented syntax to decide between exceptions and fail-fast for preconditions.

@aarondandy
Copy link

Could some kind of flag be provided to omit private/internal contracts from release builds if we have static analysis tools to prove them?

@jaredpar
Copy link
Member

@aarondandy

In previous research we have shied way from invariants in part because it's actually incredibly difficult for developers to predict the effect it will have on the code. Typically invariants checks are validated to be true on every exit / entry point of a given method. Given that can you point out all of the places on the following method where that happens?

class C
{
  invarint Count >= 0

  int[] _array;
  int _count;

  void AddRange(IEnumerable<int> all) 
  {
    EnsureCapacity();
    foreach (var cur in all) {
      _array[i] = cur;
    }
  }
}

Here are just a few of the places where an invariant check must occur in the above code:

  • In the generated call to GetEnumerator.
  • Twice per loop iteration. Once for MoveNext and Current each.
  • Before and after the call to EnsureCapacity
  • At least twice inside the implementation of EnsureCapacity
  • At the beginning and end of the method.

Sure a few of these may be able to be optimized away. But even then on this really simple method the invariant check is occurring at a very high frequency.

In the past when we looked deeply at places where developers wanted invariants we found that requires and ensures were sufficient to the task.

@aarondandy
Copy link

@jaredpar

That is fair considering the complexity The real reason I want invariants is just to reduce the amount of redundant contracts I would need to write. As a compromise would it be terrible to have a way to reuse some pre/post conditions? Something like [ContractAbbreviator] but probably way different?

@mirhagk
Copy link

mirhagk commented Jan 28, 2015

@aarondandy Are you referring to reusing a pre-condition as a post-condition, or referring to reusing a precondition across several methods in a class?

For the first situation it'd be nice if requires ensures or some other syntactic sugar was supported. For the second, it could be handled with allowing the expression to be a method (which can be declared with the shorthand lambda method syntax), so you can have a CheckClassInvariants method and do requires ensures CheckClassInvariants() which isn't too bad.

@aarondandy
Copy link

@mirhagk

"reusing a precondition across several methods" is what I meant but I like your other thoughts too. I don't really want to have a true provable invariant for Count >=0 but I definitely want a way to prove and communicate that each method call on my type won't do something crazy to Count.

@ejball
Copy link

ejball commented Jan 28, 2015

I wonder if ensures could be made to work naturally with async method results.

@alexpolozov
Copy link

Good to see that this is being considered!

Preconditions and postconditions should also not throw exceptions.

How are you going to verify this? For instance, are we allowed to do property access in contracts: requires id != this.Owner.Id? What if Owner is null? Is it an uncompilable code, a NullReferenceException at runtime, or a contract failure?

Just like in API-based Code Contracts, there should be some way to refer to the old (pre-invocation) value of the field/property/parameter. old(foo) looks good, but that makes old a new contextual keyword, which you don't like to introduce often.

@mirhagk
Copy link

mirhagk commented Jan 28, 2015

@apskim

Preconditions and postconditions should also not throw exceptions.

This may or may not be enforced. It might just be a "you shouldn't do that" kind of thing rather than a compile time exception (perhaps static analyzers could look for it).

If it is a compile time thing, then regular property access would have to be disallowed in your case to instead do something like requires id <= this.Owner.?Id. This doesn't throw any exceptions and will fail the validation.

@drewnoakes
Copy link
Member

To what extent can the compiler check a type's contracts for internal consistency? For example, it'd be hard to reliably solve for arbitrary expressions:

class C
{
    double Value ensures Math.log10(return) > 0 { get; }

    void Foo() ensures Value < 1 { ... }
}

There is no value of x < 1 for which Math.log10(x) > 0.

Would these constraints primarily be evaluated at run-time, with best-effort use from tooling?


Also, if arbitrary code can be executed in requires and ensures statements, side effects must be considered. Code may behave differently when compiled to evaluate contracts vs. without.

@mirhagk
Copy link

mirhagk commented Jan 28, 2015

@drewnoakes I'm assuming that an extension will be created that checks all sorts of crazy situations that you'd never want to add to a compiler (for complexity reasons). The compiler can do very minimum checks, and leave most of the checks to extensions.

In #98 it was mentioned that there may want to be some shorthand syntax for common checks (null being the most common). Something like requires item seems like a simple shorthand for the case of null checks, and combined with the ability to separate with a , would probably lead to a wider usage (requires item, data, index would be very simple to write and there'd be little reason to not use it).

An alternative is to use requires item! which makes use of the proposed non-nullable reference syntax ! and separates it from a boolean.

@Clockwork-Muse
Copy link

what about situations where a compile-time check is nice, but a runtime check may impart more of a performance penalty? For example, there's this comment in System.Collections.Immutable's array:

public T this[int index]
{
    get
    {
        // We intentionally do not check this.array != null, and throw NullReferenceException
        // if this is called while uninitialized.
        // The reason for this is perf.
        // Length and the indexer must be absolutely trivially implemented for the JIT optimization
        // of removing array bounds checking to work.
        return this.array[index];
    }
}

...which states that, while you want compile-time checks for existence of the array, and index range, the penalties for runtime checks are too high. (In this case they're letting the hardware/vm throw the errors directly, a rarer occurrence than checking all accesses)

Maybe the validator could biggyback off of hardware/vm protections? So it could see something like requires this.array, index >= 0 && index < this.array.Length, note that it has built-in runtime checks (effectively) for those values anyways, and noop them. Which would probably require sitting in the compiler, than as an external tool.

@jaredpar
Copy link
Member

@Clockwork-Muse

We've did pretty extensive profiling of contracts in our project and found that in general the performance hit they caused was minimal. It was usually in the 1-2% range across the system.

Sure there was the occasional check that was too expensive to make a contract. In those situations we first looked to the API itself. If verifying the invariants of the type was too expensive to put in a contract then perhaps the API itself needs to be changed.

If the API couldn't be made more efficient then we would move the check into Debug.Assert. That was pretty rare though (only a handful of occurrences).

Note: contracts do not always result in a negative performance hit. It is possible that they establish invariants (like bounds) that the JIT can optimize against and result in faster code. There were definitely places where that happened.

@AdamSpeight2008
Copy link
Contributor

Would it be possible to have a method contract in this position?

T Item(int index)
{
  requires (0 <= index) && (index < Size)
  get{ return _a[index]; }
  set(T item) { _a[index]=item; }
}

Using a trait ( #129)

T Item(int index)
{
  trait { _a!; range:  0 <= index < _a,Length; }
  get { _a[index]; }
  set(T item) { _a[index] = item; }
}

@brandf
Copy link

brandf commented Jan 31, 2015

The part about Virtual methods and interface methods made me starting thinking about how this might be seen as an extension to the type system in a way...or rather another kind of type, similar to units of measure in F#. I'm bored, so I'm going to riff on the idea...

If we start with the first example in the solution above:

public int Insert(T item, int index)
    requires index >= 0 && index <= Count
    ensures return >= 0 && return < Count
{
    return InsertCore(item, index);
}

The 'requires' are additional constraints on the methods parameters, similar to how you have a types associated with each parameter. For fun I'm using 'where' below instead of adding new keywords. I think it's worth considering making it look LINQ'ish:

public int Insert(T item, int index where index >= 0 && index <= Count)
    where return >= 0 && return < Count
{
    return InsertCore(item, index);
}

Which then makes me think of the notion of generic type constraints. Similar to how a delegate gives a name to the shape of a function, a typeconstraint is used to give a name to a constraint.

typeconstraint Bounded<Type, Lower, Upper> = Type N where N >=0 Lower && N <= Upper

Perhaps you could then use typeconstraints anywhere you would normally use a type. The generic parameters could be bound to other types, constants, members or other arguments. Now the above can be rewritten:

public Bounded<int, 0, Count> Insert(T item, Bounded<int, 0, Count> index) 
{
    return InsertCore(item, index);
}

...which is looking pretty clean IMO.

And maybe I'm going off the deep end here, but lets consider going a step further by thinking of the traditional type syntax as being syntactic sugar for a specific where clause...

typeconstraint Bounded<T, Lower, Upper> n where typeof(n) is typeof(T) && n >= Lower && n <= Upper

In other words, 'traditional type syntax'...

int x

...is shorthand for...

var x where typeof(x) == typeof(int)

...or for parameters and returns...

x where typeof(x) == typeof(int)

Since non-nullable types are being considered at C#7 design meetings, lets see how that fits in with this:

typeconstraint NotNull n where n != null

...and union types (a bit like TypeScript has)

typeconstraint Union<T1, T2> n where typeof(n) == typeof(T1) || typeof(n) == typeof(T2)

And of course, similar to delegates, an assembly should be able to expose public typeconstraints like any other kind of type.

@chrisaut
Copy link

chrisaut commented Feb 4, 2015

I mentioned it in the original discussion, but what about a short hand syntax for the most common scenarios:

public void M(string! x) {}

expands to:

public void M(string x) requies x != null {}
public void M(int >= 0 x) {}

expands to:

public void M(int x) requires x >= 0 {}
public int >= 0 Count() {}

expands to:

public int Count() ensures return >= 0 {}
public void M(int < y x, int > x y) {}

expands to:

public void M(int x, int y) requires x < y, requires y > x {}

(this specific case looks messy, thus probably ok to only support short hand for simple cases, like operators and constant literals (no method calls, no refering to other arguments or fields/properties etc, for that you would use the "full" syntax).

public C! M(B! b) {}

expands to:

public C M(B b) requires m != null, ensures return != null

I think of contracts as a harder "type", thus I feel it might make sense to have the requirements right there with the parameter (it also avoids double typing).
For instance you don't say
public M(x) requires x is int, the int requirement is right there with the parameter, so why would further requirements on x not be right along there with it?

The downside is it might look a bit messy, especially if there are lots of parameters (but that's code smell anyways), and I guess it's (much?) harder to parse for the compiler.

@AdamSpeight2008
Copy link
Contributor

@chrisaut public void M(int < y x, int > x y) {} can never be satisfied

I think the ( #129 ) would be a better approach.

public void M(int x, int y)
  trait { x < y; y > x }
  {}

As it doesn't mess with the parameters syntax, yet still available for the IDE to use the user and collapse the trait definition.

@chrisaut
Copy link

chrisaut commented Feb 4, 2015

@chrisaut public void M(int < y x, int > x y) {} can never be satisfied

why not?
it expands to:

public void M(int x, int y) requires x < y, requires y > x {}

x needs to be smaller than y, and y needs to be bigger than x (which comes from the first constraint, you wouldn't even need the second constraint).
M(0, 1) satisfies it.
But it just goes to show that for anything but trivial "< constant" stuff the syntax is confusing, hence it's probably not a good idea to allow it for anything that referes to other named parameters/fields/etc.

@AdamSpeight2008
Copy link
Contributor

May have misread it as x < y and y < x

@sirinath
Copy link

sirinath commented Feb 5, 2015

I don't see why this should be method only. Why not be able to use with variables also. In addition some distinction on runtime, compile time if unable at compile time at runtime and enforceability.

@AdamSpeight2008
Copy link
Contributor

Codeplex Topic

@vkhorikov
Copy link

@jaredpar consider binding this feature to non-nullable reference types.
Although they can't be implemented as they are implemented in, for example, F# but some kind of 'global precondition' can mitigate the problem: it can state that no reference type within a class can have null value.
This will help to reduce amount of '!= null' preconditions in code that tries to be more functional.

@johncrim
Copy link

johncrim commented Feb 9, 2015

Could you explain why run-time contract violation shouldn't throw an exception? I've shied away from using CodeContracts static analysis (build are too slow; too much work to fix all warnings), but find them great for run-time validation and documentation. It's also great to be able to dial up and down the amount of run-time checks for debug and release builds (full checks vs ReleaseRequires). In the case of release builds, you DEFINITELY don't want fail-fast behavior; and arguably you'd want consistent behavior but more checks in debug builds.

My biggest problems with CodeContracts have been:

  1. The post-processing step slows down the build a HUGE amount (CodeContracts seems to account for 80% of the build time). This is the main reason I've considered dropping CodeContracts on several occasions.
  2. The syntax could be more readable/expressive/concise
  3. The implementation is completely broken on Mono - this is the primary reason why developers with Macs are having to run Windows VMs.

I would expect that this proposal would address 1 and 2 at a minimum, so I'm a huge fan.

Another important trait to preserve is the ability to intercept contract failures and customize the response: log them, pop up a dialog, etc.

@jamesqo
Copy link
Contributor

jamesqo commented Oct 8, 2016

@kruncher We may not want extra checks we know are true in performance-sensitive code. For example, it may prevent the JIT compiler from inlining a particular method. For example if you go to corefx where the BCL lives, you'll see that basically all of the checks of the internal/private methods are Debug.Asserts.

@kruncher
Copy link

kruncher commented Oct 8, 2016

Hmmm, would it be useful to have a build option to simply disable all contracts for internal/private methods (i.e. an option that can be configured on/off for 'Debug', 'Release' or 'Custom')?

@jamesqo
Copy link
Contributor

jamesqo commented Oct 8, 2016

@kruncher Hmm... might be a good idea, actually. A problem though is that you might want to enable the contracts for only certain internal/private methods.

@Lexcess
Copy link

Lexcess commented Oct 10, 2016

In the Code Contracts implementation there is a explicit option for only checking 'public surface contracts' which exhibits this behaviour. The clear win here is that if a contract is asserted on private methods the Code Contracts implementation will require another assertion or assumption it on any callee meaning that if all callees have already validated the input you don't take the perf hit on checking again.

I'd say this behaviour is the same (or a slight subset of) the statement: "If a compiler could prove the contract was satisfied at compile-time – something ... it was free to elide the check altogether" from earlier in this thread.

@jamesqo
Copy link
Contributor

jamesqo commented Oct 10, 2016

@Lexcess

meaning that if all callees have already validated the input you don't take the perf hit on checking again.

Would not always work. For example, consider this code sample:

T Single<T>(IList<T> list)
{
    if (list == null || list.Count != 1) throw something;

    return SingleInternal(list);
}

T SingleInternal<T>(IList<T> list)
{
    Debug.Assert(list != null && list.Count == 1);

    return list[0];
}

A bad implementation of IList<T> could return different values of Count for the 2 invocations, even though it was not actually modified. So the compiler could not prove that the list.Count == 1 constraint would be satisfied, and would have to check the Count twice.

@Lexcess
Copy link

Lexcess commented Oct 10, 2016

I see what you are saying but I think it is a separate point. I think the point that you could have untrustworthy implementations (note that the MS Code Contracts has contracts for interfaces like IList so you could solve this) is different from needlessly reapplying the same checks all the way down a call stack.

It'd be interesting to see what the Code Contracts does around non-predictable return types such as your example and arbitrary yield returns. I've often found it's analyser was quite good at unravelling such problems.

@aarondandy
Copy link

For that example Code Contracts seems to take the "if it hurts don't do that strategy" and assumes you are not trying to intentionally harm the stability of your system. It does this by requiring methods used in a contract to be [Pure] but it does not enforce that the method or property is actually "Pure".

@gafter
Copy link
Member

gafter commented Mar 28, 2017

This is now tracked at dotnet/csharplang#105

@ghost
Copy link

ghost commented Dec 18, 2017

as I explained in dotnet/csharplang#1198, this syntax will be shorter:

public <0,Count>int Insert(T item, <0,Count>int index)   {
    return InsertCore(item, index);
}

@HaloFour
Copy link

@MohammadHamdyGhanem

It's also not particular generic. What if you wanted to convey a constraint that doesn't involve ranges?

@yaakov-h
Copy link
Member

Or even within ranges, disjointed ranges?

e.g. value must be between 0 and 100, or between 200 and 300.

@ghost
Copy link

ghost commented Dec 18, 2017

What you ask about is a complex condition already, and regular If statements can do it better.
If you want to complicat things, I suggest to use regular expression syntax such as:
[reg exp]int x;
This can allow writting constrains on string data types as well.
If we think at the Data Type level, any thing is poossible!
Thanks.

@yaakov-h
Copy link
Member

@MohammadHamdyGhanem If statements only give you runtime checks, not compile-time safety.

@ghost
Copy link

ghost commented Dec 18, 2017

@yaakov-h
So what? This ia already the case with any assignment that evaluates in run-time.
int i = x/somefunction();
is not runtime safe.
We csn handle exceptions easily , but business logic errors will pass undetected unless writing extra validation code, which we do more often.
In data binding, if a value raises an exception, the bound control will reject it automatically preventing the user from entering bad values without writing extra code.

@ghost
Copy link

ghost commented Dec 18, 2017

Also: Method Contracts are not run time safe, and will raise exceptions. This is why we use try catch.

@ghost
Copy link

ghost commented Dec 18, 2017

So, there are 4 possible ways to do it:

  1. Method contracts. It is verbose but can write generic conditions.
  2. Constrained Data Types: It is a compact syntax and can be extended to use regular expressions.
  3. Validation attributes: Like those used in ASP.NET MVC. It has no effect on the c# syntax if self, and we can use parameters that indicate the error message or other actions. But using attribute for individual method parameters and return value can make the code long and ugly!
  4. Writing an analyzer, which I cannot exactly understand !

Sure we all want some syntax to simplify validation codes.
We can make compromises. For example:
If the constrained types can cause problems, they can only be used to define properties and methods (parameters and return values). They are nit true types but just syntax annotations to tell compiler how validate values, so the compiler can convert then to contracts under the hood.
Attributes also can be used with properties and methods to supply error message and additional options to tell the compiler what to do when the data are not valid.
This approach says that contracts can still be used in more complicated situations (but I see that it is easer to use regular code statements)

@markusschaber
Copy link

In previous research we have shied way from invariants in part because it's actually incredibly difficult for developers to predict the effect it will have on the code. Typically invariants checks are validated to be true on every exit / entry point of a given method. Given that can you point out all of the places on the following method where that happens?

Maybe invariants could be qualified like private and protected, applying them to all methods at the same or higher visibility than the invariant.

Thus, a private methold could violate a public invariant, as it's always called in a context with higher visibility which has to ensure the invariants on it's boundaries anyways.

This allows to refactor common code into private methods while still upholding the invariants to the outside.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests