-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concepts and type-checking generics #168
Comments
I would like to understand how to write the concepts in Emmy under this proposal. Currently they look like type
AdditiveMonoid* = concept x, y, type T
x + y is T
zero(T) is T
AdditiveGroup* = concept x, y, type T
T is AdditiveMonoid
-x is T
x - y is T
MultiplicativeMonoid* = concept x, y, type T
x * y is T
id(T) is T
MultiplicativeGroup* = concept x, y, type T
T is MultiplicativeMonoid
x / y is T
Ring* = concept type T
T is AdditiveGroup
T is MultiplicativeMonoid
EuclideanRing* = concept x, y, type T
T is Ring
x div y is T
x mod y is T
Field* = concept type T
T is Ring
T is MultiplicativeGroup I guess they would become like this? type
AdditiveMonoid* = concept
proc `+`(x, y: self): self
proc zero(T: typedesc[self]): self
AdditiveGroup* = concept
self is AdditiveMonoid # will concept refinement be allowed?
proc `-`(x: self): self
proc `-`(x, y: self): self
MultiplicativeMonoid* = concept
proc `*`(x, y: self): self
proc id(T: typedesc[self]): self
MultiplicativeGroup* = concept
self is MultiplicativeMonoid
proc `/`(x, y: self): self
Ring* = AdditiveGroup and MultiplicativeMonoid
EuclideanRing* = concept
self is Ring
proc `div`(x, y: self): self
proc `mod`(x, y: self): self
Field* = Ring and MultiplicativeGroup |
Yes.
I thought about it and we could support it via |
I like the proposal. Can it deal with recursivity? That's important for my needs (mratsim/Arraymancer#299) In Arraymancer I define a deep learning trainable layer like this but I don't use them at the moment as it's too brittle: type
Variable[T] = object
value: Tensor[T]
gradient: Tensor[T]
Tensor[T] = object
data: seq[T]
TrainableLayer*[TT] = concept layer
block:
var trainable: false
for field in fields(layer):
trainable = trainable or (field is Variable[TT])
trainable
Conv2DLayer*[TT] = object
weight*: Variable[TT]
bias*: Variable[TT]
LinearLayer*[TT] = object
weight*: Variable[TT]
bias*: Variable[TT]
GRULayer*[TT] = object
W3s0*, W3sN*: Variable[TT]
U3s*: Variable[TT]
bW3s*, bU3s*: Variable[TT]
EmbeddingLayer*[TT] = object
weight*: Variable[TT]
#########
# sanity check
var convolution: Conv2DLayer[float32]
echo convolution is TrainableLayer[float32] # Why false?
#########
# user-defined
type
MyNonLayer = object
foo: int
bar: float32
MyLayer[TT] = object
metadata: MyNonLayer
weight: Variable[TT]
var x: MyNonLayer
var y: MyLayer[float32]
echo x is TrainableLayer[float32]
echo y is TrainableLayer[float32] # Why false |
What does that even mean though? "Somewhere in the object type there must be a field of type Variable[T]"? How do you know which one? What if you have more than one? |
A) How come I must repeat in every discussion about concepts that any proposal must cover the requirements of associated types and constants? I guess it would have to be: type Foo = concept
proc ElemType(_: typedesc[self]): typedesc
proc compressedSize(_: typedesc[self]): static[int] I find it a bit funny how these result in syntax that is outlawed in regular Nim. And hey, why was B) A "container concept" must be able to infer static parameter values as well. For example type FixedSizeMatrix[M, N: static[int]; T] = concept
# How do I infer M and N here from a non-generic type?
# In the current design it's done with:
x.Rows == M
x.Cols == N Please not that template Rows(T: type Matrix4x4): int = 4 Having the C) The VTable types are a very important feature in the current concepts design, but I would agree that they can be delivered as a library. D) I'd really like to have the converter type classes though and these require custom code in semcall/sigmatch. E) Regarding the main idea here, mainly the type checking of generic functions, I would approach this with a lot of caution. The internal compiler structures can be defined and derived from the current concept syntax. I think it makes a lot of sense to try to experiment with the approach before making any major decisions about removing the existing concepts-related code in the compiler. In particular, a lot of attention has to be put into code like the following: func isFixedSize*(T0: type): bool {.compileTime.} =
mixin toSszType, enumAllSerializedFields
when T0 is openarray|Option|ref|ptr:
return false
else:
type T = type toSszType(default T0)
when T is BasicType:
return true
elif T is array:
return isFixedSize(ElemType(T))
elif T is object|tuple:
enumAllSerializedFields(T):
when not isFixedSize(FieldType):
return false
return true Notice how the function accepts an underspecified type, but it discovers a more detailed type using the |
I don't care which field, but as long as there is a "Variable[T]" field, the user-defined type constitutes a TrainableLayer indeed. All deep-learning libraries work like that, except that they use dynamic dispatch and inheritance from a Layer class. |
Because it's covered by the proposal as you noticed yourself.
Static T could be done like so: type FixedSizeMatrix[M, N: static[int]; T] = concept
proc rows(x: self): M
proc cols(x: self): N
Well the proposal implies that we have dedicated, custom code for concept matching, moreso than currently.
Of course.
That code is not affected, under-constrained generics continue to exist and your |
proc rows(x: self): M
proc cols(x: self): N Hrm, you are using a static value as a return type. Seems like an irregularity to me. My underspecified code shouldn't be ruled out as "non using concepts". I could have easily constrained the input type with a concept such as |
Well the eternal question is "is |
Well no, under this RFC it couldn't "easily require" such a thing. If you cannot be precise about the generic type constraints, don't fake it with an imprecise concept, use an unconstrained generic instead. If you don't like this, fine, but that it is what this RFC says. |
I've never understood why you insist that values bound to static parameters are "types". Can they be passed to functions accepting regular values - yes. When you initialize a constant from a proc foo(N: static int) =
const N2 = N
proc bar(a: N) = discard
proc baz(a: N2) = discard For me, the "eternal" question has a very definitive answer. The static parameters are values having |
Here is some more fun. If we continue with the "static parameters are types" thinking, we reach the following: type 3DTransform = concept
self is FixedSizeMatrix
proc cols(x: self): 4
proc rows(x: self): 4 |
I'm not sure you're asking the right question. You are thinking about whether these alternative concepts can support what the current concepts can. But the real question is why does it matter to infer the value |
Why not use this: type
TrainableLayer[T] = concept
proc variableT(x: self): Variable[T]
# implementation
type
MyLayer = object
f: Variable[int]
template variableT(x: MyLayer): Variable[int] = x.f |
Because there are many variables, not just one. One actually wants to express a (fixed shape) tree whose leaves are variables |
Ok, how about: type
TrainableLayer = concept
iterator variables(x: self): Variable[each T]
# implementation
type
MyLayer = object
f: Variable[int]
iterator variables(x: MyLayer): Variable[int] = yield x.f |
I guess that could work, but I leave the final word to @mratsim , who has actually tried to put the idea into practice |
This example was not inferring 4, it was requiring it. A 3D transform is an affine transformation matrix with dimensions 4 by 4. There are various algorithms that work only on such matrices, so I would say it's a perfect example for a requirement that a generic 3G graphics library might have. |
It needs to support a variadic number of layers like type
Conv2DLayer*[TT] = object
weight*: Variable[TT]
bias*: Variable[TT]
LinearLayer*[TT] = object
weight*: Variable[TT]
bias*: Variable[TT]
GRULayer*[TT] = object
W3s0*, W3sN*: Variable[TT]
U3s*: Variable[TT]
bW3s*, bU3s*: Variable[TT]
EmbeddingLayer*[TT] = object
weight*: Variable[TT] User-definedResidual NN layers: https://www.quora.com/How-does-deep-residual-learning-work type
ResidualBlock*[TT] = object
conv1*: Conv2DLayer[TT]
conv2*: Conv2DLayer[TT] The concept should be able to match with TrainableLayer[TT] provided with this type definition. At most we can have this: type
ResidualBlock*[TT] = object
conv1*{.trainable.}: Conv2DLayer[TT]
conv2*{.trainable.}: Conv2DLayer[TT] but I'm not sure how to make a concept look for a pragma. |
It doesn't sound like we're gonna be able to type-check "somewhere in there Variable[TT] fields will be processed somehow" anytime soon. That doesn't mean the idea of type-checking generics with concepts is pointless... ;-) |
Nice to see that this is basically what I proposed in #167 + Personally I would still do away with the type
Comparable = concept # no T, an atom
proc cmp(a, b: Comparable): int
ToStringable = concept
proc `$`(a: ToStringable): string
Hashable = concept
proc hash(x: Hashable): int
proc `==`(x, y: Hashable): bool
Swapable = concept
proc swap(x, y: var Swapable) To be honest I'm really after simplicity here, to me introducing any new custom syntax is going too far. Because of this I would simply omit For the type
Appendable = concept
proc add(s: var Appendable, elem: Appendable)
ToStringable = concept
proc `$`(x: ToStringable): string
Quotable = Appendable or ToStringable But all in all, I still prefer this over the current concept syntax even if it does include these additional elements. |
First of all, nice to see this work being done. At first sight it looks clean. Regarding the either/orelse, I don't think it should be supported, at least not because of the gives use case. Generally it is much better to just require one operation, in this case it would be type
Quotable = concept
proc add(s: var string; elem: self)
proc addQuoted[T: Quotable](s: var string; x: T) =
s.add(x)
type
MyType = object
var str: string
var mt: MyType
proc add(s: var string; elem: MyType) =
s.add $elem
str.addQuoted(mt) # without the previous declaration: error |
If we zoom out a bit, a concept can represents two distinct set of requirements:
This proposal focuses entirely on the first type of requirements while ignoring the second type. By doing this, you only lose expressivity and nothing is gained in return. Gathering the set of syntax requirements is all you need to type check generics and it's a false assumption to believe that this is somehow hindered by the current approach to concepts (the set of requirements is well defined - it's exactly what would be needed in order to compile a generic function with the same body as the concept). If one really insists on using more declarative syntax, I'd suggest at least the following additions:
Some optional points to consider:
|
Er, it allows us to avoid the guesswork we do in the "generic prepass" with its ideas of "bind" vs "mixin" symbols. That is a gain. Also, proper type-checking for generics is done by C#, Scala, Rust, etc -- the C++'s template solution so far only has been copied by Nim and D, both languages which lacked an expert on type theory for their initial designs.
Yeah, well, "arbitrary facts" are dangerous. And as I said, if the concept cannot describe the type precisely, there is always the under-specified generic
Yeah, agreed, especially inferring static values needs to get special support. |
By "facts", I meant the predicates that the existing concepts support which doesn't necessarily concern the availability of certain overloads. Requiring a square matrix is an example for this. I meant that by not supporting such predicates you only lose expressivity.
I also tried to explain that the desired type checking is possible with or without such predicates. It's also possible with the current concepts, so this message is a bit of a straw man argument. |
My standard techniques to do this is to create a distinct type with private fields, and ensure that the only way to construct it is via some function that checks the predicate. Some sugar for this would go a long way |
As someone who isn't familiar with Nim, what does Further, is there any chance of nominal concepts ala |
I like the proposal except for these issues:
These could be solved with some modifications:
Code example: type
Dictionary[K, V] = concept
implemented:
proc `[]`(a: self; key: K): V
proc `[]=`(a: var self; key: K; value: V)
# either orelse is no longer needed
implemented( proc capacity(d: self): int ) or
implemented( proc size(d: self): int ) or self.hasField(size, int)
# any other compile-time test, if necessary even
compiles(someExpression)
# ... or some other awesome test that doesn't even exist now. Less compact than the OP, but with less non-obvious rules and much more flexible. |
Is there more to concept refinement than duplicating the concepts constraints (aka the concept's body) in a different concept? |
I don't understand Nim well enough yet to answer that question unfortunately. If refinement is not available though I'd like to avoid the situation where a "chain" of refined concepts has the same block of constraints copy-pasted in at every member of the chain. For example, in a library like Emmy I could have concepts for Magma, Semigroup, Monoid, Group, AbelianGroup, Ring, CommutativeRing and Field, each of which is refinement of the previous concept. The Magma constraints would be repeated seven times, the Semigroup constraints six times etc across the chain, with a corresponding increase in maintenance and readability issues. I'm not sure if that's how the new concepts proposal would work in practice though? |
Copy and paste can always be avoided via Nim's template mechanism anyway. |
Why do copy & pasting at all, doesn't it suffice to point just to the constraints of the concepts that need to be satisfied/implemented too? Question:
|
I agree that refinement is not strictly needed, and copy/paste can be avoided via templates. I guess one could also implement refinement via a type macro, that just substitutes the body of the refined concept in the right spot. Still, conceptually is not bad to have it, if it can be supported with low cost. |
This is just a collection of my thoughts about concepts so far: I find the argument that refinement is not needed, because it can be done with templates a bit weird. Yes, you could copy/paste all the declarations by defining them inside a template, but that would look something like template common() = ...
type
A = concept
common()
B = concept
common()
... which isn't very ergonomic imo and it should be really easy to implement in the compiler (just leave the copy/paste to the compiler, or point to the parent concept). Another thing that I find a bit weird is that Regarding associated types and constants (as they exist in Rust, for example), I'd love to see them, but I'm not sure how that's supposed to work, since items aren't bound to types in Nim (refs #380). I tried to simulate them using a proc that takes a type
Associated = concept
proc T(self: typedesc[Self]): typedesc
proc A(self: typedesc[Self]): static[int]
proc T(self: typedesc[int]): typedesc = bool
proc A(self: typedesc[int]): static[int] = 42
static: assert int is Associated # Error Finally, I think |
@dom96 on 10 Sep 2019:
I like this option, where the syntax is kept as unexceptional as possible, a lot. On refining concepts, would a similar approach do the trick? (modifying the example given by @andreaferretti)
|
What's wrong with just using type
AdditiveMonoid* = concept
proc `+`(x, y: Self): Self
proc zero(T: typedesc[Self]): Self
AdditiveGroup* = concept of AdditiveMonoid
proc `-`(x: Self): Self
proc `-`(x, y: Self): Self |
I find nothing wrong with My thinking with
...but then why not skip separately declaring the awkwardly-named It's perhaps also worth mentioning refining more than one concept at once. This is natural (and inevitable) with
Here, |
Something that came up on the forum a while ago: https://forum.nim-lang.org/t/8642#56264 (I think it's worth reposting here, for the sake of documentation). I really like the idea of having a |
Presently this RFC does not address having fields or properties of a specific type my sugestion is that an ident def could be used there so the following would be valid. Though this doesnt allow knowing if you can assign a field so maybe type
Vector2f = concept
x, y: float32
Vector3f = concept
x, y, z: float32
type MyVec = array[3, float32]
template x(myVec: MyVec): float32 = myVec[0]
template y(myVec: MyVec): float32 = myVec[1]
template z(myVec: MyVec): float32 = myVec[2]
assert MyVec is Vector3f
assert tuple[x: float32, y: float32] is Vector2f |
A declaration like |
"Should" as in "should already", or as in "should if implemented"? type
Fieldy = concept
proc field(x: Self): var int
Foo = object
field*: int
let bar = Foo()
doAssert bar is Fieldy |
"should be implemented" |
Generally, I like the proposal a lot. Although I don't terribly like Rust's traits implementation, I really do like that it is universally used. I think it would be fantastic if an interface like system was stabilized and integrated into the standard library and if this proposal gets them closer to that then I'm all for it. As far as I understand it (thanks to beef) the problem with old concepts are that function / general routine calls and field access (is that all?) are not accessible to the compiler: and so you cannot know that those operations are defined ex. inside a function taking parameters of a concept type? That said: I have some minor syntax/semantics squabbles and a few major issues with the current proposal. Major issues I strongly think these should be called IMO there should be a distinction between all routines: not just I don't think the field keyword needs to distinguish between writable and non-writable fields if I'm thinking about this correctly. The use case of interfaces is as function parameters: and so there is already a This could look like the following: type Stack[T] = interface
func push(s: var Self, val: T)
func pop(s: var Self): T
iterator items(s: Self): T
func `[]=`(s: var Self, a: uint, v: T)
field stack: array[256, T]
field stackptr: uint Minor issues I don't like the syntax of type Serializable = interface
iterator fieldPairs[T](x: Serializable): (string, RootObj)
proc write[T](x: T) I don't like |
Because it does something fundamentally different. There is not a single |
Ideally |
I'm happy to take |
I don't mind I am wondering though: what exactly is the distinction between concepts v2 and interfaces? Is it the lack of a particular (IMO an explicit |
This has been bumping around my head again recently. What do you think about removing the type Stack[T] = interface
push(s: var Self, val: T)
pop(s: var Self): T
items(s: Self): T
`[]=`(s: var Self, a: uint, v: T)
stack: array[256, T]
stackptr: uint I think something like this could be nice. |
That doesn't solve much as the matching needs to be "fuzzy" in other aspects too: Type modifiers like |
This doesn't make sense to me. Iterators behave very differently from funcs/procs/templates/macros (returning the same type), you can use iterators in a |
Concept redesign
Goals:
not only at instantiation time.
debug
echo
orlog
statement does not require a far reaching typeconstraint addition.
concept
's implementation onsystem.compiles
which isunder-specified in Nim, very tied to the current implementation, and
finally, slow.
Non goals:
we have the escape hatch plus the fact that you can leave your
T
underspecified.
proc needs a
cdecl
calling convention!")yet to see convincing examples of these cases. In the worst case, we could
add
enableif
for this without bloating the concept's design.macros. Having said that, since these concepts are declarative, they are
easier to process with tooling and arguably easier to process with macros
as well.
Atoms and containers
Concepts come in two forms: Atoms and containers. A container is a generic
concept like
Iterable[T]
, an atom always lacks any kind of genericparameter (as in
Comparable
).Syntactically a concept consists of a list of
proc
anditerator
declarations. There are 3 syntatic additions:
Self
is a builtin type within the concept's body stands for thecurrent concept.
each
is used to introduce a generic parameterT
within theconcept's body that is not listed within the concept's generic
parameter list.
either orelse
is used to provide basic support for optionalprocs within a concept.
We will see how these are used in the examples.
Atoms
Self
stands for the currently defined concept itself. It is used to avoida recursion,
proc cmp(a, b: Comparable): int
is invalid.Containers
A container has at least one generic parameter (most often called
T
). Thefirst syntactic usage of the generic parameter specifies how to infer and bind
T
.Other usages of
T
are then checked to match what it was bound to.Nothing interesting happens when we use multiple generic parameters:
The usual ": Constraint" syntax can be used to add generic constraints to
the involved generic parameters:
each T
Note:
each T
is currently not implemented.each T
allows to introduce generic parameters that are not part of aconcept's generic parameter list. It is furthermore a special case to
allow for the common "every field has to fulfill property P" scenario:
either orelse
Note:
either orelse
is currently not implemented.In generic code it's often desirable to specialize the code in an ad-hoc manner.
system.addQuoted
is an example of this:If we want to describe
T
with a concept we need some way to describe optionalaspects.
either orelse
can be used:More examples
system.find
It's straight-forward:
Sortable
Note that a declaration like
is possible but unwise. The reason is that
Indexable
either containstoo many procs we don't need or accessors that are slightly off as they don't
offer the right kind of mutability access.
Here is the proper definition:
Concept matching
A type
T
matches a conceptC
if every proc and iterator headerH
ofC
matches an entityE
in the current scope.The matching process is forgiving:
If
H
is aproc
,E
can be a proc, a func, a method, a template,a converter or a macro.
E
can have more parameters thanH
as longas these parameters have default values. The parameter names do not have
to match.
If
H
has the formproc p(x: Self): T
thenE
can be a publicobject field of name
p
and of typeT
.If
H
is an iterator,E
must be an iterator too, butE
's parameternames do not have to match and it can have additional default parameters.
Escape hatch
Generic routines that have at least one concept parameter are type-checked at declaration time. To disable type-checking in certain code sections an
untyped
block can be used:EDITED 2021/03/09
self
was renamed toSelf
and is what the experimental implementation uses.The text was updated successfully, but these errors were encountered: