-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Rethink Structural Types #1886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I wonder if the reason structural types aren't used more is the performance and cultural implications. I see people work around not wanting to use structural types with typeclasses, tuples, shapeless. I love the idea of being able to write a function that handles anything with a close function, without needing to write extra boilerplate for every new closeable I want to use it with. Someone will always show up to say what if someone misuses that feature on a class Relationship, where close indicates how near you are or something like that but in the end I want types that help me do my job and eliminate bugs and I have never actually had a bug caused by someone taking my generic reverse function and applying it to a vehicle. |
We can't handle them with the proposed scheme. I made a note in scala#1886.
Perhaps it is worth pointing out that Spark used the stringy types so they could make use of an optimized SQL backend. They did not move away from If this is about Spark then I suspect macros and implicit functions could be much more valuable to them.
edit: another possible cause of confusion: you are are reusing the method name @ShaneDelmore the next time I write a |
For comparison, I've got some mileage in Scalac out of an encoding like this in the past:
and then an implicit instance of |
The feature has been implemented as described. |
Is it possible to support type level selectable like this:
|
This implementation assumes a single implementation of Realistically, the implementation would need to dispatch to multiple ones (e.g. one for reflection, one for database A, one for database B, etc.), so it'd need to maintain some runtime registry where those can be registered dynamically (better mechanisms might be possible, but this is the baseline). However, the library design for this (or another solution) hasn't been done — only the compiler changes. |
Originally, structural types were introduced to make the language fit better the underlying foundations (type theorists prefer structural), and to emulate the idea of "duck typing" in dynamic languages but with static guarantees. But it turned out that almost nobody uses them. It seems that a combination of traits and classes, together with type classes represented by implicit parameters gives enough flexibility, so the need for duck typing is rarely felt.
However, there is another area where statically-typed languages are often more awkward than dynamically-typed ones: database access. In a dynamically typed language, it's quite natural to model a row as a record or object, and to select entries with simple dot notation, e.g.
row.columnName
. In a statically typed language. we can do that only if we somehow define a class for every possible row arising from a data-base manipulation (including rows arising from joins and projections), and set up a scheme to map between a row and the class representing it. This requires a lot of boilerplate code. So quite often one opts for a simpler scheme where column names are represented as strings that are passed to a select operator, e.g.row.select("columnName")
. But this forgoes all the advantages of static typing and additionally is more awkward to write than the dynamically typed version.A case in point is the Spark framework. The first version of Spark essentially supported distributed collections using RDDs. Used from Scala, this was very natural, an RDD was just some kind of collection, and was accessed in the same way as other Scala collections. Collection elements were defined by classes which were mapped transparently to database rows.
Later versions of Spark added database schemas ("data frames") for better optimizations and multi-language support. But, sadly, this meant that some amount of type safety was lost and member access was now done via strings instead of the more natural dot notation.
It seems the most natural type to represent a row in a database scheme is a structural type, with one field for each column. But unfortunately this does not work, at least not with structural types
as they are currently defined in Scala. The problem is that accessing a member of a structural types is always implemented in terms of accessing a field of method in a class, using Java reflection. For database access, this is not what we want; instead we would like to
use the field name as a parameter for operation that's defined by the system. In short, structural types are useless for database access because their member access implementation is not programmable.
The rest of this note describes a way to change that. It lays out a scheme to define programmatically the meaning of accessing a member of a structural type. The scheme is based on the idea of representing structural types programmatically, using "Selectables". It is implemented in PR #1881.
Selectable
is a trait defined as follows:The principal method of a selectable is
selectDynamic
: It takes a field name and returns the value associated with that name in the selectable.To make this precise, assume
r
is a value with structural typeS
. In generalS
is of the formC { Rs }
, i.e. it consists of a class referenceC
and refinement declarationsRs
. We call a field selectionr.f
structural iff
is a name defined by a declaration inRs
whereasC
defines no member of namef
. Assuming the selection has typeT
, it is mapped to something equivalent to the following code:That is, we make sure
r
conforms to typeSelectable
, potentially by adding an implicit conversion. We then invoke theget
operation of that instance, passing the the name"f"
as a parameter. We finally cast the resulting value back to the statically known typeT
.Selectable
also defines another access method calledselectDynamicMethod
. This operation is used to select methods instead of fields. It gets passed the class tags of the selected method's formal parameter types as additional arguments. These can then be used to disambiguate one of several overloaded variants.Package
scala.reflect
contains an implicit conversion which can map any value to a selectable that emulates reflection-based selection, in a way similar to what was done until now:When imported,
reflectiveSelectable
provides a way to access fields of any structural type using Java reflection. This is similar to the current implementation of structural types. The main difference is that to get reflection-based structural access one now has to add an import likeOn the other hand, the previously required language feature import of
reflectiveCalls
would now be redundant and can be dropped.As you can see from its implementation above,
reflectSelectable
checks first whether its argument is already a run-time instance ofSelectable
, in which case it is returned directly. This means that reflection-based accesses only take place as a last resort, if no otherSelectable
is defined.Other selectable instances can be defined in libraries. For instance, here is a simple class of records that support dynamic selection:
Record
consists of a list of pairs of element names and values. ItsselectDynamic
operation finds the pair with given name and returns its value.Let's define a record value and cast it to a structural type
Person
:(we get back to the issue of casting below, for now just note that the cast will succeed, as it checks at runtime only the erased portion of
Person
, which isRecord
).Then
person.name
will have static typeString
, and will produce"Emma"
as result.The safety of this scheme relies on the correctness of the cast. If the cast lies about the structure of the record, the corresponding
selectDynamic
operation would fail. In practice, the cast would likely be part if a database access layer which would ensure its correctness.It would be nice if the correctness of structural types could be ensured in a way less resembling pulling a rabbit our of your hat. Maybe this could be achieved by providing a language-defined bijection between structural types and a recursive generic type structure such as an
HMap
, i.e. anHList
over pairs of labels and values. The idea is that one can define type-level operations over the generic type which implement data manipulations in a type-safe way. Structural types themselves do not lend themselves to recursive type-level operations because their fundamental shape is a set (of key/value pairs), not a recursive type such as a list. On the other hand, structural types naturally implement the natural subtyping one would expect for records, which HLists or HMaps cannot do.Notes:
The scheme does not handle polymorphic methods in structural refinements. For now, such polymorphic methods are flagged as errors. It's not clear whether the use case is common enough to warrant the additional complexity of supporting it.
There are clearly some connections with
scala.Dynamic
here, since both select members programmatically. But there are also some differences.Fully dynamic selection is not typesafe, but structural selection is, as long as the correspondence of the structural type with the underlying value is as stated.
Dynamic
is just a marker trait, which gives more leeway where and how to define reflective access operations. By contrastSelectable
is a trait which declares the access operations.One access operation,
selectDynamic
is shared between both approaches, but the other access operations are different.Selectable
defines aselectDynamicMethod
, which takes class tags indicating the method's formal parameter types as additional argument.Dynamic
comes withapplyDynamic
andupdateDynamic
methods, which take actual argument values.It would be interesting to see whether we can arrive at a harmonization between the two schemes. If we only look at
selectDynamic
, this is easy: We can define a classSo one
selectDynamic
operation can perform double duty for structural and dynamic dispatch. The differences between the other methods are a bit harder to bridge however.The text was updated successfully, but these errors were encountered: