-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Array iterator classes private to hide implementation from client #193
base: main
Are you sure you want to change the base?
Changes from all commits
e6691a1
84535ad
42b09a7
5f5d79f
a52eb5f
73099ee
873c378
b0fd354
f492370
62a7b03
48fc8bd
255c487
869eb82
2bfd088
7bf7641
06018fc
e60bb5a
16ec9ef
5b91ce2
7fda29d
4363ab0
5dbe1ca
68b7695
c0da56d
b539018
44103dd
d61d656
18f8d49
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,366 @@ | ||
- Feature Name: array_private_classes | ||
- Start Date: 2022-01-08 | ||
- RFC PR: (leave this empty) | ||
- Pony Issue: (leave this empty) | ||
|
||
Rendered RFC: https://github.com/pmetras/rfcs/blob/rfc-array/text/0000-array-private-classes.md | ||
|
||
# Summary | ||
|
||
Collection classes should not expose internal classes through iterators functions. | ||
|
||
Following information hiding design principle, the builtin classes of collection | ||
data structures must not be made visible through iterator functions like `ArrayKeys`, | ||
`ArrayValues` and `ArrayPairs`. These classes can be made private as they are | ||
only used as return types for `Array` functions `keys`, `values` and `pairs`. | ||
The return values for these functions are changed to the more general interface | ||
`Iterator`. | ||
|
||
A new interface `CollectionIterator` is defined to allow for rewindable iterators, | ||
like it is the case for `Array` `values`. | ||
|
||
This design principle is applied to the other collection classes that expose | ||
internals too like: | ||
Comment on lines
+22
to
+23
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, no, I am not in favor of anything in here that I am in favor of based on an abstract design principle. I can't support this language being included. |
||
|
||
* `List` | ||
* `Map` | ||
* persistent `Map` | ||
* `Vec` | ||
* persistent `Vec` | ||
* `Set` | ||
* `Itertools` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I addressed this below. There are references to |
||
|
||
This is a breaking change for collections' client code that use now internal | ||
classes but a search on Github repositories shows that the impact should be | ||
limited. | ||
|
||
# Motivation | ||
|
||
This change brings: | ||
|
||
- Applying the design principle of | ||
[hiding implementation details](https://en.wikipedia.org/wiki/Information_hiding) | ||
but offer a general and stable interface. Returning interfaces instead of concrete | ||
classes allows changing the implementation. Usually, one must return the most | ||
general type that fullfils the contract of the function (in the case of the | ||
functions discussed in this RFC, iteration). | ||
- Collections' functions `keys`, `values` and `pairs` definitions are made more | ||
general. Iterators implementation details are not public. Internal classes used | ||
by implementation like `*Keys`, `*Values` and `*Pairs` are now | ||
[opaque data types](https://en.wikipedia.org/wiki/Opaque_data_type). Generally, | ||
when using these collection classes, clients are not interested by the iterators | ||
implementation, but by the types these iterators return and that is provided by | ||
the generic parameters. | ||
- The generic return signature of these 3 iterating functions is simpler to | ||
understand for clients of collection classes. | ||
- Reduces the number of public classes in the standard library by hiding 18 | ||
specialised classes (iterators implementations) of which 3 are from the | ||
`builtin` module. | ||
- The interface `CollectionIterator` is added to create rewindable iterators | ||
(can be re-start from first value). | ||
|
||
This change remains compatible with the existing code base but for client code | ||
that is directly using the classes `*Keys`, `*Values` and `*Pairs`. A search on | ||
Github shows that the impact is very limited. | ||
|
||
To quote Antoine de Saint-Exupéry: | ||
|
||
> Perfection is achieved, not when there is nothing more to add, but when there | ||
> is nothing left to take away. | ||
Comment on lines
+66
to
+69
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To quote from the Pony Philosophy:
If making this change to abstract types has a performance impact than it is not inline with our driving philosophy which places performance at a higher priority than simplicity. |
||
|
||
Currently the API makes a guarantee that the return value of each of these functions | ||
is a specific concrete class with a specific public name. That particular guarantee | ||
does not actually serve any practical needs of users (except, arguably, the virtual | ||
calls caveat that was mentioned on [Github discussion](https://github.com/ponylang/rfcs/pull/193#issuecomment-1009733064)). If it is not helpful to users in practice, | ||
it can be removed. And if it can be removed, then it should be removed, in the | ||
name of taking another step toward stabilizing the Pony standard library's API. | ||
Comment on lines
+71
to
+76
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This linked issue comment is in line with my previous review comment concerning performance taking priority over simplicity in the Pony Philosophy. I would like to see some evidence that the compiler as it is today devirtualize the calls and as such we get both performance (no worse than today) and simplicity (improved by this RFC). |
||
|
||
It's worth mentioning that if this change was being debated in reverse (the | ||
current standard library had minimal interfaces for the return values, and | ||
somebody proposed to make them public concrete classes instead, because they | ||
found some tangible benefit in doing so), then it would not be a breaking change | ||
at all, because the new proposed type would be a subtype of the existing return | ||
types. | ||
|
||
It'll always be easier to get more specific with these return types than more | ||
general. So we should default to offering the most general return type that is | ||
still useful to users, and ratchet down to more specific subtypes as needed (as | ||
tangible use cases for more specific return types are discovered). Ratcheting down | ||
will always be possible without a breaking change, but ratcheting up will be a | ||
breaking change each time. It's best to make such "ratchet up" breaking changes | ||
like this RFC before we reach Pony 1.0.0 as part of a general effort to make Pony | ||
and its standard library stable. | ||
Comment on lines
+85
to
+92
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not agree with this as motivation. It is always easier to start with many concrete implementations/returns which are then used to derive a generic implementation/return and trying to be generic before concrete presents more room for instability. |
||
|
||
# Detailed design | ||
|
||
Iterating functions in collections `keys`, `values` and `pairs` are changed to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In your example they return a |
||
return `Iterator` and the classes that implement these iterators are made private. | ||
Here are the full implementation of these functions for the `Array` class (changes | ||
in other collection classes are identical). | ||
|
||
As the function `values` of class `Array` uses an iterator with a `rewind` function | ||
that is not part of the `Iterator` interface, a new interface `CollectionIterator` | ||
is added to enable creation of rewindable iterators. | ||
|
||
```pony | ||
fun keys(): CollectionIterator[USize]^ => | ||
""" | ||
Return an iterator over the indices in the array. | ||
""" | ||
_ArrayKeys[A, this->Array[A]](this) | ||
|
||
fun values(): CollectionIterator[this->A]^ => | ||
""" | ||
Return an iterator over the values in the array. | ||
""" | ||
_ArrayValues[A, this->Array[A]](this) | ||
|
||
fun pairs(): CollectionIterator[(USize, this->A)]^ => | ||
""" | ||
Return an iterator over the (index, value) pairs in the array. | ||
""" | ||
_ArrayPairs[A, this->Array[A]](this) | ||
``` | ||
|
||
Note: To remain consistent with `Array` behaviour, functions `keys` and `pairs` | ||
will return a `CollectionIterator`. | ||
|
||
```pony | ||
interface CollectionIterator[A] is Iterator[A] | ||
""" | ||
A `CollectionIterator` is an iterator that can be rewinded, that is start | ||
again from first item. The data structure being iterated on can't change the | ||
order it return iterated items. | ||
""" | ||
fun has_next(): Bool | ||
""" | ||
Return `true` when function `next` can be called to get next iteration item. | ||
""" | ||
|
||
fun ref next(): A ? | ||
""" | ||
Return the next item of the iteration or an error in case there are no other | ||
items. A previous call to `has_next` check if we can continue iteration. | ||
""" | ||
Comment on lines
+135
to
+144
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be my personal preference or lack of knowledge on the topic, but is there a reason to repeat these methods which are found on If these methods must be repeated I would prefer we add tests which check that the properties of |
||
|
||
fun ref rewind(): Iterator[A]^ | ||
""" | ||
Start the iterator over again from the beginning. | ||
""" | ||
``` | ||
|
||
The code of the standard library is adapted to remove use of these now private | ||
classes, mainly in tests. Here are the files that must be changed: | ||
Comment on lines
+152
to
+153
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What private classes? Isn't this RFC about making now public classes into private classes? |
||
|
||
* `packages/builtin/array.pony` as shown above | ||
* `packages/itertools/iter.pony` in function `cycle` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see that in source |
||
* `packages/collections/heap.pony` in function `values` | ||
* `packages/collection/builtin/_test.pony` in class `_TestArrayValuesRewind` | ||
* `packages/collections/list.pony` | ||
* `packages/collections/map.pony` | ||
* `packages/collections/persistent/map.pony` | ||
* `packages/collections/persistent/vec.pony` | ||
* `packages/collections/set.pony` | ||
* `test/libponyc/util.cc` to change the name of the class to `_ArrayValues` | ||
|
||
## Detailed changes | ||
|
||
In order to judge how the API becomes simpler to understand for clients of the | ||
collections classes, here are the changes in the functions' signatures. The `-` | ||
line shows the old signature while the `+` one is the new: | ||
|
||
```pony | ||
// Array | ||
- fun keys(): ArrayKeys[A, this->Array[A]]^ => | ||
+ fun keys(): Iterator[USize]^ => | ||
- fun values(): ArrayValues[A, this->Array[A]]^ => | ||
+ fun values(): CollectionIterator[this->A]^ => | ||
- fun pairs(): ArrayPairs[A, this->Array[A]]^ => | ||
+ fun pairs(): Iterator[(USize, this->A)]^ => | ||
|
||
// Heap | ||
- fun values(): ArrayValues[A, this->Array[A]]^ => | ||
+ fun values(): Iterator[this->A]^ => | ||
|
||
// List | ||
- fun nodes(): ListNodes[A, this->ListNode[A]]^ => | ||
+ fun nodes(): Iterator[this->ListNode[A]]^ => | ||
- fun rnodes(): ListNodes[A, this->ListNode[A]]^ => | ||
+ fun rnodes(): Iterator[this->ListNode[A]]^ => | ||
- fun values(): ListValues[A, this->ListNode[A]]^ => | ||
+ fun values(): Iterator[this->A]^ => | ||
- fun rvalues(): ListValues[A, this->ListNode[A]]^ => | ||
+ fun rvalues(): Iterator[this->A]^ => | ||
|
||
// Map | ||
- fun keys(): MapKeys[K, V, H, this->HashMap[K, V, H]]^ => | ||
+ fun keys(): Iterator[this->K]^ => | ||
- fun values(): MapValues[K, V, H, this->HashMap[K, V, H]]^ => | ||
+ fun values(): Iterator[this->V]^ => | ||
- fun pairs(): MapPairs[K, V, H, this->HashMap[K, V, H]]^ => | ||
+ fun pairs(): Iterator[(this->K, this->V)]^ => | ||
|
||
// Persistent Map | ||
- fun val keys(): MapKeys[K, V, H] => | ||
+ fun val keys(): Iterator[K] => | ||
- fun val values(): MapValues[K, V, H] => | ||
+ fun val values(): Iterator[V] => | ||
- fun val pairs(): MapPairs[K, V, H] => | ||
+ fun val pairs(): Iterator[(K, V)] => | ||
|
||
// Persistent Vec | ||
- fun val keys(): VecKeys[A]^ => | ||
+ fun val keys(): Iterator[USize]^ => | ||
- fun val values(): VecValues[A]^ => | ||
+ fun val values(): Iterator[A]^ => | ||
- fun val pairs(): VecPairs[A]^ => | ||
+ fun val pairs(): Iterator[(USize, A)]^ => | ||
|
||
// Set | ||
- fun values(): SetValues[A, H, this->HashSet[A, H]]^ => | ||
+ fun values(): Iterator[this->A]^ => | ||
``` | ||
|
||
For instance, in `Array` | ||
|
||
```pony | ||
- fun keys(): ArrayKeys[A, this->Array[A]]^ => | ||
+ fun keys(): Iterator[USize]^ => | ||
``` | ||
|
||
now the client user knows now that she gets an iterator over `USize`. | ||
|
||
Another more complex example with `Map`, | ||
|
||
```pony | ||
- fun keys(): MapKeys[K, V, H, this->HashMap[K, V, H]]^ => | ||
+ fun keys(): Iterator[this->K]^ => | ||
- fun values(): MapValues[K, V, H, this->HashMap[K, V, H]]^ => | ||
+ fun values(): Iterator[this->V]^ => | ||
``` | ||
|
||
we see that we iterate over keys (`K` type) or values (`V`). The initial signature | ||
gave complex generic classes that the user has to find in the documentation to | ||
understand that they are iterators, and then dive deeper to understand on what | ||
they iterate. | ||
Comment on lines
+242
to
+245
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do think these new signatures are more readily understood. |
||
|
||
# How We Teach This | ||
|
||
This change keeps the code compatible in the vast majority of cases. When client | ||
classes are defining objects of these now private types, the reason is usually | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This RFC would be making them private types, but as of now they are now public types. |
||
to get access to the function `rewind` that was not defined in `Iterator`. By | ||
adding the interface `CollectionIterator`, client code can easily be adapted, | ||
replacing `ArrayValues[A]` by `CollectionIterator[A]`. | ||
|
||
Also, client code generally uses these functions to iterate on the returned types | ||
and does not try to access the iterator directly but is interested by the iterated | ||
items. When client code refers to the iterator type, that's generally useless and | ||
the code can be rewritten to be made shorter and more future proof. | ||
Comment on lines
+257
to
+258
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How is referring to a type useless? Code that is rewritten to update a referenced type is neither more nor less "useless" than before. |
||
|
||
A [search on Github Pony code](https://github.com/search?q=%22ArrayValues%22+language%3APony&type=code) | ||
finds 24 files using the class `ArrayValues`, of which 6 are copies of `array.pony` file. | ||
|
||
For instance, in | ||
[xml2xpath.pony](https://github.com/redvers/pony-libxml2/blob/bbca5d98d48854bfec2c6ee110220873ecc4df34/pony-libxml2/xml2xpath.pony#L41), | ||
the code can be changed from | ||
|
||
```pony | ||
fun values(): ArrayValues[Xml2node, this->Array[Xml2node]]^ ? => | ||
if (allocated) then | ||
ArrayValues[Xml2node, this->Array[Xml2node]](nodearray) | ||
else | ||
error | ||
end | ||
``` | ||
|
||
to | ||
|
||
```pony | ||
fun values(): CollectionIterator[Xml2node]^ ? => | ||
if (allocated) then | ||
nodearray.values() | ||
else | ||
error | ||
end | ||
``` | ||
|
||
In this sample, the developer was not really concerned by the type of the iterator | ||
but that the `values` function must return an `CollectionIterator` over `Xml2node`. | ||
Comment on lines
+287
to
+288
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again you cannot speak on behalf of the mindset of the developer about whether or not Red was "not really concerned" about an aspect of his code. You splice your changes into your explanation here in a way that suggests Red was only concerned that |
||
The new version makes the code simpler to understand. | ||
|
||
This change in `array.pony` and other collections will break such code but it | ||
can be easily adapted to use the new API. And it will make the standard library | ||
easier to learn by reducing the number of public types. | ||
Comment on lines
+291
to
+293
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do agree that reducing the number of public types makes it initially easier to learn Pony. However, it the changes suggested here result in a performance decrease I cannot support these changes. My reason being that it takes only slightly longer to learn what |
||
|
||
# How We Test This | ||
|
||
Pony tests must continue to pass. No additional tests are need as after review | ||
the existing coverage in Pony standard library tests is sufficient | ||
Comment on lines
+297
to
+298
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would like to see tests added that guarantee all |
||
|
||
# Drawbacks | ||
|
||
Will break any existing code that uses any of the classes that are currently | ||
public and will be made private by this RFC. | ||
|
||
# Alternatives | ||
|
||
1. Stay as is. Continue the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why would we continue the conversation of keeping this as is? If no change is needed then no conversation is needed? Are you suggesting other changes which should be pulled into this RFC from Zulip? |
||
[discussion on Zulip](https://ponylang.zulipchat.com/#narrow/stream/189959-RFCs/topic/Make.20Array.20iterators.20private). | ||
|
||
2. Update the existing concrete classes to include `rewind` where needed. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How is this an alternative? |
||
|
||
3. Instead of defining a new type of iterator with the `CollectionIterator` interface, | ||
we can consider only the rewindable part of it in an `Rewindable` interface: | ||
|
||
```pony | ||
interface Rewindable[A] | ||
fun ref rewind(): A^ | ||
""" | ||
Rewind the type `A`. | ||
""" | ||
``` | ||
|
||
Then one can create a rewindable iterator by combining these two interfaces into | ||
a intersection type: | ||
|
||
```pony | ||
type CollectionIterator[A] is (Iterator[A] & Rewindable[Iterator[A]]) | ||
``` | ||
Comment on lines
+312
to
+328
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is my preference, aka how I would write my code, but I have had conversations which swayed me against this. I do not, as I sit here now, recall the reasoning that swayed me, but I am leaving a review comment here to voice my position on this alternative. |
||
|
||
Perhaps, a more general interface name instead of `Rewindable` would be `Resetable` | ||
to define the traits of a type that can be reset to its initial state, as in the | ||
case of iterators `reset` == `rewind`. | ||
|
||
The rewindable type can be used with other types than `Iterator`, like a data | ||
structure that would implement a rewindable property. This alternative was | ||
[put aside](https://github.com/ponylang/rfcs/pull/193#discussion_r780793165) to | ||
prevent name colisions in `builtin` with user-named types. | ||
|
||
## Comparison of alternatives | ||
|
||
1. Do nothing: return public concrete classes. | ||
2. Update the existing return classes and add `rewind`. | ||
3. Add a `CollectionIterator` interface for collections. | ||
4. Create intersection type by defining a `Rewindable` interface and combine | ||
with `Iterator`. | ||
5. Change `Iterator` to add `rewind`. | ||
|
||
| Justification / Alternative # | 1 | 2 | 3 | 4 | 5 | | ||
|:------------------------------------------|:---:|:---:|:---:|:---:|:---:| | ||
| a. Simpler stdlib API | --- | --- | + | +++ | +++ | | ||
| b. Simpler function signatures | --- | --- | + | +++ | +++ | | ||
| c. Stay compatible with existing stdlib | +++ | +++ | + | +++ | + | | ||
| d. Evolutive | --- | + | +++ | +++ | + | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changes for the sake of changes is not evolution -- there must be positive or negative selection which lead microevolution into macroevolution. Put another way, if all the suggested changes have negative effects than it is still "evolution" to do nothing as we do not bring those negative effects forward with us. Side note: I am a career "full stack biologist" who has done work ranging from cancer to conservation to COVID-19 -- Pony is the project that I volunteer my time with when not working on the "evolving nature" of biological problems. |
||
| e. No impact on compiler | +++ | +++ | - | +++ | - | | ||
| f. No impact on performance | +++ | +++ | +++ | +++ | +++ | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have you benchmarked these changes to ensure no performance impact? |
||
| g. Limit stdlib pollution with interfaces | +++ | +++ | + | + | +++ | | ||
| g. Limit stdlib pollution with classes | --- | --- | +++ | ++ | +++ | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is "g" repeated? |
||
|
||
# Unresolved questions | ||
|
||
- Must analyze how `Range` and `Reverse` would be impacted if defined as | ||
`CollectionIterator`. Particularly, impact on existing client code. | ||
- Possible candidates to be analyzed: `StringBytes`, `StringRunes`, `ByteSeqIter` | ||
and probably others will be found in stdlib code. | ||
- Understand why the intersection type proposal is not considered as it seems to | ||
provide a better flexibility and is easier to understand. | ||
Comment on lines
+359
to
+366
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. None of these are questions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without "Following information hiding design principle"
being removed. I can't support this RFC. There's no "design principle" that I want to codify into Pony via RFC as a precedent for making changes.